[server-dev] [RFC] Interactive transactions in IProto
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Wed Nov 14 17:09:48 MSK 2018
Can you please put this RFC into a file and push onto
a branch in doc/rfc folder so as to push it into master
later?
On 13/11/2018 20:51, Vladimir Davydov wrote:
> ** The problem **
>
> If a CALL request leaves a transaction open upon return, the transaction
> will be forcefully aborted. This makes sense for memtx, because it
> doesn't support fiber yield (yet), however in case of vinyl, the user
> may want to continue execution of the same transaction in the next CALL,
> but currently it isn't possible.
>
> Another use case for this is SQL EXECUTE request, which is used for
> executing SQL statements. The problem is this request can only be used
> for executing a single SQL statement so without transactions in IProto
> it is impossible to implement SQL transactions on a remote client (e.g.
> via JDBC).
>
> See [1] for more details.
>
> ** The solution **
>
> Introduce the concept of streams within an IProto connection, as
> suggested by Georgy Kirichenko:
>
> 1. Introduce new request header key IPROTO_STREAM_ID.
> 2. The stream id is generated by the user and passed along with all
> requests that are supposed to be executed in the same stream.
> 3. All requests within the same stream are executed sequentially.
> 4. Requests from different streams may be executed in parallel.
> 5. If a transaction is left open by a request in a stream, the next
> request will reuse it.
> 6. If IPROTO_STREAM_ID is unset (0), everything works as before, i.e.
> no transaction preservation or request serialization will occur.
>
> The net.box API will look like this:
>
> c = net_box.connect(...)
> s = c:make_stream(stream_id)
> s:call(...)
Looks great.
>
> A net.box stream instance will be a wrapper around the connection it was
> created for. It will have all the same methods as the connection itself,
> but all requests sent on its behalf will have the stream id attached.
>
> ** Benefits **
>
> The concept of a stream should be very clear to the user. Basically, it
> gives you the ability to multiplex several connections within one, even
> if there's no intention to reuse transactions.
>
> For example we could use it in JDBC to multiplex different clients via
> the same connection: each client would be assigned a unique stream id,
> but they would all go through the same connection thus saving system
> resources.
>
> ** Technicalities **
>
> First, we will need to make it possible to detach transactions from
> fibers. This isn't as simple as it may seem, because all transaction
> data is allocated on fiber->gc region. The proposal how this should be
> done in my opinion is given here [2].
>
> Second, we will have to add a hash table mapping stream ids to stream
> objects in the tx thread. A stream object would basically be a queue of
> request awaiting execution plus an open transaction if any. When a new
> request is sent to iproto, it is submitted to the tx thread and then
> either executed directly by a fiber of the fiber pool or queued to the
> stream object if there's already a request from the same stream being
> executed by another fiber. When a fiber finishes executing a request, it
> checks if there are more request in the stream queue and continues
> execution if so.
>
> ** Questions **
>
> Should we limit the number of streams somehow? I don't think so, at
> least not right now, because streams are completely user controlled,
> like iproto connections.
>
> How to close a stream so that the corresponding stream object is
> destroyed on the server? Do we need to bother at all? May be dropping
> all streams along with a connection would be enough?
>
> Should we avoid a fiber_call() when queueing a request? In other words,
> should streams be implemented inside fiber_pool so that we don't need to
> execute a call in a fiber in case all it's going to do is just queue a
> request to be executed by another fiber. This would look cleaner and
> would probably be more efficient.
More questions to think about:
1. How to balance streams processing? How to choose from which stream
of many ones TX thread should pick up a request? I think, round-robin
is fair enough, but maybe I am wrong.
2. How about storing and balancing request queues in iproto thread?
I think we should not waste TX thread time on this pure technical
things. The only problem I see, it requires storing struct txn * in
iproto thread inside struct stream (or any other struct <name>).
>
> ** Alternatives **
>
> Instead of introducing streams, we could return transaction_id in reply
> to a CALL or EXECUTE request that opened a transaction. The user could
> then use this transaction_id with the next CALL to continue working with
> the same transaction. However,
>
> - This would break compatibility. Currently, if the transaction is open
> by the time a CALL returns, it will be aborted. If we don't do that
> we can end up accumulating stale transactions on the server side in
> case there's a mistake in the client code. We could probably add a
> flag that would specify if the client intends to leave the
> transaction open though, but that would mean we'd have to add yet
> another new key to the iproto protocol.
>
> - Generating transaction_id on the server and returning it back to the
> client would look rather ugly when it comes to implementing net_box
> client, because we would have to add extra return parameters to
> net.box 'call' method, which isn't very convenient. Compare with
> streams where the user gets a stream object that behaves exactly like
> a connection, but guarantees sequential execution and transaction
> preservation.
>
> - How to deal with open connections that are kept open by the server
> for too long? Abort them on timeout?
>
> - Currently, there's no way to guarantee that certain requests execute
> sequentially. Streams provide the user with such a way so they can be
> useful even without transactions. Returning transaction_id can't be
> used for anything else but interactive transactions.
>
> ** References **
>
> [1] https://github.com/tarantool/tarantool/issues/2503
> [2] https://github.com/tarantool/tarantool/issues/2503#issuecomment-415480435
>
More information about the Tarantool-discussions
mailing list