[server-dev] [RFC] Interactive transactions in IProto

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Nov 14 17:09:48 MSK 2018


Can you please put this RFC into a file and push onto
a branch in doc/rfc folder so as to push it into master
later?

On 13/11/2018 20:51, Vladimir Davydov wrote:
> ** The problem **
> 
> If a CALL request leaves a transaction open upon return, the transaction
> will be forcefully aborted. This makes sense for memtx, because it
> doesn't support fiber yield (yet), however in case of vinyl, the user
> may want to continue execution of the same transaction in the next CALL,
> but currently it isn't possible.
> 
> Another use case for this is SQL EXECUTE request, which is used for
> executing SQL statements. The problem is this request can only be used
> for executing a single SQL statement so without transactions in IProto
> it is impossible to implement SQL transactions on a remote client (e.g.
> via JDBC).
> 
> See [1] for more details.
> 
> ** The solution **
> 
> Introduce the concept of streams within an IProto connection, as
> suggested by Georgy Kirichenko:
> 
>   1. Introduce new request header key IPROTO_STREAM_ID.
>   2. The stream id is generated by the user and passed along with all
>      requests that are supposed to be executed in the same stream.
>   3. All requests within the same stream are executed sequentially.
>   4. Requests from different streams may be executed in parallel.
>   5. If a transaction is left open by a request in a stream, the next
>      request will reuse it.
>   6. If IPROTO_STREAM_ID is unset (0), everything works as before, i.e.
>      no transaction preservation or request serialization will occur.
> 
> The net.box API will look like this:
> 
>    c = net_box.connect(...)
>    s = c:make_stream(stream_id)
>    s:call(...)

Looks great.

> 
> A net.box stream instance will be a wrapper around the connection it was
> created for. It will have all the same methods as the connection itself,
> but all requests sent on its behalf will have the stream id attached.
> 
> ** Benefits **
> 
> The concept of a stream should be very clear to the user. Basically, it
> gives you the ability to multiplex several connections within one, even
> if there's no intention to reuse transactions.
> 
> For example we could use it in JDBC to multiplex different clients via
> the same connection: each client would be assigned a unique stream id,
> but they would all go through the same connection thus saving system
> resources.
> 
> ** Technicalities **
> 
> First, we will need to make it possible to detach transactions from
> fibers. This isn't as simple as it may seem, because all transaction
> data is allocated on fiber->gc region. The proposal how this should be
> done in my opinion is given here [2].
> 
> Second, we will have to add a hash table mapping stream ids to stream
> objects in the tx thread. A stream object would basically be a queue of
> request awaiting execution plus an open transaction if any. When a new
> request is sent to iproto, it is submitted to the tx thread and then
> either executed directly by a fiber of the fiber pool or queued to the
> stream object if there's already a request from the same stream being
> executed by another fiber. When a fiber finishes executing a request, it
> checks if there are more request in the stream queue and continues
> execution if so.
> 
> ** Questions **
> 
> Should we limit the number of streams somehow? I don't think so, at
> least not right now, because streams are completely user controlled,
> like iproto connections.
> 
> How to close a stream so that the corresponding stream object is
> destroyed on the server? Do we need to bother at all? May be dropping
> all streams along with a connection would be enough?
> 
> Should we avoid a fiber_call() when queueing a request? In other words,
> should streams be implemented inside fiber_pool so that we don't need to
> execute a call in a fiber in case all it's going to do is just queue a
> request to be executed by another fiber. This would look cleaner and
> would probably be more efficient.

More questions to think about:

1. How to balance streams processing? How to choose from which stream
of many ones TX thread should pick up a request? I think, round-robin
is fair enough, but maybe I am wrong.

2. How about storing and balancing request queues in iproto thread?
I think we should not waste TX thread time on this pure technical
things. The only problem I see, it requires storing struct txn * in
iproto thread inside struct stream (or any other struct <name>).

> 
> ** Alternatives **
> 
> Instead of introducing streams, we could return transaction_id in reply
> to a CALL or EXECUTE request that opened a transaction. The user could
> then use this transaction_id with the next CALL to continue working with
> the same transaction. However,
> 
>   - This would break compatibility. Currently, if the transaction is open
>     by the time a CALL returns, it will be aborted. If we don't do that
>     we can end up accumulating stale transactions on the server side in
>     case there's a mistake in the client code. We could probably add a
>     flag that would specify if the client intends to leave the
>     transaction open though, but that would mean we'd have to add yet
>     another new key to the iproto protocol.
> 
>   - Generating transaction_id on the server and returning it back to the
>     client would look rather ugly when it comes to implementing net_box
>     client, because we would have to add extra return parameters to
>     net.box 'call' method, which isn't very convenient. Compare with
>     streams where the user gets a stream object that behaves exactly like
>     a connection, but guarantees sequential execution and transaction
>     preservation.
> 
>   - How to deal with open connections that are kept open by the server
>     for too long? Abort them on timeout?
> 
>   - Currently, there's no way to guarantee that certain requests execute
>     sequentially. Streams provide the user with such a way so they can be
>     useful even without transactions. Returning transaction_id can't be
>     used for anything else but interactive transactions.
> 
> ** References **
> 
> [1] https://github.com/tarantool/tarantool/issues/2503
> [2] https://github.com/tarantool/tarantool/issues/2503#issuecomment-415480435
> 



More information about the Tarantool-discussions mailing list