[server-dev] [RFC] Interactive transactions in IProto

Thu Nov 15 17:48:51 MSK 2018

* Георгий Кириченко <georgy at tarantool.org> [18/11/15 14:58]:

> > > > > > Definitely, it should be a global limit, a new box.cfg setting (as
> > > > > > much as I hate settings there is no waywe can do away without it
> > > > > > here afaiu).
> > > > > 
> > > > > But why? We don't limit the number of incoming connections. Why should
> > > > > we bother with limiting streams?
> > > > 
> > > > The number of incoming connections is limited implicitly by
> > > > ulimit. A connection doesn't take database resources, it only
> > > > consumers memory buffers and file descriptors. An open transaction
> > > > potentially holds a lot more resources. E.g. a typical graph-based
> > > > deadlock detector has complexity O(N^2) on the number of
> > > > transactions.
> > > 
> > > Streams have nothing to do with deadlock resolvers or transactions.
> > > Even without streams and even now I can create thousands of active
> > > transactions. Streams are at a lower level that transactions. You
> > > for unknown reason think that stream == transaction, but it is false.
> > 
> > You forget the reason we're adding streams. We can't use today's
> > connections since they are mostly stateless. We need to add a
> > state to the connection - an open transaction. And to be able to
> > multiplex multiple states over the same connection, we're adding
> > streams.
> > 
> > If you try to look where this is heading, it's a full support of
> > SQL features related to current session.
> > 
> > The spec already says very little about impact on changes to the
> > current session made in one stream to another stream - they can be
> > dramatic. What if, for example, I change sql_default_engine in one
> > stream, will it impact another? What about the current user?
> > 
> > In SQL, there are the following attributes of the session:
> > 
> > - current user
> > - transaction
> > - transaction isolation level
> > - client character set
> > - state of the diagnostics area
> > - temporary table data
> > 
> > Are these going to be shared between streams? In other words, are
> > you going to only make "the current transaction" a server side
> > context of the stream, and share everything else? I think then you
> > will stumble over the first subsequent requirement of ANSI we will
> > eventually get to do. Besides, proxying won't work as intended.
> 
> The worst thing a sql connection is strict-synchronous - the next call should 
> not be started to process until the previous one is finished. But this breaks 
> current tarantool network batching. And if you plan to rely on transaction 'is 
> open' state so you even do not know will a currently started request finish 
> with open transaction or not. Also this might break current behavior.
> 
> The one of biggest limitation of all known SQL servers (oracle, mysql, sql 
> server) is fact that only one transaction per connection is allowed. And there 
> is root cause for a lot of connection pool existence. Also this requires to 
> have a dedicated connection after proxy for each client.
> 
> Also returning transaction id not only breaks backward compatibility but 
> generates a lot of  questions how it should be done and how server should 
> react in cases of various misuses. Also there is a lots of undefined behaviors 
> and semantic questions, for example what the state on connection after two 
> calls are batched and each call has some count of yields and the started a 
> transaction. Or should we reset a transaction if a call produces a yield. Easy 
> to see that streams paradigm does not have this issues because defines very 
> simple rules.
> 
> Transaction looks to be pinned to stream and not shareable between streams 
> even for one connection. Stream should be maintained only if transaction was 
> opened or corresponding tx queue is not empty right now. Also if user uses 
> stream then they consciously change request processing principles and 
> tarantool might rely on that fact and preserve transaction for future use. So, 
> long-living transaction survive only if transaction exists. Obviously stream 
> allows us to provide backward compatibility without any client changes.
> 
> Streams allow us to make all the things including in easy and clean manner. 
> Yes, there are questions about exact server behavior, but that is more 
> technical questions like limits and error handling

I agree, but let's simply stop pretending streams are "just about
the order", and say something like:

- IPROTO_BEGIN/COMMIT/ROLLBACK only works if IPROTO_STREAM_ID is non-zero
- if stream id is zero, then dangling transactions are rolled back
  as they are now
- all requests inside a stream are strictly sequential
- a stream owns its own diagnostics, transaction, transaction
  isolation level, and possibly authenticated user (see below).
- better yet, IPROTO_SQL_EXECUTE is only available if stream id is
  non-zero 

Then we need to decide how to manage streams. Since a
stream may have a lot of state I don't like the idea of implicit
open/close of a stream. Imagine the server closes a stream
containing a session local temporary table implicitly. The user
may get confused. Why not add a separate command to create a
stream, or extend IPROTO_AUTH with option to create a stream? 

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov