[server-dev] [RFC] Interactive transactions in IProto

Георгий Кириченко georgy at tarantool.org
Thu Nov 15 14:58:17 MSK 2018


On Thursday, November 15, 2018 11:34:07 AM MSK Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [18/11/15 01:41]:
> > > > > Definitely, it should be a global limit, a new box.cfg setting (as
> > > > > much as I hate settings there is no waywe can do away without it
> > > > > here afaiu).
> > > > 
> > > > But why? We don't limit the number of incoming connections. Why should
> > > > we bother with limiting streams?
> > > 
> > > The number of incoming connections is limited implicitly by
> > > ulimit. A connection doesn't take database resources, it only
> > > consumers memory buffers and file descriptors. An open transaction
> > > potentially holds a lot more resources. E.g. a typical graph-based
> > > deadlock detector has complexity O(N^2) on the number of
> > > transactions.
> > 
> > Streams have nothing to do with deadlock resolvers or transactions.
> > Even without streams and even now I can create thousands of active
> > transactions. Streams are at a lower level that transactions. You
> > for unknown reason think that stream == transaction, but it is false.
> 
> You forget the reason we're adding streams. We can't use today's
> connections since they are mostly stateless. We need to add a
> state to the connection - an open transaction. And to be able to
> multiplex multiple states over the same connection, we're adding
> streams.
> 
> If you try to look where this is heading, it's a full support of
> SQL features related to current session.
> 
> The spec already says very little about impact on changes to the
> current session made in one stream to another stream - they can be
> dramatic. What if, for example, I change sql_default_engine in one
> stream, will it impact another? What about the current user?
> 
> In SQL, there are the following attributes of the session:
> 
> - current user
> - transaction
> - transaction isolation level
> - client character set
> - state of the diagnostics area
> - temporary table data
> 
> Are these going to be shared between streams? In other words, are
> you going to only make "the current transaction" a server side
> context of the stream, and share everything else? I think then you
> will stumble over the first subsequent requirement of ANSI we will
> eventually get to do. Besides, proxying won't work as intended.

The worst thing a sql connection is strict-synchronous - the next call should 
not be started to process until the previous one is finished. But this breaks 
current tarantool network batching. And if you plan to rely on transaction 'is 
open' state so you even do not know will a currently started request finish 
with open transaction or not. Also this might break current behavior.

The one of biggest limitation of all known SQL servers (oracle, mysql, sql 
server) is fact that only one transaction per connection is allowed. And there 
is root cause for a lot of connection pool existence. Also this requires to 
have a dedicated connection after proxy for each client.

Also returning transaction id not only breaks backward compatibility but 
generates a lot of  questions how it should be done and how server should 
react in cases of various misuses. Also there is a lots of undefined behaviors 
and semantic questions, for example what the state on connection after two 
calls are batched and each call has some count of yields and the started a 
transaction. Or should we reset a transaction if a call produces a yield. Easy 
to see that streams paradigm does not have this issues because defines very 
simple rules.

Transaction looks to be pinned to stream and not shareable between streams 
even for one connection. Stream should be maintained only if transaction was 
opened or corresponding tx queue is not empty right now. Also if user uses 
stream then they consciously change request processing principles and 
tarantool might rely on that fact and preserve transaction for future use. So, 
long-living transaction survive only if transaction exists. Obviously stream 
allows us to provide backward compatibility without any client changes.

Streams allow us to make all the things including in easy and clean manner. 
Yes, there are questions about exact server behavior, but that is more 
technical questions like limits and error handling





More information about the Tarantool-discussions mailing list