Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>,
	Roman Khabibov <roman.habibov@tarantool.org>
Cc: tarantool-patches@freelists.org,
	Konstantin Osipov <kostja.osipov@gmail.com>,
	Kirill Yukhin <kyukhin@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH] net.box: ignore absence of _vcollation
Date: Thu, 25 Jul 2019 00:43:22 +0200	[thread overview]
Message-ID: <b2117535-46ee-2a6b-1a4a-eb3fd198ae39@tarantool.org> (raw)
In-Reply-To: <20190724212743.jwrti2iuclleogjq@tkn_work_nb>

Hi! Yeah, I remember that I should explain my vision of how that
task should be solved. See below a long-read.

Here are my explanations on how netbox works, and on how should this
issue be fixed (perhaps).

Netbox for each connection creates a worker fiber and a metadata object
keeping connect timeout, reconnect timeout, status, etc. The metadata
object is known as a netbox connection, and is the object with which a
user works. It is a result of 'netbox.connect()'. Worker fiber does real
work: establishes a tcp connection, sends/receives data.

But there appears a problem of cyclic references. The thing is that the
worker fiber needs some metadata from the netbox object. At least,
'reconnect_timeout'. On the other hand, the netbox object needs some
results from the worker fiber - responses on the requests, connection
status. In other words, both the worker fiber and the netbox object need
something from each other.

If the worker fiber and the netbox object would keep references to each
other, they would never be deleted - Lua GC is not that smart. So there is
a channel of communication between them, which references them both weakly.
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L834-L852

This function is called when a netbox object is created. It takes some
mandatory meta like host, port, and a so called 'callback'. It is a closure
defined here:
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L910-L954

The closure has access to the netbox object
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L909
and returns certain parameters of it. The worker fiber uses a weak reference
to this closure as a proxy to read metadata of the netbox object:
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L837-L840.

For example, when the worker fiber needs to obtain reconnect_timeout, it
calls the callback
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L443.
The callback here is a weak reference to the closure
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L948-L953.

And vice versa - when the worker fiber wants to return something back to
the netbox object, it calls the same callback, but with a different argument:
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L399.

How does that callback help to GC the connection? Lets assume, that there are
no references to a netbox object in user code. It is not referenced in the
worker fiber either, because the only reference is callback, and it is weak. So
it is deleted. Together with
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L844-L846.
It is a standard way of setting your own callbacks on GC of a table - you create a
dummy cdata object with a custom GC callback. On deletion of that gc-hook a
function
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L473-L481
is called. It cancels the worker fiber. Now both the netbox object and the worker
fiber are deleted.

Talking of this issue: your situation is simpler, as I see now, when I dived a
bit into the problem. You have nothing to do about the netbox object. Your target
is the worker fiber only. And in such a case I don't see any magic solution that
would shrink this patch https://github.com/tarantool/tarantool/commit/abc88ef8d38e21a6c7faf925a19105ce65d45467
to a couple of lines. The only obvious thing you can do here is to avoid passing
version id to iproto_auth_sm and iproto_schema_sm functions, and Alexander proposed.
Just make it a global value here
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L232
together with the others, if you still will need.

Your case is different from reconnect_timeout and other callback-stuff, because
these parameters are set by a user out of the worker fiber, and can be changed by
him. Callback allows the worker fiber to read actual values always. Version id
originates from the worker fiber, and so can be maintained and kept here.

Talking of how to remove branching - it has nothing to do with netbox specific code,
or GC, or anything application dependent. It is a pure technical matter. You have a
classic problem already solved in box/lua/upgrade.lua. The problem is that you have a
pipeline of versions, each adding, or removing, or changing something. And it is logical
to have a pipeline of functions building that schema step by step. IMO, you should have
several functions to fetch and to install schema.

Foe example:

function fetch_schema_2_2_1()
...
end

function fetch_schema_2_2_2()
   fetch_schema_2_2_1()
   -- + some v_collation things.
end

function fetch_schema_2_2_3()
   fetch_schema_2_2_3()
   --- + ...
end

And you have a global table where key is version id, value is a schema fetch
function. Here
https://github.com/tarantool/tarantool/commit/abc88ef8d38e21a6c7faf925a19105ce65d45467#diff-a9dd9be71f8af749f288abb6304a5185R733
you take a fetcher from the table like 'fetcher = fetchers[version_id]', and just
call it.

All these encode selects
https://github.com/tarantool/tarantool/commit/abc88ef8d38e21a6c7faf925a19105ce65d45467#diff-a9dd9be71f8af749f288abb6304a5185R754
should move to these schema fetchers. Of course, it wouldn't be that simple. At least,
because schema fetch requests are sent in parallel, as you can see here:
https://github.com/tarantool/tarantool/blob/0529e2f0ef7dd4c19283b9e8e6b0e7e43bf400a3/src/box/lua/net_box.lua#L743-L748
And still the idea is the same. Perhaps you will have not only fetcher for each
version, but two functions: start_fetch/end_fetch, or begin_fetch/end_fetch. A
first function sends a request (encode_select), a second one waits for a response.
Or let these fetchers return an asynchronous id, by which you will wait for
responses in one place. Don't know what to chose. If I were you, and I had time, I
would try to implement these step-wise fetchers. They would be quite easy to
support and to extend for other future schema changes.

Perhaps, it will remove the branching. Up to you what to do, to Alexander, and to
other managers. My only wish is to look at the final patch when it will be ready.

  reply	other threads:[~2019-07-24 22:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-08 12:13 [tarantool-patches] " Roman Khabibov
2019-07-12  9:55 ` [tarantool-patches] " Kirill Yukhin
2019-07-19 12:16   ` Roman Khabibov
2019-07-20 19:07   ` Alexander Turenko
2019-07-24 21:27     ` Alexander Turenko
2019-07-24 22:43       ` Vladislav Shpilevoy [this message]
2019-07-30 10:58         ` Alexander Turenko
2019-07-30 12:08           ` Konstantin Osipov
2019-07-30 20:18           ` Vladislav Shpilevoy
2019-07-30 22:35             ` Alexander Turenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b2117535-46ee-2a6b-1a4a-eb3fd198ae39@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=kostja.osipov@gmail.com \
    --cc=kyukhin@tarantool.org \
    --cc=roman.habibov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH] net.box: ignore absence of _vcollation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox