[Tarantool-patches] [PATCH 2/2] fiber: destroy fiber.storage created by iproto

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Dec 11 01:59:45 MSK 2019


Hi! Thanks for the review!

On 10/12/2019 09:32, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/12/08 23:47]:
>>  static int
>> -session_on_stop(struct trigger *trigger, void * /* event */)
>> +session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
> 
> Please come up with a better name, and provide a comment.
> 

Well, I don't know a better name. IMO this name perfectly
describes what is happening. 'Cleanup' means that the
fiber is getting cleaned up. Quite straightforward
explanation. If you have other ideas - please, bring them
up, and lets discuss.

I was thinking about 'on_reload', 'on_reuse', 'on_retire',
'on_reset', 'on_request_end', 'on_destroy'. Or keep 'on_stop'.
Reload - bad, because it is also called when the fiber goes
  to the dead list. It is not reload.
Reuse - bad by the same reason.
Retire - bad, because basically = death, but the fiber may be
  getting reused.
Reset - bad, too common.
Request_end - bad, because some fibers don't serve requests.
  For example, single fibers of background threads.
Stop/destroy - bad, because would be called after each
  request, and that != fiber stop or destruction.

Waiting for your feedback. I really don't know what an
other name to use.

Comment update:

===================================================================
+/**
+ * Destroy session of a background fiber when the
+ * fiber is getting destroyed.
+ */
 static int
 session_on_fiber_cleanup(struct trigger *trigger, void * /* event */)
 {
===================================================================

Even though I think that it is useless here. The function purpose
is obvious already.

>> -	trigger_add(&fiber()->on_stop, &s->fiber_on_stop);
>> +	trigger_create(&s->fiber_on_cleanup, session_on_fiber_cleanup,
>> +		       NULL, NULL);
>> +	/* Add a trigger to destroy session on fiber cleanup. */
>> +	trigger_add(&fiber()->on_cleanup, &s->fiber_on_cleanup);
> 
> I don't understand why these two concerns are intermixed here.
> Why is this code part of session now? 

Looks like you didn't read the patch carefully enough. This
piece of code is kept as is. I just used a proper
constructor for trigger object instead of assigning its
members manually. And added a comment. That is all. I would
not change this hunk at all, if no the 'on_stop' -> 'on_cleanup'
rename.

Talking of why is that code part of the session, *and always
was* - I guess so as to destroy only local sessions. We
can't just destroy any session in struct fiber.storage.session
when the fiber is being destroyed or reused, right in fiber.c.
Because remote sessions live longer than their fibers. This is
why local sessions schedule their own destruction via
on_stop/on_cleanup trigger in the fiber. And remote sessions
don't.

Although that problem could be solved by reference counting.
Anyway, that is not a subject to discuss in scope of that
issue.

> 
> I think if you change the semantics of the trigger, you should 
> move setting/resetting it to the place which controls the relevant
> object and its life cycle. Specifically, if you change
> fiber.storage semantics to be per-request *or* per Lua fiber,
> you should move the trigger to lua fiber create/destroy +
> iproto request processing start/end.

Well, this is not about Lua only. Background sessions can be
created from C API too. I could patch both Lua lbox_fiber_create
and API_EXPORT fiber_new(), but again certainly not here. I
would like to keep these not related changes away from the
problem of storage leaking and garbaging. I would be happy to
fix them separately.

Btw, Lua fiber storage in that patch is created and destroyed
in Lua part. So it conforms to your vision of how the trigger
should be used.

> Instead you move it
> to the core library, which is oblivious of these events. 

Nope. I don't move anything anywhere. Everything is kept
where it was.

> The way the core is implemented may change, and whoever is
> changing it will have to be aware of the legacy behaviour you make
> it depend on.
> 
>> index eff3d7a67..710e4919b 100644
>> --- a/src/box/session.h
>> +++ b/src/box/session.h
>> @@ -103,8 +103,11 @@ struct session {
>>  	union session_meta meta;
>>  	/** Session user id and global grants */
>>  	struct credentials credentials;
>> -	/** Trigger for fiber on_stop to cleanup created on-demand session */
>> -	struct trigger fiber_on_stop;
>> +	/**
>> +	 * Trigger for fiber on_cleanup to cleanup created
>> +	 * on-demand session.
>> +	 */
>> +	struct trigger fiber_on_cleanup;
> ...
> 
>> -	trigger_clear(&txn->fiber_on_stop);
>> +	trigger_clear(&txn->fiber_on_cleanup);
> 
> Please see the above comment. Fiber cleanup trigger, as defined
> now, does not have anything to do with the session, or fiber, or
> transaction.

But it does. As I said, my patch does not change anything here.
Txn and local session were destroyed in fiber cleanup, and they
still are.

> It belongs to the request processing layer and should be
> set/cleared from it.

Fiber loop and pool are the request processing layers. They process
all requests, local and remote. So cleanup was and still is
invoked there. But see my last comment below.

>> -	struct rlist on_stop;
>> +	/**
>> +	 * Triggers invoked before this fiber is
>> +	 * stopped/reset/recycled/destroyed/reused. In other
>> +	 * words, each time when the fiber has finished execution
>> +	 * of a request.
>> +	 */
>> +	struct rlist on_cleanup;
> 
> Please make the comment describing the timing when this
> trigger is called less broad and more accurate, calling out
> the specific places when the trigger is called, and call
> the trigger accordingly.
> on_request_or_fiber_end?

IMO, that names looks terrible. See my first comment.

Talking of the code comment. I am not sure what is wrong
with accuracy. I said exactly when cleanup is being done:
stop/reset/recycle/destroy/reuse/request end.

But ok:

===================================================================
@@ -463,6 +463,10 @@ struct fiber {
 	 * stopped/reset/recycled/destroyed/reused. In other
 	 * words, each time when the fiber has finished execution
 	 * of a request.
+	 * In particular, for fibers not from fiber pool the
+	 * cleanup is done before destroy and death.
+	 * Pooled fibers are cleaned up after each request, even
+	 * if they are never destroyed.
 	 */
 	struct rlist on_cleanup;
 	/**
===================================================================

I am going to stop here and not to give any source code
references in the comment, because such refs tend to
outdate.

>> +/** Invoke cleanup triggers and delete them. */
>> +void
>> +fiber_cleanup(struct fiber *f);
> 
> The call for a better name for the event and the accompanying 
> action applies to all of the new api
> methods this patch introduces.
> 

This is mutual. Propose a better name or choose one of
mine, please. I gave them in the first comment. I can't read
your mind, so if you want a particular name - say it, and lets
discuss.

>>  		cmsg_deliver(msg);
>> +		/*
>> +		 * Different messages = different requests. They
>> +		 * should not affect each other. This is why
>> +		 * cleanup is done here.
>> +		 */
>> +		fiber_cleanup(f);
> 
> Ugh. I think this is a layering violation, and the cleanup
> call should be somewhere inside cmsg_deliver for the relevant
> messages. The rlist with cleanup triggers is better stored in
> a fiber key.
> 

Sorry. I really don't do that on purpose. This is not because
I just like to disagree with everything. This is because I don't
understand you sometimes. - Here I again partially disagree :)

cmsg_deliver() is used in two places - the pooled fiber loop, and
cbus loop. Cbus loop is used in background threads such as vinyl
readers, relay, etc. They have no a concept of isolated request or
a context. The whole cbus_loop is one long request working in one
long-living fiber. So the only place needed that patch is the
fiber pool loop, which serves many requests. An attempt to call
cleanup in cbus loop would destroy that fiber's session after each
request, what looks like needless oscillation. That session is empty
anyway.

Also cbus has nothing to do with fiber control. It uses some
fiber things such as conds, but that is all.


On the other hand I agree, that fiber pooled loop probably should
not know anything about requests. Another option is to call
fiber_cleanup() right in iproto.cc after each request. We would need
to patch tx_process_misc(), tx_process_call(), tx_process1(),
tx_process_select(), tx_process_sql(), tx_process_join_subscribe().
Not so many places. What do you think?


More information about the Tarantool-patches mailing list