[Tarantool-patches] [PATCH v10 2/4] limbo: order access to the limbo terms

Cyrill Gorcunov gorcunov at gmail.com
Fri Aug 6 18:20:10 MSK 2021


On Fri, Aug 06, 2021 at 01:29:57AM +0200, Vladislav Shpilevoy wrote:
> >  apply_synchro_row(uint32_t replica_id, struct xrow_header *row)
> >  {
> >  	assert(iproto_type_is_synchro_request(row->type));
> > +	int rc = 0;
> >  
> >  	struct synchro_request req;
> >  	if (xrow_decode_synchro(row, &req) != 0)
> >  		goto err;
> >  
> > +	txn_limbo_term_lock(&txn_limbo);
> 
> Maybe you should hide the lock from the API. Instead, do similar to
> what transactions do:
> 
> 	int txn_limbo_process_begin(limbo *);
> 	void txn_limbo_process_commit(limbo *, request *);
> 	void txn_limbo_process_rollback(limbo *);
> 
> begin would take the lock, commit would do the request and
> unlock, rollback would only unlock. Commit and rollback you
> call from apply_synchro_row_cb depend in on the WAL write
> result.
> 
> Then the locks would disappear from the API, right?

Unfortunatelly locking is needed not only for processing but
for reading terms as well. We have a few helpers more which
are waiting the other fibers to complete before reading terms.

applier_apply_tx
  applier_synchro_filter_tx
    txn_limbo_is_replica_outdated
      txn_limbo_term_lock
        txn_limbo_replica_term_locked
      txn_limbo_term_unlock

And a number of calls for txn_limbo_replica_term which reads
term in a locked way because we need to eliminate potential
race here and fetch only last written data.

So no, locking won't disappear. Another option may be to
introduce preemption disabling (just like kernel does for
tasks which should not be rescheduled on a core while
they are wating for some action to complete). Then our
write for synchro packets would look like

	preempt_disable();
	rc = journal_write();
	preempt_enable();

which would guarantee us that while we're waiting the journal
to finish its write no other fibers from the cord will be
executed and we gotta be woken up once write is complete.

This way I think we will be allowed to drop locking at all
because main problem is exactly because of other fibers get
running while we're writing synchro data.

> In the next patch you would make txn_limbo_process_begin()
> also take the request to validate it. Then the 'filtering'
> would become internal to the limbo.
> 
> >  	struct replica_cb_data rcb_data;
> >  	struct synchro_entry entry;

	Cyrill


More information about the Tarantool-patches mailing list