From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D7AD8445320 for ; Fri, 10 Jul 2020 03:23:49 +0300 (MSK) References: From: Vladislav Shpilevoy Message-ID: Date: Fri, 10 Jul 2020 02:23:47 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH 1/1] txn_limbo: introduce dynamic synchro config List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Leonid Vasiliev , tarantool-patches@dev.tarantool.org >>   diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c >> index e28e2016f..2575f4c25 100644 >> --- a/src/box/txn_limbo.c >> +++ b/src/box/txn_limbo.c >> @@ -159,45 +160,56 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry) >>         assert(!txn_has_flag(txn, TXN_IS_DONE)); >>       assert(txn_has_flag(txn, TXN_WAIT_SYNC)); >> -    bool cancellable = fiber_set_cancellable(false); >> -    bool timed_out = fiber_yield_timeout(txn_limbo_confirm_timeout(limbo)); >> -    fiber_set_cancellable(cancellable); >> -    if (timed_out) { >> -        assert(!txn_limbo_is_empty(limbo)); >> -        if (txn_limbo_first_entry(limbo) != entry) { >> -            /* >> -             * If this is not a first entry in the >> -             * limbo, it is definitely not a first >> -             * timed out entry. And since it managed >> -             * to time out too, it means there is >> -             * currently another fiber writing >> -             * rollback. Wait when it will finish and >> -             * wake us up. >> -             */ >> -            bool cancellable = fiber_set_cancellable(false); >> -            fiber_yield(); >> -            fiber_set_cancellable(cancellable); >> -            assert(txn_limbo_entry_is_complete(entry)); >> +    double start_time = fiber_clock(); >> +    while (true) { >> +        double deadline = start_time + txn_limbo_confirm_timeout(limbo); >> +        bool cancellable = fiber_set_cancellable(false); >> +        double timeout = deadline - fiber_clock(); > > Why not just timeout = txn_limbo_confirm_timeout(limbo) ? > It's look like > fiber_clock()(old) + txn_limbo_confirm_timeout(limbo) - fiber_clock()(new) ~= txn_limbo_confirm_timeout(limbo) Because it will wait infinitely in case I will constantly update the parameters. Deadline will move forward and forward. It shouldn't. >> +        bool timed_out = fiber_cond_wait_timeout(&limbo->wait_cond, >> +                             timeout); >> +        fiber_set_cancellable(cancellable); >> +        if (txn_limbo_entry_is_complete(entry)) >>               goto complete; >> -        } >> +        if (timed_out) >> +            goto do_rollback; >> +    } >> @@ -472,17 +484,26 @@ txn_limbo_wait_confirm(struct txn_limbo *limbo) >>       struct txn_limbo_entry *tle = txn_limbo_last_entry(limbo); >>       txn_on_commit(tle->txn, &on_complete); >>       txn_on_rollback(tle->txn, &on_rollback); >> - >> -    int rc = fiber_cond_wait_timeout(&cwp.confirm_cond, >> -                     txn_limbo_confirm_timeout(limbo)); >> -    fiber_cond_destroy(&cwp.confirm_cond); >> -    if (rc != 0) { >> -        /* Clear the triggers if the timeout has been reached. */ >> -        trigger_clear(&on_complete); >> -        trigger_clear(&on_rollback); >> -        diag_set(ClientError, ER_SYNC_QUORUM_TIMEOUT); >> -        return -1; >> +    double start_time = fiber_clock(); > +    while (true) { >> +        double deadline = start_time + txn_limbo_confirm_timeout(limbo); >> +        bool cancellable = fiber_set_cancellable(false); >> +        double timeout = deadline - fiber_clock(); > > Maybe add a comment about the possible reasons of wake up > (reconfiguration, triggers, timeout). Wait_cond has a comment about that already. >> +        int rc = fiber_cond_wait_timeout(&limbo->wait_cond, timeout); > > If I understand correctly, you use a trick for wake up from triggers. > In this case fiber_wakeup invoke manually without using the fiber_cond > API. In this context, I have few questions/comments: > - IMHO, this is not true way to use implemantation details of fiber_cond > passing over the fiber_cond API. > - In the case, we don't interact with waiters list inside condition > variable. Are you sure this is ok? I am sure that it is ok to use fiber_wakeup(). Moreover, you can't use fiber_cond API here, because you will wakeup other fibers, not related to the waiter. So the direct wakeup is the only way.