Tarantool development patches archive
 help / color / mirror / Atom feed
From: Oleg Babin <olegrok@tarantool.org>
To: sergos@tarantool.org, tarantool-patches@dev.tarantool.org
Cc: v.shpilevoy@tarantool.org, alexander.turenko@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond
Date: Sun, 1 Nov 2020 13:13:09 +0300	[thread overview]
Message-ID: <ca7d33f3-e33a-231f-6a32-6b1ba0588213@tarantool.org> (raw)
In-Reply-To: <20201031162911.61876-1-sergos@tarantool.org>

[-- Attachment #1: Type: text/plain, Size: 4943 bytes --]

Hi! Thanks for changes. See two comments below.

On 31/10/2020 19:29, sergos@tarantool.org wrote:
> From: Sergey Ostanevich <sergos@tarantool.org>
>
> Hi!
>
> Thanks to Oleg Babin's comment I found there's no need to update any lua
> interfaces, since the reason was in C implementation. Also, there is one
> place the change is played, so after I fixed it I got complete testing
> pass.
> Force-pushed branch, v2 patch attached.
>
>
>
> Before this patch fiber.cond():wait() just returns for cancelled
> fiber. In contrast fiber.channel():get() threw "fiber is
> canceled" error.
> This patch unify behaviour of channels and condvars and also fixes
> related net.box module problem - it was impossible to interrupt
> net.box call with fiber.cancel because it used fiber.cond under
> the hood. Test cases for both bugs are added.
>
> Closes #4834
> Closes #5013
>
> Co-authored-by: Oleg Babin <olegrok@tarantool.org>
>
> @TarantoolBot document
> Title: fiber.cond():wait() throws if fiber is cancelled
>
> Currently fiber.cond():wait() throws an error if waiting fiber is
> cancelled like in case with fiber.channel():get().
> ---
>
> Github: https://gitlab.com/tarantool/tarantool/-/commits/sergos/gh-5013-fiber-cond
> Issue: https://github.com/tarantool/tarantool/issues/5013
>
>   src/box/box.cc                                |  6 +-
>   src/lib/core/fiber_cond.c                     |  1 +
>   test/app-tap/gh-5013-fiber-cancel.test.lua    | 23 +++++++
>   test/box/net.box_fiber_cancel_gh-4834.result  | 65 +++++++++++++++++++
>   .../box/net.box_fiber_cancel_gh-4834.test.lua | 29 +++++++++
>   5 files changed, 120 insertions(+), 4 deletions(-)
>   create mode 100755 test/app-tap/gh-5013-fiber-cancel.test.lua
>   create mode 100644 test/box/net.box_fiber_cancel_gh-4834.result
>   create mode 100644 test/box/net.box_fiber_cancel_gh-4834.test.lua
>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 18568df3b..bfa1051f9 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -305,10 +305,8 @@ box_wait_ro(bool ro, double timeout)
>   {
>   	double deadline = ev_monotonic_now(loop()) + timeout;
>   	while (is_box_configured == false || box_is_ro() != ro) {
> -		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0)
> -			return -1;
> -		if (fiber_is_cancelled()) {
> -			diag_set(FiberIsCancelled);
> +		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) {
> +                        if (fiber_is_cancelled()) diag_set(FiberIsCancelled);

Here you use spaces instead of tabs.

>   			return -1;
>   		}
>   	}
> diff --git a/src/lib/core/fiber_cond.c b/src/lib/core/fiber_cond.c
> index 904a350d9..b0645069e 100644
> --- a/src/lib/core/fiber_cond.c
> +++ b/src/lib/core/fiber_cond.c
> @@ -108,6 +108,7 @@ fiber_cond_wait_timeout(struct fiber_cond *c, double timeout)
>   		diag_set(TimedOut);
>   		return -1;
>   	}
> +	if (fiber_is_cancelled()) return -1;

It's qute strange to return -1 here but don't set a reason to diagnostic 
area. Look how it is done for channels

(https://github.com/tarantool/tarantool/blob/42c64d06d5d1a3ec937b3c596af083a672a68ad8/src/lib/core/fiber_channel.c#L180).

There is some inconsistency without it.

I've looked a bit deeper at the failure I reported before. Seems the 
problem is in "cbus_unpair" function.

The problem appears only if FiberIsCancelled is setted to diag area in 
"fiber_cond_wait" function.

This is where my expertise ends, as I'm not familiar with "cbus". 
However I have some minds how it could be eliminated.

Let's declare cbus_unpair fiber as is not cancellable and stop report 
is_cancellable flag for non-cancellable fibers. See some PoC below:


diff --git a/src/lib/core/cbus.c b/src/lib/core/cbus.c
index 5d91fb948..4167c756a 100644
--- a/src/lib/core/cbus.c
+++ b/src/lib/core/cbus.c
@@ -630,6 +630,7 @@ cbus_unpair(struct cpipe *dest_pipe, struct cpipe 
*src_pipe,
      msg.unpair_arg = unpair_arg;
      msg.src_pipe = src_pipe;
      msg.complete = false;
+    fiber_set_cancellable(false);
      fiber_cond_create(&msg.cond);

      cpipe_push(dest_pipe, &msg.cmsg);
@@ -643,6 +644,7 @@ cbus_unpair(struct cpipe *dest_pipe, struct cpipe 
*src_pipe,
          fiber_cond_wait(&msg.cond);
      }

+    fiber_set_cancellable(true);
      cpipe_destroy(dest_pipe);
  }

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index 483ae3ce1..8100c9da6 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -553,6 +553,9 @@ fiber_set_cancellable(bool yesno)
  bool
  fiber_is_cancelled(void)
  {
+    if ((fiber()->flags & FIBER_IS_CANCELLABLE) == 0) {
+        return false;
+    }
      return fiber()->flags & FIBER_IS_CANCELLED;
  }


To be honest I've not checked such change carefully and also have 
segfault at replication/gc.test.lua for "memtx" engine.

Finally, feel free to ignore this comment I hope Vlad or Sasha can give 
you more accurate and correct advices.


[-- Attachment #2: Type: text/html, Size: 7437 bytes --]

  reply	other threads:[~2020-11-01 10:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 16:29 sergos
2020-11-01 10:13 ` Oleg Babin [this message]
2020-11-03 10:20   ` Sergey Ostanevich
2020-11-03 10:27     ` Oleg Babin
2020-11-04 10:00     ` Leonid Vasiliev
2020-11-16 22:12     ` Vladislav Shpilevoy
2020-11-18 22:05       ` Sergey Ostanevich
2020-11-22 16:01         ` Vladislav Shpilevoy
2020-11-23 21:47           ` Sergey Ostanevich
2020-11-24  7:31           ` Sergey Ostanevich
2020-11-04 10:00 ` Leonid Vasiliev
2020-11-05 20:42   ` Sergey Ostanevich
2020-11-10 21:16   ` Sergey Ostanevich
2020-11-12 20:15     ` Sergey Ostanevich
2020-11-13  8:26       ` Leonid Vasiliev
2020-11-30 22:49         ` Alexander Turenko
2020-11-16 22:12 ` Vladislav Shpilevoy
2020-11-25 21:32 ` Vladislav Shpilevoy
2020-11-29 21:41   ` Sergey Ostanevich
2020-11-30 21:46   ` Alexander Turenko
2020-11-30 21:01 ` Vladislav Shpilevoy
2020-12-02 10:58 ` Alexander V. Tikhonov
2020-12-02 22:18 ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca7d33f3-e33a-231f-6a32-6b1ba0588213@tarantool.org \
    --to=olegrok@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox