Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Ostanevich <sergos@tarantool.org>
To: Leonid Vasiliev <lvasiliev@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org,
	alexander.turenko@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond
Date: Wed, 11 Nov 2020 00:16:16 +0300	[thread overview]
Message-ID: <D78775C3-B91B-4256-9F51-24F0332AD152@tarantool.org> (raw)
In-Reply-To: <7e2cf16f-f443-5a4b-d658-8d4e3ecd6f74@tarantool.org>

Hi Leonid, thank you for the review!

I put two parts together to handle them in one patch.

> On 4 Nov 2020, at 13:00, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
> 
> Hi! Thank you for the patch.
> See some comments below:
> 
> 1) The patch changes undocumented behavior, AFAIU.
> So, I have a question:"Do you plan to backport the patch to
> tarantool 1.10?". If the answer is "Yes" - I'm comfortable with the
> changes. But if the answer is "No" - I will object, because in this case
> both behaviors must be supported in all modules.
> 

Yes, the plan is to push it down to 1.10

> 2) I think changing the behavior in C doesn't cause much of a problem,
> because before when you wait without timeout, you don't need to check
> the return value (it's always 0). But in Lua it will cause the problems,
> because now throws an error if cancelled and all wait calls should be
> wrapped to pcall.

Exactly it is the change to Lua and we intentionally make it, since original
behaviour is wrong. Good things are: fiber.cond has no big presence in the
/tarantool sourcebase and the cancellation of fiber itself means the fiber
has to quit.
 
> 
> On 31.10.2020 19:29, sergos@tarantool.org wrote:
>> From: Sergey Ostanevich <sergos@tarantool.org>
>> Hi!
>> Thanks to Oleg Babin's comment I found there's no need to update any lua
>> interfaces, since the reason was in C implementation. Also, there is one
>> place the change is played, so after I fixed it I got complete testing
>> pass.
>> Force-pushed branch, v2 patch attached.
>> Before this patch fiber.cond():wait() just returns for cancelled
>> fiber. In contrast fiber.channel():get() threw "fiber is
>> canceled" error.
>> This patch unify behaviour of channels and condvars and also fixes
> 
> 3) behaviour -> behavior.

Depends on spellchecker - mine sets it to this British version.

> 
>> related net.box module problem - it was impossible to interrupt
>> net.box call with fiber.cancel because it used fiber.cond under
>> the hood. Test cases for both bugs are added.
>> Closes #4834
>> Closes #5013
>> Co-authored-by: Oleg Babin <olegrok@tarantool.org>
>> @TarantoolBot document
>> Title: fiber.cond():wait() throws if fiber is cancelled
>> Currently fiber.cond():wait() throws an error if waiting fiber is
>> cancelled like in case with fiber.channel():get().
> 
> 4) I don't think it's a good decision adding a comparison with
> fiber.channel():get() to the documentation. Up to you.
> 

I’d agree to you: let’s just put the new behaviour here.

> 5) Document the changes in module.h.

Done (need to rebuild, obviously).

> 
>> ---
> 
> 6) Add @ChangeLog.
> 
>> Github: https://gitlab.com/tarantool/tarantool/-/commits/sergos/gh-5013-fiber-cond
>> Issue: https://github.com/tarantool/tarantool/issues/5013
>>  src/box/box.cc                                |  6 +-
>>  src/lib/core/fiber_cond.c                     |  1 +
>>  test/app-tap/gh-5013-fiber-cancel.test.lua    | 23 +++++++
>>  test/box/net.box_fiber_cancel_gh-4834.result  | 65 +++++++++++++++++++
>>  .../box/net.box_fiber_cancel_gh-4834.test.lua | 29 +++++++++
>>  5 files changed, 120 insertions(+), 4 deletions(-)
>>  create mode 100755 test/app-tap/gh-5013-fiber-cancel.test.lua
>>  create mode 100644 test/box/net.box_fiber_cancel_gh-4834.result
>>  create mode 100644 test/box/net.box_fiber_cancel_gh-4834.test.lua
> 
>> diff --git a/src/box/box.cc b/src/box/box.cc
>> index 18568df3b..9e824453d 100644
>> --- a/src/box/box.cc
>> +++ b/src/box/box.cc
>> @@ -305,10 +305,8 @@ box_wait_ro(bool ro, double timeout)
>>  {
>>  	double deadline = ev_monotonic_now(loop()) + timeout;
>>  	while (is_box_configured == false || box_is_ro() != ro) {
>> -		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0)
>> -			return -1;
>> -		if (fiber_is_cancelled()) {
>> -			diag_set(FiberIsCancelled);
>> +		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) {
>> +			if (fiber_is_cancelled()) diag_set(FiberIsCancelled);
> 
> 7) According to https://www.tarantool.io/en/doc/latest/dev_guide/c_style_guide/
> it's seems like you are trying to hide something)
> Use:
> ```
> if (condition)
>    action();
> ```
> Instead of:
> ```
> if (condition) action();
> another_action();
> ```

Fixed.

>>  			return -1;
>>  		}
>>  	}
>> diff --git a/src/lib/core/fiber_cond.c b/src/lib/core/fiber_cond.c
>> index 904a350d9..0c93c5842 100644
>> --- a/src/lib/core/fiber_cond.c
>> +++ b/src/lib/core/fiber_cond.c
>> @@ -108,6 +108,10 @@ fiber_cond_wait_timeout(struct fiber_cond *c, double timeout)
>>  		diag_set(TimedOut);
>>  		return -1;
>>  	}
>> +	if (fiber_is_cancelled()) {
>> +		if (diag_is_empty(diag_get())) diag_set(FiberIsCancelled);
> 
> 8) The same as previously:
> ```
> if (condition)
>    action();
> ```
> 

Fixed.

> 9) Checking diag on empty looks strange to me. I think we should add an
> error to diag without this check. If we want to save the previous error,
> I suggest to use the stack diag.

I made this because this interface is used inside the testing machinery, where 
exact cause of e.g. replica disconnection is reported. If we put the diag here
unconditionally it will break this use of diag. I believe the stack diag also 
is not supported there yet.


>> +		return -1;
>> +	}
>>  	return 0;
>>  }
>>  diff --git a/test/app-tap/gh-5013-fiber-cancel.test.lua b/test/app-tap/gh-5013-fiber-cancel.test.lua
>> new file mode 100755
>> index 000000000..ae805c5bf
>> --- /dev/null
>> +++ b/test/app-tap/gh-5013-fiber-cancel.test.lua
>> @@ -0,0 +1,23 @@
>> +#!/usr/bin/env tarantool
>> +
>> +local tap = require('tap')
>> +local fiber = require('fiber')
>> +local test = tap.test("gh-5013-fiber-cancel")
>> +
>> +test:plan(2)
>> +
>> +local result = {}
>> +
>> +function test_f()
>> +    local cond = fiber.cond()
>> +    local res, err = pcall(cond.wait, cond)
>> +    result.res = res
>> +    result.err = err
>> +end
>> +
>> +local f = fiber.create(test_f)
>> +f:cancel()
>> +fiber.yield()
>> +
>> +test:ok(result.res == false, tostring(result.res))
>> +test:ok(tostring(result.err) == 'fiber is cancelled', tostring(result.err))
> 
> 10) Use a user-frendly check message (in both checks).
> 
Fixed

>> diff --git a/test/box/net.box_fiber_cancel_gh-4834.result b/test/box/net.box_fiber_cancel_gh-4834.result
>> new file mode 100644
>> index 000000000..eab0a5e4d
>> --- /dev/null
>> +++ b/test/box/net.box_fiber_cancel_gh-4834.result
>> @@ -0,0 +1,65 @@
>> +-- test-run result file version 2
>> +remote = require 'net.box'
>> + | ---
>> + | ...
>> +fiber = require 'fiber'
>> + | ---
>> + | ...
>> +test_run = require('test_run').new()
>> + | ---
>> + | ...
>> +
>> +-- #4834: Cancelling fiber doesn't interrupt netbox operations
>> +function infinite_call() fiber.channel(1):get() end
>> + | ---
>> + | ...
>> +box.schema.func.create('infinite_call')
>> + | ---
>> + | ...
>> +box.schema.user.grant('guest', 'execute', 'function', 'infinite_call')
>> + | ---
>> + | ...
>> +
>> +error_msg = nil
>> + | ---
>> + | ...
>> +test_run:cmd("setopt delimiter ';'")
>> + | ---
>> + | - true
>> + | ...
>> +function gh4834( > +    local cn = remote.connect(box.cfg.listen)
>> +    local f = fiber.new(function()
>> +        _, error_msg = pcall(cn.call, cn, 'infinite_call')
>> +    end)
>> +    f:set_joinable(true)
>> +    fiber.yield()
>> +    f:cancel()
>> +    f:join()
>> +    cn:close()
>> +end;
>> + | ---
>> + | ...
>> +test_run:cmd("setopt delimiter ''");
>> + | ---
>> + | - true
>> + | ...
>> +gh4834()
>> + | ---
>> + | ...
>> +error_msg
>> + | ---
>> + | - fiber is cancelled
>> + | ...
>> +box.schema.func.drop('infinite_call')
>> + | ---
>> + | ...
>> +infinite_call = nil
>> + | ---
>> + | ...
>> +channel = nil
>> + | ---
>> + | ...
>> +error_msg = nil
>> + | ---
>> + | ...
>> diff --git a/test/box/net.box_fiber_cancel_gh-4834.test.lua b/test/box/net.box_fiber_cancel_gh-4834.test.lua
>> new file mode 100644
>> index 000000000..06fb3ceac
>> --- /dev/null
>> +++ b/test/box/net.box_fiber_cancel_gh-4834.test.lua
>> @@ -0,0 +1,29 @@
>> +remote = require 'net.box'
>> +fiber = require 'fiber'
>> +test_run = require('test_run').new()
>> +
>> +-- #4834: Cancelling fiber doesn't interrupt netbox operations
>> +function infinite_call() fiber.channel(1):get() end
>> +box.schema.func.create('infinite_call')
>> +box.schema.user.grant('guest', 'execute', 'function', 'infinite_call')
>> +
>> +error_msg = nil
>> +test_run:cmd("setopt delimiter ';'")
>> +function gh4834()
> 
> 11) I think this is not a self-documenting name for the function.
> 
Renamed.

>> +    local cn = remote.connect(box.cfg.listen)
>> +    local f = fiber.new(function()
>> +        _, error_msg = pcall(cn.call, cn, 'infinite_call')
>> +    end)
>> +    f:set_joinable(true)
>> +    fiber.yield()
>> +    f:cancel()
>> +    f:join()
>> +    cn:close()
>> +end;
>> +test_run:cmd("setopt delimiter ''");
>> +gh4834()
>> +error_msg
>> +box.schema.func.drop('infinite_call')
>> +infinite_call = nil
>> +channel = nil
> 
> 12) You didn't introduce the variable "channel". In addition, I looked
> at a couple of tests and didn't see them cleaning up variables. But I
> don't mind.
> 
Removed this one. In general, clean up prevents flaky.

>> +error_msg = nil

=================

Before this patch fiber.cond():wait() just returns for cancelled
fiber. In contrast fiber.channel():get() threw "fiber is
canceled" error.
This patch unify behaviour of channels and condvars and also fixes
related net.box module problem - it was impossible to interrupt
net.box call with fiber.cancel because it used fiber.cond under
the hood. Test cases for both bugs are added.

Closes #4834
Closes #5013

Co-authored-by: Oleg Babin <olegrok@tarantool.org>

@TarantoolBot document
Title: fiber.cond():wait() throws if fiber is cancelled

Currently fiber.cond():wait() throws an error if waiting fiber is
cancelled.
---

Github: https://gitlab.com/tarantool/tarantool/-/commits/sergos/gh-5013-fiber-cond
Issue: https://github.com/tarantool/tarantool/issues/5013

@Changelog
* fiber.cond().wait() now throws if fiber is cancelled

 src/box/box.cc                                |  6 +-
 src/lib/core/fiber_cond.c                     |  4 ++
 test/app-tap/gh-5013-fiber-cancel.test.lua    | 23 +++++++
 test/box/net.box_fiber_cancel_gh-4834.result  | 65 +++++++++++++++++++
 .../box/net.box_fiber_cancel_gh-4834.test.lua | 29 +++++++++
 5 files changed, 123 insertions(+), 4 deletions(-)
 create mode 100755 test/app-tap/gh-5013-fiber-cancel.test.lua
 create mode 100644 test/box/net.box_fiber_cancel_gh-4834.result
 create mode 100644 test/box/net.box_fiber_cancel_gh-4834.test.lua

diff --git a/src/box/box.cc b/src/box/box.cc
index 18568df3b..9e824453d 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -305,10 +305,8 @@ box_wait_ro(bool ro, double timeout)
 {
        double deadline = ev_monotonic_now(loop()) + timeout;
        while (is_box_configured == false || box_is_ro() != ro) {
-               if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0)
-                       return -1;
-               if (fiber_is_cancelled()) {
-                       diag_set(FiberIsCancelled);
+               if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) {
+                       if (fiber_is_cancelled()) diag_set(FiberIsCancelled);
                        return -1;
                }
        }
diff --git a/src/lib/core/fiber_cond.c b/src/lib/core/fiber_cond.c
index 904a350d9..0c93c5842 100644
--- a/src/lib/core/fiber_cond.c
+++ b/src/lib/core/fiber_cond.c
@@ -108,6 +108,10 @@ fiber_cond_wait_timeout(struct fiber_cond *c, double timeout)
                diag_set(TimedOut);
                return -1;
        }
+       if (fiber_is_cancelled()) {
+               if (diag_is_empty(diag_get())) diag_set(FiberIsCancelled);
+               return -1;
+       }
        return 0;
 }

diff --git a/test/app-tap/gh-5013-fiber-cancel.test.lua b/test/app-tap/gh-5013-fiber-cancel.test.lua
new file mode 100755
index 000000000..34711fb31
--- /dev/null
+++ b/test/app-tap/gh-5013-fiber-cancel.test.lua
@@ -0,0 +1,23 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local fiber = require('fiber')
+local test = tap.test("gh-5013-fiber-cancel")
+
+test:plan(2)
+
+local result = {}
+
+function test_f()
+    local cond = fiber.cond()
+    local res, err = pcall(cond.wait, cond)
+    result.res = res
+    result.err = err
+end
+
+local f = fiber.create(test_f)
+f:cancel()
+fiber.yield()
+
+test:ok(result.res == false, tostring(result.res))
+test:ok(tostring(result.err) == 'fiber is cancelled', tostring(result.err))
diff --git a/test/box/net.box_fiber_cancel_gh-4834.result b/test/box/net.box_fiber_cancel_gh-4834.result
new file mode 100644
index 000000000..eab0a5e4d
--- /dev/null
+++ b/test/box/net.box_fiber_cancel_gh-4834.result
@@ -0,0 +1,65 @@
+-- test-run result file version 2
+remote = require 'net.box'
+ | ---
+ | ...
+fiber = require 'fiber'
+ | ---
+ | ...
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+-- #4834: Cancelling fiber doesn't interrupt netbox operations
+function infinite_call() fiber.channel(1):get() end
+ | ---
+ | ...
+box.schema.func.create('infinite_call')
+ | ---
+ | ...
+box.schema.user.grant('guest', 'execute', 'function', 'infinite_call')
+ | ---
+ | ...
+
+error_msg = nil
+ | ---
+ | ...
+test_run:cmd("setopt delimiter ';'")
+ | ---
+ | - true
+ | ...
+function gh4834()
+    local cn = remote.connect(box.cfg.listen)
+    local f = fiber.new(function()
+        _, error_msg = pcall(cn.call, cn, 'infinite_call')
+    end)
+    f:set_joinable(true)
+    fiber.yield()
+    f:cancel()
+    f:join()
+    cn:close()
+end;
+ | ---
+ | ...
+test_run:cmd("setopt delimiter ''");
+ | ---
+ | - true
+ | ...
+gh4834()
+ | ---
+ | ...
+error_msg
+ | ---
+ | - fiber is cancelled
+ | ...
+box.schema.func.drop('infinite_call')
+ | ---
+ | ...
+infinite_call = nil
+ | ---
+ | ...
+channel = nil
+ | ---
+ | ...
+error_msg = nil
+ | ---
+ | ...
diff --git a/test/box/net.box_fiber_cancel_gh-4834.test.lua b/test/box/net.box_fiber_cancel_gh-4834.test.lua
new file mode 100644
index 000000000..06fb3ceac
--- /dev/null
+++ b/test/box/net.box_fiber_cancel_gh-4834.test.lua
@@ -0,0 +1,29 @@
+remote = require 'net.box'
+fiber = require 'fiber'
+test_run = require('test_run').new()
+
+-- #4834: Cancelling fiber doesn't interrupt netbox operations
+function infinite_call() fiber.channel(1):get() end
+box.schema.func.create('infinite_call')
+box.schema.user.grant('guest', 'execute', 'function', 'infinite_call')
+
+error_msg = nil
+test_run:cmd("setopt delimiter ';'")
+function gh4834()
+    local cn = remote.connect(box.cfg.listen)
+    local f = fiber.new(function()
+        _, error_msg = pcall(cn.call, cn, 'infinite_call')
+    end)
+    f:set_joinable(true)
+    fiber.yield()
+    f:cancel()
+    f:join()
+    cn:close()
+end;
+test_run:cmd("setopt delimiter ''");
+gh4834()
+error_msg
+box.schema.func.drop('infinite_call')
+infinite_call = nil
+channel = nil
+error_msg = nil
-- 
2.29.2

  parent reply	other threads:[~2020-11-10 21:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 16:29 sergos
2020-11-01 10:13 ` Oleg Babin
2020-11-03 10:20   ` Sergey Ostanevich
2020-11-03 10:27     ` Oleg Babin
2020-11-04 10:00     ` Leonid Vasiliev
2020-11-16 22:12     ` Vladislav Shpilevoy
2020-11-18 22:05       ` Sergey Ostanevich
2020-11-22 16:01         ` Vladislav Shpilevoy
2020-11-23 21:47           ` Sergey Ostanevich
2020-11-24  7:31           ` Sergey Ostanevich
2020-11-04 10:00 ` Leonid Vasiliev
2020-11-05 20:42   ` Sergey Ostanevich
2020-11-10 21:16   ` Sergey Ostanevich [this message]
2020-11-12 20:15     ` Sergey Ostanevich
2020-11-13  8:26       ` Leonid Vasiliev
2020-11-30 22:49         ` Alexander Turenko
2020-11-16 22:12 ` Vladislav Shpilevoy
2020-11-25 21:32 ` Vladislav Shpilevoy
2020-11-29 21:41   ` Sergey Ostanevich
2020-11-30 21:46   ` Alexander Turenko
2020-11-30 21:01 ` Vladislav Shpilevoy
2020-12-02 10:58 ` Alexander V. Tikhonov
2020-12-02 22:18 ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D78775C3-B91B-4256-9F51-24F0332AD152@tarantool.org \
    --to=sergos@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=lvasiliev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox