[Tarantool-patches] [PATCH vshard 1/1] rebalancer: give more info at bucket_recv() fail

Oleg Babin olegrok at tarantool.org
Wed May 26 12:03:18 MSK 2021


Hi! Thanks for your patch. Two minor comments below.

On 25.05.2021 23:42, Vladislav Shpilevoy wrote:
> +--
>   -- Bucket transfer
>   --
>   -- Transfer to unknown replicaset.
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index 494e2e8..d1f3f50 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -125,6 +125,23 @@ vshard.storage.bucket_recv(100, 'from_uuid', {{1000, {{1}}}})
>   res, err = vshard.storage.bucket_recv(4, util.replicasets[2], {{1000, {{1}}}})
>   util.portable_error(err)
>   while box.space._bucket:get{4} do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end
> +--
> +-- gh-275: detailed info when couldn't insert into a space.
> +--
> +res, err = vshard.storage.bucket_recv(                                          \
> +    4, util.replicasets[2], {{box.space.test.id, {{9, 4}, {10, 4}, {1, 4}}}})
> +assert(not res)
> +assert(err.space == 'test')
> +assert(err.bucket_id == 4)
> +assert(tostring(err.tuple) == '[1, 4]')
> +assert(err.reason:match('Duplicate key exists') ~= nil)
> +err = err.message
> +assert(err:match('bucket 4 data in space "test" at tuple %[1, 4%]') ~= nil)
> +assert(err:match('Duplicate key exists') ~= nil)
> +while box.space._bucket:get{4} do                                               \
> +    vshard.storage.recovery_wakeup() fiber.sleep(0.01)                          \
> +end
> +assert(box.space.test:get{9} == nil and box.space.test:get{10} == nil)
>   
>   --
>   -- Bucket transfer
> diff --git a/vshard/error.lua b/vshard/error.lua
> index b02bfe9..bcbcd71 100644
> --- a/vshard/error.lua
> +++ b/vshard/error.lua
> @@ -149,6 +149,11 @@ local error_message_template = {
>           msg = 'Can not delete a storage ref: %s',
>           args = {'reason'},
>       },
> +    [30] = {
> +        name = 'BUCKET_RECV_DATA_ERROR',
> +        msg = 'Can not receive the bucket %s data in space "%s" at tuple %s: %s',
> +        args = {'bucket_id', 'space', 'tuple', 'reason'},
> +    }
>   }
>   
>   --
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 63e0398..7045d91 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1254,7 +1254,13 @@ local function bucket_recv_xc(bucket_id, from, data, opts)
>           end
>           box.begin()
>           for _, tuple in ipairs(space_data) do
> -            space:insert(tuple)
> +            local ok, err = pcall(space.insert, space, tuple)
> +            if not ok then
> +                box.rollback()

Am I right that before a patch nobody rolled back transaction is case of 
error?

How did it work?

> +                return nil, lerror.vshard(lerror.code.BUCKET_RECV_DATA_ERROR,
> +                                          bucket_id, space.name,
> +                                          box.tuple.new(tuple), err)
> +            end

Do you really need `box.tuple.new` here. Why just `tuple` is not enough?

AFAIU box.tuple.new doesn't just increment tuple ref-counter and 
construct new tuple.

Rebalancing is quite CPU-intensive operation so I'm not sure that such 
behaviour doesn't

make error case worse.


>               limit = limit - 1
>               if limit == 0 then
>                   box.commit()


More information about the Tarantool-patches mailing list