From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp3.mail.ru (smtp3.mail.ru [94.100.179.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 23E4D4696C3 for ; Fri, 1 May 2020 19:58:12 +0300 (MSK) From: Oleg Babin References: <5dc5d2b0aaad4fcfc157c5a6e6bd13025ba85a25.1588292014.git.v.shpilevoy@tarantool.org> Message-ID: <6c622f6c-19d9-a64b-5ccb-8799561ef7cf@tarantool.org> Date: Fri, 1 May 2020 19:58:10 +0300 MIME-Version: 1.0 In-Reply-To: <5dc5d2b0aaad4fcfc157c5a6e6bd13025ba85a25.1588292014.git.v.shpilevoy@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH vshard 1/7] test: print errors in a portable way List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org Hi! Thanks for the patch! See three small comments. On 01/05/2020 03:16, Vladislav Shpilevoy wrote: > > +-- Portable representation of an error, not depending on Tarantool > +-- version and on any additional fields it can add. Trace is also > +-- trimmed in order for the tests not to depend on line numbers of > +-- the source files, which may slip into a .result file. > +local function portable_error(err) > + return {code = err.code, type = err.type, message = err.message} > +end I propose to add "nil" check here. At first it allows to use this hepler not only when we expect error. Secondly this approach eliminates confusing error messages if tests break for some reasons. > + > return { > check_error = check_error, > shuffle_masters = shuffle_masters, > @@ -206,4 +214,5 @@ return { > SOURCEDIR = SOURCEDIR, > BUILDDIR = BUILDDIR, > git_checkout = git_checkout, > + portable_error = portable_error, > } > diff --git a/test/misc/check_uuid_on_connect.result b/test/misc/check_uuid_on_connect.result > index d65aefe..6ebc5d0 100644 > --- a/test/misc/check_uuid_on_connect.result > +++ b/test/misc/check_uuid_on_connect.result > @@ -4,10 +4,6 @@ test_run = require('test_run').new() > fiber = require('fiber') > --- > ... > -test_run:cmd("push filter 'line: *[0-9]+' to 'line: '") > ---- > -- true > -... > REPLICASET_1 = { 'bad_uuid_1_a', 'bad_uuid_1_b' } > --- > ... > @@ -42,15 +38,15 @@ vshard.storage.bucket_force_create(1) > ... > -- Fail, because replicaset_1 sees not the actual replicaset_2's > -- master UUID. > -vshard.storage.bucket_send(1, replicaset_uuid[2]) > +res, err = vshard.storage.bucket_send(1, replicaset_uuid[2]) > +--- > +... > +res, util.portable_error(err) > --- > - null > - type: ClientError > code: 77 > message: Connection closed > - trace: > - - file: builtin/box/net_box.lua > - line: > ... > test_run:grep_log('bad_uuid_1_a', 'Mismatch server UUID on replica bad_uuid_2_a%(storage%@') > --- > @@ -171,15 +167,15 @@ vshard.storage.bucket_force_create(2) > --- > - true > ... > -vshard.storage.bucket_send(2, replicaset_uuid[2]) > +res, err = vshard.storage.bucket_send(2, replicaset_uuid[2]) > +--- > +... > +res, util.portable_error(err) > --- > - null > - type: ClientError > code: 77 > message: Connection closed > - trace: > - - file: builtin/box/net_box.lua > - line: > ... > -- Close existing connection on a first error and log it. > test_run:grep_log('bad_uuid_1_a', 'Mismatch server UUID on replica bad_uuid_2_a') ~= nil > diff --git a/test/misc/check_uuid_on_connect.test.lua b/test/misc/check_uuid_on_connect.test.lua > index b568c73..62f6593 100644 > --- a/test/misc/check_uuid_on_connect.test.lua > +++ b/test/misc/check_uuid_on_connect.test.lua > @@ -1,6 +1,5 @@ > test_run = require('test_run').new() > fiber = require('fiber') > -test_run:cmd("push filter 'line: *[0-9]+' to 'line: '") > > REPLICASET_1 = { 'bad_uuid_1_a', 'bad_uuid_1_b' } > REPLICASET_2 = { 'bad_uuid_2_a', 'bad_uuid_2_b' } > @@ -16,7 +15,8 @@ util = require('util') > vshard.storage.bucket_force_create(1) > -- Fail, because replicaset_1 sees not the actual replicaset_2's > -- master UUID. > -vshard.storage.bucket_send(1, replicaset_uuid[2]) > +res, err = vshard.storage.bucket_send(1, replicaset_uuid[2]) > +res, util.portable_error(err) > test_run:grep_log('bad_uuid_1_a', 'Mismatch server UUID on replica bad_uuid_2_a%(storage%@') > box.space._bucket:select{} > -- Bucket sending fails, but it remains 'sending'. It is because > @@ -64,7 +64,8 @@ util.wait_master(test_run, REPLICASET_2, 'bad_uuid_2_a') > > test_run:switch('bad_uuid_1_a') > vshard.storage.bucket_force_create(2) > -vshard.storage.bucket_send(2, replicaset_uuid[2]) > +res, err = vshard.storage.bucket_send(2, replicaset_uuid[2]) > +res, util.portable_error(err) > -- Close existing connection on a first error and log it. > test_run:grep_log('bad_uuid_1_a', 'Mismatch server UUID on replica bad_uuid_2_a') ~= nil > > diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result > index b1f1b15..b66e449 100644 > --- a/test/rebalancer/bucket_ref.result > +++ b/test/rebalancer/bucket_ref.result > @@ -137,15 +137,15 @@ vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = true > _ = test_run:switch('box_1_a') > --- > ... > -vshard.storage.bucket_send(1, util.replicasets[2]) > +res, err = vshard.storage.bucket_send(1, util.replicasets[2]) > +--- > +... > +res, util.portable_error(err) > --- > - null > - type: ClientError > code: 32 > message: Timeout exceeded > - trace: > - - file: '[C]' > - line: 4294967295 > ... > vshard.storage.buckets_info(1) > --- > diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua > index 1fe4d84..49ba583 100644 > --- a/test/rebalancer/bucket_ref.test.lua > +++ b/test/rebalancer/bucket_ref.test.lua > @@ -48,7 +48,8 @@ vshard.storage.buckets_info(1) > _ = test_run:switch('box_2_a') > vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = true > _ = test_run:switch('box_1_a') > -vshard.storage.bucket_send(1, util.replicasets[2]) > +res, err = vshard.storage.bucket_send(1, util.replicasets[2]) > +res, util.portable_error(err) > vshard.storage.buckets_info(1) > vshard.storage.bucket_ref(1, 'write') > vshard.storage.bucket_unref(1, 'write') -- Error, no refs. > diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result > index 954b549..db6a67f 100644 > --- a/test/rebalancer/receiving_bucket.result > +++ b/test/rebalancer/receiving_bucket.result > @@ -160,15 +160,15 @@ vshard.storage.internal.errinj.ERRINJ_RECEIVE_PARTIALLY = true > _ = test_run:switch('box_2_a') > --- > ... > -vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 10}) > +res, err = vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 10}) > +--- > +... > +res, util.portable_error(err) > --- > - null > - type: ClientError > code: 32 > message: Error injection 'the bucket is received partially' > - trace: > - - file: '[C]' > - line: 4294967295 > ... > box.space._bucket:get{1} > --- > @@ -222,10 +222,7 @@ _ = test_run:switch('box_2_a') > _, err = vshard.storage.bucket_send(101, util.replicasets[1], {timeout = 0.1}) > --- > ... > -err.trace = nil > ---- > -... > -err > +util.portable_error(err) > --- > - type: ClientError > code: 78 > @@ -331,15 +328,12 @@ vshard.storage.bucket_refrw(1) > while f1:status() ~= 'dead' do fiber.sleep(0.01) end > --- > ... > -ret, err > +ret, util.portable_error(err) > --- > - null > - type: ClientError > code: 78 > message: Timeout exceeded > - trace: > - - file: '[C]' > - line: 4294967295 > ... > finish_long_thing = true > --- > diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua > index 4885815..1819cbb 100644 > --- a/test/rebalancer/receiving_bucket.test.lua > +++ b/test/rebalancer/receiving_bucket.test.lua > @@ -62,7 +62,8 @@ _ = test_run:switch('box_1_a') > while box.space._bucket:get{1} do fiber.sleep(0.01) end > vshard.storage.internal.errinj.ERRINJ_RECEIVE_PARTIALLY = true > _ = test_run:switch('box_2_a') > -vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 10}) > +res, err = vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 10}) > +res, util.portable_error(err) > box.space._bucket:get{1} > _ = test_run:switch('box_1_a') > box.space._bucket:get{1} > @@ -88,8 +89,7 @@ _ = test_run:switch('box_1_a') > vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true > _ = test_run:switch('box_2_a') > _, err = vshard.storage.bucket_send(101, util.replicasets[1], {timeout = 0.1}) > -err.trace = nil > -err > +util.portable_error(err) > box.space._bucket:get{101} > while box.space._bucket:get{101}.status ~= vshard.consts.BUCKET.ACTIVE do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end > box.space._bucket:get{101} > @@ -126,7 +126,7 @@ while f1:status() ~= 'suspended' do fiber.sleep(0.01) end > vshard.storage.buckets_info(1) > vshard.storage.bucket_refrw(1) > while f1:status() ~= 'dead' do fiber.sleep(0.01) end > -ret, err > +ret, util.portable_error(err) > finish_long_thing = true > while f:status() ~= 'dead' do fiber.sleep(0.01) end > vshard.storage.buckets_info(1) > diff --git a/test/router/retry_reads.result b/test/router/retry_reads.result > index 8873310..ae60267 100644 > --- a/test/router/retry_reads.result > +++ b/test/router/retry_reads.result > @@ -113,10 +113,7 @@ fiber.time() - start < 1 > --- > - true > ... > -e.trace = nil > ---- > -... > -e > +util.portable_error(e) > --- > - type: ClientError > code: 0 > @@ -125,10 +122,7 @@ e > _, e = rs1:callro('sleep', {1}, {timeout = 0.0001}) > --- > ... > -e.trace = nil > ---- > -... > -e > +util.portable_error(e) > --- > - type: ClientError > code: 78 > diff --git a/test/router/retry_reads.test.lua b/test/router/retry_reads.test.lua > index ebfd1a9..c1fb689 100644 > --- a/test/router/retry_reads.test.lua > +++ b/test/router/retry_reads.test.lua > @@ -40,12 +40,10 @@ fiber.time() - start < 1 > start = fiber.time() > _, e = rs1:callro('raise_client_error', {}, {timeout = 5}) > fiber.time() - start < 1 > -e.trace = nil > -e > +util.portable_error(e) > > _, e = rs1:callro('sleep', {1}, {timeout = 0.0001}) > -e.trace = nil > -e > +util.portable_error(e) > > -- > -- Do not send multiple requests during timeout - it brokes long > diff --git a/test/router/router.result b/test/router/router.result > index 31a351f..d85f718 100644 > --- a/test/router/router.result > +++ b/test/router/router.result > @@ -255,35 +255,21 @@ vshard.router.static.replicasets[util.replicasets[2]].bucket_count > _, e = vshard.router.callro(1, 'raise_client_error', {}, {}) > --- > ... > -e.trace = nil > ---- > -... > -e > +util.portable_error(e) > --- > - type: ClientError > - message: Unknown error > code: 32 > -... > -tostring(e) > ---- > -- '{"type":"ClientError","message":"Unknown error","code":32}' > + message: Unknown error > ... > _, e = vshard.router.route(1):callro('raise_client_error', {}) > --- > ... > -e.trace = nil > ---- > -... > -e > +util.portable_error(e) > --- > - type: ClientError > code: 0 > message: Unknown error > ... > -tostring(e) > ---- > -- '{"type":"ClientError","code":0,"message":"Unknown error"}' > -... > -- Ensure, that despite not working multi-return, it is allowed > -- to return 'nil, err_obj'. > vshard.router.callro(1, 'echo', {nil, 'error_object'}, {}) > @@ -632,11 +618,14 @@ future:is_ready() > future = vshard.router.callrw(bucket_id, 'raise_client_error', {}, {is_async = true}) > --- > ... > -future:wait_result() > +res, err = future:wait_result() > --- > -- null > -- {'type': 'ClientError', 'message': 'Unknown error', 'code': 32, 'trace': [{'file': '[C]', > - 'line': 4294967295}]} > +... > +util.portable_error(err) > +--- > +- type: ClientError > + code: 32 > + message: Unknown error > ... > future:is_ready() > --- > diff --git a/test/router/router.test.lua b/test/router/router.test.lua > index 571759a..abc1a3c 100644 > --- a/test/router/router.test.lua > +++ b/test/router/router.test.lua > @@ -95,13 +95,9 @@ vshard.router.static.replicasets[util.replicasets[2]].bucket_count > -- Test lua errors. > -- > _, e = vshard.router.callro(1, 'raise_client_error', {}, {}) > -e.trace = nil > -e > -tostring(e) > +util.portable_error(e) > _, e = vshard.router.route(1):callro('raise_client_error', {}) > -e.trace = nil > -e > -tostring(e) > +util.portable_error(e) > -- Ensure, that despite not working multi-return, it is allowed > -- to return 'nil, err_obj'. > vshard.router.callro(1, 'echo', {nil, 'error_object'}, {}) > @@ -216,7 +212,8 @@ future = vshard.router.callro(bucket_id, 'space_get', {'test', {1}}, {is_async = > future:wait_result() > future:is_ready() > future = vshard.router.callrw(bucket_id, 'raise_client_error', {}, {is_async = true}) > -future:wait_result() > +res, err = future:wait_result() > +util.portable_error(err) > future:is_ready() > future = vshard.router.callrw(bucket_id, 'do_push', args, {is_async = true}) > func, iter, i = future:pairs() > diff --git a/test/router/sync.result b/test/router/sync.result > index 83443af..6f0821d 100644 > --- a/test/router/sync.result > +++ b/test/router/sync.result > @@ -38,6 +38,9 @@ _ = test_run:cmd("start server router_1") > _ = test_run:switch("router_1") > --- > ... > +util = require('util') > +--- > +... > vshard.router.bootstrap() > --- > - true > @@ -47,15 +50,13 @@ vshard.router.sync(-1) > - null > - Timeout exceeded > ... > -vshard.router.sync(0) > +res, err = vshard.router.sync(0) > --- > -- null > -- replicaset: ac522f65-aa94-4134-9f64-51ee384f1a54 > +... > +util.portable_error(err) > +--- > +- type: ClientError > code: 78 > - trace: > - - file: builtin/box/net_box.lua > - line: > - type: ClientError > message: Timeout exceeded > ... > -- > diff --git a/test/router/sync.test.lua b/test/router/sync.test.lua > index 9602a4d..3150343 100644 > --- a/test/router/sync.test.lua > +++ b/test/router/sync.test.lua > @@ -11,11 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt > _ = test_run:cmd("create server router_1 with script='router/router_1.lua'") > _ = test_run:cmd("start server router_1") > _ = test_run:switch("router_1") > +util = require('util') > > vshard.router.bootstrap() > > vshard.router.sync(-1) > -vshard.router.sync(0) > +res, err = vshard.router.sync(0) > +util.portable_error(err) > > -- > -- gh-190: router should not ignore cfg.sync_timeout. > diff --git a/test/storage/storage.result b/test/storage/storage.result > index 007d9fb..424bc4c 100644 > --- a/test/storage/storage.result > +++ b/test/storage/storage.result > @@ -506,15 +506,14 @@ vshard.storage.bucket_recv(100, 'from_uuid', {{1000, {{1}}}}) > -- > -- Test not existing space in bucket data. > -- > -vshard.storage.bucket_recv(4, util.replicasets[2], {{1000, {{1}}}}) > +res, err = vshard.storage.bucket_recv(4, util.replicasets[2], {{1000, {{1}}}}) > +--- > +... > +util.portable_error(err) > --- > -- null > - type: ClientError > code: 36 > message: Space '1000' does not exist > - trace: > - - file: '[C]' > - line: 4294967295 > ... > while box.space._bucket:get{4} do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end > --- > diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua > index 0e82cd1..d631b51 100644 > --- a/test/storage/storage.test.lua > +++ b/test/storage/storage.test.lua > @@ -121,7 +121,8 @@ vshard.storage.bucket_recv(100, 'from_uuid', {{1000, {{1}}}}) > -- > -- Test not existing space in bucket data. > -- > -vshard.storage.bucket_recv(4, util.replicasets[2], {{1000, {{1}}}}) > +res, err = vshard.storage.bucket_recv(4, util.replicasets[2], {{1000, {{1}}}}) > +util.portable_error(err) > while box.space._bucket:get{4} do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end > > -- > diff --git a/test/unit/error.result b/test/unit/error.result > index f74438f..144df44 100644 > --- a/test/unit/error.result > +++ b/test/unit/error.result > @@ -19,9 +19,15 @@ ok, err = pcall(box.error, box.error.TIMEOUT) > box_error = lerror.box(err) > --- > ... > -tostring(box_error) > +str = tostring(box_error) > --- > -- '{"type":"ClientError","code":78,"message":"Timeout exceeded","trace":[{"file":"[C]","line":4294967295}]}' > +... > +-- Base_type appears in 2.4.1. Simple nullification of base_type > +-- won't work, since order of the old fields becomes different. > +assert(str == '{"type":"ClientError","code":78,"message":"Timeout exceeded","trace":[{"file":"[C]","line":4294967295}]}' or \ > + str == '{"code":78,"base_type":"ClientError","type":"ClientError","message":"Timeout exceeded","trace":[{"file":"[C]","line":4294967295}]}') > +--- > +- true > ... > vshard_error = lerror.vshard(lerror.code.UNREACHABLE_MASTER, 'uuid', 'reason') > --- > @@ -80,12 +86,12 @@ function raise_lua_err() assert(false) end > ok, err = pcall(raise_lua_err) > --- > ... > -lerror.make(err) > +err = lerror.make(err) > +--- > +... > +util.portable_error(err) > --- > - type: ClientError > code: 32 > message: '[string "function raise_lua_err() assert(false) end "]:1: assertion failed!' > - trace: > - - file: '[C]' > - line: 4294967295 > ... > diff --git a/test/unit/error.test.lua b/test/unit/error.test.lua > index 6c15707..2ba7da6 100644 > --- a/test/unit/error.test.lua > +++ b/test/unit/error.test.lua > @@ -8,7 +8,11 @@ lerror = vshard.error > -- > ok, err = pcall(box.error, box.error.TIMEOUT) > box_error = lerror.box(err) Why do you use pcall(box.error, box.error.TIMEOUT)? Is it needed for promote an error to box.error.last()? Why not simply `box.error.new(box.error.TIMEOUT)?` I doesn't expect that it will promote an error to box.error.last() (even if https://github.com/tarantool/tarantool/issues/4778 still works in 1.10/2.2/2.3), but I think that this isn't really needed if you return `nil, err`. > -tostring(box_error) > +str = tostring(box_error) > +-- Base_type appears in 2.4.1. Simple nullification of base_type > +-- won't work, since order of the old fields becomes different. > +assert(str == '{"type":"ClientError","code":78,"message":"Timeout exceeded","trace":[{"file":"[C]","line":4294967295}]}' or \ > + str == '{"code":78,"base_type":"ClientError","type":"ClientError","message":"Timeout exceeded","trace":[{"file":"[C]","line":4294967295}]}') > I assume that we don't have a guarantee that our error will be printed as `{"type":"ClientError","code":78 ... }` and not as `{"code":78, "type":"ClientError" ...}`. Also there is confusing "line" parameter. Is it stable for different versions? I propose to check it as ``` err.type == "ClientError" err.code == 78 ... ``` > vshard_error = lerror.vshard(lerror.code.UNREACHABLE_MASTER, 'uuid', 'reason') > tostring(vshard_error) > @@ -32,4 +36,5 @@ util.check_error(lerror.vshard, 'Wrong format code', 'arg1', 'arg2') > > function raise_lua_err() assert(false) end > ok, err = pcall(raise_lua_err) > -lerror.make(err) > +err = lerror.make(err) > +util.portable_error(err) >