[Tarantool-patches] [PATCH v2 10/10] iproto: support error stacked diagnostic area

Tue Mar 31 02:24:48 MSK 2020

Thanks for the patchset!

On future - it would be good to have diffs and answers on
my comments inlined. Otherwise I need to jump between the
new email, the old one, and the commit's diff to check what
and how was fixed. And I likely forget something.

See 10 comments below.

On 25/03/2020 02:43, Nikita Pettik wrote:
> This patch introduces support of stacked errors in IProto protocol and
> in net.box module.
> 
> @TarantoolBot document
> Title: Stacked error diagnostic area
> 
> Starting from now errors can be organized into lists. To achieve this
> Lua table representing error object is extended with .prev field and
> e:set_prev(err) method. .prev field returns previous error if any exist.
> e:set_prev(err) method expects err to be error object or nil and sets
> err as previous error of e. For instance:
> 
> e1 = box.error.new({code = 111, reason = "cause"})
> e2 = box.error.new({code = 111, reason = "cause of cause"})

1. I would wrap these code samples into ```. Otherwise you may
screw the whole ticket formatting accidentally. You can check how
will it look by clicking 'New issue', pasting the request, and
clicking 'Preview'.

Currently it looks not the best. For example,

    - error: 'builtin/error.lua: Cycles are not allowed'

is turned into a bullet point.

> e1:set_prev(e2)
> assert(e1.prev == e2) -- true
> 
> Cycles are not allowed for error lists:
> 
> e2:set_prev(e1)
> - error: 'builtin/error.lua: Cycles are not allowed'
> 
> Nil is valid input to :set_prev() method:
> 
> e1:set_prev(nil)
> assert(e1.prev == nil) -- true
> 
> Note that error can be 'previous' only to the one error at once:
> 
> e1:set_prev(e2)
> e3:set_prev(e2)
> assert(e1.prev == nil) -- true
> assert(e3.prev == e2) -- true
> 
> Setting previous error does not erase its own previous members:
> 
> -- e1 -> e2 -> e3 -> e4
> e1:set_prev(e2)
> e2:set_prev(e3)
> e3:set_prev(e4)
> e2:set_prev(e5)
> -- Now there are two lists: e1->e2->e5 and e3->e4
> assert(e1.prev == e2) -- true
> assert(e2.prev == e5) -- true
> assert(e3.prev == e4) -- true
> 
> Alternatively:
> 
> e1:set_prev(e2)
> e2:set_prev(e3)
> e3:set_prev(e4)
> e5:set_prev(e3)
> -- Now there are two lists: e1->e2 and e5->e3->e4
> assert(e1.prev == e2) -- true
> assert(e2.prev == nil) -- true
> assert(e5.prev == e3) -- true
> assert(e3.prev == e4) -- true
> 
> Stacked diagnostics is also supported by IProto protocol. Now responses
> containing errors always (even if there's only one error to be returned)
> include new IProto key: IPROTO_ERROR_STACK (0x51). So, body corresponding to
> error response now looks like:
> MAP{IPROTO_ERROR : string, IPROTO_ERROR_STACK : ARRAY[MAP{ERROR_CODE : uint, ERROR_MESSAGE : string}, MAP{...}, ...]}
> 
> where IPROTO_ERROR is 0x31 key, IPROTO_ERROR_STACK is 0x51, ERROR_CODE
> is 0x01 and ERROR_MESSAGE is 0x02.
> Instances of older versions (without support of stacked errors in
> protocol) simply ignore unknown keys and still rely only on IPROTO_ERROR
> key.
> 
> Closes #1148

2. Propose to move this to before the docbot request. So as not
to include it into the doc ticket.

> diff --git a/src/box/xrow.c b/src/box/xrow.c
> index 256dd4d91..ceb38b040 100644
> --- a/src/box/xrow.c
> +++ b/src/box/xrow.c
> @@ -1070,19 +1083,66 @@ xrow_encode_auth(struct xrow_header *packet, const char *salt, size_t salt_len,
>  	return 0;
>  }
>  
> +static int
> +iproto_decode_error_stack(const char **pos, const char *end)
> +{
> +	const char *reason = NULL;
> +	static_assert(TT_STATIC_BUF_LEN >= DIAG_ERRMSG_MAX, "static buffer is "\
> +		      "expected to be larger than error message max length");
> +	/*
> +	 * Erase previously set diag errors. It is required since
> +	 * box_error_add() does not replace previous errors.
> +	 */
> +	box_error_clear();
> +	if (mp_check_array(*pos, end) != 0)

3. mp_check() functions return > 0 on error.

When you call mp_check_array() without checking if it is
MP_ARRAY, it will crash. Also it will crash if *pos == end
in the checks below. But it does not matter anymore.

I noticed that we don't need MessagePack validness checking
here. I see now that mp_check() is called by some upper function
always. You need only check that MP_ types are correct. That it
is MP_ARRAY, keys are MP_UINT and so on.

I don't see any tests on that. Are there some? Maybe at least
unit tests, in xrow.cc? Although a proper test would try to
send an error stack through applier. Because xrow_decode_error()
is used only there.

> +		return -1;
> +	uint32_t stack_sz = mp_decode_array(pos);
> +	for (uint32_t i = 0; i < stack_sz; i++) {
> +		uint32_t code = 0;
> +		if (mp_typeof(**pos) != MP_MAP || mp_check_map(*pos, end) != 0)
> +			return -1;
> +		uint32_t map_sz = mp_decode_map(pos);
> +		for (uint32_t key_idx = 0; key_idx < map_sz; key_idx++) {
> +			if (mp_typeof(**pos) != MP_UINT ||
> +			    mp_check_uint(*pos, end) != 0)
> +				return -1;
> +			uint8_t key = mp_decode_uint(pos);
> +			if (key == IPROTO_ERROR_CODE) {
> +				if (mp_typeof(**pos) != MP_UINT ||
> +				    mp_check_uint(*pos, end) != 0)
> +					return -1;
> +				code = mp_decode_uint(pos);
> +			} else if (key == IPROTO_ERROR_MESSAGE) {
> +				if (mp_typeof(**pos) != MP_STR ||
> +				    mp_check_strl(*pos, end) != 0)
> +					return -1;
> +				uint32_t len;
> +				const char *str = mp_decode_str(pos, &len);
> +				reason = tt_cstr(str, len);
> +			} else {
> +				mp_next(pos);
> +				continue;
> +			}
> +			box_error_add(__FILE__, __LINE__, code, reason);
> +		}
> +	}
> +	return 0;
> +}
> diff --git a/test/box/iproto.result b/test/box/iproto.result
> new file mode 100644
> index 000000000..bb369be5a
> --- /dev/null
> +++ b/test/box/iproto.result
> @@ -0,0 +1,141 @@
> +test_run = require('test_run').new()

4. test_run objects seems to be not necessary here.

> +---
> +...
> +net_box = require('net.box')
> +---
> +...
> +urilib = require('uri')
> +---
> +...
> +msgpack = require('msgpack')
> +---
> +...
> +IPROTO_REQUEST_TYPE   = 0x00
> +---
> +...
> +IPROTO_INSERT         = 0x02
> +---
> +...
> +IPROTO_SYNC           = 0x01
> +---
> +...
> +IPROTO_SPACE_ID       = 0x10
> +---
> +...
> +IPROTO_TUPLE          = 0x21
> +---
> +...
> +IPROTO_ERROR          = 0x31
> +---
> +...
> +IPROTO_ERROR_STACK    = 0x52
> +---
> +...
> +IPROTO_ERROR_CODE     = 0x01
> +---
> +...
> +IPROTO_ERROR_MESSAGE  = 0x02
> +---
> +...
> +IPROTO_OK             = 0x00
> +---
> +...
> +IPROTO_SCHEMA_VERSION = 0x05
> +---
> +...
> +IPROTO_STATUS_KEY     = 0x00

5. The last 3 keys are unused.

> +---
> +...
> +-- gh-1148: test capabilities of stacked diagnostics bypassing net.box.
> +--
> +lua_func = [[function(tuple) local json = require('json') return json.encode(tuple) end]]
> +---
> +...
> +box.schema.func.create('f1', {body = lua_func, is_deterministic = true, is_sandboxed = true})
> +---
> +...
> +s = box.schema.space.create('s')
> +---
> +...
> +pk = s:create_index('pk')
> +---
> +...
> +idx = s:create_index('idx', {func = box.func.f1.id, parts = {{1, 'string'}}})
> +---
> +...
> +box.schema.user.grant('guest', 'read, write, execute', 'universe')
> +---
> +...
> +next_request_id = 16
> +---
> +...
> +header = { \
> +    [IPROTO_REQUEST_TYPE] = IPROTO_INSERT, \
> +    [IPROTO_SYNC]         = next_request_id, \
> +}
> +---
> +...
> +body = { \
> +    [IPROTO_SPACE_ID] = s.id, \
> +    [IPROTO_TUPLE]    = box.tuple.new({1}) \
> +}
> +---
> +...
> +uri = urilib.parse(box.cfg.listen)
> +---
> +...
> +sock = net_box.establish_connection(uri.host, uri.service)
> +---
> +...
> +response = iproto_request(sock, header, body)
> +---
> +...
> +sock:close()
> +---
> +- true
> +...
> +-- Both keys (obsolete and stack ones) are present in response.
> +--
> +assert(response.body[IPROTO_ERROR_STACK] ~= nil)
> +---
> +- true
> +...
> +assert(response.body[IPROTO_ERROR] ~= nil)
> +---
> +- true
> +...
> +err = response.body[IPROTO_ERROR_STACK][1]
> +---
> +...
> +assert(err[IPROTO_ERROR_MESSAGE] == body[IPROTO_ERROR])
> +---
> +- error: assertion failed!

6.Shouldn't it be true? You meant response.body, not just body?

> +...
> +err = response.body[IPROTO_ERROR_STACK][2]
> +---
> +...
> +assert(err[IPROTO_ERROR_CODE] ~= nil)
> +---
> +- true
> +...
> +assert(type(err[IPROTO_ERROR_CODE]) == 'number')
> +---
> +- true
> +...
> +assert(err[IPROTO_ERROR_MESSAGE] ~= nil)
> +---
> +- true
> +...
> +assert(type(err[IPROTO_ERROR_MESSAGE]) == 'string')

7. This test covers the previous assertion. type(nil)
is nil, so it won't be equal to 'string'.

> +---
> +- true
> +...
> +box.schema.user.revoke('guest', 'read,write,execute', 'universe')
> +---
> +...
> +s:drop()
> +---
> diff --git a/test/box/net.box.result b/test/box/net.box.result
> index e3dabf7d9..eeefc0e3f 100644
> --- a/test/box/net.box.result
> +++ b/test/box/net.box.result
> @@ -3902,6 +3902,66 @@ sock:close()
>  ---
>  - true
>  ...
> +-- gh-1148: test stacked diagnostics.
> +--
> +test_run:cmd("push filter \"file: .*\" to \"file: <filename>\"")

8. I removed the filter and nothing changed. Why do you
need it?

> +---
> +- true
> +...
> +lua_code = [[function(tuple) local json = require('json') return json.encode(tuple) end]]
> +---
> +...
> +box.schema.func.create('f1', {body = lua_code, is_deterministic = true, is_sandboxed = true})
> +---
> +...
> +s = box.schema.space.create('s')
> +---
> +...
> +pk = s:create_index('pk')
> +---
> +...
> +idx = s:create_index('idx', {func = box.func.f1.id, parts = {{1, 'string'}}})

9. Functional index is an overkill for such a simple test. You
may prefer creating a stored function, which creates a stack of 2
errors, and calls box.error(). That would be simpler IMO, and you can
test more than 2 errors too. The same for the iproto.test.lua. To get
that error you can use IPROTO_CALL. No need to create spaces, indexes,
bother with some 'determenistic sandboxed' shit. No even need to call
box.schema.func.create assuming you have 'universe execute' access.

> +---
> +...
> +box.schema.user.grant('guest', 'read,write,execute', 'universe')
> +---
> +...
> +c = net.connect(box.cfg.listen)
> +---
> +...
> +f = function(...) return c.space.s:insert(...) end
> +---
> +...
> +_, e = pcall(f, {1})
> +---
> +...
> +assert(e ~= nil)
> +---
> +- true
> +...
> +e:unpack().message
> +---
> +- 'Failed to build a key for functional index ''idx'' of space ''s'': can''t evaluate
> +  function'
> +...
> +e.prev.message
> +---
> +- error: '[string "return e.prev.message "]:1: attempt to index field ''prev'' (a
> +    nil value)'

10. Something is wrong here, no?