[tarantool-patches] Re: [PATCH v1 1/3] box: rfc for stacked diagnostic area in Tarantool

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Aug 6 00:16:11 MSK 2019


Hi! Thanks for the patch!

See 9 comments below.

On 01/08/2019 13:13, Kirill Shcherbatov wrote:
> Part of #1148
> ---
>  doc/rfc/1148-stacked-diagnostics.md | 136 ++++++++++++++++++++++++++++
>  1 file changed, 136 insertions(+)
>  create mode 100644 doc/rfc/1148-stacked-diagnostics.md
> 
> diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md
> new file mode 100644
> index 000000000..a007491aa
> --- /dev/null
> +++ b/doc/rfc/1148-stacked-diagnostics.md
> @@ -0,0 +1,136 @@
> +# Stacked Diagnostics in Tarantool
> +
> +* **Status**: In progress
> +* **Start date**: 30-08-2019

1. Today is 5 August. I guess it is a typo.

> +* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov at tarantool.org, Konstantin Osipov @kostja kostja at tarantool.org, Georgy Kirichenko @GeorgyKirichenko georgy at tarantool.org, @Totktonada Alexander Turenko alexander.turenko at tarantool.org
> +* **Issues**: [#1148](https://github.com/tarantool/<repository\>/issues/1148)
> +
> +## Summary
> +Support stacked diagnostics for Tarantool allows to accumulate all occurred errors during processing a request. This allows to better understand what has happened and handle errors
> +correspondingly.
> +
> +## Background and motivation
> +
> +Tarantool statements must produce diagnostic information that populates the diagnostics area. This is a Standard SQL requirement and other vendors and languages also have such feature.
> +Diagnostics area stack must contain a diagnostics area for each nested execution context.
> +
> +### Current Tarantool's error diagnostics
> +Currently Tarantool has `diag_set()` mechanism to set a diagnostic error.
> +The last error is exported with `box.error.last()` endpoint.
> +
> +In total there are 5 error classes in Tarantool.
> Each error class is associated with set of error codes (@see ClientError::get_errcode(const struct error *e)).
> +```
> +ClientError      ->     ClientError::errcode() -> ...;
> +OutOfMemory      ->     MEMORY_ISSUE
> +SystemError      ->     SYSTEM
> +CollationError   ->     CANT_CREATE_COLLATION
> +Lua error        ->     PROC_LUA
> +```
> +All system-defined errors are exported with their unique error codes in box.error folder by NAME.
> +
> +You may use box.error.new endpoint to create a new error instance of predefined type:
> +```
> +tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just I can't")
> +tarantool> t:unpack()
> +---
> +- type: ClientError
> +  code: 9
> +  message: 'Failed to create space ''myspace'': just I can''t'
> +  trace:
> +  - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]'
> +    line: 1
> +```
> +
> +User is allowed to define own errors with any code (including system errors) with
> +```
> +box.error.new{code = user_code, reason = user_error_msg}
> +```
> +where `user_code` must be greater that the biggest registered system error, say `10000`.

2. 'greater than'.

> +
> +Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()` methods and uniform `.type`, `.message` and `.trace` fields.
> +
> +
> +## Proposal
> +In some cases a diagnostic information must be more complicated.
> +For example, when persistent Lua function referenced by functional index has a bug in it's definition, Lua handler sets an diag message. Then functional index extractor code sets an own,  more specialized error.
> +
> +In this and similar situations, we need stack-based error diagnostics:
> +
> +- The serialize method should return the most recent error in stack:
> +```
> +tarantool> box.error.last()
> +Failed to build a key for functional index ''idx'' of space ''withdata': function error
> +```
> +
> +- The `:unpack()` method must return the complete diagnostics stack - a table of errors sorted in their historical order:
> +```
> +tarantool> box.error.last():unpack()
> +---
> +  - type: LuajitError
> +    message: '[string "return function(tuple)..."]:2: attempt to call global
> +     ''require'' (a nil value)'
> +    trace:
> +    - file: /home/kir/WORK/tarantool/src/lua/utils.c
> +      line: 1000
> +  - type: ClientError
> +    code: 198
> +    message: 'Failed to build a key for functional index ''idx'' of space ''withdata'':
> +      function error
> +    trace:
> +    - file: /home/kir/WORK/tarantool/src/box/key_list.c
> +      line: 68
> +```
> +

3. VShard uses unpack: https://github.com/tarantool/vshard/blob/a6826b26f9ab38e338a34cb7dd7f46dabbc67af5/vshard/error.lua#L149
Is it compatible with what you are doing? Here I need to turn an error into a table to
set a different serialization method.

4. Have you considered an idea to unpack errors not as an array, but
as a list? When each error object has a field 'reason' pointing to
another object, and so on. It would allow not to change unpack() output
format - it would still return the newest error, but with a new field.

> +- We need to introduce `next_error:wrap(error)` and  `error:unwrap()` methods.
> +The `wrap` function can take any previous and last error objects to append error to the end of next_error trace. The next_error backtrace is expected to be empty.
> +The `unwrap` function removes the element's parent and returns the unwrapped content. This is effectively the inverse of the `wrap()` method.
> +```
> +Workaround:

5. Workaround for what?

> +
> +function create_new_user()
> +        name, err = pcall(box.internal.func_call,
> +                          box.func.generate_user_name.name)
> +        if not name then
> +                local e = box.error.new({
> +                              code = APP_ERROR_CODE,
> +                              reason = "create_new_user"}):
> +                e:wrap(err):raise()
> +        end
> +        box.schema.user.create(name)
> +end
> +```
> +
> +- We also need `error:has` method to test whether given error object contain given error code:

6. 'object contains'.

> +```
> +box.error.last():has(box.error.LuajitError)

7. Something is wrong.

tarantool> box.error.LuajitError
---
- null
...

> +```
> +
> +## Detailed design
> +
> +Diagnostic area object has a good encapsulated API.
> +
> +Lets extend error object with a pointer to it's reason error and introduce a few new methods (we need to rename diag_add_error to diag_set_error because it replaces an existent diag error in fact):
> +```
> +struct error {
> +...
> +/* A pointer to the reason error. */

8. Broken alignment.

> +	struct error *reason;
> +};
> +
> +/**
> + * Add a new error to the diagnostics area: the previous error
> + * becomes a reason of a current.
> + * \param diag diagnostics area
> + * \param e error to add
> + */
> +static inline void
> +diag_add_error(struct diag *diag, struct error *e);
> +
> +/** Same as diag_set, but softly extend diagnostic information. */
> +#define diag_add(class, ...)
> +```
> +
> +### IPROTO Communication
> +Currently errors are sent as (IPROTO_TYPE_ERROR | errcode) messages with error message string payload.
> +Lets introduce a new IPROTO_TYPE_ERROR_TRACE error code with
> +[[errcode1, "errstring1"], [errcode2, "errstring2"]..] payload, where errors are serialized in historical order.
> 

9. Please, describe the new binary protocol in the similar way as it
is done here: https://github.com/tarantool/tarantool/blob/master/doc/rfc/3328-wire_protocol.md#body
And add it to the docbot request with code values included.




More information about the Tarantool-patches mailing list