From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id D1F3C264A9
	for <tarantool-patches@freelists.org>; Wed,  7 Aug 2019 19:28:13 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id PtijaDOaLerv for <tarantool-patches@freelists.org>;
	Wed,  7 Aug 2019 19:28:13 -0400 (EDT)
Received: from smtp57.i.mail.ru (smtp57.i.mail.ru [217.69.128.37])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1D0542649B
	for <tarantool-patches@freelists.org>; Wed,  7 Aug 2019 19:28:13 -0400 (EDT)
Date: Thu, 8 Aug 2019 02:27:56 +0300
From: Alexander Turenko <alexander.turenko@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH v1 1/3] box: rfc for stacked diagnostic area in Tarantool
Message-ID: <20190807232755.s7gruhxeja75hob7@tkn_work_nb>
References: <cover.1564657285.git.kshcherbatov@tarantool.org>
 <101e2b28c29ebfc4cf58b45e534207fd4f6d9b3a.1564657285.git.kshcherbatov@tarantool.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <101e2b28c29ebfc4cf58b45e534207fd4f6d9b3a.1564657285.git.kshcherbatov@tarantool.org>
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-Help: <mailto:ecartis@freelists.org?Subject=help>
List-Unsubscribe: <mailto:tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-Subscribe: <mailto:tarantool-patches-request@freelists.org?Subject=subscribe>
List-Owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-Archive: <http://www.freelists.org/archives/tarantool-patches>
To: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Cc: tarantool-patches@freelists.org, v.shpilevoy@tarantool.org, kostja@tarantool.org

Thanks for the document, the question is interesting.

There are three 'objects' we interested in this discussion: diagnostics
area, struct error and cdata<struct error> in Lua. Your document jumps
over all three objects and it is hard to understand what exactly you
propose.

I mean: what fields and functions are proposed to add for struct diag,
struct error and the Lua part? It would be easier to understand the
proposal if API changes will be split for those three parts and will be
described in an api-reference manner.

It also seems that you want to handle some part of #4398 ('Expose
box.error to be used by applications'). I propose to don't do it here.

I had the objection against a list of errors in the diagnostic area: a
user may want to save an error, perform some new commands, which can
rewrite a last diagnostic (replace the diagnostic list) and then (s)he
may want to look at the stack of reasons of the saved error. But now it
seems you discarded the idea. (BTW, it would be good to briefly mention
changes you made for a new revision of your document: the text looks
similar and it is hard to find differences.)

I think that struct error should have a pointer to its cause, which is
another error.

I like Java terminology here: cause is exactly what we want to save for
an error. Wrap / unwrap (Go terminology) look too abstract and maybe
even unclear for me. Maybe this terms appears from merry / xerror
packages, which were inspired to wrap built-in errors with additional
information (say, stack traces). It is up to maintainers, but my vote is
for 'cause' field, 'get_cause()' and 'set_cause()' functions (maybe with
some prefix, because it is C).

'Is' method (Go terminology, here it is named 'has') also don't look
self-descriptive enough for me. I would name it 'find_cause()'.

Despite I propose to concentrate changes on struct error and its Lua
counterpart, I think that it worth to add a helper to append an error to
the diagnostic area and set a previous last error as its cause. I would
name it 'diag_append()', but I don't have strict objections against
'diag_add()'. The former just looks a bit more intuitive for me.

BTW, it would be good to link documents you are lean on: I guess
https://go.googlesource.com/proposal/+/master/design/29934-error-values.md
is good candidate if you want to stick with Go terms. It also would be
good to mention stacked diagnostics in SQL and some Java reference /
article if you will agree with Java terms.

I doubt a bit that iproto change you proposed would be
backward-compatible: imagine that you connect from an old tarantool to a
new tarantool using net.box or handle errors using another existing
connector. I would send those messages as before, but add a field with
IPROTO_ERROR_LIST key in addition to existing IPROTO_ERROR (0x31) to a
body map. The value can be a list of maps like {code = <mp_uint>,
message = <mp_str>} or a list of list [<code>, <message>]. The
IPROTO_ERROR_LIST should start from a last (top) error I think.

Maybe we should use 'warnings' term here if this feature is intended to
be used for SQL warnings. If we want to use the proposed mechanics for
warnings, then my proposal re using 'cause' term looks doubtful. Don't
sure whether we should introduce some kind of warnings list for the
diagnostic area or reuse 'cause' / 'parent' / ... field of struct error.

I didn't look at your patch: I want to agree on terms and overall
approach first. At least one of maintainers should approve it.

WBR, Alexander Turenko.

On Thu, Aug 01, 2019 at 02:13:26PM +0300, Kirill Shcherbatov wrote:
> Part of #1148
> ---
>  doc/rfc/1148-stacked-diagnostics.md | 136 ++++++++++++++++++++++++++++
>  1 file changed, 136 insertions(+)
>  create mode 100644 doc/rfc/1148-stacked-diagnostics.md
> 
> diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md
> new file mode 100644
> index 000000000..a007491aa
> --- /dev/null
> +++ b/doc/rfc/1148-stacked-diagnostics.md
> @@ -0,0 +1,136 @@
> +# Stacked Diagnostics in Tarantool
> +
> +* **Status**: In progress
> +* **Start date**: 30-08-2019
> +* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov@tarantool.org, Konstantin Osipov @kostja kostja@tarantool.org, Georgy Kirichenko @GeorgyKirichenko georgy@tarantool.org, @Totktonada Alexander Turenko alexander.turenko@tarantool.org
> +* **Issues**: [#1148](https://github.com/tarantool/<repository\>/issues/1148)
> +
> +## Summary
> +Support stacked diagnostics for Tarantool allows to accumulate all occurred errors during processing a request. This allows to better understand what has happened and handle errors
> +correspondingly.
> +
> +## Background and motivation
> +
> +Tarantool statements must produce diagnostic information that populates the diagnostics area. This is a Standard SQL requirement and other vendors and languages also have such feature.
> +Diagnostics area stack must contain a diagnostics area for each nested execution context.
> +
> +### Current Tarantool's error diagnostics
> +Currently Tarantool has `diag_set()` mechanism to set a diagnostic error.
> +The last error is exported with `box.error.last()` endpoint.
> +
> +In total there are 5 error classes in Tarantool. Each error class is associated with set of error codes (@see ClientError::get_errcode(const struct error *e)).
> +```
> +ClientError      ->     ClientError::errcode() -> ...;
> +OutOfMemory      ->     MEMORY_ISSUE
> +SystemError      ->     SYSTEM
> +CollationError   ->     CANT_CREATE_COLLATION
> +Lua error        ->     PROC_LUA
> +```
> +All system-defined errors are exported with their unique error codes in box.error folder by NAME.
> +
> +You may use box.error.new endpoint to create a new error instance of predefined type:
> +```
> +tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just I can't")
> +tarantool> t:unpack()
> +---
> +- type: ClientError
> +  code: 9
> +  message: 'Failed to create space ''myspace'': just I can''t'
> +  trace:
> +  - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]'
> +    line: 1
> +```
> +
> +User is allowed to define own errors with any code (including system errors) with
> +```
> +box.error.new{code = user_code, reason = user_error_msg}
> +```
> +where `user_code` must be greater that the biggest registered system error, say `10000`.
> +
> +Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()` methods and uniform `.type`, `.message` and `.trace` fields.
> +
> +
> +## Proposal
> +In some cases a diagnostic information must be more complicated.
> +For example, when persistent Lua function referenced by functional index has a bug in it's definition, Lua handler sets an diag message. Then functional index extractor code sets an own,  more specialized error.
> +
> +In this and similar situations, we need stack-based error diagnostics:
> +
> +- The serialize method should return the most recent error in stack:
> +```
> +tarantool> box.error.last()
> +Failed to build a key for functional index ''idx'' of space ''withdata': function error
> +```
> +
> +- The `:unpack()` method must return the complete diagnostics stack - a table of errors sorted in their historical order:
> +```
> +tarantool> box.error.last():unpack()
> +---
> +  - type: LuajitError
> +    message: '[string "return function(tuple)..."]:2: attempt to call global
> +     ''require'' (a nil value)'
> +    trace:
> +    - file: /home/kir/WORK/tarantool/src/lua/utils.c
> +      line: 1000
> +  - type: ClientError
> +    code: 198
> +    message: 'Failed to build a key for functional index ''idx'' of space ''withdata'':
> +      function error
> +    trace:
> +    - file: /home/kir/WORK/tarantool/src/box/key_list.c
> +      line: 68
> +```
> +
> +- We need to introduce `next_error:wrap(error)` and  `error:unwrap()` methods.
> +The `wrap` function can take any previous and last error objects to append error to the end of next_error trace. The next_error backtrace is expected to be empty.
> +The `unwrap` function removes the element's parent and returns the unwrapped content. This is effectively the inverse of the `wrap()` method.
> +```
> +Workaround:
> +
> +function create_new_user()
> +        name, err = pcall(box.internal.func_call,
> +                          box.func.generate_user_name.name)
> +        if not name then
> +                local e = box.error.new({
> +                              code = APP_ERROR_CODE,
> +                              reason = "create_new_user"}):
> +                e:wrap(err):raise()
> +        end
> +        box.schema.user.create(name)
> +end
> +```
> +
> +- We also need `error:has` method to test whether given error object contain given error code:
> +```
> +box.error.last():has(box.error.LuajitError)
> +```
> +
> +## Detailed design
> +
> +Diagnostic area object has a good encapsulated API.
> +
> +Lets extend error object with a pointer to it's reason error and introduce a few new methods (we need to rename diag_add_error to diag_set_error because it replaces an existent diag error in fact):
> +```
> +struct error {
> +...
> +/* A pointer to the reason error. */
> +	struct error *reason;
> +};
> +
> +/**
> + * Add a new error to the diagnostics area: the previous error
> + * becomes a reason of a current.
> + * \param diag diagnostics area
> + * \param e error to add
> + */
> +static inline void
> +diag_add_error(struct diag *diag, struct error *e);
> +
> +/** Same as diag_set, but softly extend diagnostic information. */
> +#define diag_add(class, ...)
> +```
> +
> +### IPROTO Communication
> +Currently errors are sent as (IPROTO_TYPE_ERROR | errcode) messages with error message string payload.
> +Lets introduce a new IPROTO_TYPE_ERROR_TRACE error code with
> +[[errcode1, "errstring1"], [errcode2, "errstring2"]..] payload, where errors are serialized in historical order.
> -- 
> 2.22.0
>