From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id D1F3C264A9 for ; Wed, 7 Aug 2019 19:28:13 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PtijaDOaLerv for ; Wed, 7 Aug 2019 19:28:13 -0400 (EDT) Received: from smtp57.i.mail.ru (smtp57.i.mail.ru [217.69.128.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1D0542649B for ; Wed, 7 Aug 2019 19:28:13 -0400 (EDT) Date: Thu, 8 Aug 2019 02:27:56 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH v1 1/3] box: rfc for stacked diagnostic area in Tarantool Message-ID: <20190807232755.s7gruhxeja75hob7@tkn_work_nb> References: <101e2b28c29ebfc4cf58b45e534207fd4f6d9b3a.1564657285.git.kshcherbatov@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <101e2b28c29ebfc4cf58b45e534207fd4f6d9b3a.1564657285.git.kshcherbatov@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: Kirill Shcherbatov Cc: tarantool-patches@freelists.org, v.shpilevoy@tarantool.org, kostja@tarantool.org Thanks for the document, the question is interesting. There are three 'objects' we interested in this discussion: diagnostics area, struct error and cdata in Lua. Your document jumps over all three objects and it is hard to understand what exactly you propose. I mean: what fields and functions are proposed to add for struct diag, struct error and the Lua part? It would be easier to understand the proposal if API changes will be split for those three parts and will be described in an api-reference manner. It also seems that you want to handle some part of #4398 ('Expose box.error to be used by applications'). I propose to don't do it here. I had the objection against a list of errors in the diagnostic area: a user may want to save an error, perform some new commands, which can rewrite a last diagnostic (replace the diagnostic list) and then (s)he may want to look at the stack of reasons of the saved error. But now it seems you discarded the idea. (BTW, it would be good to briefly mention changes you made for a new revision of your document: the text looks similar and it is hard to find differences.) I think that struct error should have a pointer to its cause, which is another error. I like Java terminology here: cause is exactly what we want to save for an error. Wrap / unwrap (Go terminology) look too abstract and maybe even unclear for me. Maybe this terms appears from merry / xerror packages, which were inspired to wrap built-in errors with additional information (say, stack traces). It is up to maintainers, but my vote is for 'cause' field, 'get_cause()' and 'set_cause()' functions (maybe with some prefix, because it is C). 'Is' method (Go terminology, here it is named 'has') also don't look self-descriptive enough for me. I would name it 'find_cause()'. Despite I propose to concentrate changes on struct error and its Lua counterpart, I think that it worth to add a helper to append an error to the diagnostic area and set a previous last error as its cause. I would name it 'diag_append()', but I don't have strict objections against 'diag_add()'. The former just looks a bit more intuitive for me. BTW, it would be good to link documents you are lean on: I guess https://go.googlesource.com/proposal/+/master/design/29934-error-values.md is good candidate if you want to stick with Go terms. It also would be good to mention stacked diagnostics in SQL and some Java reference / article if you will agree with Java terms. I doubt a bit that iproto change you proposed would be backward-compatible: imagine that you connect from an old tarantool to a new tarantool using net.box or handle errors using another existing connector. I would send those messages as before, but add a field with IPROTO_ERROR_LIST key in addition to existing IPROTO_ERROR (0x31) to a body map. The value can be a list of maps like {code = , message = } or a list of list [, ]. The IPROTO_ERROR_LIST should start from a last (top) error I think. Maybe we should use 'warnings' term here if this feature is intended to be used for SQL warnings. If we want to use the proposed mechanics for warnings, then my proposal re using 'cause' term looks doubtful. Don't sure whether we should introduce some kind of warnings list for the diagnostic area or reuse 'cause' / 'parent' / ... field of struct error. I didn't look at your patch: I want to agree on terms and overall approach first. At least one of maintainers should approve it. WBR, Alexander Turenko. On Thu, Aug 01, 2019 at 02:13:26PM +0300, Kirill Shcherbatov wrote: > Part of #1148 > --- > doc/rfc/1148-stacked-diagnostics.md | 136 ++++++++++++++++++++++++++++ > 1 file changed, 136 insertions(+) > create mode 100644 doc/rfc/1148-stacked-diagnostics.md > > diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md > new file mode 100644 > index 000000000..a007491aa > --- /dev/null > +++ b/doc/rfc/1148-stacked-diagnostics.md > @@ -0,0 +1,136 @@ > +# Stacked Diagnostics in Tarantool > + > +* **Status**: In progress > +* **Start date**: 30-08-2019 > +* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov@tarantool.org, Konstantin Osipov @kostja kostja@tarantool.org, Georgy Kirichenko @GeorgyKirichenko georgy@tarantool.org, @Totktonada Alexander Turenko alexander.turenko@tarantool.org > +* **Issues**: [#1148](https://github.com/tarantool//issues/1148) > + > +## Summary > +Support stacked diagnostics for Tarantool allows to accumulate all occurred errors during processing a request. This allows to better understand what has happened and handle errors > +correspondingly. > + > +## Background and motivation > + > +Tarantool statements must produce diagnostic information that populates the diagnostics area. This is a Standard SQL requirement and other vendors and languages also have such feature. > +Diagnostics area stack must contain a diagnostics area for each nested execution context. > + > +### Current Tarantool's error diagnostics > +Currently Tarantool has `diag_set()` mechanism to set a diagnostic error. > +The last error is exported with `box.error.last()` endpoint. > + > +In total there are 5 error classes in Tarantool. Each error class is associated with set of error codes (@see ClientError::get_errcode(const struct error *e)). > +``` > +ClientError -> ClientError::errcode() -> ...; > +OutOfMemory -> MEMORY_ISSUE > +SystemError -> SYSTEM > +CollationError -> CANT_CREATE_COLLATION > +Lua error -> PROC_LUA > +``` > +All system-defined errors are exported with their unique error codes in box.error folder by NAME. > + > +You may use box.error.new endpoint to create a new error instance of predefined type: > +``` > +tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just I can't") > +tarantool> t:unpack() > +--- > +- type: ClientError > + code: 9 > + message: 'Failed to create space ''myspace'': just I can''t' > + trace: > + - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]' > + line: 1 > +``` > + > +User is allowed to define own errors with any code (including system errors) with > +``` > +box.error.new{code = user_code, reason = user_error_msg} > +``` > +where `user_code` must be greater that the biggest registered system error, say `10000`. > + > +Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()` methods and uniform `.type`, `.message` and `.trace` fields. > + > + > +## Proposal > +In some cases a diagnostic information must be more complicated. > +For example, when persistent Lua function referenced by functional index has a bug in it's definition, Lua handler sets an diag message. Then functional index extractor code sets an own, more specialized error. > + > +In this and similar situations, we need stack-based error diagnostics: > + > +- The serialize method should return the most recent error in stack: > +``` > +tarantool> box.error.last() > +Failed to build a key for functional index ''idx'' of space ''withdata': function error > +``` > + > +- The `:unpack()` method must return the complete diagnostics stack - a table of errors sorted in their historical order: > +``` > +tarantool> box.error.last():unpack() > +--- > + - type: LuajitError > + message: '[string "return function(tuple)..."]:2: attempt to call global > + ''require'' (a nil value)' > + trace: > + - file: /home/kir/WORK/tarantool/src/lua/utils.c > + line: 1000 > + - type: ClientError > + code: 198 > + message: 'Failed to build a key for functional index ''idx'' of space ''withdata'': > + function error > + trace: > + - file: /home/kir/WORK/tarantool/src/box/key_list.c > + line: 68 > +``` > + > +- We need to introduce `next_error:wrap(error)` and `error:unwrap()` methods. > +The `wrap` function can take any previous and last error objects to append error to the end of next_error trace. The next_error backtrace is expected to be empty. > +The `unwrap` function removes the element's parent and returns the unwrapped content. This is effectively the inverse of the `wrap()` method. > +``` > +Workaround: > + > +function create_new_user() > + name, err = pcall(box.internal.func_call, > + box.func.generate_user_name.name) > + if not name then > + local e = box.error.new({ > + code = APP_ERROR_CODE, > + reason = "create_new_user"}): > + e:wrap(err):raise() > + end > + box.schema.user.create(name) > +end > +``` > + > +- We also need `error:has` method to test whether given error object contain given error code: > +``` > +box.error.last():has(box.error.LuajitError) > +``` > + > +## Detailed design > + > +Diagnostic area object has a good encapsulated API. > + > +Lets extend error object with a pointer to it's reason error and introduce a few new methods (we need to rename diag_add_error to diag_set_error because it replaces an existent diag error in fact): > +``` > +struct error { > +... > +/* A pointer to the reason error. */ > + struct error *reason; > +}; > + > +/** > + * Add a new error to the diagnostics area: the previous error > + * becomes a reason of a current. > + * \param diag diagnostics area > + * \param e error to add > + */ > +static inline void > +diag_add_error(struct diag *diag, struct error *e); > + > +/** Same as diag_set, but softly extend diagnostic information. */ > +#define diag_add(class, ...) > +``` > + > +### IPROTO Communication > +Currently errors are sent as (IPROTO_TYPE_ERROR | errcode) messages with error message string payload. > +Lets introduce a new IPROTO_TYPE_ERROR_TRACE error code with > +[[errcode1, "errstring1"], [errcode2, "errstring2"]..] payload, where errors are serialized in historical order. > -- > 2.22.0 >