From: Nikita Pettik <korablev@tarantool.org> To: tarantool-patches@dev.tarantool.org Cc: v.shpilevoy@tarantool.org Subject: [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area Date: Wed, 25 Mar 2020 04:42:57 +0300 [thread overview] Message-ID: <a9553e91674504017befa6890c60298e0a67328a.1585097339.git.korablev@tarantool.org> (raw) In-Reply-To: <cover.1585097339.git.korablev@tarantool.org> In-Reply-To: <cover.1585097339.git.korablev@tarantool.org> From: Kirill Shcherbatov <kshcherbatov@tarantool.org> Part of #1148 --- doc/rfc/1148-stacked-diagnostics.md | 225 ++++++++++++++++++++++++++++ 1 file changed, 225 insertions(+) create mode 100644 doc/rfc/1148-stacked-diagnostics.md diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md new file mode 100644 index 000000000..ccc588d46 --- /dev/null +++ b/doc/rfc/1148-stacked-diagnostics.md @@ -0,0 +1,225 @@ +# Stacked Diagnostics + +* **Status**: In progress +* **Start date**: 30-07-2019 +* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov@tarantool.org, + Pettik Nikita @Korablev77 korablev@tarantool.org +* **Issues**: [#1148](https://github.com/tarantool/<repository\>/issues/1148) + + +## Summary + +The document describes a stacked diagnostics feature. It is needed for the cases, +when there is a complex and huge call stack with variety of subsystems in it. +It may turn out that the most low-level error is not descriptive enough. In turn +user may want to be able to look at errors along the whole call stack from the +place where a first error happened. In terms of implementation single fiber's +error object is going to be replaced with a list of objects forming diagnostic +stack. Its first element will always be created by the deepest and the most +basic error. Both C and Lua APIs are extended to support adding errors in stack. + +## Background and motivation + +Support stacked diagnostics for Tarantool allows to accumulate all errors +occurred during request processing. It allows to better understand what +happened, and handle errors appropriately. Consider following example: +persistent Lua function referenced by functional index has a bug in it's +definition, Lua handler sets an diag message. Then functional index extractor +code setups an own, more specialized error. + +### Current error diagnostics + +Currently Tarantool has `diag_set()` mechanism to set a diagnostic error. +Object representing error featuring following properties: + - type (string) error’s C++ class; + - code (number) error’s number; + - message (string) error’s message; + - file (string) Tarantool source file; + - line (number) line number in the Tarantool source file. + +The last error raised is exported with `box.error.last()` function. + +Type of error is represented by a few C++ classes (all are inherited from +Exception class). For instance hierarchy for ClientError is following: +``` +ClientError + | LoggedError + | AccessDeniedError + | UnsupportedIndexFeature +``` + +All codes and names of ClientError class are available in box.error. +User is able to create a new error instance of predefined type using +box.error.new() function. For example: +``` +tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just cause") +tarantool> t:unpack() +--- +- type: ClientError + code: 9 + message: 'Failed to create space ''myspace'': just cause' + trace: + - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]' + line: 1 +``` + +User is also capable of defining own errors with any code by means of: +``` +box.error.new({code = user_code, reason = user_error_msg}) +``` +For instance: +``` +e = box.error.new({code = 500, reason = 'just cause'}) +``` + +Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()` +methods and `.type`, `.message` and `.trace` fields. + +## Proposal + +In some cases a diagnostic area should be more complicated than +one last raised error to provide decent information concerning incident (see +motivating example above). Without stacked diagnostic area, only last error is +delivered to user. One way to deal with this problem is to introduce stack +accumulating all errors happened during request processing. + +### C API + +Let's keep existent `diag_set()` method as is. It is supposed to replace the +last error in diagnostic area with a new one. To add new error at the top of +existing one, let's introduce new method `diag_add()`. It is assumed to keep +an existent error message in diagnostic area (if any) and sets it as a previous +error for a recently-constructed error object. Note that `diag_set()` is not +going to preserve pointer to previous error which is held in error to be +substituted. To illustrate last point consider example: + +``` +0. Errors: <NULL> +1. diag_set(code = 1) +Errors: <e1(code = 1) -> NULL> +2. diag_add(code = 2) +Errors: <e1(code = 1) -> e2(code = 2) -> NULL> +3. diag_set(code = 3) +Errors: <e3(code = 3) -> NULL> +``` + +Hence, developer takes responsibility of placing `diag_set()` where the most +basic error should be raised. For instance, if during request processing +`diag_add()` is called before `diag_set()` then it will result in inheritance +of all errors from previous error raise: + +``` +-- Processing of request #1 +1. diag_set(code = 1) +Errors: <e1(code = 1) -> NULL> +2. diag_add(code = 2) +Errors: <e1(code = 1) -> e2(code = 2) -> NULL> +-- End of execution + +-- Processing of request #2 +1. diag_add(code = 1) +Errors: <e1(code = 1) -> e2(code = 2) -> e3(code = 1) -> NULL> +-- End of execution +``` + +As a result, at the end of execution of second request, three errors in +stack are reported instead of one. + +Another way to resolve this issue is to erase diagnostic area before +request processing. However, it breaks current user-visible behaviour +since box.error.last() will preserve last occurred error only until execution +of the next request. + +The diagnostic area (now) contains (nothing but) pointer to the top error: +``` +struct diag { + struct error *last; +}; + +``` + +To organize errors in a list let's extend error structure with pointer to +the previous element. Or alternatively, add member of any data structure +providing list properties (struct rlist, struct stailq or whatever): +``` +struct diag { + struct stailq *errors; +}; + +struct error { + ... + struct stailq_entry *in_errors; +}; +``` +When error is set to diagnostics area, its reference counter is incremented; +on the other hand if error is added (i.e. linked to the head of diagnostics +area list), its reference counter remains unchanged. The same concerns linking +two errors: only counter of referenced error is incremented. During error +destruction (that is the moment when error's reference counter hits 0 value) +the next error in the list (if any) is also affected: its reference counter +is decremented as well. + +### Lua API + +Tarantool returns a last-set (diag::last) error as `cdata` object from central +diagnostic area to Lua in case of error. User should be unable to modify it +(since it is considered to be a bad practice - in fact object doesn't belong +to user). On the other hand, user needs an ability to inspect a collected +diagnostic information. Hence, let's extend the error object API with a method +which provides the way to get the previous error (reason): `:prev()` (and +correspondingly `.prev` field). + +``` +-- Return a reason error object for given error object 'e' +-- (when exists, nil otherwise). +e:prev(error) == error.prev +``` + +Furthermore, let's extend signature of `box.error.new()` with new (optional) +argument - 'prev' - previous error object: + +``` +e1 = box.error.new({code = 111, reason = "just cause"}) +e2 = box.error.new({code = 222, reason = "just cause x2", prev = e1}) +``` + +User may want to link already existing errors. To achieve this let's add +`set_prev` method to error object so that one can join two errors: +``` +e1 = box.error.new({code = 111, reason = "just cause"}) +e2 = box.error.new({code = 222, reason = "just cause x2"}) +... +e2.set_prev(e1) -- e2.prev == e1 +``` +### Binary protocol + +Currently errors are sent as `(IPROTO_ERROR | errcode)` response with an +string message containing error details as a payload. There are not so many +options to extend current protocol wihtout breaking backward compatibility +(which is obviously one of implementation requirements). One way is to extend +existent binary protocol with a new key IPROTO_ERROR_STACK (or +IPROTO_ERROR_REASON or simply IPROTO_ERROR_V2): +``` +{ + // backward compatibility + IPROTO_ERROR: "the most recent error message", + // modern error message + IPROTO_ERROR_STACK: { + { + // the most recent error object + IPROTO_ERROR_CODE: error_code_number, + IPROTO_ERROR_MESSAGE: error_message_string, + }, + ... + { + // the oldest (reason) error object + }, + } +} +``` + +IPROTO_ERROR is always sent (as in older versions) in case of error. +IPROTO_ERROR_STACK is presented in response only if there's at least two +elements in diagnostic list. Map which contains error stack can be optimized +in terms of space, so that avoid including error which is already encoded +in IPROTO_ERROR. -- 2.17.1
next prev parent reply other threads:[~2020-03-25 1:43 UTC|newest] Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-25 1:42 [Tarantool-patches] [PATCH v2 00/10] Stacked diagnostics Nikita Pettik 2020-03-25 1:42 ` Nikita Pettik [this message] 2020-03-25 8:27 ` [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area Konstantin Osipov 2020-03-25 14:08 ` Nikita Pettik 2020-03-25 1:42 ` [Tarantool-patches] [PATCH v2 02/10] box: rename diag_add_error to diag_set_error Nikita Pettik 2020-03-25 8:27 ` Konstantin Osipov 2020-03-26 0:22 ` Vladislav Shpilevoy 2020-03-26 12:31 ` Nikita Pettik 2020-03-25 1:42 ` [Tarantool-patches] [PATCH v2 03/10] test: move box.error tests to box/error.test.lua Nikita Pettik 2020-03-25 8:28 ` Konstantin Osipov 2020-03-26 0:22 ` Vladislav Shpilevoy 2020-03-26 12:31 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 04/10] box/error: introduce box.error.set() method Nikita Pettik 2020-03-25 8:33 ` Konstantin Osipov 2020-03-25 17:41 ` Nikita Pettik 2020-03-26 0:22 ` Vladislav Shpilevoy 2020-03-26 12:31 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 05/10] box/error: don't set error created via box.error.new to diag Nikita Pettik 2020-03-26 16:50 ` Konstantin Osipov 2020-03-26 17:59 ` Nikita Pettik 2020-03-26 18:06 ` Nikita Pettik 2020-03-26 18:07 ` Alexander Turenko 2020-03-27 0:19 ` Vladislav Shpilevoy 2020-03-27 13:09 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 06/10] box: introduce stacked diagnostic area Nikita Pettik 2020-03-26 16:54 ` Konstantin Osipov 2020-03-26 18:03 ` Nikita Pettik 2020-03-26 18:08 ` Konstantin Osipov 2020-03-28 18:40 ` Vladislav Shpilevoy 2020-04-01 16:09 ` Nikita Pettik 2020-04-02 0:29 ` Vladislav Shpilevoy 2020-04-02 17:42 ` Nikita Pettik 2020-04-02 22:20 ` Vladislav Shpilevoy 2020-04-03 1:54 ` Nikita Pettik 2020-04-03 23:17 ` Vladislav Shpilevoy 2020-03-28 18:59 ` Vladislav Shpilevoy 2020-03-31 17:44 ` Nikita Pettik 2020-04-02 0:29 ` Vladislav Shpilevoy 2020-04-02 14:16 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 07/10] box: use stacked diagnostic area for functional indexes Nikita Pettik 2020-03-30 23:24 ` Vladislav Shpilevoy 2020-04-01 15:53 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 08/10] box/error: clarify purpose of reference counting in struct error Nikita Pettik 2020-03-30 23:24 ` Vladislav Shpilevoy 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 09/10] iproto: refactor error encoding with mpstream Nikita Pettik 2020-03-30 23:24 ` Vladislav Shpilevoy 2020-04-01 15:54 ` Nikita Pettik 2020-03-25 1:43 ` [Tarantool-patches] [PATCH v2 10/10] iproto: support error stacked diagnostic area Nikita Pettik 2020-03-30 23:24 ` Vladislav Shpilevoy 2020-04-01 16:26 ` Nikita Pettik 2020-04-01 22:24 ` Nikita Pettik 2020-04-02 0:29 ` Vladislav Shpilevoy 2020-04-02 14:01 ` Nikita Pettik 2020-04-02 22:20 ` Vladislav Shpilevoy 2020-04-03 2:16 ` Nikita Pettik 2020-04-03 23:17 ` Vladislav Shpilevoy 2020-04-06 11:07 ` Nikita Pettik
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=a9553e91674504017befa6890c60298e0a67328a.1585097339.git.korablev@tarantool.org \ --to=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox