Tarantool development patches archive
 help / color / mirror / Atom feed
From: Nikita Pettik <korablev@tarantool.org>
To: tarantool-patches@dev.tarantool.org
Cc: v.shpilevoy@tarantool.org
Subject: [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area
Date: Wed, 25 Mar 2020 04:42:57 +0300	[thread overview]
Message-ID: <a9553e91674504017befa6890c60298e0a67328a.1585097339.git.korablev@tarantool.org> (raw)
In-Reply-To: <cover.1585097339.git.korablev@tarantool.org>
In-Reply-To: <cover.1585097339.git.korablev@tarantool.org>

From: Kirill Shcherbatov <kshcherbatov@tarantool.org>

Part of #1148
---
 doc/rfc/1148-stacked-diagnostics.md | 225 ++++++++++++++++++++++++++++
 1 file changed, 225 insertions(+)
 create mode 100644 doc/rfc/1148-stacked-diagnostics.md

diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md
new file mode 100644
index 000000000..ccc588d46
--- /dev/null
+++ b/doc/rfc/1148-stacked-diagnostics.md
@@ -0,0 +1,225 @@
+# Stacked Diagnostics
+
+* **Status**: In progress
+* **Start date**: 30-07-2019
+* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov@tarantool.org,
+               Pettik Nikita @Korablev77 korablev@tarantool.org
+* **Issues**: [#1148](https://github.com/tarantool/<repository\>/issues/1148)
+
+
+## Summary
+
+The document describes a stacked diagnostics feature. It is needed for the cases,
+when there is a complex and huge call stack with variety of subsystems in it.
+It may turn out that the most low-level error is not descriptive enough. In turn
+user may want to be able to look at errors along the whole call stack from the
+place where a first error happened. In terms of implementation single fiber's
+error object is going to be replaced with a list of objects forming diagnostic
+stack. Its first element will always be created by the deepest and the most
+basic error. Both C and Lua APIs are extended to support adding errors in stack.
+
+## Background and motivation
+
+Support stacked diagnostics for Tarantool allows to accumulate all errors
+occurred during request processing. It allows to better understand what
+happened, and handle errors appropriately. Consider following example:
+persistent Lua function referenced by functional index has a bug in it's
+definition, Lua handler sets an diag message. Then functional index extractor
+code setups an own, more specialized error.
+
+### Current error diagnostics
+
+Currently Tarantool has `diag_set()` mechanism to set a diagnostic error.
+Object representing error featuring following properties:
+ - type (string) error’s C++ class;
+ - code (number) error’s number;
+ - message (string) error’s message;
+ - file (string) Tarantool source file;
+ - line (number) line number in the Tarantool source file.
+
+The last error raised is exported with `box.error.last()` function.
+
+Type of error is represented by a few C++ classes (all are inherited from
+Exception class). For instance hierarchy for ClientError is following:
+```
+ClientError
+ | LoggedError
+ | AccessDeniedError
+ | UnsupportedIndexFeature
+```
+
+All codes and names of ClientError class are available in box.error.
+User is able to create a new error instance of predefined type using
+box.error.new() function. For example:
+```
+tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just cause")
+tarantool> t:unpack()
+---
+- type: ClientError
+  code: 9
+  message: 'Failed to create space ''myspace'': just cause'
+  trace:
+  - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]'
+    line: 1
+```
+
+User is also capable of defining own errors with any code  by means of:
+```
+box.error.new({code = user_code, reason = user_error_msg})
+```
+For instance:
+```
+e = box.error.new({code = 500, reason = 'just cause'})
+```
+
+Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()`
+methods and `.type`, `.message` and `.trace` fields.
+
+## Proposal
+
+In some cases a diagnostic area should be more complicated than
+one last raised error to provide decent information concerning incident (see
+motivating example above). Without stacked diagnostic area, only last error is
+delivered to user. One way to deal with this problem is to introduce stack
+accumulating all errors happened during request processing.
+
+### C API
+
+Let's keep existent `diag_set()` method as is. It is supposed to replace the
+last error in diagnostic area with a new one. To add new error at the top of
+existing one, let's introduce new method `diag_add()`. It is assumed to keep
+an existent error message in diagnostic area (if any) and sets it as a previous
+error for a recently-constructed error object. Note that `diag_set()` is not
+going to preserve pointer to previous error which is held in error to be
+substituted. To illustrate last point consider example:
+
+```
+0. Errors: <NULL>
+1. diag_set(code = 1)
+Errors: <e1(code = 1) -> NULL>
+2. diag_add(code = 2)
+Errors: <e1(code = 1) -> e2(code = 2) -> NULL>
+3. diag_set(code = 3)
+Errors: <e3(code = 3) -> NULL>
+```
+
+Hence, developer takes responsibility of placing `diag_set()` where the most
+basic error should be raised. For instance, if during request processing
+`diag_add()` is called before `diag_set()` then it will result in inheritance
+of all errors from previous error raise:
+
+```
+-- Processing of request #1
+1. diag_set(code = 1)
+Errors: <e1(code = 1) -> NULL>
+2. diag_add(code = 2)
+Errors: <e1(code = 1) -> e2(code = 2) -> NULL>
+-- End of execution
+
+-- Processing of request #2
+1. diag_add(code = 1)
+Errors: <e1(code = 1) -> e2(code = 2) -> e3(code = 1) -> NULL>
+-- End of execution
+```
+
+As a result, at the end of execution of second request, three errors in
+stack are reported instead of one.
+
+Another way to resolve this issue is to erase diagnostic area before
+request processing. However, it breaks current user-visible behaviour
+since box.error.last() will preserve last occurred error only until execution
+of the next request.
+
+The diagnostic area (now) contains (nothing but) pointer to the top error:
+```
+struct diag {
+  struct error *last;
+};
+
+```
+
+To organize errors in a list let's extend error structure with pointer to
+the previous element. Or alternatively, add member of any data structure
+providing list properties (struct rlist, struct stailq or whatever):
+```
+struct diag {
+  struct stailq *errors;
+};
+
+struct error {
+   ...
+   struct stailq_entry *in_errors;
+};
+```
+When error is set to diagnostics area, its reference counter is incremented;
+on the other hand if error is added (i.e. linked to the head of diagnostics
+area list), its reference counter remains unchanged. The same concerns linking
+two errors: only counter of referenced error is incremented. During error
+destruction (that is the moment when error's reference counter hits 0 value)
+the next error in the list (if any) is also affected: its reference counter
+is decremented as well.
+
+### Lua API
+
+Tarantool returns a last-set (diag::last) error as `cdata` object from central
+diagnostic area to Lua in case of error. User should be unable to modify it
+(since it is considered to be a bad practice - in fact object doesn't belong
+to user). On the other hand, user needs an ability to inspect a collected
+diagnostic information. Hence, let's extend the error object API with a method
+which provides the way to get the previous error (reason): `:prev()` (and
+correspondingly `.prev` field).
+
+```
+-- Return a reason error object for given error object 'e'
+-- (when exists, nil otherwise).
+e:prev(error) == error.prev
+```
+
+Furthermore, let's extend signature of `box.error.new()` with new (optional)
+argument - 'prev' - previous error object:
+
+```
+e1 = box.error.new({code = 111, reason = "just cause"})
+e2 = box.error.new({code = 222, reason = "just cause x2", prev = e1})
+```
+
+User may want to link already existing errors. To achieve this let's add
+`set_prev` method to error object so that one can join two errors:
+```
+e1 = box.error.new({code = 111, reason = "just cause"})
+e2 = box.error.new({code = 222, reason = "just cause x2"})
+...
+e2.set_prev(e1) -- e2.prev == e1
+```
+### Binary protocol
+
+Currently errors are sent as `(IPROTO_ERROR | errcode)` response with an
+string message containing error details as a payload. There are not so many
+options to extend current protocol wihtout breaking backward compatibility
+(which is obviously one of implementation requirements). One way is to extend
+existent binary protocol with a new key IPROTO_ERROR_STACK (or
+IPROTO_ERROR_REASON or simply IPROTO_ERROR_V2):
+```
+{
+        // backward compatibility
+        IPROTO_ERROR: "the most recent error message",
+        // modern error message
+        IPROTO_ERROR_STACK: {
+                {
+                        // the most recent error object
+                        IPROTO_ERROR_CODE: error_code_number,
+                        IPROTO_ERROR_MESSAGE: error_message_string,
+                },
+                ...
+                {
+                        // the oldest (reason) error object
+                },
+        }
+}
+```
+
+IPROTO_ERROR is always sent (as in older versions) in case of error.
+IPROTO_ERROR_STACK is presented in response only if there's at least two
+elements in diagnostic list. Map which contains error stack can be optimized
+in terms of space, so that avoid including error which is already encoded
+in IPROTO_ERROR.
-- 
2.17.1

  reply	other threads:[~2020-03-25  1:43 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25  1:42 [Tarantool-patches] [PATCH v2 00/10] Stacked diagnostics Nikita Pettik
2020-03-25  1:42 ` Nikita Pettik [this message]
2020-03-25  8:27   ` [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area Konstantin Osipov
2020-03-25 14:08     ` Nikita Pettik
2020-03-25  1:42 ` [Tarantool-patches] [PATCH v2 02/10] box: rename diag_add_error to diag_set_error Nikita Pettik
2020-03-25  8:27   ` Konstantin Osipov
2020-03-26  0:22   ` Vladislav Shpilevoy
2020-03-26 12:31     ` Nikita Pettik
2020-03-25  1:42 ` [Tarantool-patches] [PATCH v2 03/10] test: move box.error tests to box/error.test.lua Nikita Pettik
2020-03-25  8:28   ` Konstantin Osipov
2020-03-26  0:22   ` Vladislav Shpilevoy
2020-03-26 12:31     ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 04/10] box/error: introduce box.error.set() method Nikita Pettik
2020-03-25  8:33   ` Konstantin Osipov
2020-03-25 17:41     ` Nikita Pettik
2020-03-26  0:22   ` Vladislav Shpilevoy
2020-03-26 12:31     ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 05/10] box/error: don't set error created via box.error.new to diag Nikita Pettik
2020-03-26 16:50   ` Konstantin Osipov
2020-03-26 17:59     ` Nikita Pettik
2020-03-26 18:06       ` Nikita Pettik
2020-03-26 18:07       ` Alexander Turenko
2020-03-27  0:19   ` Vladislav Shpilevoy
2020-03-27 13:09     ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 06/10] box: introduce stacked diagnostic area Nikita Pettik
2020-03-26 16:54   ` Konstantin Osipov
2020-03-26 18:03     ` Nikita Pettik
2020-03-26 18:08       ` Konstantin Osipov
2020-03-28 18:40   ` Vladislav Shpilevoy
2020-04-01 16:09     ` Nikita Pettik
2020-04-02  0:29       ` Vladislav Shpilevoy
2020-04-02 17:42         ` Nikita Pettik
2020-04-02 22:20           ` Vladislav Shpilevoy
2020-04-03  1:54             ` Nikita Pettik
2020-04-03 23:17               ` Vladislav Shpilevoy
2020-03-28 18:59   ` Vladislav Shpilevoy
2020-03-31 17:44     ` Nikita Pettik
2020-04-02  0:29       ` Vladislav Shpilevoy
2020-04-02 14:16         ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 07/10] box: use stacked diagnostic area for functional indexes Nikita Pettik
2020-03-30 23:24   ` Vladislav Shpilevoy
2020-04-01 15:53     ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 08/10] box/error: clarify purpose of reference counting in struct error Nikita Pettik
2020-03-30 23:24   ` Vladislav Shpilevoy
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 09/10] iproto: refactor error encoding with mpstream Nikita Pettik
2020-03-30 23:24   ` Vladislav Shpilevoy
2020-04-01 15:54     ` Nikita Pettik
2020-03-25  1:43 ` [Tarantool-patches] [PATCH v2 10/10] iproto: support error stacked diagnostic area Nikita Pettik
2020-03-30 23:24   ` Vladislav Shpilevoy
2020-04-01 16:26     ` Nikita Pettik
2020-04-01 22:24       ` Nikita Pettik
2020-04-02  0:29         ` Vladislav Shpilevoy
2020-04-02 14:01           ` Nikita Pettik
2020-04-02 22:20             ` Vladislav Shpilevoy
2020-04-03  2:16               ` Nikita Pettik
2020-04-03 23:17                 ` Vladislav Shpilevoy
2020-04-06 11:07                   ` Nikita Pettik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9553e91674504017befa6890c60298e0a67328a.1585097339.git.korablev@tarantool.org \
    --to=korablev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 01/10] box: rfc for stacked diagnostic area' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox