From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 96CDE46970E for ; Thu, 23 Jan 2020 02:07:17 +0300 (MSK) References: <3f49ddc14c2e07f4ac96fc71f3912fd0c5663882.1579032293.git.korablev@tarantool.org> <20200122135409.GB1144@tarantool.org> From: Vladislav Shpilevoy Message-ID: <0ea9eb51-093d-039e-0042-d97c302de6cc@tarantool.org> Date: Thu, 23 Jan 2020 00:07:14 +0100 MIME-Version: 1.0 In-Reply-To: <20200122135409.GB1144@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH] box: rfc for stacked diagnostic area List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik , tarantool-patches@dev.tarantool.org I will take a look. I wasn't in To/CC in the first email in this thread, so I didn't pay attention. On 22/01/2020 14:54, Nikita Pettik wrote: > On 14 Jan 23:16, Nikita Pettik wrote: >> From: Kirill Shcherbatov > > Hi guys, > > Does anybody want to look at this RFC? It's quite similar to the one which > got LGTM from Konstantin, but it is even more simplified. I have to get > formal approvement to start fixing code itself. Thanks. > >> Part of #1148 >> --- >> rfc in human-readable format: https://github.com/tarantool/tarantool/blob/np/gh-1148-stacked-diag/doc/rfc/1148-stacked-diagnostics.md >> Previous version of rfc: https://github.com/tarantool/tarantool/commit/c2a7e1a10732fdb231780ac04d1a4bc1618c0468 >> >> Changes in this version: >> - All savepoint routines have been removed as meaningless; >> - Removed SQL warnings mentions since they are unrelated to the subject of rfc; >> - Removed mentions about exposing box.error.new() with filename and >> line parameters (again, it is barely related to the subject of rfc); >> - Described in details difference between diag_set() in diag_add() >> functions as a part of C interface; >> - Other minor clean-ups. >> >> Now RFC looks quite brief and straightforward in its implementation; >> it attempts at solving only #1148 issue. >> >> doc/rfc/1148-stacked-diagnostics.md | 214 ++++++++++++++++++++++++++++++++++++ >> 1 file changed, 214 insertions(+) >> create mode 100644 doc/rfc/1148-stacked-diagnostics.md >> >> diff --git a/doc/rfc/1148-stacked-diagnostics.md b/doc/rfc/1148-stacked-diagnostics.md >> new file mode 100644 >> index 000000000..68fa6d21a >> --- /dev/null >> +++ b/doc/rfc/1148-stacked-diagnostics.md >> @@ -0,0 +1,214 @@ >> +# Stacked Diagnostics >> + >> +* **Status**: In progress >> +* **Start date**: 30-07-2019 >> +* **Authors**: Kirill Shcherbatov @kshcherbatov kshcherbatov@tarantool.org, >> + Pettik Nikita korablev@tarantool.org >> +* **Issues**: [#1148](https://github.com/tarantool//issues/1148) >> + >> +## Background and motivation >> + >> +Support stacked diagnostics for Tarantool allows to accumulate all errors >> +occurred during request processing. It allows to better understand what >> +happened, and handle errors appropriately. >> + >> +### Current error diagnostics >> + >> +Currently Tarantool has `diag_set()` mechanism to set a diagnostic error. >> +Object representing error featuring following properties: >> + - type (string) error’s C++ class; >> + - code (number) error’s number; >> + - message (string) error’s message; >> + - file (string) Tarantool source file; >> + - line (number) line number in the Tarantool source file. >> + >> +The last error raised is exported with `box.error.last()` function. >> + >> +Type of error is represented by a few C++ classes (all are inherited from >> +Exception class): >> +``` >> +ClientError >> + | LoggedError >> + | AccessDeniedError >> + | UnsupportedIndexFeature >> + >> +XlogError >> + | XlogGapError >> + >> +SystemError >> + | SocketError >> + | OutOfMemory >> + | TimedOut >> + >> +ChannelIsClosed >> +FiberIsCancelled >> +LuajitError >> +IllegalParams >> +CollationError >> +SwimError >> +CryptoError >> +``` >> + >> +All codes and names of ClientError class are available in box.error. >> +User is able to create a new error instance of predefined type using >> +box.error.new() function. For example: >> +``` >> +tarantool> t = box.error.new(box.error.CREATE_SPACE, "myspace", "just cause") >> +tarantool> t:unpack() >> +--- >> +- type: ClientError >> + code: 9 >> + message: 'Failed to create space ''myspace'': just because' >> + trace: >> + - file: '[string "t = box.error.new(box.error.CREATE_SPACE, "my..."]' >> + line: 1 >> +``` >> + >> +User is also capable of defining own errors with any code by means of: >> +``` >> +box.error.new({code = user_code, reason = user_error_msg}) >> +``` >> +For instance: >> +``` >> +e = box.error.new({code = 500, reason = 'just cause'}) >> +``` >> + >> +Error cdata object has `:unpack()`, `:raise()`, `:match(...)`, `:__serialize()` >> +methods and `.type`, `.message` and `.trace` fields. >> + >> +## Proposal >> + >> +In some cases a diagnostic information should be more complicated than >> +one last raised error. Consider following example: persistent Lua function >> +referenced by functional index has a bug in it's definition, Lua handler sets >> +an diag message. Then functional index extractor code setups an own, more >> +specialized error. Without stacked diagnostic area, only last error is >> +delivired to user. One way to deal with this problem is to introduce stack >> +accumulating all errors happened during request processing. >> + >> +### C API >> + >> +Let's keep existent `diag_set()` method as is. It is supposed to replace the >> +last error in diagnostic area with a new one. To add new error at the top of >> +existing one, let's introduce new method `diag_add()`. It is assumed to keep >> +an existent error message in diagnostic area (if any) and sets it as a reason >> +error for a recently-constructed error object. Note that `diag_set()` is not >> +going to preserve pointer to previous error which is held in error to be >> +substituted. To illustrate last point consider example: >> + >> +``` >> +0. Errors: >> +1. diag_set(code = 1) >> +Errors: NULL> >> +2. diag_add(code = 2) >> +Errors: e2(code = 2) -> NULL> >> +3. diag_set(code = 3) >> +Errors: NULL> >> +``` >> + >> +Hence, developer takes responsibility of placing `diag_set()` where the most >> +basic error should be raised. For instance, if during request processing >> +`diag_add()` is called before `diag_set()` then it will result in inheritance >> +of all errors from previous error raise: >> + >> +``` >> +-- Processing of request #1 >> +1. diag_set(code = 1) >> +Errors: NULL> >> +2. diag_add(code = 2) >> +Errors: e2(code = 2) -> NULL> >> +-- End of execution >> + >> +-- Processing of request #2 >> +1. diag_add(code = 1) >> +Errors: e2(code = 2) -> e3(code = 1) -> NULL> >> +-- End of execution >> +``` >> + >> +As a result, at the end of execution fo second request, three errors in >> +stack are reported instead of one. >> + >> +Another way to resolve this issue is to erase diagnostic area before >> +request processing. However, it breaks current user-visible behaviour >> +since box.error.last() will preserve last occurred error only until execution >> +of the next request. >> + >> +The diagnostic area (now) contains (nothing but) pointer to the top error: >> +``` >> +struct diag { >> + struct error *last; >> +}; >> + >> +``` >> + >> +To organize errors in a list let's extend error structure with pointer to >> +the previous element. Or alternatively, add member of any data structure >> +providing list properties (struct rlist, struct stailq or whatever): >> +``` >> +struct diag { >> + struct stailq *errors; >> +}; >> + >> +struct error { >> + ... >> + struct stailq_entry *in_errors; >> +}; >> +``` >> + >> + >> +### Lua API >> + >> +Tarantool returns a last-set (diag::last) error as `cdata` object from central >> +diagnostic area to Lua in case of error. User should be unable to modify it >> +(since it is considered to be a bad practice - in fact object doesn't belong >> +to user). On the other hand, user needs an ability to inspect a collected >> +diagnostic information. Hence, let's extend the `box.error` API with a function >> +which provides the way to get the previous error (reason): `:prev()` (and >> +correspondingly `.prev` field). >> + >> +``` >> +-- Return a reason error object for given error >> +-- (when exists, nil otherwise). >> +box.error.prev(error) == error.prev >> +``` >> + >> +Furthermore, let's extend signature of `box.error.new()` with new (optional) >> +argument - the 'reason' parent error object: >> + >> +``` >> +e1 = box.error.new({code = 111, reason = "just cause"}) >> +e2 = box.error.new({code = 222, reason = "just cause x2", prev = e1}) >> +``` >> + >> +### Binary protocol >> + >> +Currently errors are sent as `(IPROTO_ERROR | errcode)` response with an >> +string message containing error details as a payload. There are not so many >> +options to extend current protocol wihtout breaking backward compatibility >> +(which is obviously one of implementation requirements). One way is to extend >> +existent binary protocol with a new key IPROTO_ERROR_STACK (or >> +IPROTO_ERROR_REASON or simply IPROTO_ERROR_V2): >> +``` >> +{ >> + // backward compatibility >> + IPROTO_ERROR: "the most recent error message", >> + // modern error message >> + IPROTO_ERROR_STACK: { >> + { >> + // the most recent error object >> + IPROTO_ERROR_CODE: error_code_number, >> + IPROTO_ERROR_REASON: error_reason_string, >> + }, >> + ... >> + { >> + // the oldest (reason) error object >> + }, >> + } >> +} >> +``` >> + >> +IPROTO_ERROR is always sent (as in older versions) in case of error. >> +IPROTO_ERROR_STACK is presented in response only if there's at least two >> +elements in diagnostic list. Map which contains error stack can be optimized >> +in terms of space, so that avoid including error which is already encoded >> +in IPROTO_ERROR. >> -- >> 2.15.1 >>