From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 27 Mar 2019 12:18:57 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH] xrow: print corrupted rows on decoding error. Message-ID: <20190327091857.tk5yxwhhgy2eloia@esperanza> References: <20190315160238.37882-1-sergepetrenko@tarantool.org> <20190318175829.rxdmzizp67wwfdkq@esperanza> <8222472E-3EB1-4B68-8AA3-F5677EB5F001@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8222472E-3EB1-4B68-8AA3-F5677EB5F001@tarantool.org> To: Serge Petrenko Cc: tarantool-patches@freelists.org List-ID: On Tue, Mar 19, 2019 at 03:19:31PM +0300, Serge Petrenko wrote: > Hi! > Thankyou for review! > The fixes are on the branch, and an incremental diff is below. > > > 18 марта 2019 г., в 20:58, Vladimir Davydov написал(а): > > > > On Fri, Mar 15, 2019 at 07:02:38PM +0300, Serge Petrenko wrote: > >> Add row hex printing to log on verbose level. This would be useful > >> during investigation of errors related to invalid msgpack packet > >> arrival. > >> > >> Related to #4040 > >> --- > >> https://github.com/tarantool/tarantool/tree/sp/xrow-decode-verbose-err > >> > >> > >> src/box/xrow.c | 27 +++++++++++++++++++++++++++ > >> 1 file changed, 27 insertions(+) > >> > >> diff --git a/src/box/xrow.c b/src/box/xrow.c > >> index bddae1d5b..8a5254902 100644 > >> --- a/src/box/xrow.c > >> +++ b/src/box/xrow.c > >> @@ -88,14 +88,29 @@ mp_decode_vclock(const char **data, struct vclock *vclock) > >> return 0; > >> } > >> > >> +static inline void > >> +say_dump_row_hex(const char *start, const char *end) > >> +{ > >> + char *buf = (char *)malloc(3 * (end - start) + 1); > > > > Theoretically, malloc() may fail. You'd better check it and print > > a warning in this case. Anyway, do we really need to use malloc() > > here at all? Won't tt_static_buf be enough for typical cases. > > Done, please see the incremental diff below. There are too many changes for an incremental diff. It's time for v2. > I still use malloc for sizes greater than TT_STATIC_BUF_LEN, > but check for an allocation error. No, please don't. I don't want branching here - it complicates the code without yielding any benefits. I just suggested to use tt_static_buf *instead of* malloc. If you think it's inadequate, then please use malloc() as before. > > > > >> + char *pos = buf; > >> + for (const char *cur = start; cur < end; ++cur) { > >> + pos += sprintf(pos, "%02hhX ", *cur); > >> + } > >> + *pos = 0; > >> + say_verbose("Corrupted row is: %s", buf); > > > > say_verbose() will turn into no-op, but you allocate a buffer and format > > the message anyway. Please fix that by using say_log_level_is_enabled() > > before doing anything. > > Fixed. > > > >> + free(buf); > >> +} > >> + > >> int > >> xrow_header_decode(struct xrow_header *header, const char **pos, > >> const char *end, bool end_is_exact) > >> { > >> memset(header, 0, sizeof(struct xrow_header)); > >> const char *tmp = *pos; > >> + const char * const start = *pos; > >> if (mp_check(&tmp, end) != 0) { > >> error: > >> + say_dump_row_hex(start, end); > >> diag_set(ClientError, ER_INVALID_MSGPACK, "packet header"); > >> return -1; > >> } > >> @@ -163,6 +178,7 @@ error: > >> if (*pos < end && header->type != IPROTO_NOP) { > >> const char *body = *pos; > >> if (mp_check(pos, end)) { > >> + say_dump_row_hex(start, end); > >> diag_set(ClientError, ER_INVALID_MSGPACK, "packet body"); > >> return -1; > >> } > >> @@ -171,6 +187,7 @@ error: > >> header->body[0].iov_len = *pos - body; > >> } > >> if (end_is_exact && *pos < end) { > >> + say_dump_row_hex(start,end); > >> diag_set(ClientError, ER_INVALID_MSGPACK, "packet body"); > > > > It would be better to add a helper function that would set diag and > > print a corrupted row so that we don't miss it in case we add another > > point of failure. Dunno, may be it's even worth adding a new exception > > class with logging built in. > > Done, please, see below. > I don’t like the idea with a new exception class. > If we introduce it, we’ll have to store the whole > corrupted row, which can be quite big, until logging happens. Okay. > > diff --git a/src/box/xrow.c b/src/box/xrow.c > index 3854874cc..000647111 100644 > --- a/src/box/xrow.c > +++ b/src/box/xrow.c > @@ -88,17 +88,44 @@ mp_decode_vclock(const char **data, struct vclock *vclock) > return 0; > } > > +/** > + * Set diagnostics with an error and log the corrupted row > + * which caused the error. > + * @param what - exception to set. > + * @param desc_str - error description string. > + */ > static inline void > -say_dump_row_hex(const char *start, const char *end) > +xrow_on_decode_err(const char *start, const char *end, uint32_t what, > + const char *desc_str) This should be a macro (perhaps a macro + a helper function), otherwise the line number where an error occurred will be lost. > { > - char *buf = (char *)malloc(3 * (end - start) + 1); > + diag_set(ClientError, what, desc_str); > + > + if (!say_log_level_is_enabled(S_VERBOSE)) > + return; > + > + size_t len = 3 * (end - start + 1); > + char *buf; > + > + if (len <= TT_STATIC_BUF_LEN) { > + buf = tt_static_buf(); > + } else { > + buf = (char *)malloc(len); > + if (!buf) { > + say_verbose("Got a corrupted row during decoding. "\ > + "Not enough memory to dump row contents."); > + return; > + } > + } > + > char *pos = buf; > for (const char *cur = start; cur < end; ++cur) { > pos += sprintf(pos, "%02hhX ", *cur); > } Please use snprintf. I'm not quite familiar with the %02hhX specifier. What does the output look like? Could you give an example in the commit message? BTW, it'd be really nice to have a test covering this code - should be pretty easy to add. Also, is the output format consistent with some unix tool (hexdump maybe)? > *pos = 0; > say_verbose("Corrupted row is: %s", buf); > - free(buf); > + > + if (len > TT_STATIC_BUF_LEN) > + free(buf); > } > > int > @@ -110,8 +137,7 @@ xrow_header_decode(struct xrow_header *header, const char **pos, > const char * const start = *pos; > if (mp_check(&tmp, end) != 0) { > error: > - say_dump_row_hex(start, end); > - diag_set(ClientError, ER_INVALID_MSGPACK, "packet header"); > + xrow_on_decode_err(start, end, ER_INVALID_MSGPACK, "packet header"); > return -1; > } > > @@ -178,8 +204,7 @@ error: > if (*pos < end && header->type != IPROTO_NOP) { > const char *body = *pos; > if (mp_check(pos, end)) { > - say_dump_row_hex(start, end); > - diag_set(ClientError, ER_INVALID_MSGPACK, "packet body"); > + xrow_on_decode_err(start, end, ER_INVALID_MSGPACK, "packet body"); You skipped quite a few places where a decoding error might occur, e.g. xrow_decode_sql => ER_MISSING_REQUEST_FIELD doesn't have extra logging while xrow_decode_dml => ER_MISSING_REQUEST_FIELD does. I'd like to see some pattern here. Perhaps, logging every error doesn't make much sense, especially if you primarily target at corrupted input. Let's discuss.