From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B9C0322412 for ; Thu, 13 Dec 2018 11:30:57 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vBqo8qCIJBPA for ; Thu, 13 Dec 2018 11:30:57 -0500 (EST) Received: from smtp35.i.mail.ru (smtp35.i.mail.ru [94.100.177.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id F255B2240F for ; Thu, 13 Dec 2018 11:30:56 -0500 (EST) Subject: [tarantool-patches] Re: [PATCH] sql: terminate with \0 encoded msgpack References: <20181211162100.55972-1-korablev@tarantool.org> <2F544190-12E5-4F38-913A-68C0EC4FF438@tarantool.org> From: Vladislav Shpilevoy Message-ID: <838a7d2d-1853-e7b8-d452-cdef2edffc9a@tarantool.org> Date: Thu, 13 Dec 2018 19:30:51 +0300 MIME-Version: 1.0 In-Reply-To: <2F544190-12E5-4F38-913A-68C0EC4FF438@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: "n.pettik" , tarantool-patches@freelists.org On 13/12/2018 16:53, n.pettik wrote: > >> Please, show where exactly strlen() is called on msgpack? > > Trace which leads to overflow is shown in issue description: > https://github.com/tarantool/tarantool/issues/3868 > > ==55841==ERROR: AddressSanitizer: heap-buffer-overflow > #0 0x10384026a in wrap_strlen (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x1626a) > #1 0x100ce617b in sqlite3Strlen30 util.c:129 > #2 0x100d5acfb in sqlite3VdbeMemSetStr vdbemem.c:908 > #3 0x100d3f451 in sqlite3VdbeList vdbeaux.c:1685 > #4 0x100d2c86d in sqlite3Step vdbeapi.c:573 > #5 0x100d2bab2 in sqlite3_step vdbeapi.c:629 > #6 0x100365d7a in lua_sql_execute (tarantool:x86_64+0x100365d7a) > #7 0x10047188c in lj_BC_FUNCC (tarantool:x86_64+0x10047188c) > >> I can not find this place. Also, terminating msgpack with 0 is >> a crutch, IMO. You will be able to print msgpack and use strlen(), >> but as you said, it will not be valid msgpack. I will not be >> able to decode it. This is because of msgpack having zeroes in a >> middle. So those columns in EXPLAIN with invalid msgpack makes no >> sense to show in such a case - invalid msgpack is useless. > > Well, that’s true. We can operate with Blob in other way: since > its length is always comes as first argument of OP_Blob, we > can rely only on that length and disregard null terminations at all. > Here’s kind of workaround for this approach: > > diff --git a/src/box/sql/vdbeaux.c b/src/box/sql/vdbeaux.c > index fc805e3aa..06f07dbb9 100644 > --- a/src/box/sql/vdbeaux.c > +++ b/src/box/sql/vdbeaux.c > @@ -1213,8 +1213,13 @@ displayComment(const Op * pOp, /* The opcode to be commented */ > if (c == 'P') { > c = zSynopsis[++ii]; > if (c == '4') { > - sqlite3_snprintf(nTemp - jj, zTemp + jj, > - "%s", zP4); > + if (pOp->opcode == OP_Blob) { > + memcpy(zTemp + jj, zP4, pOp->p1); > + } else { > + sqlite3_snprintf(nTemp - jj, > + zTemp + jj, > + "%s", zP4); > + } > } else if (c == 'X') { > sqlite3_snprintf(nTemp - jj, zTemp + jj, > "%s", pOp->zComment); > @@ -1418,8 +1423,7 @@ sqlite3VdbePrintOp(FILE * pOut, int pc, Op * pOp) > char *zP4; > char zPtr[50]; > char zCom[100]; > - static const char *zFormat1 = > - "%4d> %4d %-13s %4d %4d %4d %-13s %.2X %s\n"; > + static const char *zFormat1 = "%4d> %4d %-13s %4d %4d %4d "; > if (pOut == 0) > pOut = stdout; > zP4 = displayP4(pOp, zPtr, sizeof(zPtr)); > @@ -1433,8 +1437,13 @@ sqlite3VdbePrintOp(FILE * pOut, int pc, Op * pOp) > * information from the vdbe.c source text > */ > fprintf(pOut, zFormat1, fiber_self()->fid, pc, > - sqlite3OpcodeName(pOp->opcode), pOp->p1, pOp->p2, pOp->p3, zP4, > - pOp->p5, zCom); > + sqlite3OpcodeName(pOp->opcode), pOp->p1, pOp->p2, pOp->p3); > + if (pOp->opcode == OP_Blob) { > + fwrite(zP4, sizeof(char), pOp->p1, pOut); > + fprintf(pOut, "%.2X %s \n", pOp->p5, zCom); > + } else { > + fprintf(pOut, "%-13s %.2X %s \n", zP4, pOp->p5, zCom); > + } > fflush(pOut); > } > #endif > @@ -1681,8 +1690,13 @@ sqlite3VdbeList(Vdbe * p) > pMem->flags = MEM_Str | MEM_Term; > zP4 = displayP4(pOp, pMem->z, pMem->szMalloc); > if (zP4 != pMem->z) { > - pMem->n = 0; > - sqlite3VdbeMemSetStr(pMem, zP4, -1, 1, 0); > + /* > + * BLOB may contain encoded mspack which > + * disrespects null termination. > + */ > + int str_len = pOp->opcode == OP_Blob ? pOp->p1 : 0; > + pMem->n = str_len; > + sqlite3VdbeMemSetStr(pMem, zP4, str_len, 1, 0); > } else { > > It may fail on some tests, I didn’t check it. But it certainly doesn’t result in > buffer-overflow, that I’ve checked. > > If this solution still looks ugly/unacceptable, lets consider other options. It is better, but msgpack in EXPLAIN still is unreadable. > >> I remember, that msgpack 0 termination existed in earlier >> SQLite + Tarantool versions, but I removed it deliberately so >> as to return it without string methods usage. >> >> All places, which you crutched up with 0 termination, have >> SQL_SUBTYPE_MSGPACK subtype, using which you can determine >> if it is msgpack. > > On the other hand I guess not only msgpacks can contain nulls while > being encoded..See sample above.> >> I understand, that from EXPLAIN we can not return Lua maps/arrays >> since 6th column of EXPLAIN output is declared as a string. But >> we can decode msgpack into JSON or YAML. This output would be >> both parsable and readable. > > I don’t think that decoding msgpack for user is fair. > IMHO blob should come as is. > I do not agree that blob should come as is. It does not already, in fact. We return decoded msgpack as Lua objects when use box.sql.execute() to select from a space. What is more, I think that in EXPLAIN, in Vdbe comments, raw msgpack makes no sense. It is useless, unreadable. A sense of a comment to be read by a human and help to detect some problems or something. But I can not read raw msgpack. So I still think that EXPLAIN should decode msgpack into a readable form. For me it does not matter: YAML, JSON or lua objects.