From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 8C90422896 for ; Thu, 13 Dec 2018 08:53:12 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mOp2Zgrr2Dv2 for ; Thu, 13 Dec 2018 08:53:12 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 01AB5222BE for ; Thu, 13 Dec 2018 08:53:11 -0500 (EST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.0 \(3445.100.39\)) Subject: [tarantool-patches] Re: [PATCH] sql: terminate with \0 encoded msgpack From: "n.pettik" In-Reply-To: Date: Thu, 13 Dec 2018 16:53:08 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <2F544190-12E5-4F38-913A-68C0EC4FF438@tarantool.org> References: <20181211162100.55972-1-korablev@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org Cc: Vladislav Shpilevoy > Please, show where exactly strlen() is called on msgpack? Trace which leads to overflow is shown in issue description: https://github.com/tarantool/tarantool/issues/3868 =3D=3D55841=3D=3DERROR: AddressSanitizer: heap-buffer-overflow #0 0x10384026a in wrap_strlen = (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x1626a) #1 0x100ce617b in sqlite3Strlen30 util.c:129 #2 0x100d5acfb in sqlite3VdbeMemSetStr vdbemem.c:908 #3 0x100d3f451 in sqlite3VdbeList vdbeaux.c:1685 #4 0x100d2c86d in sqlite3Step vdbeapi.c:573 #5 0x100d2bab2 in sqlite3_step vdbeapi.c:629 #6 0x100365d7a in lua_sql_execute (tarantool:x86_64+0x100365d7a) #7 0x10047188c in lj_BC_FUNCC (tarantool:x86_64+0x10047188c) > I can not find this place. Also, terminating msgpack with 0 is > a crutch, IMO. You will be able to print msgpack and use strlen(), > but as you said, it will not be valid msgpack. I will not be > able to decode it. This is because of msgpack having zeroes in a > middle. So those columns in EXPLAIN with invalid msgpack makes no > sense to show in such a case - invalid msgpack is useless. Well, that=E2=80=99s true. We can operate with Blob in other way: since its length is always comes as first argument of OP_Blob, we can rely only on that length and disregard null terminations at all. Here=E2=80=99s kind of workaround for this approach: diff --git a/src/box/sql/vdbeaux.c b/src/box/sql/vdbeaux.c index fc805e3aa..06f07dbb9 100644 --- a/src/box/sql/vdbeaux.c +++ b/src/box/sql/vdbeaux.c @@ -1213,8 +1213,13 @@ displayComment(const Op * pOp, /* The opcode to = be commented */ if (c =3D=3D 'P') { c =3D zSynopsis[++ii]; if (c =3D=3D '4') { - sqlite3_snprintf(nTemp - jj, = zTemp + jj, - "%s", zP4); + if (pOp->opcode =3D=3D OP_Blob) = { + memcpy(zTemp + jj, zP4, = pOp->p1); + } else { + sqlite3_snprintf(nTemp - = jj, + zTemp + = jj, + "%s", = zP4); + } } else if (c =3D=3D 'X') { sqlite3_snprintf(nTemp - jj, = zTemp + jj, "%s", = pOp->zComment); @@ -1418,8 +1423,7 @@ sqlite3VdbePrintOp(FILE * pOut, int pc, Op * pOp) char *zP4; char zPtr[50]; char zCom[100]; - static const char *zFormat1 =3D - "%4d> %4d %-13s %4d %4d %4d %-13s %.2X %s\n"; + static const char *zFormat1 =3D "%4d> %4d %-13s %4d %4d %4d "; if (pOut =3D=3D 0) pOut =3D stdout; zP4 =3D displayP4(pOp, zPtr, sizeof(zPtr)); @@ -1433,8 +1437,13 @@ sqlite3VdbePrintOp(FILE * pOut, int pc, Op * pOp) * information from the vdbe.c source text */ fprintf(pOut, zFormat1, fiber_self()->fid, pc, - sqlite3OpcodeName(pOp->opcode), pOp->p1, pOp->p2, = pOp->p3, zP4, - pOp->p5, zCom); + sqlite3OpcodeName(pOp->opcode), pOp->p1, pOp->p2, = pOp->p3); + if (pOp->opcode =3D=3D OP_Blob) { + fwrite(zP4, sizeof(char), pOp->p1, pOut); + fprintf(pOut, "%.2X %s \n", pOp->p5, zCom); + } else { + fprintf(pOut, "%-13s %.2X %s \n", zP4, pOp->p5, zCom); + } fflush(pOut); } #endif @@ -1681,8 +1690,13 @@ sqlite3VdbeList(Vdbe * p) pMem->flags =3D MEM_Str | MEM_Term; zP4 =3D displayP4(pOp, pMem->z, pMem->szMalloc); if (zP4 !=3D pMem->z) { - pMem->n =3D 0; - sqlite3VdbeMemSetStr(pMem, zP4, -1, 1, 0); + /* + * BLOB may contain encoded mspack which + * disrespects null termination. + */ + int str_len =3D pOp->opcode =3D=3D OP_Blob ? = pOp->p1 : 0; + pMem->n =3D str_len; + sqlite3VdbeMemSetStr(pMem, zP4, str_len, 1, 0); } else { It may fail on some tests, I didn=E2=80=99t check it. But it certainly = doesn=E2=80=99t result in buffer-overflow, that I=E2=80=99ve checked. If this solution still looks ugly/unacceptable, lets consider other = options. > I remember, that msgpack 0 termination existed in earlier > SQLite + Tarantool versions, but I removed it deliberately so > as to return it without string methods usage. >=20 > All places, which you crutched up with 0 termination, have > SQL_SUBTYPE_MSGPACK subtype, using which you can determine > if it is msgpack. On the other hand I guess not only msgpacks can contain nulls while being encoded..See sample above. > I understand, that from EXPLAIN we can not return Lua maps/arrays > since 6th column of EXPLAIN output is declared as a string. But > we can decode msgpack into JSON or YAML. This output would be > both parsable and readable. I don=E2=80=99t think that decoding msgpack for user is fair. IMHO blob should come as is.