From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id D298E23D69 for ; Fri, 15 Feb 2019 16:28:32 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tpsiM9920YwQ for ; Fri, 15 Feb 2019 16:28:32 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 6410823C66 for ; Fri, 15 Feb 2019 16:28:32 -0500 (EST) Subject: [tarantool-patches] Re: [PATCH v2 1/3] lua-yaml: verify arguments count References: <58ccdb031d0befd0e80d50a4684f5e7a59182062.1548123025.git.alexander.turenko@tarantool.org> <04a54285-aa7c-2726-0077-64c14a40dde6@tarantool.org> <20190205032929.gbcapbbgbnpfolnl@tkn_work_nb> <58b73713-df96-57df-b705-6858e6a54e59@tarantool.org> <20190211133218.yiwkf6jqf4uxtyyh@tkn_work_nb> From: Vladislav Shpilevoy Message-ID: <42422054-b590-1197-c08b-0cd6f8af0f8c@tarantool.org> Date: Sat, 16 Feb 2019 00:28:29 +0300 MIME-Version: 1.0 In-Reply-To: <20190211133218.yiwkf6jqf4uxtyyh@tkn_work_nb> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Alexander Turenko , Kirill Yukhin This commit LGTM. Next is pending of fixes. On 11/02/2019 14:32, Alexander Turenko wrote: > lua_is* really checks whether an acceptable index is a valid one, so > there are two possible approaches, one of which we should stick I think: > > * Verify lua_gettop() upper and lower bounds right at start of a > function. > * Use lua_is* (including lua_isnone() and lua_isnoneornil()) and don't > verify arguments count explicitly. > > I think we should use one of these ways within a module: this is more > important then the patch size. The only difference for a user is that > the latter approach does not check for extra arguments. > > Now I implemented the latter approach as I see you want to minimize > explicit checks. See the patch at end of the email. > > It is possible to reduce the patch further, but loss consistency in what > we check: lua_is* or lua_gettop(). I'll do if you insist, but don't > think it is the right way to proceed. > > NB: branch: kh/gh-3662-yaml-2.1 > > WBR, Alexander Turenko. > > On Tue, Feb 05, 2019 at 10:36:41PM +0300, Vladislav Shpilevoy wrote: >> Hi! Thanks for the fixes! >> >>>>> functions. >>>>> >>>>> Without these checks the functions could read garbage outside of a Lua >>>>> stack when called w/o arguments. >>>> >>>> Honestly, I do not understand how is it possible. Please, >>>> provide a test for both functions. See my 3 doubts below. >>> >>> lua_isstring(L, 1) checks a garbage w/o preliminary lua_gettop() check. >>> yaml.encode() gives me "unsupported Lua type 'thread'" on the current >>> tarantool 2.1. >> >> I looked at lua_isstring implementation, and I see, that it checks >> top. If an index is above top, then the type is nil. >> >> static TValue *index2adr(lua_State *L, int idx) >> { >> if (idx > 0) { >> TValue *o = L->base + (idx - 1); >> return o < L->top ? o : niltv(L); >> ... >> > > Ouch, lua_isstring() is called only in case of top == 2, so this is out > of scope of the discussion. The real cause of this weird "unsupported > Lua type 'thread'" error is lua_yaml_encode() code: it calls > `lua_newthread(L)` and then `lua_pushvalue(L, 1);`. A 1st stack item > should be a value we encode, but when there are no arguments for > yaml.encode() the new lua thread is the 1st item. > > Anyway, I don't think that "unsupported Lua type 'thread'" is the right > error message for `yaml.encode()`. Are you agree? > >>> >>> Anyway, added bad API usage test cases. Also I changed this: >>> >>> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc >>> index 3a427263e..46374970f 100644 >>> --- a/third_party/lua-yaml/lyaml.cc >>> +++ b/third_party/lua-yaml/lyaml.cc >>> @@ -453,7 +453,7 @@ usage_error: >>> return luaL_error(L, OOM_ERRMSG); >>> yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len); >>> bool tag_only; >>> - if (lua_gettop(L) == 2) { >>> + if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) { >>> if (! lua_istable(L, 2)) >>> goto usage_error; >>> lua_getfield(L, 2, "tag_only"); >>> >>> We should not raise an usage error for yaml.decode(object, nil). >> >> Why? It is said, that the second value either does not exist, or >> is a table. Nil is not a table. So why? If your logic was about >> considering nil as a not existing value, then why don't we handle >> cases like this: yaml.decode(object, nil, nil, nil, nil) ? The same >> for l_dump() and encode. > > There is the difference between `yaml.decode(object, nil)` and > `yaml.decode(object, nil, nil, nil, nil)`. The former one is likely to > appear due to passing though the 2nd argument, say: > > ``` > local function load_cfg(raw, opts) > local object = yaml.decode(raw, opts) > ...some post-processing... > return object > end > ``` > > The latter is definitely wrong usage. > > But now I removed checks for extra args, see above. > >> >>>>> usage_error: >>>>> return luaL_error(L, "Usage: yaml.decode(document, "\ >>>>> "[{tag_only = boolean}])"); >>>>> @@ -416,7 +417,7 @@ usage_error: >>>>> return luaL_error(L, OOM_ERRMSG); >>>>> yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len); >>>>> bool tag_only; >>>>> - if (lua_gettop(L) > 1) { >>>>> + if (lua_gettop(L) == 2) { >>>> >>>> 2. This function never touches anything beyond second value on >>>> the stack, so here lua_gettop(L) > 1 means the same as >>>> lua_gettop(L) == 2 - the second argument exist. Third and next >>>> values do not matter. >>> >>> I read this as 'those are equivalent' (correct me if I'm wrong). Ok. I'd >>> prefer to leave it with ==. Also note the fix I pasted above. >> >> Why? Again. I do not see any reason behind this change except personal >> preference. > > It does not matter much, because I anyway need to add ` && ! > lua_isnil(L, 2)` or use `! lua_isnoneornil(L, 2)` here to make decode > behaviour consistent with encode one (against 2nd argument). Yep, it is > personal preference. Anyway, now it is `! lua_isnoneornil(L, 2)`. > >> I reverted all the changes about l_load() function, and the >> tests passed. So why do we need to make diff bigger? > > yaml.decode('', nil, {}) don't pass before (don't raise an error). > Other tests are passed, because of two reasons: > > * no test on yaml.decode('', nil); > * lua_isstring() checks stack size. > > Re test: added for encode and decode. > > Re lua_isstring(): okay, now I understood that it checks given index by > the API: > > http://pgl.yoyo.org/luai/i/lua_isstring ("acceptable index") > https://www.lua.org/manual/5.3/manual.html#4.3 ("Valid and Acceptable Indices") > https://www.lua.org/manual/5.1/manual.html#3.2 (the same for Lua 5.1) > > So I changed the description of the commit to make it clear that the > reason of the change is to make the code more consistent. > >> >>> >>>> >>>>> if (! lua_istable(L, 2)) >>>>> goto usage_error; >>>>> lua_getfield(L, 2, "tag_only"); >>>>> @@ -794,7 +795,7 @@ error: >>>>> static int l_dump(lua_State *L) { >>>>> struct luaL_serializer *serializer = luaL_checkserializer(L); >>>>> int top = lua_gettop(L); >>>>> - if (top > 2) { >>>>> + if (!(top == 1 || top == 2)) { >>>> >>>> 3. Here my reasoning is the same - the previous checking works >>>> as well. >>> >>> It will not give an error in case of yaml.encode() and yaml.encode({}, >>> {}, {}). >> >> Decent. Here you are right. >> >>> >>>> >>>>> usage_error: >>>>> return luaL_error(L, "Usage: encode(object, {tag_prefix = , "\ >>>>> "tag_handle = })"); >>>>> >> >> My diff, which reverts some changes and makes this patch one-liner: >> >> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc >> index 354cafe86..854794dd1 100644 >> --- a/third_party/lua-yaml/lyaml.cc >> +++ b/third_party/lua-yaml/lyaml.cc >> @@ -400,8 +400,7 @@ static void load(struct lua_yaml_loader *loader) { >> */ >> static int l_load(lua_State *L) { >> struct lua_yaml_loader loader; >> - int top = lua_gettop(L); >> - if (!(top == 1 || top == 2) || !lua_isstring(L, 1)) { >> + if (! lua_isstring(L, 1)) { >> usage_error: >> return luaL_error(L, "Usage: yaml.decode(document, "\ >> "[{tag_only = boolean}])"); >> @@ -417,7 +416,7 @@ usage_error: >> return luaL_error(L, OOM_ERRMSG); >> yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len); >> bool tag_only; >> - if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) { >> + if (lua_gettop(L) > 1) { >> if (! lua_istable(L, 2)) >> goto usage_error; >> lua_getfield(L, 2, "tag_only"); > > ---- > > The new patch description and diff (w/o tests): > > lua-yaml: verify args in a consistent manner > > Use lua_is*() functions instead of explicit lua_gettop() checks in > yaml.encode() and yaml.decode() functions. > > Behaviour changes: > > * yaml.decode(object, nil) ignores nil (it is consistent with encode > behaviour). > * yaml.encode() gives an usage error instead of "unsupported Lua type > 'thread'". > * yaml.encode('', {}, {}) ignores 3rd argument (it is consistent with > decode behaviour). > > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc > index c6d118a79..bd876ab29 100644 > --- a/third_party/lua-yaml/lyaml.cc > +++ b/third_party/lua-yaml/lyaml.cc > @@ -416,7 +416,7 @@ usage_error: > return luaL_error(L, OOM_ERRMSG); > yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len); > bool tag_only; > - if (lua_gettop(L) > 1) { > + if (! lua_isnoneornil(L, 2)) { > if (! lua_istable(L, 2)) > goto usage_error; > lua_getfield(L, 2, "tag_only"); > @@ -793,14 +793,13 @@ error: > */ > static int l_dump(lua_State *L) { > struct luaL_serializer *serializer = luaL_checkserializer(L); > - int top = lua_gettop(L); > - if (top > 2) { > + if (lua_isnone(L, 1)) { > usage_error: > return luaL_error(L, "Usage: encode(object, {tag_prefix = , "\ > "tag_handle = })"); > } > const char *prefix = NULL, *handle = NULL; > - if (top == 2 && !lua_isnil(L, 2)) { > + if (! lua_isnoneornil(L, 2)) { > if (! lua_istable(L, 2)) > goto usage_error; > lua_getfield(L, 2, "tag_prefix"); >