From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp56.i.mail.ru (smtp56.i.mail.ru [217.69.128.36]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 9662C469719 for ; Fri, 14 Feb 2020 18:55:03 +0300 (MSK) References: <20200210075707.86953-1-arkholga@tarantool.org> <20200213231744.tag2ign4ed6avwmx@tkn_work_nb> From: Olga Arkhangelskaia Message-ID: Date: Fri, 14 Feb 2020 18:55:01 +0300 MIME-Version: 1.0 In-Reply-To: <20200213231744.tag2ign4ed6avwmx@tkn_work_nb> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Turenko Cc: tarantool-patches@dev.tarantool.org Hi! Thanks a lot for the review. I did not make separate version because there is any major change. I have fixed everything you mentioned and moved the test case to the separate file. You will find new diff below. 14.02.2020 2:17, Alexander Turenko пишет: >> Issue:https://github.com/tarantool/tarantool/issues/4761 >> Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings > Nice catch! > > The patch itself is quite straighforward and obviously okay. I have > several comments around wording, formatting and the test case. > > I'll paste the commit from the updated branch to comment it inline. > >> commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb >> Author: Olga Arkhangelskaia >> Date: Wed Feb 5 14:05:25 2020 +0300 >> >> json: fix silent change of global json settings > 'global' is not precise term here: it can be arbitrary json > (de)serializer instance (created by json.new()), not necessarily the > default one. It is better to say 'instance options'. Something like > "don't spoil instance options with per-call ones". > >> >> When json.decode is used with 2 arguments, 2nd argument seeps out to global >> json settings. Moreover, due to current serializer.cfg implementation it > Same for 'global'. > >> remains invisible while checking settings by json.cfg. To prevent such >> behaviour we stop writing to global serializer struct and use local one, >> to get one-time action. >> As was mention before json.cfg can not be trusted in this case, so to check that >> everything remained unchanged we call decode twice with and without 2nd >> argument. > Didn't get the second paragraph. Is it about the test? We usually don't > describe tests in commit messages, but rather provide comments in the > code of a test when it is necessary. > > Please, mention the commit where the degradation occurs (see [2] for > example). > >> >> Closes #4761 > Several nits: > > * Fit a commit body within 72 symbols. > * It is better to split paragraphs with an empty line. > > See [1] for formal rules and examples. > > [1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/ > [2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc > >> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua >> index fadfc74ec..a6b36ff3d 100755 >> --- a/test/app-tap/json.test.lua >> +++ b/test/app-tap/json.test.lua >> @@ -22,7 +22,7 @@ end >> >> tap.test("json", function(test) >> local serializer = require('json') >> - test:plan(40) >> + test:plan(41) >> >> test:test("unsigned", common.test_unsigned, serializer) >> test:test("signed", common.test_signed, serializer) >> @@ -94,6 +94,11 @@ tap.test("json", function(test) >> 'error: too many nested data structures') >> test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth, >> 'global option remains unchanged') >> + -- >> + -- gh-4761 json.decode silently changes global settings of json when called >> + -- with 2d parameter >> + -- > Several nits: > > * Separate the comment from the previous test case with an empty line. > * Use a colon after 'gh-xxxx' to unify it with other descriptions within > the file. > * Fit a comment within 66 symbols (see [2]; it is for C, but in fact we > apply the rule to Lua). > * Typo: 2d -> 2nd. > * Same as above for 'global'. > > [2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/ > >> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}')) > The test case should be as much independent from other as possible. Here > it uses the previous one, which calls .decode() with > per-call options. Moreover, 'Tarantool Engineer Standard Operating > Procedures' document now obligates a developer to add a test case for a > bug fix within a separate file (see 'Writing tests' section) and the > reason is mostly to push developers to guarantee test cases > independence. Please, note that it also holds a certain naming policy. > > You can use '{"foo": "bar"}' json string and {decode_max_depth = 1} > option to minimize the test case. It will be easier to read. > > test.ok() has four parameters: test object (the colon adds it), > condition to check, message and extra data to show when the test case > fails. Let's provide the message, because of two reasons: > > * It eases initial analyzing of a fail when it occurs and so generally > recommended. > * In case of fail the second return value of pcall() will be shown as > the message and it looks as unintended effect. > > I think it would be good to provide a regression test of the same kind > for json.encode(). > >> >> -- >> -- gh-3514: fix parsing integers with exponent in json >> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c >> index 3d25814f3..5925e7e6f 100644 >> --- a/third_party/lua-cjson/lua_cjson.c >> +++ b/third_party/lua-cjson/lua_cjson.c >> @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l) >> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1, >> "expected 1 or 2 arguments"); >> >> + struct luaL_serializer *cfg = luaL_checkserializer(l); >> + struct luaL_serializer user_cfg; >> + /* >> + * user_cfg is per-call local version of global cfg: it is >> + * used if user passes custom options to :decode() method >> + * as a separate arguments. In this case it is required >> + * to avoid modifying global parameters. Life span of >> + * user_cfg is restricted by the scope of :decode() so it >> + * is enough to allocate it on the stack. >> + */ > Nit: We usually provide a description before a field / function / > variable, not after. > >> + json.cfg = cfg; >> if (lua_gettop(l) == 2) { >> - struct luaL_serializer *user_cfg = luaL_checkserializer(l); >> - luaL_serializer_parse_options(l, user_cfg); >> + user_cfg = *cfg; > Technically it also copies triggers. I would rather use > luaL_serializer_copy_options() and left a comment that triggers are left > uninitialized intentionally: the code should not run them. It is better > because of two reasons: > > * trigger_run() would segfault if it will be called on user_cfg somehow > and it will explicitly shows that something is going in a wrong way > (likely on tests during development). It is better than change other > serializer options using a trigger. > * It is more obvious what is going here (I guess that the problem that > you fixed here appears purely because this assignment was considered > as the mistake and a pointer assignment should be here). > >> + luaL_serializer_parse_options(l, &user_cfg); >> lua_pop(l, 1); >> - json.cfg = user_cfg; >> - } else { >> - json.cfg = luaL_checkserializer(l); >> + json.cfg = &user_cfg; >> } >> >> json.data = luaL_checklstring(l, 1, &json_len); -Subject: [PATCH] json: fix silent change of global json settings +Subject: [PATCH] json: don't spoil instance options with per-call -When json.decode is used with 2 arguments, 2nd argument seeps out to global -json settings. Morover, due to current serialier.cfg implementation it -remains invisible while checking settings by json.cfg. To prevent sucj -behaviour we stop writing to global serializer struct and use local one, -to get one-time action. -As was mention before json.cfg can not be trusted in this case, so to check that -everything remained unchanged we call decode twice with and without 2nd -argument. +When json.decode is used with 2 arguments, 2nd argument seeps out to the json +configuration of the instance. Moreover, due to current serializer.cfg +implementation it remains invisible while checking settings by json.cfg. + +This fixes commit 6508ddb ("json: fix stack-use-after-scope in json_decode()"). -diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua -index fadfc74ec..a6b36ff3d 100755 ---- a/test/app-tap/json.test.lua -+++ b/test/app-tap/json.test.lua -@@ -22,7 +22,7 @@ end - - tap.test("json", function(test) -     local serializer = require('json') -- test:plan(40) -+ test:plan(41) - -     test:test("unsigned", common.test_unsigned, serializer) -     test:test("signed", common.test_signed, serializer) -@@ -94,6 +94,11 @@ tap.test("json", function(test) -             'error: too many nested data structures') -     test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth, -             'global option remains unchanged') ++++ b/test/app-tap/gh-4761-fix-json-decode.test.lua +@@ -0,0 +1,28 @@ ++#!/usr/bin/env tarantool ++ ++local tap = require('tap') ++tap.test("json", function(test) ++    local serializer = require('json') ++ test:plan(4) ++  + -- -+    -- gh-4761 json.decode silently changes global settings of json when called -+    -- with 2d parameter ++    -- gh-4761: json.decode silently changes instances settings when called ++    -- with 2nd parameter.  + -- -+ test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}')) - - -- -     -- gh-3514: fix parsing integers with exponent in json ++    test:ok(not pcall(serializer.decode, '{"foo":{"bar": 1}}', ++                      {decode_max_depth = 1}), ++            'error: too many nested data structures') ++    test:ok(pcall(serializer.decode, '{"foo":{"bar": 1}}'), ++            'json instance settings are unchanged') ++ ++ -- ++    -- Same check for json.encode. ++ -- ++    local nan = 1/0 ++    test:ok(not pcall(serializer.encode, {a = 1/0}, ++                      {encode_invalid_numbers = false}), ++            'expected error with NaN encoding with .encode') ++    test:ok(pcall(serializer.encode, {a=nan}), ++            "json instance settings are unchanged") ++ ++end)  diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c -index 3d25814f3..f855cbd80 100644 +index 3d25814f3..8ff2ca89a 100644  --- a/third_party/lua-cjson/lua_cjson.c  +++ b/third_party/lua-cjson/lua_cjson.c -@@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l) +@@ -1004,13 +1004,26 @@ static int json_decode(lua_State *l)       luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,                     "expected 1 or 2 arguments");  +    struct luaL_serializer *cfg = luaL_checkserializer(l); -+    struct luaL_serializer user_cfg = *cfg; ++ ++    /* user_cfg is per-call local version of global cfg: it is ++     * used if user passes custom options to :decode() method ++     * as a separate arguments. In this case it is required ++     * to avoid modifying global parameters. Life span of ++     * user_cfg is restricted by the scope of :decode() so it ++     * is enough to allocate it on the stack. */ ++    struct luaL_serializer user_cfg;  +    json.cfg = cfg;       if (lua_gettop(l) == 2) {  -        struct luaL_serializer *user_cfg = luaL_checkserializer(l);  -        luaL_serializer_parse_options(l, user_cfg); -+        luaL_serializer_parse_options(l, &user_cfg); -         lua_pop(l, 1); +-        lua_pop(l, 1);  -        json.cfg = user_cfg;  -    } else {