From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp44.i.mail.ru (smtp44.i.mail.ru [94.100.177.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 1A5CF469719 for ; Fri, 14 Feb 2020 02:17:29 +0300 (MSK) Date: Fri, 14 Feb 2020 02:17:44 +0300 From: Alexander Turenko Message-ID: <20200213231744.tag2ign4ed6avwmx@tkn_work_nb> References: <20200210075707.86953-1-arkholga@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200210075707.86953-1-arkholga@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Olga Arkhangelskaia Cc: tarantool-patches@dev.tarantool.org > Issue:https://github.com/tarantool/tarantool/issues/4761 > Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings Nice catch! The patch itself is quite straighforward and obviously okay. I have several comments around wording, formatting and the test case. I'll paste the commit from the updated branch to comment it inline. > commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb > Author: Olga Arkhangelskaia > Date: Wed Feb 5 14:05:25 2020 +0300 > > json: fix silent change of global json settings 'global' is not precise term here: it can be arbitrary json (de)serializer instance (created by json.new()), not necessarily the default one. It is better to say 'instance options'. Something like "don't spoil instance options with per-call ones". > > When json.decode is used with 2 arguments, 2nd argument seeps out to global > json settings. Moreover, due to current serializer.cfg implementation it Same for 'global'. > remains invisible while checking settings by json.cfg. To prevent such > behaviour we stop writing to global serializer struct and use local one, > to get one-time action. > As was mention before json.cfg can not be trusted in this case, so to check that > everything remained unchanged we call decode twice with and without 2nd > argument. Didn't get the second paragraph. Is it about the test? We usually don't describe tests in commit messages, but rather provide comments in the code of a test when it is necessary. Please, mention the commit where the degradation occurs (see [2] for example). > > Closes #4761 Several nits: * Fit a commit body within 72 symbols. * It is better to split paragraphs with an empty line. See [1] for formal rules and examples. [1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/ [2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc > > diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua > index fadfc74ec..a6b36ff3d 100755 > --- a/test/app-tap/json.test.lua > +++ b/test/app-tap/json.test.lua > @@ -22,7 +22,7 @@ end > > tap.test("json", function(test) > local serializer = require('json') > - test:plan(40) > + test:plan(41) > > test:test("unsigned", common.test_unsigned, serializer) > test:test("signed", common.test_signed, serializer) > @@ -94,6 +94,11 @@ tap.test("json", function(test) > 'error: too many nested data structures') > test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth, > 'global option remains unchanged') > + -- > + -- gh-4761 json.decode silently changes global settings of json when called > + -- with 2d parameter > + -- Several nits: * Separate the comment from the previous test case with an empty line. * Use a colon after 'gh-xxxx' to unify it with other descriptions within the file. * Fit a comment within 66 symbols (see [2]; it is for C, but in fact we apply the rule to Lua). * Typo: 2d -> 2nd. * Same as above for 'global'. [2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/ > + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}')) The test case should be as much independent from other as possible. Here it uses the previous one, which calls .decode() with per-call options. Moreover, 'Tarantool Engineer Standard Operating Procedures' document now obligates a developer to add a test case for a bug fix within a separate file (see 'Writing tests' section) and the reason is mostly to push developers to guarantee test cases independence. Please, note that it also holds a certain naming policy. You can use '{"foo": "bar"}' json string and {decode_max_depth = 1} option to minimize the test case. It will be easier to read. test.ok() has four parameters: test object (the colon adds it), condition to check, message and extra data to show when the test case fails. Let's provide the message, because of two reasons: * It eases initial analyzing of a fail when it occurs and so generally recommended. * In case of fail the second return value of pcall() will be shown as the message and it looks as unintended effect. I think it would be good to provide a regression test of the same kind for json.encode(). > > -- > -- gh-3514: fix parsing integers with exponent in json > diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c > index 3d25814f3..5925e7e6f 100644 > --- a/third_party/lua-cjson/lua_cjson.c > +++ b/third_party/lua-cjson/lua_cjson.c > @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l) > luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1, > "expected 1 or 2 arguments"); > > + struct luaL_serializer *cfg = luaL_checkserializer(l); > + struct luaL_serializer user_cfg; > + /* > + * user_cfg is per-call local version of global cfg: it is > + * used if user passes custom options to :decode() method > + * as a separate arguments. In this case it is required > + * to avoid modifying global parameters. Life span of > + * user_cfg is restricted by the scope of :decode() so it > + * is enough to allocate it on the stack. > + */ Nit: We usually provide a description before a field / function / variable, not after. > + json.cfg = cfg; > if (lua_gettop(l) == 2) { > - struct luaL_serializer *user_cfg = luaL_checkserializer(l); > - luaL_serializer_parse_options(l, user_cfg); > + user_cfg = *cfg; Technically it also copies triggers. I would rather use luaL_serializer_copy_options() and left a comment that triggers are left uninitialized intentionally: the code should not run them. It is better because of two reasons: * trigger_run() would segfault if it will be called on user_cfg somehow and it will explicitly shows that something is going in a wrong way (likely on tests during development). It is better than change other serializer options using a trigger. * It is more obvious what is going here (I guess that the problem that you fixed here appears purely because this assignment was considered as the mistake and a pointer assignment should be here). > + luaL_serializer_parse_options(l, &user_cfg); > lua_pop(l, 1); > - json.cfg = user_cfg; > - } else { > - json.cfg = luaL_checkserializer(l); > + json.cfg = &user_cfg; > } > > json.data = luaL_checklstring(l, 1, &json_len);