From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp40.i.mail.ru (smtp40.i.mail.ru [94.100.177.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4A920469719 for ; Sun, 16 Feb 2020 03:06:26 +0300 (MSK) Date: Sun, 16 Feb 2020 03:06:43 +0300 From: Alexander Turenko Message-ID: <20200216000643.qfqpnlszbpipvic3@tkn_work_nb> References: <20200210075707.86953-1-arkholga@tarantool.org> <20200213231744.tag2ign4ed6avwmx@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Olga Arkhangelskaia Cc: tarantool-patches@dev.tarantool.org Pushed to master, 2.3, 2.2 and 1.10 with minor changes. CCed Kirill. The changes I made are described below and I hope they are NOT about my personal feeling how code should be organized and commented, but how the whole team operates on it. I explained a reason that is under each change (where it is not very obvious), so you may verify future patches versus those points youself. Some changes, however, may look as my personal taste of a good code shape. Excuse me if it is so, I tried to refrain myself from this kind of modifications. WBR, Alexander Turenko. ---- Pasted the new patch to comment it, because the diff of diffs you pasted does not have all recent changes (at least usage of luaL_serializer_copy_options()). > commit cc7e3fc49cc71ac0b4616c6f073a40fdc7ed3979 > Author: Olga Arkhangelskaia > Date: Wed Feb 5 14:05:25 2020 +0300 > > json: don't spoil instance options with per-call I see, you removed 'ones' from the end to fit 50 symbols limit. However 'per-call' is adjective and it can not be used w/o a noun (or noun group). So I rephrased it a bit: | json: don't spoil instance with per-call options > > When json.decode is used with 2 arguments, 2nd argument seeps out to the json > configuration of the instance. Moreover, due to current serializer.cfg > implementation it remains invisible while checking settings by json.cfg. Reformatted to fit 72 symbols. The limit comes from the tradition to format any text in the way that will look good on 80x25 screen: 4 symbols for the indent `git log` adds before the text, 72 for the text itself and 4 symbols to make the text visually centered on the screen. Changed 'by json.cfg' (at end of the sentence) to 'using json.cfg table', it looks more strict. > > This fixes commit 6508ddb ("json: fix stack-use-after-scope in json_decode()"). Use full 40 hex digits commit hash to follow our usual style (GitHub renders it to shorter hash, but git log shows as is). > > Closes #4761 > > diff --git a/test/app-tap/gh-4761-fix-json-decode.test.lua b/test/app-tap/gh-4761-fix-json-decode.test.lua > new file mode 100755 > index 000000000..d10849317 > --- /dev/null > +++ b/test/app-tap/gh-4761-fix-json-decode.test.lua Changed the test name to 'gh-4761-json-per-call-options.test.lua' to follow the style of other 'gh-*' regression tests: it is either noun group describing a component where the problem was or a noun group explaining the problem itself. Not a sentence in the imperative mood. > @@ -0,0 +1,28 @@ > +#!/usr/bin/env tarantool > + > +local tap = require('tap') > +tap.test("json", function(test) Changed double quites to single quotes to unify the style. Changed the test name to 'gh-4761-json-per-call-options', because 'json' looks too common here. > + local serializer = require('json') Moved the 'require' statement to the top level and changed the variable name to 'json': it is more usual for our Lua code. > + test:plan(4) > + > + -- > + -- gh-4761: json.decode silently changes instances settings when called > + -- with 2nd parameter. Cite from the previous review notes: | * Fit a comment within 66 symbols (see [2]; it is for C, but in fact we | apply the rule to Lua). | | [2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/ Reformatted the comment to follow this rule. Fixed typo: 'instances' -> 'instance' (plural is not needed here). Moved the comment above: since we test both :decode() and :encode() and the comment describes the situation that is theoretically possible for both methods, it looks more applicable for the entire test. Added 'verify json.encode as well' sentence. > + -- > + test:ok(not pcall(serializer.decode, '{"foo":{"bar": 1}}', > + {decode_max_depth = 1}), > + 'error: too many nested data structures') I would use test:ok() and so for the test case itself, but assert() for test preparation code. I would also split the call to :decode() and result verification code for readability matters. So, rewrote in this way: | -- Preparation code: call :decode() with a custom option. | local ok, err = pcall(json.decode, '{"foo": {"bar": 1}}', | {decode_max_depth = 1}) | assert(not ok, 'expect "too many nested data structures" error') > + test:ok(pcall(serializer.decode, '{"foo":{"bar": 1}}'), > + 'json instance settings are unchanged') Splitted :decode() call from test:ok() as above. Also verify the result of decoding: | -- Verify that the instance option remains unchanged. | local exp_res = {foo = {bar = 1}} | local ok, res = pcall(json.decode, '{"foo": {"bar": 1}}') | test:is_deeply({ok, res}, {true, exp_res}, | 'json instance settings remain unchanged after :decode()') Added :decode() to the test case name to differentiate it from the one below (re :encode()) in the TAP13 output. > + > + -- > + -- Same check for json.encode. > + -- > + local nan = 1/0 > + test:ok(not pcall(serializer.encode, {a = 1/0}, Changed '1/0' to 'nan'. > + {encode_invalid_numbers = false}), > + 'expected error with NaN encoding with .encode') > + test:ok(pcall(serializer.encode, {a=nan}), > + "json instance settings are unchanged") Made the changes in the spirit of ones described above. > + > +end) It is good to exit from a test with a non-zero exit code in case of a fail (because our test harness may miss some kinds of fails listed in TAP13 output), so I have added the check of tap.test() result: | local res = tap.test(<...>, function(test) | <...test cases...> | end) | | os.exit(res and 0 or 1) It is mentioned in [1] for the following form: | local test = tap.test(<...>) | <...test cases...> | os.exit(test:check() and 0 or 1) test:check() returns exactly same code as tap.test(<...>) when the latter is called with a function. [1]: https://github.com/tarantool/doc/issues/1004 > diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c > index 3d25814f3..8ff2ca89a 100644 > --- a/third_party/lua-cjson/lua_cjson.c > +++ b/third_party/lua-cjson/lua_cjson.c > @@ -1004,13 +1004,26 @@ static int json_decode(lua_State *l) > luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1, > "expected 1 or 2 arguments"); > > + struct luaL_serializer *cfg = luaL_checkserializer(l); > + > + /* user_cfg is per-call local version of global cfg: it is Changed 'global cfg' to 'serializer instance options'. > + * used if user passes custom options to :decode() method Changed: user -> a user. > + * as a separate arguments. In this case it is required Changed 'arguments' to the singular form. > + * to avoid modifying global parameters. Life span of Changed 'global parameters' to 'options of the instance'. > + * user_cfg is restricted by the scope of :decode() so it > + * is enough to allocate it on the stack. */ > + struct luaL_serializer user_cfg; We use two styles of comments (for one-line and multi-line comments): | /* Hello. */ | /* | * Hello, | * world. | */ Changed the comment to follow the latter style. If you're curious, Linus about comments style (strong language, you have been warned): https://lkml.org/lkml/2016/7/8/625 > + json.cfg = cfg; > if (lua_gettop(l) == 2) { > - struct luaL_serializer *user_cfg = luaL_checkserializer(l); > - luaL_serializer_parse_options(l, user_cfg); > - lua_pop(l, 1); > - json.cfg = user_cfg; > - } else { > - json.cfg = luaL_checkserializer(l); > + > + /* Only copy cfg options and left update triggers, see > + * struct luaL_serializer, uninitialized. The code should > + * never run them, since we must not change the json config of the > + * instance. */ > + luaL_serializer_copy_options(&user_cfg, cfg); > + luaL_serializer_parse_options(l, &user_cfg); > + lua_pop(l, 1); > + json.cfg = &user_cfg; > } Tab / space mix. The code is inherited from an other source, so we should follow the surrounding style and use spaces for indentation. Fixed. I rewrote the comment: | /* | * on_update triggers are left uninitialized for user_cfg. | * The decoding code don't (and shouldn't) run them. | */ The reason is to make the comment technically strict: * The code not only copy options from a serializer instance, but also parse and apply user-provided options. I removed this clause to don't describe this (it is quite obvious from the code now). * Only changes from Lua run on_update triggers, because C code don't change options expect when a serializer is created or when the change is result of on_update() trigger run. So even if the decoding code would change an option the triggers will not be run. So I only said that triggers don't and shouldn't be run rather than that the decoding code don't change the options. Hope you don't mind. > > json.data = luaL_checklstring(l, 1, &json_len);