[Tarantool-patches] [PATCH] json: fix silent change of global json settings
Alexander Turenko
alexander.turenko at tarantool.org
Fri Feb 14 02:17:44 MSK 2020
> Issue:https://github.com/tarantool/tarantool/issues/4761
> Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings
Nice catch!
The patch itself is quite straighforward and obviously okay. I have
several comments around wording, formatting and the test case.
I'll paste the commit from the updated branch to comment it inline.
> commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb
> Author: Olga Arkhangelskaia <arkholga at tarantool.org>
> Date: Wed Feb 5 14:05:25 2020 +0300
>
> json: fix silent change of global json settings
'global' is not precise term here: it can be arbitrary json
(de)serializer instance (created by json.new()), not necessarily the
default one. It is better to say 'instance options'. Something like
"don't spoil instance options with per-call ones".
>
> When json.decode is used with 2 arguments, 2nd argument seeps out to global
> json settings. Moreover, due to current serializer.cfg implementation it
Same for 'global'.
> remains invisible while checking settings by json.cfg. To prevent such
> behaviour we stop writing to global serializer struct and use local one,
> to get one-time action.
> As was mention before json.cfg can not be trusted in this case, so to check that
> everything remained unchanged we call decode twice with and without 2nd
> argument.
Didn't get the second paragraph. Is it about the test? We usually don't
describe tests in commit messages, but rather provide comments in the
code of a test when it is necessary.
Please, mention the commit where the degradation occurs (see [2] for
example).
>
> Closes #4761
Several nits:
* Fit a commit body within 72 symbols.
* It is better to split paragraphs with an empty line.
See [1] for formal rules and examples.
[1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/
[2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc
>
> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
> index fadfc74ec..a6b36ff3d 100755
> --- a/test/app-tap/json.test.lua
> +++ b/test/app-tap/json.test.lua
> @@ -22,7 +22,7 @@ end
>
> tap.test("json", function(test)
> local serializer = require('json')
> - test:plan(40)
> + test:plan(41)
>
> test:test("unsigned", common.test_unsigned, serializer)
> test:test("signed", common.test_signed, serializer)
> @@ -94,6 +94,11 @@ tap.test("json", function(test)
> 'error: too many nested data structures')
> test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
> 'global option remains unchanged')
> + --
> + -- gh-4761 json.decode silently changes global settings of json when called
> + -- with 2d parameter
> + --
Several nits:
* Separate the comment from the previous test case with an empty line.
* Use a colon after 'gh-xxxx' to unify it with other descriptions within
the file.
* Fit a comment within 66 symbols (see [2]; it is for C, but in fact we
apply the rule to Lua).
* Typo: 2d -> 2nd.
* Same as above for 'global'.
[2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/
> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
The test case should be as much independent from other as possible. Here
it uses the previous one, which calls <json instance>.decode() with
per-call options. Moreover, 'Tarantool Engineer Standard Operating
Procedures' document now obligates a developer to add a test case for a
bug fix within a separate file (see 'Writing tests' section) and the
reason is mostly to push developers to guarantee test cases
independence. Please, note that it also holds a certain naming policy.
You can use '{"foo": "bar"}' json string and {decode_max_depth = 1}
option to minimize the test case. It will be easier to read.
test.ok() has four parameters: test object (the colon adds it),
condition to check, message and extra data to show when the test case
fails. Let's provide the message, because of two reasons:
* It eases initial analyzing of a fail when it occurs and so generally
recommended.
* In case of fail the second return value of pcall() will be shown as
the message and it looks as unintended effect.
I think it would be good to provide a regression test of the same kind
for json.encode().
>
> --
> -- gh-3514: fix parsing integers with exponent in json
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..5925e7e6f 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l)
> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
> "expected 1 or 2 arguments");
>
> + struct luaL_serializer *cfg = luaL_checkserializer(l);
> + struct luaL_serializer user_cfg;
> + /*
> + * user_cfg is per-call local version of global cfg: it is
> + * used if user passes custom options to :decode() method
> + * as a separate arguments. In this case it is required
> + * to avoid modifying global parameters. Life span of
> + * user_cfg is restricted by the scope of :decode() so it
> + * is enough to allocate it on the stack.
> + */
Nit: We usually provide a description before a field / function /
variable, not after.
> + json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> - luaL_serializer_parse_options(l, user_cfg);
> + user_cfg = *cfg;
Technically it also copies triggers. I would rather use
luaL_serializer_copy_options() and left a comment that triggers are left
uninitialized intentionally: the code should not run them. It is better
because of two reasons:
* trigger_run() would segfault if it will be called on user_cfg somehow
and it will explicitly shows that something is going in a wrong way
(likely on tests during development). It is better than change other
serializer options using a trigger.
* It is more obvious what is going here (I guess that the problem that
you fixed here appears purely because this assignment was considered
as the mistake and a pointer assignment should be here).
> + luaL_serializer_parse_options(l, &user_cfg);
> lua_pop(l, 1);
> - json.cfg = user_cfg;
> - } else {
> - json.cfg = luaL_checkserializer(l);
> + json.cfg = &user_cfg;
> }
>
> json.data = luaL_checklstring(l, 1, &json_len);
More information about the Tarantool-patches
mailing list