[Tarantool-patches] [PATCH] json: fix silent change of global json settings

Alexander Turenko alexander.turenko at tarantool.org
Fri Feb 14 02:17:44 MSK 2020


> Issue:https://github.com/tarantool/tarantool/issues/4761
> Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings

Nice catch!

The patch itself is quite straighforward and obviously okay. I have
several comments around wording, formatting and the test case.

I'll paste the commit from the updated branch to comment it inline.

> commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb
> Author: Olga Arkhangelskaia <arkholga at tarantool.org>
> Date:   Wed Feb 5 14:05:25 2020 +0300
> 
>     json: fix silent change of global json settings

'global' is not precise term here: it can be arbitrary json
(de)serializer instance (created by json.new()), not necessarily the
default one. It is better to say 'instance options'. Something like
"don't spoil instance options with per-call ones".

>     
>     When json.decode is used with 2 arguments, 2nd argument seeps out to global
>     json settings. Moreover, due to current serializer.cfg implementation it

Same for 'global'.

>     remains invisible while checking settings by json.cfg. To prevent such
>     behaviour we stop writing to global serializer struct and use local one,
>     to get one-time action.
>     As was mention before json.cfg can not be trusted in this case, so to check that
>     everything remained unchanged we call decode twice with and without 2nd
>     argument.

Didn't get the second paragraph. Is it about the test? We usually don't
describe tests in commit messages, but rather provide comments in the
code of a test when it is necessary.

Please, mention the commit where the degradation occurs (see [2] for
example).

>     
>     Closes #4761

Several nits:

* Fit a commit body within 72 symbols.
* It is better to split paragraphs with an empty line.

See [1] for formal rules and examples.

[1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/
[2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc

> 
> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
> index fadfc74ec..a6b36ff3d 100755
> --- a/test/app-tap/json.test.lua
> +++ b/test/app-tap/json.test.lua
> @@ -22,7 +22,7 @@ end
>  
>  tap.test("json", function(test)
>      local serializer = require('json')
> -    test:plan(40)
> +    test:plan(41)
>  
>      test:test("unsigned", common.test_unsigned, serializer)
>      test:test("signed", common.test_signed, serializer)
> @@ -94,6 +94,11 @@ tap.test("json", function(test)
>              'error: too many nested data structures')
>      test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
>              'global option remains unchanged')
> +    --
> +    -- gh-4761 json.decode silently changes global settings of json when called
> +    -- with 2d parameter
> +    --

Several nits:

* Separate the comment from the previous test case with an empty line.
* Use a colon after 'gh-xxxx' to unify it with other descriptions within
  the file.
* Fit a comment within 66 symbols (see [2]; it is for C, but in fact we
  apply the rule to Lua).
* Typo: 2d -> 2nd.
* Same as above for 'global'.

[2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/

> +    test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))

The test case should be as much independent from other as possible. Here
it uses the previous one, which calls <json instance>.decode() with
per-call options. Moreover, 'Tarantool Engineer Standard Operating
Procedures' document now obligates a developer to add a test case for a
bug fix within a separate file (see 'Writing tests' section) and the
reason is mostly to push developers to guarantee test cases
independence. Please, note that it also holds a certain naming policy.

You can use '{"foo": "bar"}' json string and {decode_max_depth = 1}
option to minimize the test case. It will be easier to read.

test.ok() has four parameters: test object (the colon adds it),
condition to check, message and extra data to show when the test case
fails. Let's provide the message, because of two reasons:

* It eases initial analyzing of a fail when it occurs and so generally
  recommended.
* In case of fail the second return value of pcall() will be shown as
  the message and it looks as unintended effect.

I think it would be good to provide a regression test of the same kind
for json.encode().

>  
>      --
>      -- gh-3514: fix parsing integers with exponent in json
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..5925e7e6f 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l)
>      luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
>                    "expected 1 or 2 arguments");
>  
> +    struct luaL_serializer *cfg = luaL_checkserializer(l);
> +    struct luaL_serializer user_cfg;
> +    /*
> +     * user_cfg is per-call local version of global cfg: it is
> +     * used if user passes custom options to :decode() method
> +     * as a separate arguments. In this case it is required
> +     * to avoid modifying global parameters. Life span of
> +     * user_cfg is restricted by the scope of :decode() so it
> +     * is enough to allocate it on the stack.
> +     */

Nit: We usually provide a description before a field / function /
variable, not after.

> +    json.cfg = cfg;
>      if (lua_gettop(l) == 2) {
> -        struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> -        luaL_serializer_parse_options(l, user_cfg);
> +        user_cfg = *cfg;

Technically it also copies triggers. I would rather use
luaL_serializer_copy_options() and left a comment that triggers are left
uninitialized intentionally: the code should not run them. It is better
because of two reasons:

* trigger_run() would segfault if it will be called on user_cfg somehow
  and it will explicitly shows that something is going in a wrong way
  (likely on tests during development). It is better than change other
  serializer options using a trigger.
* It is more obvious what is going here (I guess that the problem that
  you fixed here appears purely because this assignment was considered
  as the mistake and a pointer assignment should be here).

> +        luaL_serializer_parse_options(l, &user_cfg);
>          lua_pop(l, 1);
> -        json.cfg = user_cfg;
> -    } else {
> -        json.cfg = luaL_checkserializer(l);
> +        json.cfg = &user_cfg;
>      }
>  
>      json.data = luaL_checklstring(l, 1, &json_len);


More information about the Tarantool-patches mailing list