* [Tarantool-patches] [PATCH] json: fix silent change of global json settings
@ 2020-02-10 7:57 Olga Arkhangelskaia
2020-02-10 13:08 ` Nikita Pettik
2020-02-13 23:17 ` Alexander Turenko
0 siblings, 2 replies; 9+ messages in thread
From: Olga Arkhangelskaia @ 2020-02-10 7:57 UTC (permalink / raw)
To: tarantool-patches
When json.decode is used with 2 arguments, 2nd argument seeps out to global
json settings. Morover, due to current serialier.cfg implementation it
remains invisible while checking settings by json.cfg. To prevent sucj
behaviour we stop writing to global serializer struct and use local one,
to get one-time action.
As was mention before json.cfg can not be trusted in this case, so to check that
everything remained unchanged we call decode twice with and without 2nd
argument.
Closes #4761
---
Issue:https://github.com/tarantool/tarantool/issues/4761
Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings
test/app-tap/json.test.lua | 7 ++++++-
third_party/lua-cjson/lua_cjson.c | 10 +++++-----
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
index fadfc74ec..a6b36ff3d 100755
--- a/test/app-tap/json.test.lua
+++ b/test/app-tap/json.test.lua
@@ -22,7 +22,7 @@ end
tap.test("json", function(test)
local serializer = require('json')
- test:plan(40)
+ test:plan(41)
test:test("unsigned", common.test_unsigned, serializer)
test:test("signed", common.test_signed, serializer)
@@ -94,6 +94,11 @@ tap.test("json", function(test)
'error: too many nested data structures')
test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
'global option remains unchanged')
+ --
+ -- gh-4761 json.decode silently changes global settings of json when called
+ -- with 2d parameter
+ --
+ test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
--
-- gh-3514: fix parsing integers with exponent in json
diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
index 3d25814f3..f855cbd80 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
"expected 1 or 2 arguments");
+ struct luaL_serializer *cfg = luaL_checkserializer(l);
+ struct luaL_serializer user_cfg = *cfg;
+ json.cfg = cfg;
if (lua_gettop(l) == 2) {
- struct luaL_serializer *user_cfg = luaL_checkserializer(l);
- luaL_serializer_parse_options(l, user_cfg);
+ luaL_serializer_parse_options(l, &user_cfg);
lua_pop(l, 1);
- json.cfg = user_cfg;
- } else {
- json.cfg = luaL_checkserializer(l);
+ json.cfg = &user_cfg;
}
json.data = luaL_checklstring(l, 1, &json_len);
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-10 7:57 [Tarantool-patches] [PATCH] json: fix silent change of global json settings Olga Arkhangelskaia
@ 2020-02-10 13:08 ` Nikita Pettik
2020-02-11 8:46 ` Olga Arkhangelskaia
2020-02-13 23:17 ` Alexander Turenko
1 sibling, 1 reply; 9+ messages in thread
From: Nikita Pettik @ 2020-02-10 13:08 UTC (permalink / raw)
To: Olga Arkhangelskaia; +Cc: tarantool-patches
On 10 Feb 10:57, Olga Arkhangelskaia wrote:
> When json.decode is used with 2 arguments, 2nd argument seeps out to global
> json settings. Morover,
Nit: Moreover.
> due to current serialier.cfg implementation it
-> serializer
> remains invisible while checking settings by json.cfg. To prevent sucj
-> such
> behaviour we stop writing to global serializer struct and use local one,
> to get one-time action.
> As was mention before json.cfg can not be trusted in this case, so to check that
> everything remained unchanged we call decode twice with and without 2nd
> argument.
>
> Closes #4761
Note that there's no 'Closes #4761' label on your actual branch.
I guess you simply forgot to push updated branch.
> ---
> + --
> + -- gh-4761 json.decode silently changes global settings of json when called
> + -- with 2d parameter
> + --
> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
>
> --
> -- gh-3514: fix parsing integers with exponent in json
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..f855cbd80 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
> "expected 1 or 2 arguments");
>
> + struct luaL_serializer *cfg = luaL_checkserializer(l);
Nit: I'd add a brief comment here (to avoid any confusions concerning
copying object on the stack):
diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
index f855cbd80..c9c987c8c 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -1006,6 +1006,14 @@ static int json_decode(lua_State *l)
struct luaL_serializer *cfg = luaL_checkserializer(l);
struct luaL_serializer user_cfg = *cfg;
+ /*
+ * user_cfg is per-call local version of global cfg: it is
+ * used if user passes custom options to :decode() method
+ * as a separate arguments. In this case it is required
+ * to avoid modifying global parameters. Life span of
+ * user_cfg is restricted by the scope of :decode() so it
+ * is enough to allocate it on the stack.
+ */
json.cfg = cfg;
if (lua_gettop(l) == 2) {
> + struct luaL_serializer user_cfg = *cfg;
> + json.cfg = cfg;
What is more, you can avoid premature copying on the stack:
@@ -1005,9 +1005,10 @@ static int json_decode(lua_State *l)
"expected 1 or 2 arguments");
struct luaL_serializer *cfg = luaL_checkserializer(l);
- struct luaL_serializer user_cfg = *cfg;
+ struct luaL_serializer user_cfg;
json.cfg = cfg;
if (lua_gettop(l) == 2) {
+ user_cfg = *cfg;
luaL_serializer_parse_options(l, &user_cfg);
lua_pop(l, 1);
json.cfg = &user_cfg;
> if (lua_gettop(l) == 2) {
> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> - luaL_serializer_parse_options(l, user_cfg);
> + luaL_serializer_parse_options(l, &user_cfg);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-10 13:08 ` Nikita Pettik
@ 2020-02-11 8:46 ` Olga Arkhangelskaia
2020-02-11 12:36 ` Nikita Pettik
0 siblings, 1 reply; 9+ messages in thread
From: Olga Arkhangelskaia @ 2020-02-11 8:46 UTC (permalink / raw)
To: Nikita Pettik; +Cc: tarantool-patches
Hi Nikita! Thanks for the review!
I fixed all issues that you have pointed out.
And pushed branch.
10.02.2020 16:08, Nikita Pettik пишет:
> On 10 Feb 10:57, Olga Arkhangelskaia wrote:
>> When json.decode is used with 2 arguments, 2nd argument seeps out to global
>> json settings. Morover,
> Nit: Moreover.
>
>> due to current serialier.cfg implementation it
> -> serializer
>
>> remains invisible while checking settings by json.cfg. To prevent sucj
> -> such
>
>> behaviour we stop writing to global serializer struct and use local one,
>> to get one-time action.
>> As was mention before json.cfg can not be trusted in this case, so to check that
>> everything remained unchanged we call decode twice with and without 2nd
>> argument.
>>
>> Closes #4761
> Note that there's no 'Closes #4761' label on your actual branch.
> I guess you simply forgot to push updated branch.
>
>> ---
>> + --
>> + -- gh-4761 json.decode silently changes global settings of json when called
>> + -- with 2d parameter
>> + --
>> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
>>
>> --
>> -- gh-3514: fix parsing integers with exponent in json
>> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
>> index 3d25814f3..f855cbd80 100644
>> --- a/third_party/lua-cjson/lua_cjson.c
>> +++ b/third_party/lua-cjson/lua_cjson.c
>> @@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
>> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
>> "expected 1 or 2 arguments");
>>
>> + struct luaL_serializer *cfg = luaL_checkserializer(l);
> Nit: I'd add a brief comment here (to avoid any confusions concerning
> copying object on the stack):
>
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index f855cbd80..c9c987c8c 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1006,6 +1006,14 @@ static int json_decode(lua_State *l)
>
> struct luaL_serializer *cfg = luaL_checkserializer(l);
> struct luaL_serializer user_cfg = *cfg;
> + /*
> + * user_cfg is per-call local version of global cfg: it is
> + * used if user passes custom options to :decode() method
> + * as a separate arguments. In this case it is required
> + * to avoid modifying global parameters. Life span of
> + * user_cfg is restricted by the scope of :decode() so it
> + * is enough to allocate it on the stack.
> + */
> json.cfg = cfg;
> if (lua_gettop(l) == 2) {
>
>> + struct luaL_serializer user_cfg = *cfg;
>> + json.cfg = cfg;
> What is more, you can avoid premature copying on the stack:
>
> @@ -1005,9 +1005,10 @@ static int json_decode(lua_State *l)
> "expected 1 or 2 arguments");
>
> struct luaL_serializer *cfg = luaL_checkserializer(l);
> - struct luaL_serializer user_cfg = *cfg;
> + struct luaL_serializer user_cfg;
> json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> + user_cfg = *cfg;
> luaL_serializer_parse_options(l, &user_cfg);
> lua_pop(l, 1);
> json.cfg = &user_cfg;
>
>> if (lua_gettop(l) == 2) {
>> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
>> - luaL_serializer_parse_options(l, user_cfg);
>> + luaL_serializer_parse_options(l, &user_cfg);
Here is new diff
diff --git a/third_party/lua-cjson/lua_cjson.c
b/third_party/lua-cjson/lua_cjson.c
index f855cbd80..5925e7e6f 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -1005,9 +1005,18 @@ static int json_decode(lua_State *l)
"expected 1 or 2 arguments");
struct luaL_serializer *cfg = luaL_checkserializer(l);
- struct luaL_serializer user_cfg = *cfg;
+ struct luaL_serializer user_cfg;
+ /*
+ * user_cfg is per-call local version of global cfg: it is
+ * used if user passes custom options to :decode() method
+ * as a separate arguments. In this case it is required
+ * to avoid modifying global parameters. Life span of
+ * user_cfg is restricted by the scope of :decode() so it
+ * is enough to allocate it on the stack.
+ */
json.cfg = cfg;
if (lua_gettop(l) == 2) {
+ user_cfg = *cfg;
luaL_serializer_parse_options(l, &user_cfg);
lua_pop(l, 1);
json.cfg = &user_cfg;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-11 8:46 ` Olga Arkhangelskaia
@ 2020-02-11 12:36 ` Nikita Pettik
0 siblings, 0 replies; 9+ messages in thread
From: Nikita Pettik @ 2020-02-11 12:36 UTC (permalink / raw)
To: Olga Arkhangelskaia; +Cc: tarantool-patches
On 11 Feb 11:46, Olga Arkhangelskaia wrote:
> Hi Nikita! Thanks for the review!
>
> I fixed all issues that you have pointed out.
>
> And pushed branch.
LGTM
> 10.02.2020 16:08, Nikita Pettik пишет:
> > On 10 Feb 10:57, Olga Arkhangelskaia wrote:
>
> Here is new diff
>
> diff --git a/third_party/lua-cjson/lua_cjson.c
> b/third_party/lua-cjson/lua_cjson.c
> index f855cbd80..5925e7e6f 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1005,9 +1005,18 @@ static int json_decode(lua_State *l)
> "expected 1 or 2 arguments");
>
> struct luaL_serializer *cfg = luaL_checkserializer(l);
> - struct luaL_serializer user_cfg = *cfg;
> + struct luaL_serializer user_cfg;
> + /*
> + * user_cfg is per-call local version of global cfg: it is
> + * used if user passes custom options to :decode() method
> + * as a separate arguments. In this case it is required
> + * to avoid modifying global parameters. Life span of
> + * user_cfg is restricted by the scope of :decode() so it
> + * is enough to allocate it on the stack.
> + */
> json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> + user_cfg = *cfg;
> luaL_serializer_parse_options(l, &user_cfg);
> lua_pop(l, 1);
> json.cfg = &user_cfg;
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-10 7:57 [Tarantool-patches] [PATCH] json: fix silent change of global json settings Olga Arkhangelskaia
2020-02-10 13:08 ` Nikita Pettik
@ 2020-02-13 23:17 ` Alexander Turenko
2020-02-14 15:55 ` Olga Arkhangelskaia
1 sibling, 1 reply; 9+ messages in thread
From: Alexander Turenko @ 2020-02-13 23:17 UTC (permalink / raw)
To: Olga Arkhangelskaia; +Cc: tarantool-patches
> Issue:https://github.com/tarantool/tarantool/issues/4761
> Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings
Nice catch!
The patch itself is quite straighforward and obviously okay. I have
several comments around wording, formatting and the test case.
I'll paste the commit from the updated branch to comment it inline.
> commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb
> Author: Olga Arkhangelskaia <arkholga@tarantool.org>
> Date: Wed Feb 5 14:05:25 2020 +0300
>
> json: fix silent change of global json settings
'global' is not precise term here: it can be arbitrary json
(de)serializer instance (created by json.new()), not necessarily the
default one. It is better to say 'instance options'. Something like
"don't spoil instance options with per-call ones".
>
> When json.decode is used with 2 arguments, 2nd argument seeps out to global
> json settings. Moreover, due to current serializer.cfg implementation it
Same for 'global'.
> remains invisible while checking settings by json.cfg. To prevent such
> behaviour we stop writing to global serializer struct and use local one,
> to get one-time action.
> As was mention before json.cfg can not be trusted in this case, so to check that
> everything remained unchanged we call decode twice with and without 2nd
> argument.
Didn't get the second paragraph. Is it about the test? We usually don't
describe tests in commit messages, but rather provide comments in the
code of a test when it is necessary.
Please, mention the commit where the degradation occurs (see [2] for
example).
>
> Closes #4761
Several nits:
* Fit a commit body within 72 symbols.
* It is better to split paragraphs with an empty line.
See [1] for formal rules and examples.
[1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/
[2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc
>
> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
> index fadfc74ec..a6b36ff3d 100755
> --- a/test/app-tap/json.test.lua
> +++ b/test/app-tap/json.test.lua
> @@ -22,7 +22,7 @@ end
>
> tap.test("json", function(test)
> local serializer = require('json')
> - test:plan(40)
> + test:plan(41)
>
> test:test("unsigned", common.test_unsigned, serializer)
> test:test("signed", common.test_signed, serializer)
> @@ -94,6 +94,11 @@ tap.test("json", function(test)
> 'error: too many nested data structures')
> test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
> 'global option remains unchanged')
> + --
> + -- gh-4761 json.decode silently changes global settings of json when called
> + -- with 2d parameter
> + --
Several nits:
* Separate the comment from the previous test case with an empty line.
* Use a colon after 'gh-xxxx' to unify it with other descriptions within
the file.
* Fit a comment within 66 symbols (see [2]; it is for C, but in fact we
apply the rule to Lua).
* Typo: 2d -> 2nd.
* Same as above for 'global'.
[2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/
> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
The test case should be as much independent from other as possible. Here
it uses the previous one, which calls <json instance>.decode() with
per-call options. Moreover, 'Tarantool Engineer Standard Operating
Procedures' document now obligates a developer to add a test case for a
bug fix within a separate file (see 'Writing tests' section) and the
reason is mostly to push developers to guarantee test cases
independence. Please, note that it also holds a certain naming policy.
You can use '{"foo": "bar"}' json string and {decode_max_depth = 1}
option to minimize the test case. It will be easier to read.
test.ok() has four parameters: test object (the colon adds it),
condition to check, message and extra data to show when the test case
fails. Let's provide the message, because of two reasons:
* It eases initial analyzing of a fail when it occurs and so generally
recommended.
* In case of fail the second return value of pcall() will be shown as
the message and it looks as unintended effect.
I think it would be good to provide a regression test of the same kind
for json.encode().
>
> --
> -- gh-3514: fix parsing integers with exponent in json
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..5925e7e6f 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l)
> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
> "expected 1 or 2 arguments");
>
> + struct luaL_serializer *cfg = luaL_checkserializer(l);
> + struct luaL_serializer user_cfg;
> + /*
> + * user_cfg is per-call local version of global cfg: it is
> + * used if user passes custom options to :decode() method
> + * as a separate arguments. In this case it is required
> + * to avoid modifying global parameters. Life span of
> + * user_cfg is restricted by the scope of :decode() so it
> + * is enough to allocate it on the stack.
> + */
Nit: We usually provide a description before a field / function /
variable, not after.
> + json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> - luaL_serializer_parse_options(l, user_cfg);
> + user_cfg = *cfg;
Technically it also copies triggers. I would rather use
luaL_serializer_copy_options() and left a comment that triggers are left
uninitialized intentionally: the code should not run them. It is better
because of two reasons:
* trigger_run() would segfault if it will be called on user_cfg somehow
and it will explicitly shows that something is going in a wrong way
(likely on tests during development). It is better than change other
serializer options using a trigger.
* It is more obvious what is going here (I guess that the problem that
you fixed here appears purely because this assignment was considered
as the mistake and a pointer assignment should be here).
> + luaL_serializer_parse_options(l, &user_cfg);
> lua_pop(l, 1);
> - json.cfg = user_cfg;
> - } else {
> - json.cfg = luaL_checkserializer(l);
> + json.cfg = &user_cfg;
> }
>
> json.data = luaL_checklstring(l, 1, &json_len);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-13 23:17 ` Alexander Turenko
@ 2020-02-14 15:55 ` Olga Arkhangelskaia
2020-02-16 0:06 ` Alexander Turenko
0 siblings, 1 reply; 9+ messages in thread
From: Olga Arkhangelskaia @ 2020-02-14 15:55 UTC (permalink / raw)
To: Alexander Turenko; +Cc: tarantool-patches
Hi! Thanks a lot for the review. I did not make separate version
because there is any major change. I have fixed everything you mentioned
and moved the test case to the separate file.
You will find new diff below.
14.02.2020 2:17, Alexander Turenko пишет:
>> Issue:https://github.com/tarantool/tarantool/issues/4761
>> Branch:https://github.com/tarantool/tarantool/tree/OKriw/gh-4761-json.decode-silently-changes-config-when-used-with-config-settings
> Nice catch!
>
> The patch itself is quite straighforward and obviously okay. I have
> several comments around wording, formatting and the test case.
>
> I'll paste the commit from the updated branch to comment it inline.
>
>> commit ddc51e76469f86b29ba0c37c6e2911bbba569bdb
>> Author: Olga Arkhangelskaia <arkholga@tarantool.org>
>> Date: Wed Feb 5 14:05:25 2020 +0300
>>
>> json: fix silent change of global json settings
> 'global' is not precise term here: it can be arbitrary json
> (de)serializer instance (created by json.new()), not necessarily the
> default one. It is better to say 'instance options'. Something like
> "don't spoil instance options with per-call ones".
>
>>
>> When json.decode is used with 2 arguments, 2nd argument seeps out to global
>> json settings. Moreover, due to current serializer.cfg implementation it
> Same for 'global'.
>
>> remains invisible while checking settings by json.cfg. To prevent such
>> behaviour we stop writing to global serializer struct and use local one,
>> to get one-time action.
>> As was mention before json.cfg can not be trusted in this case, so to check that
>> everything remained unchanged we call decode twice with and without 2nd
>> argument.
> Didn't get the second paragraph. Is it about the test? We usually don't
> describe tests in commit messages, but rather provide comments in the
> code of a test when it is necessary.
>
> Please, mention the commit where the degradation occurs (see [2] for
> example).
>
>>
>> Closes #4761
> Several nits:
>
> * Fit a commit body within 72 symbols.
> * It is better to split paragraphs with an empty line.
>
> See [1] for formal rules and examples.
>
> [1]: https://www.tarantool.io/en/doc/1.10/dev_guide/developer_guidelines/
> [2]: https://github.com/tarantool/tarantool/commit/ccacba28f813fb99fd9eaf07fb41bf604dd341bc
>
>> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
>> index fadfc74ec..a6b36ff3d 100755
>> --- a/test/app-tap/json.test.lua
>> +++ b/test/app-tap/json.test.lua
>> @@ -22,7 +22,7 @@ end
>>
>> tap.test("json", function(test)
>> local serializer = require('json')
>> - test:plan(40)
>> + test:plan(41)
>>
>> test:test("unsigned", common.test_unsigned, serializer)
>> test:test("signed", common.test_signed, serializer)
>> @@ -94,6 +94,11 @@ tap.test("json", function(test)
>> 'error: too many nested data structures')
>> test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
>> 'global option remains unchanged')
>> + --
>> + -- gh-4761 json.decode silently changes global settings of json when called
>> + -- with 2d parameter
>> + --
> Several nits:
>
> * Separate the comment from the previous test case with an empty line.
> * Use a colon after 'gh-xxxx' to unify it with other descriptions within
> the file.
> * Fit a comment within 66 symbols (see [2]; it is for C, but in fact we
> apply the rule to Lua).
> * Typo: 2d -> 2nd.
> * Same as above for 'global'.
>
> [2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/
>
>> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
> The test case should be as much independent from other as possible. Here
> it uses the previous one, which calls <json instance>.decode() with
> per-call options. Moreover, 'Tarantool Engineer Standard Operating
> Procedures' document now obligates a developer to add a test case for a
> bug fix within a separate file (see 'Writing tests' section) and the
> reason is mostly to push developers to guarantee test cases
> independence. Please, note that it also holds a certain naming policy.
>
> You can use '{"foo": "bar"}' json string and {decode_max_depth = 1}
> option to minimize the test case. It will be easier to read.
>
> test.ok() has four parameters: test object (the colon adds it),
> condition to check, message and extra data to show when the test case
> fails. Let's provide the message, because of two reasons:
>
> * It eases initial analyzing of a fail when it occurs and so generally
> recommended.
> * In case of fail the second return value of pcall() will be shown as
> the message and it looks as unintended effect.
>
> I think it would be good to provide a regression test of the same kind
> for json.encode().
>
>>
>> --
>> -- gh-3514: fix parsing integers with exponent in json
>> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
>> index 3d25814f3..5925e7e6f 100644
>> --- a/third_party/lua-cjson/lua_cjson.c
>> +++ b/third_party/lua-cjson/lua_cjson.c
>> @@ -1004,13 +1004,22 @@ static int json_decode(lua_State *l)
>> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
>> "expected 1 or 2 arguments");
>>
>> + struct luaL_serializer *cfg = luaL_checkserializer(l);
>> + struct luaL_serializer user_cfg;
>> + /*
>> + * user_cfg is per-call local version of global cfg: it is
>> + * used if user passes custom options to :decode() method
>> + * as a separate arguments. In this case it is required
>> + * to avoid modifying global parameters. Life span of
>> + * user_cfg is restricted by the scope of :decode() so it
>> + * is enough to allocate it on the stack.
>> + */
> Nit: We usually provide a description before a field / function /
> variable, not after.
>
>> + json.cfg = cfg;
>> if (lua_gettop(l) == 2) {
>> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
>> - luaL_serializer_parse_options(l, user_cfg);
>> + user_cfg = *cfg;
> Technically it also copies triggers. I would rather use
> luaL_serializer_copy_options() and left a comment that triggers are left
> uninitialized intentionally: the code should not run them. It is better
> because of two reasons:
>
> * trigger_run() would segfault if it will be called on user_cfg somehow
> and it will explicitly shows that something is going in a wrong way
> (likely on tests during development). It is better than change other
> serializer options using a trigger.
> * It is more obvious what is going here (I guess that the problem that
> you fixed here appears purely because this assignment was considered
> as the mistake and a pointer assignment should be here).
>
>> + luaL_serializer_parse_options(l, &user_cfg);
>> lua_pop(l, 1);
>> - json.cfg = user_cfg;
>> - } else {
>> - json.cfg = luaL_checkserializer(l);
>> + json.cfg = &user_cfg;
>> }
>>
>> json.data = luaL_checklstring(l, 1, &json_len);
-Subject: [PATCH] json: fix silent change of global json settings
+Subject: [PATCH] json: don't spoil instance options with per-call
-When json.decode is used with 2 arguments, 2nd argument seeps out to
global
-json settings. Morover, due to current serialier.cfg implementation it
-remains invisible while checking settings by json.cfg. To prevent sucj
-behaviour we stop writing to global serializer struct and use local one,
-to get one-time action.
-As was mention before json.cfg can not be trusted in this case, so to
check that
-everything remained unchanged we call decode twice with and without 2nd
-argument.
+When json.decode is used with 2 arguments, 2nd argument seeps out to
the json
+configuration of the instance. Moreover, due to current serializer.cfg
+implementation it remains invisible while checking settings by json.cfg.
+
+This fixes commit 6508ddb ("json: fix stack-use-after-scope in
json_decode()").
-diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
-index fadfc74ec..a6b36ff3d 100755
---- a/test/app-tap/json.test.lua
-+++ b/test/app-tap/json.test.lua
-@@ -22,7 +22,7 @@ end
-
- tap.test("json", function(test)
- local serializer = require('json')
-- test:plan(40)
-+ test:plan(41)
-
- test:test("unsigned", common.test_unsigned, serializer)
- test:test("signed", common.test_signed, serializer)
-@@ -94,6 +94,11 @@ tap.test("json", function(test)
- 'error: too many nested data structures')
- test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
- 'global option remains unchanged')
++++ b/test/app-tap/gh-4761-fix-json-decode.test.lua
+@@ -0,0 +1,28 @@
++#!/usr/bin/env tarantool
++
++local tap = require('tap')
++tap.test("json", function(test)
++ local serializer = require('json')
++ test:plan(4)
++
+ --
-+ -- gh-4761 json.decode silently changes global settings of json
when called
-+ -- with 2d parameter
++ -- gh-4761: json.decode silently changes instances settings when
called
++ -- with 2nd parameter.
+ --
-+ test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
-
- --
- -- gh-3514: fix parsing integers with exponent in json
++ test:ok(not pcall(serializer.decode, '{"foo":{"bar": 1}}',
++ {decode_max_depth = 1}),
++ 'error: too many nested data structures')
++ test:ok(pcall(serializer.decode, '{"foo":{"bar": 1}}'),
++ 'json instance settings are unchanged')
++
++ --
++ -- Same check for json.encode.
++ --
++ local nan = 1/0
++ test:ok(not pcall(serializer.encode, {a = 1/0},
++ {encode_invalid_numbers = false}),
++ 'expected error with NaN encoding with .encode')
++ test:ok(pcall(serializer.encode, {a=nan}),
++ "json instance settings are unchanged")
++
++end)
diff --git a/third_party/lua-cjson/lua_cjson.c
b/third_party/lua-cjson/lua_cjson.c
-index 3d25814f3..f855cbd80 100644
+index 3d25814f3..8ff2ca89a 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
-@@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
+@@ -1004,13 +1004,26 @@ static int json_decode(lua_State *l)
luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
"expected 1 or 2 arguments");
+ struct luaL_serializer *cfg = luaL_checkserializer(l);
-+ struct luaL_serializer user_cfg = *cfg;
++
++ /* user_cfg is per-call local version of global cfg: it is
++ * used if user passes custom options to :decode() method
++ * as a separate arguments. In this case it is required
++ * to avoid modifying global parameters. Life span of
++ * user_cfg is restricted by the scope of :decode() so it
++ * is enough to allocate it on the stack. */
++ struct luaL_serializer user_cfg;
+ json.cfg = cfg;
if (lua_gettop(l) == 2) {
- struct luaL_serializer *user_cfg = luaL_checkserializer(l);
- luaL_serializer_parse_options(l, user_cfg);
-+ luaL_serializer_parse_options(l, &user_cfg);
- lua_pop(l, 1);
+- lua_pop(l, 1);
- json.cfg = user_cfg;
- } else {
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-14 15:55 ` Olga Arkhangelskaia
@ 2020-02-16 0:06 ` Alexander Turenko
0 siblings, 0 replies; 9+ messages in thread
From: Alexander Turenko @ 2020-02-16 0:06 UTC (permalink / raw)
To: Olga Arkhangelskaia; +Cc: tarantool-patches
Pushed to master, 2.3, 2.2 and 1.10 with minor changes. CCed Kirill.
The changes I made are described below and I hope they are NOT about my
personal feeling how code should be organized and commented, but how the
whole team operates on it. I explained a reason that is under each
change (where it is not very obvious), so you may verify future patches
versus those points youself.
Some changes, however, may look as my personal taste of a good code
shape. Excuse me if it is so, I tried to refrain myself from this kind
of modifications.
WBR, Alexander Turenko.
----
Pasted the new patch to comment it, because the diff of diffs you pasted
does not have all recent changes (at least usage of
luaL_serializer_copy_options()).
> commit cc7e3fc49cc71ac0b4616c6f073a40fdc7ed3979
> Author: Olga Arkhangelskaia <arkholga@tarantool.org>
> Date: Wed Feb 5 14:05:25 2020 +0300
>
> json: don't spoil instance options with per-call
I see, you removed 'ones' from the end to fit 50 symbols limit. However
'per-call' is adjective and it can not be used w/o a noun (or noun
group). So I rephrased it a bit:
| json: don't spoil instance with per-call options
>
> When json.decode is used with 2 arguments, 2nd argument seeps out to the json
> configuration of the instance. Moreover, due to current serializer.cfg
> implementation it remains invisible while checking settings by json.cfg.
Reformatted to fit 72 symbols. The limit comes from the tradition to
format any text in the way that will look good on 80x25 screen: 4
symbols for the indent `git log` adds before the text, 72 for the text
itself and 4 symbols to make the text visually centered on the screen.
Changed 'by json.cfg' (at end of the sentence) to 'using json.cfg
table', it looks more strict.
>
> This fixes commit 6508ddb ("json: fix stack-use-after-scope in json_decode()").
Use full 40 hex digits commit hash to follow our usual style (GitHub
renders it to shorter hash, but git log shows as is).
>
> Closes #4761
>
> diff --git a/test/app-tap/gh-4761-fix-json-decode.test.lua b/test/app-tap/gh-4761-fix-json-decode.test.lua
> new file mode 100755
> index 000000000..d10849317
> --- /dev/null
> +++ b/test/app-tap/gh-4761-fix-json-decode.test.lua
Changed the test name to 'gh-4761-json-per-call-options.test.lua' to
follow the style of other 'gh-*' regression tests: it is either noun
group describing a component where the problem was or a noun group
explaining the problem itself. Not a sentence in the imperative mood.
> @@ -0,0 +1,28 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +tap.test("json", function(test)
Changed double quites to single quotes to unify the style.
Changed the test name to 'gh-4761-json-per-call-options',
because 'json' looks too common here.
> + local serializer = require('json')
Moved the 'require' statement to the top level and changed the variable
name to 'json': it is more usual for our Lua code.
> + test:plan(4)
> +
> + --
> + -- gh-4761: json.decode silently changes instances settings when called
> + -- with 2nd parameter.
Cite from the previous review notes:
| * Fit a comment within 66 symbols (see [2]; it is for C, but in fact we
| apply the rule to Lua).
|
| [2]: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/
Reformatted the comment to follow this rule. Fixed typo: 'instances' ->
'instance' (plural is not needed here).
Moved the comment above: since we test both :decode() and :encode() and
the comment describes the situation that is theoretically possible for
both methods, it looks more applicable for the entire test. Added
'verify json.encode as well' sentence.
> + --
> + test:ok(not pcall(serializer.decode, '{"foo":{"bar": 1}}',
> + {decode_max_depth = 1}),
> + 'error: too many nested data structures')
I would use test:ok() and so for the test case itself, but assert() for
test preparation code.
I would also split the call to :decode() and result verification code
for readability matters.
So, rewrote in this way:
| -- Preparation code: call :decode() with a custom option.
| local ok, err = pcall(json.decode, '{"foo": {"bar": 1}}',
| {decode_max_depth = 1})
| assert(not ok, 'expect "too many nested data structures" error')
> + test:ok(pcall(serializer.decode, '{"foo":{"bar": 1}}'),
> + 'json instance settings are unchanged')
Splitted :decode() call from test:ok() as above. Also verify the result
of decoding:
| -- Verify that the instance option remains unchanged.
| local exp_res = {foo = {bar = 1}}
| local ok, res = pcall(json.decode, '{"foo": {"bar": 1}}')
| test:is_deeply({ok, res}, {true, exp_res},
| 'json instance settings remain unchanged after :decode()')
Added :decode() to the test case name to differentiate it from the one
below (re :encode()) in the TAP13 output.
> +
> + --
> + -- Same check for json.encode.
> + --
> + local nan = 1/0
> + test:ok(not pcall(serializer.encode, {a = 1/0},
Changed '1/0' to 'nan'.
> + {encode_invalid_numbers = false}),
> + 'expected error with NaN encoding with .encode')
> + test:ok(pcall(serializer.encode, {a=nan}),
> + "json instance settings are unchanged")
Made the changes in the spirit of ones described above.
> +
> +end)
It is good to exit from a test with a non-zero exit code in case of a
fail (because our test harness may miss some kinds of fails listed in
TAP13 output), so I have added the check of tap.test() result:
| local res = tap.test(<...>, function(test)
| <...test cases...>
| end)
|
| os.exit(res and 0 or 1)
It is mentioned in [1] for the following form:
| local test = tap.test(<...>)
| <...test cases...>
| os.exit(test:check() and 0 or 1)
test:check() returns exactly same code as tap.test(<...>) when the
latter is called with a function.
[1]: https://github.com/tarantool/doc/issues/1004
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..8ff2ca89a 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,26 @@ static int json_decode(lua_State *l)
> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
> "expected 1 or 2 arguments");
>
> + struct luaL_serializer *cfg = luaL_checkserializer(l);
> +
> + /* user_cfg is per-call local version of global cfg: it is
Changed 'global cfg' to 'serializer instance options'.
> + * used if user passes custom options to :decode() method
Changed: user -> a user.
> + * as a separate arguments. In this case it is required
Changed 'arguments' to the singular form.
> + * to avoid modifying global parameters. Life span of
Changed 'global parameters' to 'options of the instance'.
> + * user_cfg is restricted by the scope of :decode() so it
> + * is enough to allocate it on the stack. */
> + struct luaL_serializer user_cfg;
We use two styles of comments (for one-line and multi-line comments):
| /* Hello. */
| /*
| * Hello,
| * world.
| */
Changed the comment to follow the latter style.
If you're curious, Linus about comments style (strong language, you have
been warned): https://lkml.org/lkml/2016/7/8/625
> + json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> - luaL_serializer_parse_options(l, user_cfg);
> - lua_pop(l, 1);
> - json.cfg = user_cfg;
> - } else {
> - json.cfg = luaL_checkserializer(l);
> +
> + /* Only copy cfg options and left update triggers, see
> + * struct luaL_serializer, uninitialized. The code should
> + * never run them, since we must not change the json config of the
> + * instance. */
> + luaL_serializer_copy_options(&user_cfg, cfg);
> + luaL_serializer_parse_options(l, &user_cfg);
> + lua_pop(l, 1);
> + json.cfg = &user_cfg;
> }
Tab / space mix. The code is inherited from an other source, so we
should follow the surrounding style and use spaces for indentation.
Fixed.
I rewrote the comment:
| /*
| * on_update triggers are left uninitialized for user_cfg.
| * The decoding code don't (and shouldn't) run them.
| */
The reason is to make the comment technically strict:
* The code not only copy options from a serializer instance, but also
parse and apply user-provided options. I removed this clause to don't
describe this (it is quite obvious from the code now).
* Only changes from Lua run on_update triggers, because C code don't
change options expect when a serializer is created or when the change
is result of on_update() trigger run. So even if the decoding code
would change an option the triggers will not be run. So I only said
that triggers don't and shouldn't be run rather than that the decoding
code don't change the options.
Hope you don't mind.
>
> json.data = luaL_checklstring(l, 1, &json_len);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Tarantool-patches] [PATCH] json: fix silent change of global json settings
2020-02-05 12:28 Olga Arkhangelskaia
@ 2020-02-10 7:57 ` Olga Arkhangelskaia
0 siblings, 0 replies; 9+ messages in thread
From: Olga Arkhangelskaia @ 2020-02-10 7:57 UTC (permalink / raw)
To: tarantool-patches
Forget to add issue number and branch. Resend.
05.02.2020 15:28, Olga Arkhangelskaia пишет:
> When json.decode is used with 2 arguments, 2nd argument seeps out to global
> json settings. Morover, due to current serialier.cfg implementation it
> remains invisible while checking settings by json.cfg. To prevent sucj
> behaviour we stop writing to global serializer struct and use local one,
> to get one-time action.
> As was mention before json.cfg can not be trusted in this case, so to check that
> everything remained unchanged we call decode twice with and without 2nd
> argument.
> ---
> test/app-tap/json.test.lua | 7 ++++++-
> third_party/lua-cjson/lua_cjson.c | 10 +++++-----
> 2 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
> index fadfc74ec..a6b36ff3d 100755
> --- a/test/app-tap/json.test.lua
> +++ b/test/app-tap/json.test.lua
> @@ -22,7 +22,7 @@ end
>
> tap.test("json", function(test)
> local serializer = require('json')
> - test:plan(40)
> + test:plan(41)
>
> test:test("unsigned", common.test_unsigned, serializer)
> test:test("signed", common.test_signed, serializer)
> @@ -94,6 +94,11 @@ tap.test("json", function(test)
> 'error: too many nested data structures')
> test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
> 'global option remains unchanged')
> + --
> + -- gh-4761 json.decode silently changes global settings of json when called
> + -- with 2d parameter
> + --
> + test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
>
> --
> -- gh-3514: fix parsing integers with exponent in json
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d25814f3..f855cbd80 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
> luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
> "expected 1 or 2 arguments");
>
> + struct luaL_serializer *cfg = luaL_checkserializer(l);
> + struct luaL_serializer user_cfg = *cfg;
> + json.cfg = cfg;
> if (lua_gettop(l) == 2) {
> - struct luaL_serializer *user_cfg = luaL_checkserializer(l);
> - luaL_serializer_parse_options(l, user_cfg);
> + luaL_serializer_parse_options(l, &user_cfg);
> lua_pop(l, 1);
> - json.cfg = user_cfg;
> - } else {
> - json.cfg = luaL_checkserializer(l);
> + json.cfg = &user_cfg;
> }
>
> json.data = luaL_checklstring(l, 1, &json_len);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Tarantool-patches] [PATCH] json: fix silent change of global json settings
@ 2020-02-05 12:28 Olga Arkhangelskaia
2020-02-10 7:57 ` Olga Arkhangelskaia
0 siblings, 1 reply; 9+ messages in thread
From: Olga Arkhangelskaia @ 2020-02-05 12:28 UTC (permalink / raw)
To: tarantool-patches
When json.decode is used with 2 arguments, 2nd argument seeps out to global
json settings. Morover, due to current serialier.cfg implementation it
remains invisible while checking settings by json.cfg. To prevent sucj
behaviour we stop writing to global serializer struct and use local one,
to get one-time action.
As was mention before json.cfg can not be trusted in this case, so to check that
everything remained unchanged we call decode twice with and without 2nd
argument.
---
test/app-tap/json.test.lua | 7 ++++++-
third_party/lua-cjson/lua_cjson.c | 10 +++++-----
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
index fadfc74ec..a6b36ff3d 100755
--- a/test/app-tap/json.test.lua
+++ b/test/app-tap/json.test.lua
@@ -22,7 +22,7 @@ end
tap.test("json", function(test)
local serializer = require('json')
- test:plan(40)
+ test:plan(41)
test:test("unsigned", common.test_unsigned, serializer)
test:test("signed", common.test_signed, serializer)
@@ -94,6 +94,11 @@ tap.test("json", function(test)
'error: too many nested data structures')
test:is(serializer.cfg.decode_max_depth, orig_decode_max_depth,
'global option remains unchanged')
+ --
+ -- gh-4761 json.decode silently changes global settings of json when called
+ -- with 2d parameter
+ --
+ test:ok(pcall(serializer.decode,'{"1":{"b":{"c":1,"d":null}},"a":1}'))
--
-- gh-3514: fix parsing integers with exponent in json
diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
index 3d25814f3..f855cbd80 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -1004,13 +1004,13 @@ static int json_decode(lua_State *l)
luaL_argcheck(l, lua_gettop(l) == 2 || lua_gettop(l) == 1, 1,
"expected 1 or 2 arguments");
+ struct luaL_serializer *cfg = luaL_checkserializer(l);
+ struct luaL_serializer user_cfg = *cfg;
+ json.cfg = cfg;
if (lua_gettop(l) == 2) {
- struct luaL_serializer *user_cfg = luaL_checkserializer(l);
- luaL_serializer_parse_options(l, user_cfg);
+ luaL_serializer_parse_options(l, &user_cfg);
lua_pop(l, 1);
- json.cfg = user_cfg;
- } else {
- json.cfg = luaL_checkserializer(l);
+ json.cfg = &user_cfg;
}
json.data = luaL_checklstring(l, 1, &json_len);
--
2.20.1 (Apple Git-117)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-02-16 0:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-10 7:57 [Tarantool-patches] [PATCH] json: fix silent change of global json settings Olga Arkhangelskaia
2020-02-10 13:08 ` Nikita Pettik
2020-02-11 8:46 ` Olga Arkhangelskaia
2020-02-11 12:36 ` Nikita Pettik
2020-02-13 23:17 ` Alexander Turenko
2020-02-14 15:55 ` Olga Arkhangelskaia
2020-02-16 0:06 ` Alexander Turenko
-- strict thread matches above, loose matches on Subject: below --
2020-02-05 12:28 Olga Arkhangelskaia
2020-02-10 7:57 ` Olga Arkhangelskaia
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox