[tarantool-patches] Re: [PATCH] json: improve error messages
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Thu Oct 17 00:45:51 MSK 2019
Hi! Thanks for the patch!
Sorry, but I am not sure that now the messages are
better. Just look at it:
tarantool> json.decode('{')
---
- error: Expected object key string but found '\0' on line 1 at character 2 here ^^^...
...
What is `^^^`, what is `...`? In the ticket it is said, that a user should have a
context, but there is no a context.
I was thinking, that the author of 4339 rather meant that ^^^ should be below the
error in an error string, like this:
tarantool> json.decode('{')
---
- error: Expected object key string but found '\0' on line 1 at character 2 here:
"{ "
^^^
...
But I do realize, that it is too complex. And still ^^^ looks confusing.
It looks like arrows pointing up, while you are trying to put a
context after them.
At least, you could make a context like this:
tarantool> json.decode('{')
---
- error: Expected object key string but found '\0' on line 1 at character 2 here
"{>>\0<<"
...
So you would have arrows pointing right at the bad position.
I tried an error not related to eof, and got the same confusing
result:
tarantool> json.decode('{1: "world"}')
---
- error: 'Expected object key string but found int on line 1 at character 2 here ^^^:
"world"}...'
...
>From that message looks like the error is in ':', or in "world", or
somewhere above because of up arrows.
See 3 comments below.
> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 3d7edbf28..aab662585 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c
> @@ -72,23 +72,23 @@ typedef enum {
> } json_token_type_t;
>
> static const char *json_token_type_name[] = {
> - "T_OBJ_BEGIN",
> - "T_OBJ_END",
> - "T_ARR_BEGIN",
> - "T_ARR_END",
> - "T_STRING",
> - "T_UINT",
> - "T_INT",
> - "T_NUMBER",
> - "T_BOOLEAN",
> - "T_NULL",
> - "T_COLON",
> - "T_COMMA",
> - "T_END",
> - "T_WHITESPACE",
> - "T_LINEFEED",
> - "T_ERROR",
> - "T_UNKNOWN",
> + "'{'",
> + "'}'",
> + "'['",
> + "']'",
> + "string",
> + "unsigned int",
> + "int",
> + "number",
> + "boolean",
> + "null",
> + "':'",
> + "','",
> + "'\\0'",
1. I don't think this is a good idea. Probably,
just 'end' would be better. Users don't write
the terminating zero explicitly, and its presence
in an error message is confusing.
> + "whitespace",
> + "'\\n'",
2. Please, no special characters: no \0, no \n, no \t
or anything else. Here 'new line' would look easier
to understand.
> + "error",
> + "unknown symbol",
> NULL
> };
>
> @@ -825,6 +825,21 @@ static void json_next_token(json_parse_t *json, json_token_t *token)
> json_set_token_error(token, json, "invalid token");
> }
>
> +/*
> + * Copy 10 or less (if string @a ptr is shorter than 10)
> + * characters to static string buffer @a aux_str.
> + *
> + * @param aux_str String static buffer to fill.
> + * @param ptr String with the context.
> + */
> +static void fill_context(char *aux_str, const char *ptr)
3. So as a context you understand 10 symbols *after* the error?
I don't think that will help with error understanding, because
a user won't even see the error position. I think, you need to
take some symbols before and some after the bad position.
Also, please, make 10 a enum constant.
> +{
> + int i = 0;
> + for(; ptr[i] != '\0' && i < 10; i++)
> + aux_str[i] = ptr[i];
> + aux_str[i] = '\0';
> +}
> +
> /* This function does not return.
> * DO NOT CALL WITH DYNAMIC MEMORY ALLOCATED.
> * The only supported exception is the temporary parser string
More information about the Tarantool-patches
mailing list