[Tarantool-patches] [PATCH luajit 2/2][v2] Followup fix for embedded bytecode loader.
Sergey Kaplun
skaplun at tarantool.org
Tue Sep 5 15:55:53 MSK 2023
Hi, Sergey!
Thanks for the patch!
Please, consider my comments below.
On 31.08.23, Sergey Bronnikov wrote:
> From: Sergey Bronnikov <sergeyb at tarantool.org>
>
> (cherry-picked from commit e49863eda13d095b1a78fd4ca0fd3a6a9a17d782)
>
> The patch follows up a previous patch and limits the total size of a
> chunk load by `lua_load` with size `LJ_MAX_BUF - 1`.
>
> Sergey Bronnikov:
> * added the description and the test
> ---
> src/lj_lex.c | 1 +
> test/tarantool-c-tests/lj-549-lua_load.test.c | 134 ++++++++++++++++++
I suggest renaming the test to lj-549-lua-load.test.c to be consistent with
other tests.
> 2 files changed, 135 insertions(+)
> create mode 100644 test/tarantool-c-tests/lj-549-lua_load.test.c
>
> diff --git a/src/lj_lex.c b/src/lj_lex.c
> index 6291705f..13495c41 100644
> --- a/src/lj_lex.c
> +++ b/src/lj_lex.c
> @@ -51,6 +51,7 @@ static LJ_NOINLINE LexChar lex_more(LexState *ls)
> if (sz >= LJ_MAX_BUF) {
> if (sz != ~(size_t)0) lj_err_mem(ls->L);
> sz = ~(uintptr_t)0 - (uintptr_t)p;
> + if (sz >= LJ_MAX_BUF) sz = LJ_MAX_BUF-1;
> ls->endmark = 1;
> }
> ls->pe = p + sz;
> diff --git a/test/tarantool-c-tests/lj-549-lua_load.test.c b/test/tarantool-c-tests/lj-549-lua_load.test.c
> new file mode 100644
> index 00000000..9baa7a1a
> --- /dev/null
> +++ b/test/tarantool-c-tests/lj-549-lua_load.test.c
> @@ -0,0 +1,134 @@
> +#include <assert.h>
This include is excess.
> +#include <stdint.h>
Ditto.
> +#include <stddef.h>
> +#include <string.h>
Ditto.
> +#include <stdlib.h>
> +#include <stdio.h>
Ditto.
> +
> +#include <lua.h>
> +#include <lualib.h>
This include is excess since all libs are opened via utils.
> +#include <lauxlib.h>
This include is excess since there is no `luaL*` functions or structures
usage (and there is no usage of the `LUA_ERRFILE`, `LUA_NOREF`,
`LUA_REFNIL`).
> +
> +#include "test.h"
> +#include "utils.h"
> +
> +/* Need for skipcond. */
> +#include "lj_arch.h"
There is no skipconditions, so this include may be dropped.
> +
> +/* Defined in lj_def.h. */
> +#define LJ_MAX_MEM32 0x7fffff00 /* Max. 32 bit memory allocation. */
> +#define LJ_MAX_BUF LJ_MAX_MEM32 /* Max. buffer length. */
Why don't use `#include "lj_def.h"` instead and mention what we need
from it?
Reminder: this is kind of unit tests (or these C tests may implement
unit test). So, we can include internal libraries, and this is OK for
__our C tests__.
> +
> +/* Defined in lua.h. */
> +/* mark for precompiled code (`<esc>Lua') */
> +#define LUA_SIGNATURE "\033Lua"
We already included <lua.h>, so this define isn't required.
> +
> +#define UNUSED(x) ((void)(x))
> +
> +/**
There is no need in double '*' outside functions (we're not in Kansas
anymore. :))
I suggest to be consistent with other tests codebase and use just `/*`.
> + * Function generates a huge chunk of "bytecode" with a size bigger than
> + * LJ_MAX_BUF. Generated chunk must enable endmark in a Lex state.
Nit: Comment line width is greater than 66 symbols.
Typo: s/Generated/The generated/
(I'll proceed with the branch verison below.)
| static const char *
| bc_reader_with_endmark(lua_State *L, void *data, size_t *size)
The comment is desirable about the resulting chunk:
According the Lua 5.1 Reference Manual:
| To signal the end of the chunk, the reader must return `NULL` or set
| `size` to zero.
So, since this function returns `NULL`, the resulting chunk should be
treated as "". Which provides the following bytecode:
| "endmark":0-1
| 0000 FUNCV rbase: 1
| 0001 RET0 rbase: 0 lit: 1
This is also avoids test's failure before the patch: we just return
earlier:
| <src/lj_lex.c:50>
| if (p == NULL || sz == 0) return LEX_EOF;
So, looks like the test doesn't check the patch itself.
> + {
> + UNUSED(data);
> + *size = ~(size_t)0;
> +
> + return NULL;
> + }
> +
> + static int bc_loader_with_endmark(void *test_state)
> + {
> + lua_State *L = test_state;
> + void *ud = NULL;
> + int res = lua_load(L, bc_reader_with_endmark, ud, "endmark");
> +
> + /*
> + * Make sure we passed the condition with lj_err_mem in the function
Nit: Comment line width is greater than 66 symbols.
> + * `lex_more`.
> + */
> + assert_true(res != LUA_ERRMEM);
Maybe it's better to use here codition res == LUA_OK?
> + lua_settop(L, 0);
> +
> + return TEST_EXIT_SUCCESS;
> + }
> +
> + enum bc_emission_state {
> + EMIT_BC,
> + EMIT_EOF,
> + };
> +
> + typedef struct {
> + enum bc_emission_state state;
> + } dt;
> +
> + /**
Typo: s</**></*>
> + * Function returns a bytecode chunk on the first call and NULL
> + * and size equal to zero on the second call. Triggers the flag
> + * `END_OF_STREAM` in the function `lex_more`.
> + */
> + static const char *
> + bc_reader_with_eof(lua_State *L, void *data, size_t *size)
> + {
> + UNUSED(L);
> + dt *test_data = (dt *)data;
> + if (test_data->state == EMIT_EOF) {
> + *size = 0;
> + return NULL;
> + }
> +
> + static char *bc_chunk = NULL;
> + free(bc_chunk);
This free is called only once, when bc_chunk is already NULL.
I suggest moving the initialization of the `bc_chunk` to the beginning
of the scope and calling `free()` only for the `EMIT_EOF` state (it's
also a little bit more readable -- a reader shouldn't remember that
`free(NULL)` is OK).
> +
> + /**
Typo: s</**></*>
> + * Minimal size of a buffer with bytecode:
> + * signiture (1 byte) and a bytecode itself (1 byte).
Typo: s/a bytecode/the bytecode/
Typo: s/signiture/The signature/
> + */
> + size_t sz = 2;
> + bc_chunk = malloc(sz);
> + /**
Typo: s</**></*>
> + * `lua_load` automatically detects whether the chunk is text or binary,
Typo: s/binary,/binary/
> + * and loads it accordingly. We need a trace for *bytecode* input,
> + * so it is necessary to deceive a check in `lj_lex_setup`, that
> + * makes a sanity check and detects whether input is bytecode or text
> + * by the first char. Put `LUA_SIGNATURE[0]` at the beginning of the
> + * allocated region.
Nit: Comment line width is greater than 66 symbols.
> + */
> + bc_chunk[0] = LUA_SIGNATURE[0];
> + *size = sz;
> + test_data->state = EMIT_EOF;
> +
> + return bc_chunk;
> + }
> +
> + static int bc_loader_with_eof(void *test_state)
> + {
> + lua_State *L = test_state;
> + dt test_data = {0};
> + test_data.state = EMIT_BC;
> + int res = lua_load(L, bc_reader_with_eof, &test_data, "eof");
> + assert_true(res = LUA_ERRSYNTAX);
Typo: s/=/==/
But res is indeed `LUA_ERRSYNTAX` for now :).
> + lua_settop(L, 0);
> +
> + return TEST_EXIT_SUCCESS;
> + }
> +
> + int main(void)
> + {
> + lua_State *L = utils_lua_init();
> + const struct test_unit tgroup[] = {
> + test_unit_def(bc_loader_with_endmark),
> + test_unit_def(bc_loader_with_eof)
> + };
> +
> + const int test_result = test_run_group(tgroup, L);
> + utils_lua_close(L);
> + return test_result;
> + }
[1]: https://www.lua.org/manual/5.1/manual.html#lua_Reader
--
Best regards,
Sergey Kaplun
More information about the Tarantool-patches
mailing list