[Tarantool-patches] [PATCH luajit 2/2][v2] Followup fix for embedded bytecode loader.

Sergey Bronnikov sergeyb at tarantool.org
Thu Sep 7 10:04:31 MSK 2023


Hi, Sergey

On 9/5/23 15:55, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the patch!
> Please, consider my comments below.
>
> On 31.08.23, Sergey Bronnikov wrote:
>> From: Sergey Bronnikov <sergeyb at tarantool.org>
>>
>> (cherry-picked from commit e49863eda13d095b1a78fd4ca0fd3a6a9a17d782)
>>
>> The patch follows up a previous patch and limits the total size of a
>> chunk load by `lua_load` with size `LJ_MAX_BUF - 1`.
>>
>> Sergey Bronnikov:
>> * added the description and the test
>> ---
>>   src/lj_lex.c                                  |   1 +
>>   test/tarantool-c-tests/lj-549-lua_load.test.c | 134 ++++++++++++++++++
> I suggest renaming the test to lj-549-lua-load.test.c to be consistent with
> other tests.
Renamed.
>
>>   2 files changed, 135 insertions(+)
>>   create mode 100644 test/tarantool-c-tests/lj-549-lua_load.test.c
>>
>> diff --git a/src/lj_lex.c b/src/lj_lex.c
>> index 6291705f..13495c41 100644
>> --- a/src/lj_lex.c
>> +++ b/src/lj_lex.c
>> @@ -51,6 +51,7 @@ static LJ_NOINLINE LexChar lex_more(LexState *ls)
>>     if (sz >= LJ_MAX_BUF) {
>>       if (sz != ~(size_t)0) lj_err_mem(ls->L);
>>       sz = ~(uintptr_t)0 - (uintptr_t)p;
>> +    if (sz >= LJ_MAX_BUF) sz = LJ_MAX_BUF-1;
>>       ls->endmark = 1;
>>     }
>>     ls->pe = p + sz;
>> diff --git a/test/tarantool-c-tests/lj-549-lua_load.test.c b/test/tarantool-c-tests/lj-549-lua_load.test.c
>> new file mode 100644
>> index 00000000..9baa7a1a
>> --- /dev/null
>> +++ b/test/tarantool-c-tests/lj-549-lua_load.test.c
>> @@ -0,0 +1,134 @@
>> +#include <assert.h>
> This include is excess.
Removed.
>
>> +#include <stdint.h>
> Ditto.

Removed.


>
>> +#include <stddef.h>
>> +#include <string.h>
> Ditto.

Removed.


>
>> +#include <stdlib.h>
>> +#include <stdio.h>
> Ditto.
Removed.
>
>> +
>> +#include <lua.h>
>> +#include <lualib.h>
> This include is excess since all libs are opened via utils.
Removed.
>
>> +#include <lauxlib.h>
> This include is excess since there is no `luaL*` functions or structures
> usage (and there is no usage of the `LUA_ERRFILE`, `LUA_NOREF`,
> `LUA_REFNIL`).
Removed.
>
>> +
>> +#include "test.h"
>> +#include "utils.h"
>> +
>> +/* Need for skipcond. */
>> +#include "lj_arch.h"
> There is no skipconditions, so this include may be dropped.
Removed.
>
>> +
>> +/* Defined in lj_def.h. */
>> +#define LJ_MAX_MEM32    0x7fffff00      /* Max. 32 bit memory allocation. */
>> +#define LJ_MAX_BUF      LJ_MAX_MEM32    /* Max. buffer length. */
> Why don't use `#include "lj_def.h"` instead and mention what we need
> from it?
> Reminder: this is kind of unit tests (or these C tests may implement
> unit test). So, we can include internal libraries, and this is OK for
> __our C tests__.
>
Fixed.
>> +
>> +/* Defined in lua.h. */
>> +/* mark for precompiled code (`<esc>Lua') */
>> +#define	LUA_SIGNATURE	"\033Lua"
> We already included <lua.h>, so this define isn't required.
Fixed.
>
>> +
>> +#define UNUSED(x) ((void)(x))
>> +
>> +/**
> There is no need in double '*' outside functions (we're not in Kansas
> anymore. :))
> I suggest to be consistent with other tests codebase and use just `/*`.
Fixed.
>
>> + * Function generates a huge chunk of "bytecode" with a size bigger than
>> + * LJ_MAX_BUF. Generated chunk must enable endmark in a Lex state.
> Nit: Comment line width is greater than 66 symbols.
> Typo: s/Generated/The generated/
Fixed.
>
> (I'll proceed with the branch verison below.)
>
> | static const char *
> | bc_reader_with_endmark(lua_State *L, void *data, size_t *size)
>
> The comment is desirable about the resulting chunk:
> According the Lua 5.1 Reference Manual:
> | To signal the end of the chunk, the reader must return `NULL` or set
> | `size` to zero.
>
> So, since this function returns `NULL`, the resulting chunk should be
> treated as "". Which provides the following bytecode:
> | "endmark":0-1
> | 0000 FUNCV  rbase:   1
> | 0001 RET0   rbase:   0 lit:     1
>
> This is also avoids test's failure before the patch: we just return
> earlier:
>
> | <src/lj_lex.c:50>
> | if (p == NULL || sz == 0) return LEX_EOF;
>
> So, looks like the test doesn't check the patch itself.

This is exactly the case covered by testcase,

note the name of testcase and Reader function contains postfix "_eof".

Yes, doesn't check the patch itself, because it is not trivial to test 
endmark introduced in patch.

I added note about test to the second commit too.

>
>> + {
>> + 	UNUSED(data);
>> + 	*size = ~(size_t)0;
>> +
>> + 	return NULL;
>> + }
>> +
>> + static int bc_loader_with_endmark(void *test_state)
>> + {
>> + 	lua_State *L = test_state;
>> + 	void *ud = NULL;
>> + 	int res = lua_load(L, bc_reader_with_endmark, ud, "endmark");
>> +
>> + 	/*
>> + 	 * Make sure we passed the condition with lj_err_mem in the function
> Nit: Comment line width is greater than 66 symbols.
Fixed.
>
>> + 	 * `lex_more`.
>> + 	 */
>> + 	assert_true(res != LUA_ERRMEM);
> Maybe it's better to use here codition res == LUA_OK?
>
>> + 	lua_settop(L, 0);
>> +
>> + 	return TEST_EXIT_SUCCESS;
>> + }
>> +
>> + enum bc_emission_state {
>> + 	EMIT_BC,
>> + 	EMIT_EOF,
>> + };
>> +
>> + typedef struct {
>> + 	enum bc_emission_state state;
>> + } dt;
>> +
>> + /**
> Typo: s</**></*>

Fixed.


>
>> +  * Function returns a bytecode chunk on the first call and NULL
>> +  * and size equal to zero on the second call. Triggers the flag
>> +  * `END_OF_STREAM` in the function `lex_more`.
>> +  */
>> + static const char *
>> + bc_reader_with_eof(lua_State *L, void *data, size_t *size)
>> + {
>> + 	UNUSED(L);
>> + 	dt *test_data = (dt *)data;
>> + 	if (test_data->state == EMIT_EOF) {
>> + 		*size = 0;
>> + 		return NULL;
>> + 	}
>> +
>> + 	static char *bc_chunk = NULL;
>> + 	free(bc_chunk);
> This free is called only once, when bc_chunk is already NULL.
> I suggest moving the initialization of the `bc_chunk` to the beginning
> of the scope and calling `free()` only for the `EMIT_EOF` state (it's
> also a little bit more readable -- a reader shouldn't remember that
> `free(NULL)` is OK).
Updated.
>
>> +
>> + 	/**
> Typo: s</**></*>
Fixed.
>
>> + 	 * Minimal size of a buffer with bytecode:
>> + 	 * signiture (1 byte) and a bytecode itself (1 byte).
> Typo: s/a bytecode/the bytecode/
> Typo: s/signiture/The signature/


Fixed.

>
>> + 	 */
>> + 	size_t sz = 2;
>> + 	bc_chunk = malloc(sz);
>> + 	/**
> Typo: s</**></*>
Fixed.
>
>> + 	 * `lua_load` automatically detects whether the chunk is text or binary,
> Typo: s/binary,/binary/
Fixed.
>
>> + 	 * and loads it accordingly. We need a trace for *bytecode* input,
>> + 	 * so it is necessary to deceive a check in `lj_lex_setup`, that
>> + 	 * makes a sanity check and detects whether input is bytecode or text
>> + 	 * by the first char. Put `LUA_SIGNATURE[0]` at the beginning of the
>> + 	 * allocated region.
> Nit: Comment line width is greater than 66 symbols.
Fixed.
>
>> + 	 */
>> + 	bc_chunk[0] = LUA_SIGNATURE[0];
>> + 	*size = sz;
>> + 	test_data->state = EMIT_EOF;
>> +
>> + 	return bc_chunk;
>> + }
>> +
>> + static int bc_loader_with_eof(void *test_state)
>> + {
>> + 	lua_State *L = test_state;
>> + 	dt test_data = {0};
>> + 	test_data.state = EMIT_BC;
>> + 	int res = lua_load(L, bc_reader_with_eof, &test_data, "eof");
>> + 	assert_true(res = LUA_ERRSYNTAX);
> Typo: s/=/==/

Fixed.


> But res is indeed `LUA_ERRSYNTAX` for now :).
>
>> + 	lua_settop(L, 0);
>> +
>> + 	return TEST_EXIT_SUCCESS;
>> + }
>> +
>> + int main(void)
>> + {
>> + 	lua_State *L = utils_lua_init();
>> + 	const struct test_unit tgroup[] = {
>> + 		test_unit_def(bc_loader_with_endmark),
>> + 		test_unit_def(bc_loader_with_eof)
>> + 	};
>> +
>> + 	const int test_result = test_run_group(tgroup, L);
>> + 	utils_lua_close(L);
>> + 	return test_result;
>> + }
> [1]: https://www.lua.org/manual/5.1/manual.html#lua_Reader
>
> --
> Best regards,
> Sergey Kaplun


More information about the Tarantool-patches mailing list