[Tarantool-patches] [PATCH v2] luajit: proxy -j and -b flags

Sergey Kaplun skaplun at tarantool.org
Wed Dec 8 10:36:10 MSK 2021


Hi, Maxim!

Thanks for the patch!
Please consider my comments below.

First of all please change spaces to tabs considering Tarantool code style
guide. Secondly, please add some tests for new flags (i.e. tests for -b[opt]
-j[command]). We don't need to test their functionality, but test parsing of
those flags by us (includes tricky part for jitcmd).

Thirdly, please check CI [1] before sending patch for review ;) :

| $ ./test-run.py args.test.py
| ...
| [001] box-py/args.test.py                                             [ fail ]
| [001]
| [001] Test failed! Result content mismatch:
| [001] --- box-py/args.result    Mon Dec  6 09:30:03 2021
| [001] +++ var/rejects/box-py/args.reject        Wed Dec  8 10:16:35 2021
| [001] @@ -9,6 +9,8 @@
| [001]    -v, --version                  print program version and exit
| [001]    -e EXPR                        execute string 'EXPR'
| [001]    -l NAME                        require library 'NAME'
| [001] +  -j CMD                         perform LuaJIT control command
| [001] +  -b[options] input output       save LuaJIT bytecode
| [001]    -i                             enter interactive mode after executing 'SCRIPT'
| [001]    --                             stop handling options
| [001]    -                              execute stdin and stop handling options
| [001] @@ -27,6 +29,8 @@
| [001]    -v, --version                  print program version and exit
| [001]    -e EXPR                        execute string 'EXPR'
| [001]    -l NAME                        require library 'NAME'
| [001] +  -j CMD                         perform LuaJIT control command
| [001] +  -b[options] input output       save LuaJIT bytecode
| [001]    -i                             enter interactive mode after executing 'SCRIPT'
| [001]    --                             stop handling options
| [001]    -                              execute stdin and stop handling options

On 14.09.21, Maxim Kokryashkin wrote:
> There are two flags in the LuaJIT useful for debugging purposes: `-j`

Nit: debugging and runtime configuration (-joff/jon).

> and `-b`. However, if you want to check the same Lua code from the
> Tarantool, you will need to make some adjustments in the script itself,
> as those flags are not present in the Tarantool's CLI.
> 
> This patch introduces those flags to the Tarantool, so debugging is
> much more convenient now. Flags are working the same as they do in
> LuaJIT.
> 
> Closes #5541
> ---
> GitHub branch:
> https://github.com/tarantool/tarantool/tree/fckxorg/gh-5541-proxy-luajit-flags
> 
> Issues:
> https://github.com/tarantool/tarantool/issues/5541
> 
> Q: Why do we need fiber_sleep here?
> Please, drop a comment.
> 
> > + aux_loop_is_run = true;
> > + for (argv++; *argv != NULL; narg++, argv++)
> > + lua_pushstring(L, *argv);
> > + int res = lua_pcall(L, narg, 0, 0);
> > + if(res)
> > + goto error;
> > +end:
> > + if (!aux_loop_is_run)
> > + fiber_sleep(0.0);
> A:That chunk of code is taken almost verbatim from the `run_script_f()`
> as I need to achieve almost the same thing. So the first
> `fiber_sleep()` is needed, because `tarantool_lua_dump_bytecode()`
> needs to start auxiliary event loop and re-schedule this fiber. The

OK, got it now. Please add the same comment as for `run_script_f()` here.

> second `fiber_sleep()` is needed because event loop should be started
> before the fiber calls the `ev_break()`.
> 
> Q: IINM, bcsave module don't run a user's code, just precompile it to the
> bytecode. Do we need a separate fiber for it?
> A: Yes, we need it since tarantool has it's own definition of os.exit().
> If you call os.exit() when running lua code outside of any fiber, you will
> get a panic. Since there is no easy way to redefine it temporarily and
> not break anything, I decided to stick with execution inside a fiber.

Oh! Thanks for clarifying. Please add the corresponding comment then.

> 
>  src/lua/init.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++
>  src/lua/init.h |   2 +
>  src/main.cc    | 113 ++++++++++++++++++------------
>  3 files changed, 255 insertions(+), 45 deletions(-)
> 
> diff --git a/src/lua/init.c b/src/lua/init.c
> index f9738025d..54d07926a 100644
> --- a/src/lua/init.c
> +++ b/src/lua/init.c
> @@ -313,6 +313,154 @@ skip:		base = (base == -1 ? 10 : base);
>  
>  /* }}} */
>  
> +static void l_message(const char *msg)
> +{
> +        printf("%s\n", msg);
> +        fflush(stdout);
> +}

Side note: I'm not sure that this is really convinient way for Tarantool to
report error this way. Wait for Igor's opinion.

> +
> +/**
> + * This function checks if there is an error,
> + * and outputs error message corresponding to the
> + * error object on the top of the Lua stack,
> + * if there is one,

Typo: s/,/./

> + *
> + * This function pops the error message from the Lua stack

Typo: missed dot in the end of the sentence.

Nit: Also I suggest to rephrase this comment like the following:
| This function gets an error message as its argument on the Lua stack and prints
| it to stdout. After that it pops the error message from the stack.

Feel free to ignore or change it on your own.

> + */
> +static int report(lua_State *L, int status)
> +{
> +        if (status && !lua_isnil(L, -1)) {
> +                const char *msg = lua_tostring(L, -1);
> +                if (msg == NULL)
> +                        msg = "(error object is not a string)";
> +                l_message(msg);
> +                lua_pop(L, 1);
> +        }
> +        return status;
> +}
> +
> +/**
> + * Load add-on module. This function has no
> + * effects on the Lua stack.
> + */
> +static int loadjitmodule(lua_State* L)

Typo: s/lua_State* L/lua_State *L/

> +{
> +        lua_getglobal(L, "require");
> +        lua_pushliteral(L, "jit.");
> +        lua_pushvalue(L, -3);

Side note: It is argument from Lua stack, you can mention it in the
comment above.

> +        lua_concat(L, 2);
> +        if (lua_pcall(L, 1, 1, 0)) {

Nit: I prefer more clear way: `(lua_pcall(L, 1, 1, 0) != LUA_OK)`.
Feel free to ignore.

Also, is it possible to get these modules unloaded as far as they precompiled
into Tarantool in build-time? Is it possible  with multiple -e flag?

> +                const char *msg = lua_tostring(L, -1);
> +                if (msg && !strncmp(msg, "module ", 7))
> +                        goto nomodule;
> +                return report(L, 1);
> +        }
> +        lua_getfield(L, -1, "start");
> +        if (lua_isnil(L, -1)) {
> +        nomodule:

Side note: I'm note sure about that but looks like label should be aligned with
the code below, according Tarantool's guidelines.

> +                l_message("unknown luaJIT command or jit.* modules not installed");
> +                return 1;
> +        }
> +        /* Drop module table. */
> +        lua_remove(L, -2);
> +        return 0;
> +}
> +/**
> + * This function dump bytecode for Lua script.
> + * Has no effect on the Lua stack

Typo: missed dot at the end of the sentence.

> + */
> +int dobytecode(va_list ap)
> +{
> +        char **argv = va_arg(ap, char **);
> +        struct diag *diag = va_arg(ap, struct diag *);
> +
> +        int narg = 0;
> +        bool aux_loop_is_run = false;
> +
> +        lua_pushliteral(tarantool_L, "bcsave");
> +        if (loadjitmodule(tarantool_L))

Nit: For functions returned some status it is better to check that status is
not ok manually according Tarantool's guidelines, i.e.:
| if loadjitmodule(tarantool_L) != 0

> +                goto error;
> +        if (argv[0][2]) {
> +                narg++;
> +                argv[0][1] = '-';
> +                lua_pushstring(tarantool_L, argv[0] + 1);
> +        }
> +
> +        fiber_sleep(0.0);
> +        aux_loop_is_run = true;
> +
> +        for (argv++; *argv != NULL; narg++, argv++)
> +                lua_pushstring(tarantool_L, *argv);
> +        int res = lua_pcall(tarantool_L, narg, 0, 0);
> +        if (res != LUA_OK)
> +                goto error;

Nit: I prefer to add empty line here to better readability:
For me this label is blurred with the code above.
Feel free to ignore.

> +        end:
> +        if (!aux_loop_is_run)
> +                fiber_sleep(0.0);
> +	ev_break(loop(), EVBREAK_ALL);
> +	return 0;
> +
> +        error:

Shouldn't we set diag as LuajitError like it is done for run_script_f?

> +        diag_move(diag_get(), diag);
> +        goto end;
> +}
> +
> +/**
> + * Run command with options.

All commands or jit module only?

> + * Has no effect on the Lua stack.
> + */
> +static int runcmdopt(lua_State *L, const char *opt)
> +{
> +        int narg = 0;
> +        if (opt && *opt) {

Nit: we can add comment here that this loop splits arguments.

> +                for (;;) {
> +                        const char *p = strchr(opt, ',');
> +                        narg++;
> +                        if (!p) break;
> +                        if (p == opt)
> +                                lua_pushnil(L);
> +                        else
> +                                lua_pushlstring(L, opt, (size_t)(p - opt));
> +                        opt = p + 1;
> +                }
> +                if (*opt)
> +                        lua_pushstring(L, opt);
> +                else
> +                        lua_pushnil(L);
> +        }
> +        return report(L, lua_pcall(L, narg, 0, 0));
> +}
> +
> +/**
> + * JIT engine control command: try jit
> + * library first or load add-on module.
> + * Has no effect on the Lua stack.

It's not true. At least it removes module name (incoming parameter on Lua
stack) from the stack.

> + */
> +int dojitcmd(const char *cmd)
> +{
> +        const char *opt = strchr(cmd, '=');
> +        lua_pushlstring(tarantool_L, cmd, opt ? (size_t)(opt - cmd) : strlen(cmd));
> +        lua_getfield(tarantool_L, LUA_REGISTRYINDEX, "_LOADED");
> +        /* Get jit.* module table. */
> +        lua_getfield(tarantool_L, -1, "jit");
> +        lua_remove(tarantool_L, -2);
> +        lua_pushvalue(tarantool_L, -2);
> +        /* Lookup library function. */
> +        lua_gettable(tarantool_L, -2);
> +        if (!lua_isfunction(tarantool_L, -1)) {
> +                /* Drop non-function and jit.* table, keep module name. */
> +                lua_pop(tarantool_L, 2);

> +        if (loadjitmodule(tarantool_L))

Something strange with indentation here.

> +                return 1;
> +        } else {
> +                /* Drop jit.* table. */
> +                lua_remove(tarantool_L, -2);
> +        }
> +        /* Drop module name. */
> +        lua_remove(tarantool_L, -2);
> +        return runcmdopt(tarantool_L, opt ? opt + 1 : opt);
> +}
> +
>  /**
>   * Original LuaJIT/Lua logic: <luajit/src/lib_package.c - function setpath>
>   *
> @@ -608,6 +756,10 @@ run_script_f(va_list ap)
>  			lua_setglobal(L, optv[i + 1]);
>  			lua_settop(L, 0);
>  			break;
> +                case 'j':
> +                        if (dojitcmd(optv[i + 1]) != 0)
> +                                goto error;
> +                        break;

Why do you remove workaround from LuaJIT here?

| $ src/tarantool -j -e ''
| Can't open script : No such file or directory

| $ luajit -j -e ''
| luajit: unknown luaJIT command or jit.* modules not installed

I mean the following part:

| const char *cmd = argv[i] + 2;
| if (*cmd == '\0') cmd = argv[++i];
| lua_assert(cmd != NULL);
| if (dojitcmd(L, cmd))
|   return 1;
| break;

>  		case 'e':
>  			/*
>  			 * Execute chunk
> @@ -747,6 +899,39 @@ tarantool_lua_run_script(char *path, bool interactive,
>  	return diag_is_empty(diag_get()) ? 0 : -1;
>  }
>  
> +int
> +tarantool_lua_dump_bytecode(char **argv) {

<snipped>

> +}
> +
>  void
>  tarantool_lua_free()
>  {
> diff --git a/src/lua/init.h b/src/lua/init.h
> index 7fc0b1a31..3c9489468 100644
> --- a/src/lua/init.h
> +++ b/src/lua/init.h
> @@ -74,6 +74,8 @@ int
>  tarantool_lua_run_script(char *path, bool force_interactive,
>  			 int optc, const char **optv,
>  			 int argc, char **argv);

Nit: missing empty line here.
Also, we can add a comment with functions decription.

> +int
> +tarantool_lua_dump_bytecode(char **argv);
>  
>  extern char *history;
>  
> diff --git a/src/main.cc b/src/main.cc
> index 91ed79fab..65248e655 100644
> --- a/src/main.cc
> +++ b/src/main.cc
> @@ -576,6 +576,8 @@ print_help(const char *program)
>  	puts("  -v, --version\t\t\tprint program version and exit");
>  	puts("  -e EXPR\t\t\texecute string 'EXPR'");
>  	puts("  -l NAME\t\t\trequire library 'NAME'");
> +	puts("  -j CMD \t\t\tperform LuaJIT control command");
> +	puts("  -b[options] input output\tsave LuaJIT bytecode");

Nit: I prefer the same description as for LuaJIT:
| -b ...    Save or list bytecode.
| -j cmd    Perform LuaJIT control command.

There is no need (IMO) to use uppercased words as far as they aren't mentioned
in the option description.

>  	puts("  -i\t\t\t\tenter interactive mode after executing 'SCRIPT'");
>  	puts("  --\t\t\t\tstop handling options");
>  	puts("  -\t\t\t\texecute stdin and stop handling options");
> @@ -584,6 +586,12 @@ print_help(const char *program)
>  	puts("to see online documentation, submit bugs or contribute a patch.");
>  }
>  
> +#define O_INTERACTIVE 0x1
> +#define O_BYTECODE    0x2
> +
> +extern "C" void **
> +export_syms(void);

For what do we need these lines?

> +
>  int
>  main(int argc, char **argv)
>  {
> @@ -595,7 +603,7 @@ main(int argc, char **argv)
>  	fpconv_check();
>  
>  	/* Enter interactive mode after executing 'script' */
> -	bool interactive = false;
> +        uint32_t opt_mask = 0;

Maybe it is enough to introduce one more boolean variable?
At least it looks OK for 2 options (interactive|bytecode).
Feel free to ignore.

>  	/* Lua interpeter options, e.g. -e and -l */
>  	int optc = 0;
>  	const char **optv = NULL;
> @@ -606,38 +614,47 @@ main(int argc, char **argv)
>  		{"version", no_argument, 0, 'v'},
>  		{NULL, 0, 0, 0},
>  	};
> -	static const char *opts = "+hVvie:l:";
> +	static const char *opts = "+hVvbij:e:l:";

Why do we need ':' option here for j?

>  
>  	int ch;
> +        bool lj_arg = false;

There is no need in special variable -- we can just use goto here.

Also, I don't know how we can should work here with multiple options.
For example, we have -i and -bl options together:

| $ src/tarantool -i -bl -e ''
| -- BYTECODE -- "":0-1
| 0001    RET0     0   1

| $ luajit -i -bl -e ''
| usage: luajit [options]... [script [args]...].
| Available options are:
| ...

The behaviour in the second example looks more convenient for me: user gets
messy with contradicting flags (-i launches interactive mode, but -bl only
dumps bytecode without chunk execution) and is noticed about that via an error.

>  	while ((ch = getopt_long(argc, argv, opts, longopts, NULL)) != -1) {
> -		switch (ch) {
> -		case 'V':
> -		case 'v':
> -			print_version();
> -			return 0;
> -		case 'h':
> -			print_help(basename(argv[0]));
> -			return 0;
> -		case 'i':
> -			/* Force interactive mode */
> -			interactive = true;
> -			break;
> -		case 'l':
> -		case 'e':
> -			/* Save Lua interepter options to optv as is */
> -			if (optc == 0) {
> -				optv = (const char **) calloc(argc,
> -							      sizeof(optv[0]));
> -				if (optv == NULL)
> -					panic_syserror("No enough memory for arguments");
> -			}
> -			optv[optc++] = ch == 'l' ? "-l" : "-e";
> -			optv[optc++] = optarg;
> -			break;
> -		default:
> -			/* "invalid option" is printed by getopt */
> -			return EX_USAGE;
> -		}
> +                switch (ch) {
> +                        case 'V':
> +                        case 'v':
> +                                print_version();
> +                                return 0;
> +                        case 'h':
> +                                print_help(basename(argv[0]));
> +                                return 0;
> +                        case 'i':
> +                                /* Force interactive mode */
> +                                opt_mask |= O_INTERACTIVE;
> +                                break;
> +                        case 'b':
> +                                opt_mask |= O_BYTECODE;
> +                                lj_arg = true;
> +                                break;
> +                        case 'j':
> +                        case 'l':
> +                        case 'e':
> +                                /* Save Lua interepter options to optv as is */
> +                                if (optc == 0) {
> +                                        optv = (const char **) calloc(argc, sizeof(optv[0]));
> +                                        if (optv == NULL)
> +                                                panic_syserror("No enough memory for arguments");
> +                                }
> +                                if(ch == 'l') optv[optc++] = "-l";
> +                                else if(ch == 'j') optv[optc++] = "-j";
> +                                else optv[optc++] = "-e";
> +                                optv[optc++] = optarg;
> +                                break;
> +                        default:
> +                                /* "invalid option" is printed by getopt */
> +                                return EX_USAGE;
> +                }
> +
> +                if(lj_arg) break;
>  	}
>  
>  	/* Shift arguments */
> @@ -645,19 +662,21 @@ main(int argc, char **argv)
>  	for (int i = 1; i < argc; i++)
>  		argv[i] = argv[optind + i - 1];
>  
> -	if (argc > 1 && strcmp(argv[1], "-") && access(argv[1], R_OK) != 0) {
> -		/*
> -		 * Somebody made a mistake in the file
> -		 * name. Be nice: open the file to set
> -		 * errno.
> -		 */
> -		int fd = open(argv[1], O_RDONLY);
> -		int save_errno = errno;
> -		if (fd >= 0)
> -			close(fd);
> -		printf("Can't open script %s: %s\n", argv[1], strerror(save_errno));
> -		return save_errno;
> -	}
> +        if(!(opt_mask & O_BYTECODE)) {

Nit: Looks like we can join if conditions, i.e.:
| !(opt_mask & O_BYTECODE) && argc > 1 && ...

Also, why do we need to avoid setting of errno in that case?

> +                if (argc > 1 && strcmp(argv[1], "-") && access(argv[1], R_OK) != 0) {

Nit: linewidth is more than 80 characters.

> +                        /*
> +                        * Somebody made a mistake in the file
> +                        * name. Be nice: open the file to set
> +                        * errno.
> +                        */
> +                        int fd = open(argv[1], O_RDONLY);
> +                        int save_errno = errno;
> +                        if (fd >= 0)
> +                                close(fd);
> +                        printf("Can't open script %s: %s\n", argv[1], strerror(save_errno));

Nit: Ditto.

> +                                return save_errno;
> +                }
> +        }
>  
>  	argv = title_init(argc, argv);
>  	/*
> @@ -747,13 +766,17 @@ main(int argc, char **argv)
>  			panic("%s", "can't init event loop");
>  
>  		int events = ev_activecnt(loop());
> -		/*
> +
> +                if(opt_mask & O_BYTECODE) {

Nit: {} are excess.

> +                        return tarantool_lua_dump_bytecode(argv);
> +                }
> +                /*
>  		 * Load user init script.  The script should have access
>  		 * to Tarantool Lua API (box.cfg, box.fiber, etc...) that
>  		 * is why script must run only after the server was fully
>  		 * initialized.
>  		 */
> -		if (tarantool_lua_run_script(script, interactive, optc, optv,
> +		if (tarantool_lua_run_script(script, opt_mask & O_INTERACTIVE, optc, optv,
>  					     main_argc, main_argv) != 0)
>  			diag_raise();
>  		/*
> -- 
> 2.33.0
> 

[1]: https://github.com/tarantool/tarantool/runs/3601797316?check_suite_focus=true#step:5:5089

-- 
Best regards,
Sergey Kaplun



More information about the Tarantool-patches mailing list