[Tarantool-patches] [PATCH luajit 2/2] test: rewrite sysprof test using managed execution

Maxim Kokryashkin m.kokryashkin at tarantool.org
Sun Dec 3 17:17:12 MSK 2023


Hi, Igor!
Thanks for the patch!
Please consider my comments below.

Side note: This new testing approach is amazing, thanks a lot!

On Tue, Nov 28, 2023 at 02:53:17PM +0000, Igor Munkin wrote:
> Often tests for sampling profiler require long running loops to be
Typo: s/Often/Often,/
> executed, so a certain situation is likely to occur. Thus the test added
Typo: s/Thus/Thus,/
> in the commit 285a1b0a16c1c3224ab93f7cf1fef17ab29e4ab4 ("sysprof: fix
Typo: s/in the/in/
> crash during FFUNC stream") expects FFUNC VM state (and even the
Typo: s/expects FFUNC/expects the FFUNC/
> particular instruction to be executed) at the moment when stacktrace is
> being collected. Unfortunately, it leads to test routine hang for
Typo: s/to test routine hang/to the test routine hanging/
> several environments and if it does not we cannot guarantee that the
Typo: s/environments/environments,/
Typo: s/and if it does not/and even if it does not,/
> desired scenario is tested (only rely on statistics). As a result the
Typo: s/As a result/As a result,/
> test for the aforementioned patch was disabled for Tarantool CI in the
> commit fef60a1052b8444be61e9a9932ab4875aff0b2ed ("test: prevent hanging
> Tarantool CI by sysprof test") until the issue is not resolved.
>
> This patch introduces the new approach for testing our sampling profiler
> via <ptrace> implementing precise managed execution of the profiled
Typo: s/<ptrace>/<ptrace>,/
> instance mentioned in tarantool/tarantool#9387.
>
> Instead of running around <tostring> gazillion times we accurately step
Typo: s/times/times,/
> to the exact place where the issue reproduces and manually emit SIGPROF
> to the Lua VM being profiled. The particular approach implemented in
> this patch is described below.
>
> As it was mentioned, the test makes sysprof to collect the particular
Typo: s/to collect/collect/
> event (FFUNC) at the certain instruction in Lua VM (<lj_fff_res1>) to
> reproduce the issue from tarantool/tarantool#8594. Hence it's enough to
Typo: s/Hence/Hence,/
> call <tostring> fast function in the profiled instance (i.e. "tracee").
Typo: s/i.e./i.e.,/
> To emit SIGPROF right at <lj_fff_res1> in scope of <tostring> builtin,
Typo: s/in scope/in the scope/
> the manager (i.e. "tracer") is implemented.
Typo: s/i.e./i.e.,/
>
> Here are the main steps (see comments and `man 2 ptrace' for more info):
>   1. Poison <int 3> instruction as the first instruction at
>      <lj_ff_tostring> to stop at the beginning of the fast function;
>   2. Resume the "tracee" from the "tracer";
>   3. Hit the emitted interruption, restore the original instruction and
>      "rewind" the RIP to "replay" the instruction at <lj_ff_tostring>;
>   4. Do the hack 1-3 for <lj_fff_res1>;
>   5. Emit SIGPROF while resumimg the "tracee";
Typo: s/resumimg/resuming/
>
> As a result sysprof collects the full backtrace with <tostring> fast
Typo: s/a result/a result,/
> function as the topmost frame.
>
> Resolves tarantool/tarantool#9387
> Follows up tarantool/tarantool#8594
>
> Signed-off-by: Igor Munkin <imun at tarantool.org>
> ---
>  .../gh-8594-sysprof-ffunc-crash.test.c        | 269 ++++++++++++++++++
>  .../gh-8594-sysprof-ffunc-crash.test.lua      |  55 ----
>  2 files changed, 269 insertions(+), 55 deletions(-)
>  create mode 100644 test/tarantool-c-tests/gh-8594-sysprof-ffunc-crash.test.c
>  delete mode 100644 test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
>
> diff --git a/test/tarantool-c-tests/gh-8594-sysprof-ffunc-crash.test.c b/test/tarantool-c-tests/gh-8594-sysprof-ffunc-crash.test.c
> new file mode 100644
> index 00000000..4caed8cb
> --- /dev/null
> +++ b/test/tarantool-c-tests/gh-8594-sysprof-ffunc-crash.test.c
> @@ -0,0 +1,269 @@
> +#include "lauxlib.h"
> +#include "lmisclib.h"
> +#include "lua.h"
> +
> +#include "test.h"
> +#include "utils.h"
> +
> +#include <signal.h>
> +#include <sys/ptrace.h>
> +#include <sys/user.h>
> +#include <sys/wait.h>
> +#include <unistd.h>
> +
> +/* XXX: Still need normal assert inside <tracee> and helpers. */
> +#undef NDEBUG
> +#include <assert.h>
> +
> +/*
> + * XXX: The test is *very* Linux/x86_64 specific. Fortunately, so
> + * does the sampling profiler. <lj_arch.> is needed for LUAJIT_OS
> + * and LUAJIT_TARGET.
> + */
> +#include "lj_arch.h"
> +
> +#if LUAJIT_OS == LUAJIT_OS_LINUX && LUAJIT_TARGET == LUAJIT_ARCH_X64
Why not use the `LJ_HASSYSPROF` flag?
> +
> +/*
> + * XXX: The test makes sysprof to collect the particular event
Typo: s/sysprof to/sysprof/
> + * (FFUNC) at the particular instruction (<lj_fff_res1>) to
> + * reproduce the issue #8594. Hence it's enough to call <tostring>
Typo: s/Hence/Hence,/
> + * fast function (this is done in <tracee> function). To emit
> + * SIGPROF right at <lj_fff_res1> in scope of <tostring> fast
> + * function, the managed execution is implemented in the function
> + * <tracer>: <int 3> instruction is poisoned as the first
> + * instruction at <lj_ff_tostring> to stop <tracee> at the
> + * beginning of the fast function; <tracer> resumes the <tracee>;
> + * the same hack is done for <lj_fff_res1>. When the <tracee> hits
> + * the interruption at <lj_fff_res1>, SIGPROF is emitted while
> + * resuming the <tracee>. As a result sysprof collects the full
Typo: s/a result/a result,/
> + * backtrace with <tostring> fast function as the topmost frame.
> + *
> + * See more info here:
> + * * https://man7.org/linux/man-pages/man2/ptrace.2.html
> + * * https://github.com/tarantool/tarantool/issues/8594
> + * * https://github.com/tarantool/tarantool/issues/9387
> + */
> +
> +#define MESSAGE "Canary is alive"
> +#define LUACALL "local a = tostring('" MESSAGE "') return a"
> +/* XXX: Resolve the necessary addresses from VM engine. */
> +extern void *lj_ff_tostring(void);
> +extern void *lj_fff_res1(void);
> +
> +/* Sysprof "/dev/null" stream helpers. {{{ */
> +
> +/*
> + * Yep, 8Mb. Tuned in order not to bother the platform with too
> + * often flushes.
> + */
> +#define STREAM_BUFFER_SIZE (8 * 1024 * 1024)
> +#define DEVNULL -1
> +
> +struct devnull_ctx {
> +	/*
> +	 * XXX: Dummy file descriptor to be used as "/dev/null"
> +	 * context indicator in writer and on_stop callback.
> +	 */
> +	int fd;
> +	/* Buffer for data recorded by sysprof. */
> +	uint8_t buf[STREAM_BUFFER_SIZE];
> +};
> +
> +static int stream_new(struct luam_Sysprof_Options *options) {
> +	struct devnull_ctx *ctx = calloc(1, sizeof(struct devnull_ctx));
> +	if (ctx == NULL)
> +		return PROFILE_ERRIO;
> +
> +	/* Set "/dev/null" context indicator. */
> +	ctx->fd = DEVNULL;
> +	options->ctx = ctx;
> +	options->buf = ctx->buf;
> +	options->len = STREAM_BUFFER_SIZE;
> +
> +	return PROFILE_SUCCESS;
> +}
> +
> +static int stream_delete(void *rawctx, uint8_t *buf) {
> +	struct devnull_ctx *ctx = rawctx;
> +	assert(ctx->fd == DEVNULL);
> +	free(ctx);
> +	return PROFILE_SUCCESS;
> +}
> +
> +static size_t stream_writer(const void **buf_addr, size_t len, void *rawctx) {
> +	struct devnull_ctx *ctx = rawctx;
> +	assert(ctx->fd == DEVNULL);
> +	/* Do nothing, just return back to the profiler. */
> +	return STREAM_BUFFER_SIZE;
> +}
> +
> +/* }}} Sysprof "/dev/null" stream helpers. */
> +
> +static int tracee(const char *luacode) {
> +	struct luam_Sysprof_Counters counters = {};
> +	struct luam_Sysprof_Options opt = {
> +		/* Collect full backtraces per event. */
> +		.mode = LUAM_SYSPROF_CALLGRAPH,
> +		/*
> +		 * XXX: Setting the "endless timer". The test
> +		 * requires the single event to be streamed at
> +		 * <lj_fff_res1> instruction, so to avoid spoiling
> +		 * the stream with other unwanted events, the
> +		 * timer is set to some unreachable point, so the
> +		 * profiler will be guaranteed to stop before any
> +		 * event is emitted.
> +		 */
> +		.interval = -1ULL,
> +	};
> +
> +	/* Allow tracing for this process */
> +	if (ptrace(PTRACE_TRACEME, 0, 0, 0) < 0) {
> +		perror("Failed to turn the calling thread into a tracee");
Typo?: s/thread/process/
> +		return EXIT_FAILURE;
> +	}
> +
> +	/*
> +	 * XXX: Allow parent (which is our tracer now) to observe
> +	 * our signal-delivery-stop (i.e. the tracee is ready).
> +	 * For more info see ptrace(2), "Attaching and detaching".
> +	 */
> +	raise(SIGSTOP);
> +
> +	lua_State *L = utils_lua_init();
> +
> +	/* Customize and start profiler. */
> +	assert(stream_new(&opt) == PROFILE_SUCCESS);
> +	assert(luaM_sysprof_set_writer(stream_writer) == PROFILE_SUCCESS);
> +	assert(luaM_sysprof_set_on_stop(stream_delete) == PROFILE_SUCCESS);
> +	assert(luaM_sysprof_start(L, &opt) == PROFILE_SUCCESS);
> +
> +	/* FIXME: Make this part test agnostic. */
> +	assert(luaL_dostring(L, luacode) == LUA_OK);
> +	assert(strcmp(lua_tostring(L, -1), MESSAGE) == 0);
> +
> +	/* Terminate profiler and Lua universe. */
> +	assert(luaM_sysprof_stop(L) == PROFILE_SUCCESS);
> +	utils_lua_close(L);
It doesn't seem to be semantically correct to terminate Lua universe
before obtaining the counters, even though there is no obstacles to do
so in the API implementation itself.
> +
> +	/*
> +	 * XXX: The only event to be streamed must be FFUNC at
> +	 * <lj_fff_res1> instruction.
> +	 * FIXME: Make this part test agnostic.
> +	 */
> +	assert(luaM_sysprof_report(&counters) == PROFILE_SUCCESS);
> +	assert(counters.samples == 1);
> +	assert(counters.vmst_ffunc == 1);
> +
> +	return EXIT_SUCCESS;
> +}
> +
> +const uint8_t INT3 = 0xCC;
> +static inline unsigned long int3poison(unsigned long instruction) {
> +	const size_t int3bits = sizeof(INT3) * 8;
> +	const unsigned long int3mask = -1UL >> int3bits << int3bits;
> +	return (instruction & int3mask) | INT3;
> +}
> +
> +static int continue_until(pid_t chpid, void *addr) {
> +	int wstatus;
> +	struct user_regs_struct regs;
> +
> +	/* Obtain the instructions at the <addr>. */
> +	unsigned long data = ptrace(PTRACE_PEEKTEXT, chpid, addr, 0);
> +	/*
> +	 * Emit the <int 3> instruction to the <addr>.
> +	 * XXX: <int 3> is poisoned as the LSB to the <data>
> +	 * obtained from the <addr> above.
> +	 */
> +	ptrace(PTRACE_POKETEXT, chpid, addr, int3poison(data));
> +
> +	/* Resume <chpid> tracee until SIGTRAP occurs. */
> +	ptrace(PTRACE_CONT, chpid, 0, 0);
> +	/* Wait <chpid> tracee signal-delivery-stop. */
> +	waitpid(chpid, &wstatus, 0);
Should we check the `wstatus` here too?
> +
> +	/* Obtain GPR set to tweak RIP for further execution. */
> +	ptrace(PTRACE_GETREGS, chpid, 0, &regs);
> +	/*
> +	 * Make sure we indeed are stopped at <addr>.
> +	 * XXX: RIP points right after <int 3> instruction.
> +	 */
> +	assert(regs.rip == (long)addr + sizeof(INT3));
> +
> +	/*
> +	 * XXX: Restore the original instruction at <addr> and
> +	 * "rewind" RIP by <int 3> size to "replay" the poisoned
> +	 * instruction at the <addr>.
> +	 */
> +	regs.rip -= sizeof(INT3);
> +	ptrace(PTRACE_SETREGS, chpid, 0, &regs);
> +	ptrace(PTRACE_POKETEXT, chpid, addr, data);
> +
> +	/* Return wait status to the caller for test checks. */
> +	return wstatus;
> +}
> +
> +static int tracer(pid_t chpid) {
> +	int wstatus;
> +
> +	/* Wait until <chpid> tracee is ready. */
> +	waitpid(chpid, &wstatus, 0);
Should we check the process status here too?
> +
> +	/* Resume <chpid> tracee until <lj_ff_tostring>. */
> +	wstatus = continue_until(chpid, lj_ff_tostring);
> +
> +	/* The tracee has to be alive and stopped by SIGTRAP. */
> +	assert_false(WIFEXITED(wstatus));
> +	assert_true(WIFSTOPPED(wstatus));
> +
> +	/* Resume <chpid> tracee until <lj_fff_res1>. */
> +	wstatus = continue_until(chpid, lj_fff_res1);
> +
> +	/* The tracee has to be alive and stopped by SIGTRAP. */
> +	assert_false(WIFEXITED(wstatus));
> +	assert_true(WIFSTOPPED(wstatus));
This part is duplicated, let's move it to a separate `healthcheck`
function, or something like that.
> +
> +	/* Send SIGPROF to make sysprof collect the event. */
> +	ptrace(PTRACE_CONT, chpid, 0, SIGPROF);
> +
> +	/* Wait until <chpid> tracee successfully exits. */
> +	waitpid(chpid, &wstatus, 0);
> +	assert_true(WIFEXITED(wstatus));
> +
> +	return TEST_EXIT_SUCCESS;
> +}
> +
> +static int test_tostring_call(void *ctx) {
> +	pid_t chpid = fork();
> +	switch(chpid) {
> +	case -1:
> +		bail_out("Tracee fork failed");
> +	case 0:
> +		/*
> +		 * XXX: Tracee has to <exit> instead of <return>
> +		 * to avoid duplicate reports in <test_run_group>.
> +		 * Test assertions are used only in the <tracer>,
> +		 * so the <tracer> ought to report whether the
> +		 * test succeeded or not.
> +		 */
> +		exit(tracee(LUACALL));
> +	default:
> +		return tracer(chpid);
> +	}
> +}
> +
> +#else /* LUAJIT_OS == LUAJIT_OS_LINUX && LUAJIT_TARGET == LUAJIT_ARCH_X64 */
> +
> +static int test_tostring_call(void *ctx) {
> +	return skip_all("sysprof is implemented for Linux/x86_64 only");
> +}
> +
> +#endif /* LUAJIT_OS == LUAJIT_OS_LINUX && LUAJIT_TARGET == LUAJIT_ARCH_X64 */
> +
> +int main(void) {
> +	const struct test_unit tgroup[] = {
> +		test_unit_def(test_tostring_call),
> +	};
> +	return test_run_group(tgroup, NULL);
> +}
> diff --git a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua b/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> deleted file mode 100644
> index f8b409ae..00000000
> --- a/test/tarantool-tests/gh-8594-sysprof-ffunc-crash.test.lua
> +++ /dev/null
<snipped>


More information about the Tarantool-patches mailing list