From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 6B04B6FC87; Wed, 29 Sep 2021 09:55:27 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6B04B6FC87 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1632898527; bh=urVkT3DZm4Ipf513bHk/7Jx/ku+lwFKLsYBxiWwWPfw=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=InKWm105iUewiPhi0JfGNZ11u6ex1CqtYn76eIzXI10ALruI4waSDuno/VSNUBCXH oXRc5Kf3+mVOXkvrb/8mp+N7gqV+KH1NEKbp472JdF+1jqfzjij3/1pPQksv/e3A8X TMFEt90j/QHQER0tVIWXy6I5inAmwEtm/mkXEgBM= Received: from smtp35.i.mail.ru (smtp35.i.mail.ru [94.100.177.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BA6146FC87 for ; Wed, 29 Sep 2021 09:55:25 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BA6146FC87 Received: by smtp35.i.mail.ru with esmtpa (envelope-from ) id 1mVTUy-0002R2-Fw; Wed, 29 Sep 2021 09:55:25 +0300 Date: Wed, 29 Sep 2021 09:53:56 +0300 To: Maxim Kokryashkin , Mikhail Shishatskiy Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD96A58C36AA2E996499DED20B83AFF2DE90C450EFC080BE8B7182A05F5380850405013DE2374196C540B9D1E3DB3F8FDEB46E9E467832F8C58C87DE0C9DD3F8B29 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE73F300A97FDD4E158EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637B4ADA525623CF2D68638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D822A2BF5C9158164B6016FE21B7B0A57F117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8B2EE5AD8F952D28FBA471835C12D1D977C4224003CC8364762BB6847A3DEAEFB0F43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B68655334FD4449CB9ECD01F8117BC8BEAAAE862A0553A39223F8577A6DFFEA7CB24F08513AFFAC7943847C11F186F3C59DAA53EE0834AAEE X-C1DE0DAB: 0D63561A33F958A52B2BE309BFE8C5AA79EEC78C44B6411482B78DB340E5F7B8D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75BFC02AB3DF06BA5A410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34BD150993ED26DDF47A0B84A52C5ABE5635EB0452EFEA0BBD4A3F40A004B9039B21928ED36AF9D0C81D7E09C32AA3244CFB70077A4F45F178C541ACBCED1D8E9DE8FBBEFAE1C4874C927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojWaDhU1ub98YdbXUVGUh5Iw== X-Mailru-Sender: 3B9A0136629DC91206CBC582EFEF4CB4022736A18AB250725E67EDBF9F48B43B021A003F19E14F19F2400F607609286E924004A7DEC283833C7120B22964430C52B393F8C72A41A89437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit 4/7] core: introduce lua and platform profiler X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Maxim, Mikhail! Thanks for the patch! Please consider my comments below. On 08.09.21, Maxim Kokryashkin wrote: > From: Mikhail Shishatskiy > > This patch introduces a sampling platform profiler for > the Lua machine. > > The profiler uses the signal sampling backend from the > low-level profiler built in vanilla LuaJIT, which was put > into a separate module in one of the previous patches. I suppose we can omit this clarification -- if somebody looks at git log he'll see it. > Thus, one cannot use both profilers at the same time. Why we can define the signal (other that used for LuaJIT profiler) to avoid conflict? > > First of all profiler dumps the definitions of all > loaded Lua functions and all loaded shared libraries > (symtab) via the write buffer introduced in one of > the previous patches. Suggest to drop "introduced in one of ..." > > As the profiling signal may occur at any time, we need > to provide some guarantees for the sampling profiler > to unwind consistent stack frames. > So, the VM is adjusted > to save stack's current top frame into a dedicated > variable `guesttop` in the virtual machine global state. It is done in previous patch, so I suppose we shouldn't mention it in this commit. > > When signal occurs, the profiler dumps current VM state Typo: s/signal/a signal/ > and additional info about about stacks that can be > considered consistent in a given state. Also, profiling > can be done without writing to file: vmstate counters > are accumulated in static memory and can be received via > the `luaM_sysprof_report` function. For more details see Typo: s/details see/details, see/ > the header file. > > When profiling is over, the old signal handler is restored, > and a special epilogue header is written. > > Part of tarantool/tarantool#781 I suggest to move tests and Lua C API to commit, that introduses Lua API as well. And in this commit we will only introduce core functionality without API. > --- > CMakeLists.txt | 6 + > src/CMakeLists.txt | 1 + > src/Makefile.dep.original | 8 +- > src/lj_arch.h | 11 + > src/lj_errmsg.h | 2 +- > src/lj_mapi.c | 26 + > src/lj_state.c | 7 + > src/lj_sysprof.c | 483 ++++++++++++++++++ > src/lj_sysprof.h | 94 ++++ > src/lmisclib.h | 93 ++++ > test/tarantool-tests/CMakeLists.txt | 1 + > .../misclib-sysprof-capi.test.lua | 53 ++ > .../misclib-sysprof-capi/CMakeLists.txt | 1 + > .../misclib-sysprof-capi/testsysprof.c | 269 ++++++++++ > 14 files changed, 1052 insertions(+), 3 deletions(-) > create mode 100644 src/lj_sysprof.c > create mode 100644 src/lj_sysprof.h > create mode 100644 test/tarantool-tests/misclib-sysprof-capi.test.lua > create mode 100644 test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt > create mode 100644 test/tarantool-tests/misclib-sysprof-capi/testsysprof.c Please update for build with disabled sysprof as well. > > diff --git a/CMakeLists.txt b/CMakeLists.txt > index 5348e043..0c702093 100644 > --- a/CMakeLists.txt > +++ b/CMakeLists.txt > @@ -184,6 +184,12 @@ if(LUAJIT_DISABLE_MEMPROF) > AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_MEMPROF) > endif() > > +# Disable platform and lua profiler. > +option(LUAJIT_DISABLE_SYSPROF "LuaJIT platform and lua profiler support" OFF) Typo: s/lua/Lua/ > +if(LUAJIT_DISABLE_SYSPROF) > + AppendFlags(TARGET_C_FLAGS -DLUAJIT_DISABLE_SYSPROF) > +endif() > + > # Switch to harder (and slower) hash function when a collision > # chain in the string hash table exceeds a certain length. > option(LUAJIT_SMART_STRINGS "Harder string hashing function" ON) > diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt > index 0f3d888f..c19d8b54 100644 > --- a/src/CMakeLists.txt > +++ b/src/CMakeLists.txt > @@ -69,6 +69,7 @@ make_source_list(SOURCES_UTILS > make_source_list(SOURCES_PROFILER > SOURCES > lj_memprof.c > + lj_sysprof.c > lj_profile_timer.c > lj_profile.c Minor: lj_sysprof.c should be the last entry (alphabetically). > ) > diff --git a/src/Makefile.dep.original b/src/Makefile.dep.original > index b7d82719..503817bb 100644 > --- a/src/Makefile.dep.original > +++ b/src/Makefile.dep.original > @@ -140,8 +140,8 @@ lj_lib.o: lj_lib.c lauxlib.h lua.h luaconf.h lj_obj.h lj_def.h lj_arch.h \ > lj_load.o: lj_load.c lua.h luaconf.h lauxlib.h lj_obj.h lj_def.h \ > lj_arch.h lj_gc.h lj_err.h lj_errmsg.h lj_buf.h lj_str.h lj_func.h \ > lj_frame.h lj_bc.h lj_vm.h lj_lex.h lj_bcdump.h lj_parse.h > -lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h lj_arch.h \ > - lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h > +lj_mapi.o: lj_mapi.c lua.h luaconf.h lmisclib.h lj_obj.h lj_def.h \ > + lj_arch.h lj_dispatch.h lj_bc.h lj_jit.h lj_ir.h lj_sysprof.h > lj_mcode.o: lj_mcode.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \ > lj_gc.h lj_err.h lj_errmsg.h lj_jit.h lj_ir.h lj_mcode.h lj_trace.h \ > lj_dispatch.h lj_bc.h lj_traceerr.h lj_vm.h > @@ -204,6 +204,10 @@ lj_strscan.o: lj_strscan.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \ > lj_char.h lj_strscan.h > lj_symtab.o: lj_symtab.c lj_symtab.h lj_wbuf.h lj_def.h lua.h luaconf.h \ > lj_obj.h lj_arch.h > +lj_sysprof.o: lj_sysprof.c lj_arch.h lua.h luaconf.h lj_sysprof.h \ > + lj_obj.h lj_def.h lmisclib.h lj_debug.h lj_dispatch.h lj_bc.h lj_jit.h \ > + lj_ir.h lj_frame.h lj_trace.h lj_traceerr.h lj_wbuf.h lj_profile_timer.h \ > + lj_symtab.h > lj_tab.o: lj_tab.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h lj_gc.h \ > lj_err.h lj_errmsg.h lj_tab.h > lj_trace.o: lj_trace.c lj_obj.h lua.h luaconf.h lj_def.h lj_arch.h \ Looks like lj_state.o dependencies should be updated too, IINM: 0) We can fix missed dependencies from lj_memprof.h 1) We should add dependency by lj_sysprof.h > diff --git a/src/lj_arch.h b/src/lj_arch.h > index 5bf0afb8..f0b60092 100644 > --- a/src/lj_arch.h > +++ b/src/lj_arch.h > @@ -585,4 +589,11 @@ > #define LJ_HASMEMPROF 1 > #endif > > +/* Disable or enable the platform and lua profiler. */ Typo: s/lua/Lua/ > +#if defined(LUAJIT_DISABLE_SYSPROF) || defined(LJ_ARCH_NOSYSPROF) || LJ_TARGET_WINDOWS || LJ_TARGET_CYGWIN || LJ_TARGET_PS3 || LJ_TARGET_PS4 || LJ_TARGET_XBOX360 Side note: I not sure that all available systems have pthread.h. Is that so? > +#define LJ_HASSYSPROF 0 > +#else > +#define LJ_HASSYSPROF 1 > +#endif > + > #endif > diff --git a/src/lj_errmsg.h b/src/lj_errmsg.h > index ae0a18c0..77a08cb0 100644 > --- a/src/lj_errmsg.h > +++ b/src/lj_errmsg.h > diff --git a/src/lj_mapi.c b/src/lj_mapi.c > index b2b35a17..0b2284f0 100644 > --- a/src/lj_mapi.c > +++ b/src/lj_mapi.c > diff --git a/src/lj_state.c b/src/lj_state.c > index f82b1b5b..cc6f92f1 100644 > --- a/src/lj_state.c > +++ b/src/lj_state.c > diff --git a/src/lj_sysprof.c b/src/lj_sysprof.c > new file mode 100644 > index 00000000..c0c83fa9 > --- /dev/null > +++ b/src/lj_sysprof.c > @@ -0,0 +1,483 @@ Minor: It would be nice to add some module description like it is done for . > +#define lj_sysprof_c > +#define LUA_CORE > + > +#include "lj_arch.h" > +#include "lj_sysprof.h" > + > +#if LJ_HASSYSPROF > + > +#include "lj_obj.h" > +#include "lj_debug.h" > +#include "lj_dispatch.h" > +#include "lj_frame.h" > + > +#if LJ_HASJIT > +#include "lj_jit.h" > +#include "lj_trace.h" > +#endif > + > +#include "lj_wbuf.h" > +#include "lj_profile_timer.h" > +#include "lj_symtab.h" > + > +#include > +#include > +#include Minor: I supposet that we should check that execinfo is available on the system and `backtrace` symbol is defined. Not sure that it is true for Alpine Linux, for example. > + > +#define SYSPROF_HANDLER_STACK_DEPTH 4 Minor: please add a comment why it is 4 and not 5 or 3. > +#define SYSPROF_BACKTRACE_BUF_SIZE 4096 Firstly, I thought that this is the size of the buffer in bytes. I suggest to rename this macro to SYSPROF_BACKTRACE_FRAME_MAX or some think like that (mention FRAME is important -- to get that it stands for the highest possible frame). Feel free to change it on your own. Also, I suppose that 4096 values (i.e. 4 Kb) on the stack is too much. Do you know somebody ready to examine backtrace with at least 1k entries? Also, if for LuaJIT expected limit *may be* really around 4K entries with the following code: | luajit -e 'local function rev (s) return string.gsub(s, "(.)(.+)", function (c,s1) return rev(s1)..c end) end local x = string.rep("0", 1000) rev(x)' Tarantool's Lua fiber's stack is limited by ~300 entries, IIRC. (All numbers are empirically established.) In real life I never see backtraces with more than 50 entries. So, I suppose that 512 entries are more than enough. Thoughts? > + > +enum sysprof_state { > + /* Profiler needs to be configured. */ > + SPS_UNCONFIGURED, In my opinion, it is better to have default config that profiler starts with by default. Thoughts? > + /* Profiler is not running. */ > + SPS_IDLE, > + /* Profiler is running. */ > + SPS_PROFILE, > + /* > + ** Stopped in case of stopped or failed stream. > + ** Saved errno is set at luaM_sysprof_stop. > + */ > + SPS_HALT > +}; > + > +struct sysprof { > + global_State *g; /* Profiled VM. */ > + pthread_t thread; /* Profiled thread. */ > + enum sysprof_state state; /* Internal state. */ As far as this internal state can be changed inside a signal handler (see `sysprof_record_sample()`) it should be declared as volatile, to avoid compile optimization of this code part. Moreover, it should be used `sig_atomic_t` here to be sure that its usage is atomic in context of usage inside the signal handler. > + struct lj_wbuf out; /* Output accumulator. */ > + struct luam_sysprof_counters counters; /* Profiling counters. */ > + struct luam_sysprof_options opt; /* Profiling options. */ > + struct luam_sysprof_config config; /* Profiler configurations. */ > + lj_profile_timer timer; /* Profiling timer. */ > + int saved_errno; /* Saved errno when profiler failed. */ > +}; > + Minor: I suppose we should add a comment here, that only one VM (`global_State` in terms) can be profiled at a time. > +static struct sysprof sysprof = {0}; > + > +/* --- Stream ------------------------------------------------------------- */ > + > +static const uint8_t ljp_header[] = {'l', 'j', 'p', LJP_FORMAT_VERSION, > + 0x0, 0x0, 0x0}; Minor: use tabs for alignment here. > + > +static int stream_is_needed(struct sysprof *sp) > +{ > + return LUAM_SYSPROF_DEFAULT != sp->opt.mode; Nit: `sp->opt.mode != LUAM_SYSPROF_DEFAULT` is more readable for me. Feel free to ignore. > +} > + > +static void stream_prologue(struct sysprof *sp) > +{ > + lj_symtab_dump(&sp->out, sp->g); > + lj_wbuf_addn(&sp->out, ljp_header, sizeof(ljp_header)); > +} > + > +static void stream_epilogue(struct sysprof *sp) > +{ > + lj_wbuf_addbyte(&sp->out, LJP_EPILOGUE_BYTE); > +} > + > +static void stream_lfunc(struct lj_wbuf *buf, GCfunc *func) Nit: I suppose, that we can use `const GCfunc` instead to accent, that this func cann't change here. Here and below. > +{ Minor: I suppose, that we should add lua_assert(isluafunc(func)) here. > + GCproto *pt = funcproto(func); Nit: GC proto can be declared as `const` here, isn't it? > + lua_assert(pt != NULL); > + lj_wbuf_addbyte(buf, LJP_FRAME_LFUNC); > + lj_wbuf_addu64(buf, (uintptr_t)pt); > + lj_wbuf_addu64(buf, (uint64_t)pt->firstline); > +} > + > +static void stream_cfunc(struct lj_wbuf *buf, GCfunc *func) > +{ Minor: I suppose, that we should add lua_assert(iscfunc(func)) here. > + lj_wbuf_addbyte(buf, LJP_FRAME_CFUNC); > + lj_wbuf_addu64(buf, (uintptr_t)func->c.f); > +} > + > +static void stream_ffunc(struct lj_wbuf *buf, GCfunc *func) > +{ Minor: I suppose, that we should add lua_assert(isffunc(func)) here. > + lj_wbuf_addbyte(buf, LJP_FRAME_FFUNC); > + lj_wbuf_addu64(buf, func->c.ffid); > +} > + > +static void stream_frame_lua(struct lj_wbuf *buf, cTValue *frame) > +{ > + GCfunc *func = frame_func(frame); Nit: GCfunc may be declared as const here, IINM. > + lua_assert(NULL != func); Side note: why do you use the reversed condition order here? :) > + if (isluafunc(func)) { > + stream_lfunc(buf, func); > + } else if (isffunc(func)) { > + stream_ffunc(buf, func); > + } else { Minor: I suppose that we should use also `else if` here with else lua_assert(0) unreachable branch below. > + stream_cfunc(buf, func); > + } > +} > + > +static void stream_backtrace_lua(struct sysprof *sp) > +{ > + global_State *g = sp->g; > + struct lj_wbuf *buf = &sp->out; > + cTValue *top_frame = NULL, *frame = NULL, *bot = NULL; > + lua_State *L = NULL; Nit: This initialization looks excess. Just declaration is enough. Same for top_frame, frame and bot. > + > + lua_assert(g != NULL); > + L = gco2th(gcref(g->cur_L)); > + lua_assert(L != NULL); > + > + top_frame = sp->g->top_frame.guesttop.interp_base - 1; Looks like this line expects that interp_base always has Lua function on top. Does it conflict with fast functions and C functions? Should it be `- (1 + LJ_FR2)` instead? Nit: s/sp->g/g/ > + bot = tvref(L->stack) + LJ_FR2; > + /* Traverse frames backwards */ > + for (frame = top_frame; frame > bot; frame = frame_prev(frame)) { > + if (frame_gc(frame) == obj2gco(L)) { Nit: `{` and `}` are excess. > + continue; /* Skip dummy frames. See lj_err_optype_call(). */ > + } > + stream_frame_lua(buf, frame); Should we skip VARAG pseudo-frames as well? > + } > + > + lj_wbuf_addbyte(buf, LJP_FRAME_LUA_LAST); > +} > + Minor: please add comment about return values. I don't get when this function returns the same address was given, and when NULL. And for what NULL stands for. > +static void *stream_frame_host(int frame_no, void *addr) Nit: also, looks like addr can be declared as a constant, can't it? > +{ > + struct sysprof *sp = &sysprof; > + > + if (LJ_UNLIKELY(frame_no <= SYSPROF_HANDLER_STACK_DEPTH)) { > + /* > + ** We don't want the profiler stack to be streamed, as it will > + ** burden the profile with unnecessary information. > + */ > + return addr; > + } else if (LJ_UNLIKELY(sp->opt.mode == LUAM_SYSPROF_LEAF && > + frame_no > SYSPROF_HANDLER_STACK_DEPTH + 1)) { Looks like it should be `frame_no > SYSPROF_HANDLER_STACK_DEPTH, indeed If `SYSPROF_HANDLER_STACK_DEPTH + 1 == frame_no` the first branch is taken. > + return NULL; > + } Nit: I suggest to move comment upper and omit curly brackets. Feel free to ignore. > + > + lj_wbuf_addu64(&sp->out, (uintptr_t)addr); > + return addr; > +} > + > +static void default_backtrace_host(void *(writer)(int frame_no, void *addr)) > +{ > + static void* backtrace_buf[SYSPROF_BACKTRACE_BUF_SIZE] = {}; Typo: s/void* backtrace_buf/void *backtrace_buf/ > + > + struct sysprof *sp = &sysprof; > + const int depth = backtrace(backtrace_buf, > + sp->opt.mode == LUAM_SYSPROF_LEAF > + ? SYSPROF_HANDLER_STACK_DEPTH + 1 > + : SYSPROF_BACKTRACE_BUF_SIZE); > + lua_assert(depth >= SYSPROF_HANDLER_STACK_DEPTH); Nit: strictly saying it is the following: | depth <= sp->opt.mode == LUAM_SYSPROF_LEAF ? | SYSPROF_HANDLER_STACK_DEPTH + 1 : SYSPROF_BACKTRACE_BUF_SIZE So I suggest to add a variable to use in backtrace configuration and the assert. > + > + for (int i = SYSPROF_HANDLER_STACK_DEPTH; i < depth; ++i) { > + if (!writer(i - SYSPROF_HANDLER_STACK_DEPTH + 1, backtrace_buf[i])) { Nit: { } brackets are excess. > + return; > + } > + } > +} > + > +static void stream_backtrace_host(struct sysprof *sp) > +{ > + lua_assert(sp->config.backtracer != NULL); > + sp->config.backtracer(stream_frame_host); > + lj_wbuf_addu64(&sp->out, (uintptr_t)LJP_FRAME_HOST_LAST); > +} > + > +static void stream_trace(struct sysprof *sp) > +{ I suppose we should add the following ifdef here. | #if LJ_HASJIT > + struct lj_wbuf *out = &sp->out; > + Nit: this empty line is excess. > + uint32_t traceno = sp->g->vmstate; > + jit_State *J = G2J(sp->g); > + > + lj_wbuf_addu64(out, traceno); > + lj_wbuf_addu64(out, (uintptr_t)J->prev_pt); > + lj_wbuf_addu64(out, J->prev_line); Don't get this line. J->prev_line is not the line of currently executed trace, IINM. But the line of code that is recorded now. Also, prev_line is updated only by LuaJIT's profiler in 'l' mode. Ditto for `J->prev_pt`. > +} > + > +static void stream_guest(struct sysprof *sp) > +{ > + stream_backtrace_lua(sp); > + stream_backtrace_host(sp); > +} > + > +static void stream_host(struct sysprof *sp) > +{ > + stream_backtrace_host(sp); > +} > + > +typedef void (*event_streamer)(struct sysprof *sp); > + > +static event_streamer event_streamers[] = { > + /* XXX: order is important */ > + stream_host, /* LJ_VMST_INTERP */ > + stream_guest, /* LJ_VMST_LFUNC */ > + stream_guest, /* LJ_VMST_FFUNC */ > + stream_guest, /* LJ_VMST_CFUNC */ > + stream_host, /* LJ_VMST_GC */ > + stream_host, /* LJ_VMST_EXIT */ > + stream_host, /* LJ_VMST_RECORD */ > + stream_host, /* LJ_VMST_OPT */ > + stream_host, /* LJ_VMST_ASM */ > + stream_trace /* LJ_VMST_TRACE */ > +}; I don't understand the following point: Assume I have the following Lua code named : | 1 local tsp = require "testsysprof" | 2 jit.off() jit.flush() | 3 | 4 function lc(nsec) -- Lua call to C | 5 local r = 0 | 6 for i = 1, 1e5 do r = r + i end | 7 return tsp.c(nsec) -- Regular C function does n sleeps to 1 sec | 8 end | 9 | 10 local function lclc(nsec) -- Lua call to C call to Lua call to C | 11 local r = 0 | 12 for i = 1, 1e5 do r = r + i end | 13 return tsp.clc(nsec) -- C call to Lua call to C | 14 end | 15 | 16 local s, r = misc.sysprof.start( {mode = "C", interval = 1, path = "sysprof.bin"}) | 17 assert(s,r) | 18 | 19 for i = 1, 100 do | 20 assert(lclc(3) == true) -- 3 sec sleep | 21 end | 22 | 23 s,r = misc.sysprof.stop() | 24 assert(s,r) I expect to see the following callgraph: Lua->C->Lua->C: +----------------------------+ | CFUNC | +----------------------------+ |....CFUNCS related to VM....| +----------------------------+ | tsysprof.lua:4 | +----------------------------+ |....CFUNCS related to VM....| +----------------------------+ | CFUNC | +----------------------------+ |....CFUNCS related to VM....| +----------------------------+ | tsysprof.lua:10 | +----------------------------+---------------------------+ | tsysprof.lua:0 | MB some other code flow | +----------------------------+---------------------------+ | tsysprof.lua:0 | +--------------------------------------------------------+ |......................CFUNCS till main()................| +--------------------------------------------------------+ >From the code flow above IINM we report separate information about host stack and guest stack. Will they be merged together by sysprof-parser? If not -- then instead excpected callgraph I'll see something like the following: +---------------------------+ | CFUNC | +---------------------------+ |....CFUNCS related to VM...| +---------------------------+ | tsysprof.lua:4 | +----------------------------+---------------------------+ |....CFUNCS related to VM....|....CFUNCS related to VM...| +----------------------------+---------------------------+ | tsysprof.lua:10 | CFUNC | +----------------------------+---------------------------+-------------------------+ | sysprof.lua:0 ttttt |.MB some other code flow.| +--------------------------------------------------------+-------------------------+ | tsysprof.lua:0 | +----------------------------------------------------------------------------------+ |......................CFUNCS till main()..........................................| +----------------------------------------------------------------------------------+ But, if them merged please add the corresponding comment here and in RFC, to avoid confusing. > + > +static void stream_event(struct sysprof *sp, uint32_t vmstate) > +{ > + event_streamer stream = NULL; > + > + /* Check that vmstate fits in 4 bits (see streaming format) */ > + lua_assert((vmstate & ~(uint32_t)((1 << 4) - 1)) == 0); Minor: May be it is better to add corresponding define in the header? > + lj_wbuf_addbyte(&sp->out, (uint8_t)vmstate); > + Nit: the following empty line is excess. Feel free to ignor. > + stream = event_streamers[vmstate]; > + lua_assert(NULL != stream); > + stream(sp); > +} > + > +/* -- Signal handler ------------------------------------------------------ */ > + > +static void sysprof_record_sample(struct sysprof *sp, siginfo_t *info) > +{ > + global_State *g = sp->g; > + uint32_t _vmstate = ~(uint32_t)(g->vmstate); > + uint32_t vmstate = _vmstate < LJ_VMST_TRACE ? _vmstate : LJ_VMST_TRACE; > + Some comments about the assert below are wellcome. > + lua_assert(pthread_self() == sp->thread); >From Programmer's Manual PTHREAD_SELF(3) | SYNOPSIS | #include | pthread_t pthread_self(void); | Compile and link with -pthread. So it looks like we should add corresponding flag in CMake. > + > + /* Caveat: order of counters must match vmstate order in . */ > + ((uint64_t *)&sp->counters)[vmstate]++; > + > + sp->counters.samples++; > + sp->counters.overruns += info->si_overrun; > + > + if (stream_is_needed(sp)) { Minor: `if (!stream_is_needed(sp)) return;` (early return) looks pretier to me. > + stream_event(sp, vmstate); > + if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) { > + sp->saved_errno = lj_wbuf_errno(&sp->out); > + lj_wbuf_terminate(&sp->out); > + sp->state = SPS_HALT; > + } > + } > +} > + > +static void sysprof_signal_handler(int sig, siginfo_t *info, void *ctx) > +{ > + struct sysprof *sp = &sysprof; > + UNUSED(sig); > + UNUSED(ctx); > + > + switch (sp->state) { > + case SPS_PROFILE: > + sysprof_record_sample(sp, info); > + break; > + > + case SPS_IDLE: > + case SPS_HALT: > + /* noop */ > + break; > + > + default: > + lua_assert(0); > + break; > + } > +} > + > +/* -- Internal ------------------------------------------------------------ */ > + > +static int sysprof_validate(struct sysprof *sp, > + const struct luam_sysprof_options *opt) Side note: See my question about default configuration above. > +{ > + switch (sp->state) { > + case SPS_UNCONFIGURED: > + return SYSPROF_ERRUSE; > + > + case SPS_IDLE: { Nit: {} looks unnecessary here. > + if (opt->mode > LUAM_SYSPROF_CALLGRAPH) { As far as it is enum it can be < than LUAM_SYSPROF_DEFAULT. > + return SYSPROF_ERRUSE; > + } else if (opt->mode != LUAM_SYSPROF_DEFAULT && > + (opt->buf == NULL || opt->len == 0 || > + sp->config.writer == NULL || sp->config.on_stop == NULL)) { If they all are set by public API may be it is better to use lua_apicheck() instead? > + return SYSPROF_ERRUSE; > + } else if (opt->interval == 0) { > + return SYSPROF_ERRUSE; > + } > + break; > + } > + > + case SPS_PROFILE: > + case SPS_HALT: > + return SYSPROF_ERRRUN; And the similar check is missed for memprof start (ERRRUN return only if the state is MPS_PROFILE). I suppose we should forbid to start profiling if the profiler is in HALT state -- it means that the profiling with error to report still running. Wait Igor's opinion here. If yes, it should be fixed in memprof in the separate patchset. > + > + default: > + lua_assert(0); Nit: `break;` is missing. >From The C programming language - Second edition (K&R 2): Chapter 3.4 Switch | As a matter of good form, put a break after the last case (the default | here) even though it's logically unnecessary. Some day when another case | gets added at the end, this bit of defensive programming will save you. > + } > + > + return SYSPROF_SUCCESS; > +} > + > +static int sysprof_init(struct sysprof *sp, lua_State *L, > + const struct luam_sysprof_options *opt) > +{ > + int status = sysprof_validate(sp, opt); Nit: status may be declared as const. > + if (SYSPROF_SUCCESS != status) { Nit: {} are excess. > + return status; > + } > + > + /* Copy validated options to sysprof state. */ > + memcpy(&sp->opt, opt, sizeof(sp->opt)); > + > + /* Init general fields. */ > + sp->g = G(L); > + sp->thread = pthread_self(); > + > + /* Reset counters. */ > + memset(&sp->counters, 0, sizeof(sp->counters)); > + > + /* Reset saved errno. */ > + sp->saved_errno = 0; > + > + if (stream_is_needed(sp)) { > + lj_wbuf_init(&sp->out, sp->config.writer, opt->ctx, opt->buf, opt->len); > + } Nit: {} are excess. > + > + return SYSPROF_SUCCESS; > +} > + > +/* -- Public profiling API ------------------------------------------------ */ > + > +int lj_sysprof_configure(const struct luam_sysprof_config *config) > +{ > + struct sysprof *sp = &sysprof; > + lua_assert(config != NULL); > + if (sp->state != SPS_UNCONFIGURED && sp->state != SPS_IDLE) { Nit: {} are excess. > + return SYSPROF_ERRUSE; > + } > + > + memcpy(&sp->config, config, sizeof(*config)); > + > + if (sp->config.backtracer == NULL) { Ditto. > + sp->config.backtracer = default_backtrace_host; > + } > + > + sp->state = SPS_IDLE; > + > + return SYSPROF_SUCCESS; > +} > + > +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt) > +{ > + struct sysprof *sp = &sysprof; > + > + int status = sysprof_init(sp, L, opt); > + if (SYSPROF_SUCCESS != status) { > + if (NULL != sp->config.on_stop) { I suppose as far as this situation is only possible by the user's mistake during code development it will be better to use lua_assert() or lua_apicheck() here. Thoughts? > + /* > + ** Initialization may fail in case of unconfigured sysprof, > + ** so we cannot guarantee cleaning up resources in this case. > + */ > + sp->config.on_stop(opt->ctx, opt->buf); > + } > + return status; > + } > + > + sp->state = SPS_PROFILE; > + > + if (stream_is_needed(sp)) { > + stream_prologue(sp); > + if (LJ_UNLIKELY(lj_wbuf_test_flag(&sp->out, STREAM_ERRIO|STREAM_STOP))) { > + /* on_stop call may change errno value. */ > + int saved_errno = lj_wbuf_errno(&sp->out); Nit: saved_errno may be declared as const. > + /* Ignore possible errors. mp->out.buf may be NULL here. */ > + sp->config.on_stop(opt->ctx, sp->out.buf); > + lj_wbuf_terminate(&sp->out); > + sp->state = SPS_IDLE; > + errno = saved_errno; > + return SYSPROF_ERRIO; > + } > + } > + > + sp->timer.opt.interval_msec = opt->interval; > + sp->timer.opt.handler = sysprof_signal_handler; > + lj_profile_timer_start(&sp->timer); > + > + return SYSPROF_SUCCESS; > +} > + > +int lj_sysprof_stop(lua_State *L) > +{ > + struct sysprof *sp = &sysprof; > + global_State *g = sp->g; > + struct lj_wbuf *out = &sp->out; > + > + if (SPS_IDLE == sp->state) { > + return SYSPROF_ERRSTOP; Why it is bad to have SPS_IDLE here (why UNCONFIGURED is OK)? We can use PROFILE_ERRUSE as for this error for consistency with memprof. > + } else if (G(L) != g) { > + return SYSPROF_ERRUSE; > + } Nit: {} are excess. > + > + lj_profile_timer_stop(&sp->timer); > + > + if (SPS_HALT == sp->state) { > + errno = sp->saved_errno; > + sp->state = SPS_IDLE; > + /* wbuf was terminated when error occured. */ > + return SYSPROF_ERRIO; > + } > + > + sp->state = SPS_IDLE; > + > + if (stream_is_needed(sp)) { > + int cb_status = 0; Nit: This initialization looks excess. > + > + stream_epilogue(sp); > + lj_wbuf_flush(out); > + > + cb_status = sp->config.on_stop(sp->opt.ctx, out->buf); > + if (LJ_UNLIKELY(lj_wbuf_test_flag(out, STREAM_ERRIO|STREAM_STOP)) || Typo: s/ERRIO|STREAM/ERRIO | STREAM/ > + 0 != cb_status) { Side note: why do you use here 0 != cb_status, but not cb_status != 0? > + errno = lj_wbuf_errno(out); > + lj_wbuf_terminate(out); > + return SYSPROF_ERRIO; > + } > + > + lj_wbuf_terminate(out); > + } > + > + return SYSPROF_SUCCESS; > +} > + > +int lj_sysprof_report(struct luam_sysprof_counters *counters) > +{ > + const struct sysprof *sp = &sysprof; > + if (sp->state != SPS_IDLE) { Nit: {} are excess. > + return SYSPROF_ERRUSE; > + } > + memcpy(counters, &sp->counters, sizeof(sp->counters)); > + return SYSPROF_SUCCESS; > +} > + > +#else /* LJ_HASSYSPROF */ > +#endif /* LJ_HASSYSPROF */ > + Nit: this empty line is excess. > diff --git a/src/lj_sysprof.h b/src/lj_sysprof.h > new file mode 100644 > index 00000000..515bc08c > --- /dev/null > +++ b/src/lj_sysprof.h > @@ -0,0 +1,94 @@ > +/* > +** Sysprof - platform and Lua profiler Typo: s/profiler/profiler/. > +*/ > + > +/* > +** XXX: Platform profiler is not thread safe. Please, don't try to > +** use it inside several VM, you can profile only one at a time. > +*/ > + > +/* > +** XXX: Platform profiler uses the same signal backend as lj_profile. Please, > +** don't use both at the same time. So, may be it is better to use another one signal for sysprof profilling? > +*/ > + > +#ifndef _LJ_SYSPROF_H > +#define _LJ_SYSPROF_H > + > +#include "lj_obj.h" > +#include "lmisclib.h" > + > +#define LJP_FORMAT_VERSION 0x1 > + > +/* > +** Event stream format: > +*/ > + > +#define LJP_FRAME_LFUNC ((uint8_t)1) > +#define LJP_FRAME_CFUNC ((uint8_t)2) > +#define LJP_FRAME_FFUNC ((uint8_t)3) > +#define LJP_FRAME_LUA_LAST 0x80 > +#define LJP_FRAME_HOST_LAST NULL > + > +#define LJP_EPILOGUE_BYTE 0x80 > + > +int lj_sysprof_configure(const struct luam_sysprof_config *config); > + > +int lj_sysprof_start(lua_State *L, const struct luam_sysprof_options *opt); > + > +int lj_sysprof_stop(lua_State *L); > + > +int lj_sysprof_report(struct luam_sysprof_counters *counters); > + > +#endif > + Nit: this empty line is excess. > diff --git a/src/lmisclib.h b/src/lmisclib.h > index 0c07707e..3545ff47 100644 > --- a/src/lmisclib.h > +++ b/src/lmisclib.h > @@ -60,6 +60,99 @@ struct luam_Metrics { > > LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics); > > +/* --- Sysprof - platform and lua profiler -------------------------------- */ > + > +/* Profiler configurations */ Typo: s/configurations/configurations./ > +struct luam_sysprof_config { Minor: luam_Sysprof_Config is more consistent. Perfect variant is the luam_, but I don't know how to name it. > + /* > + ** Writer function for profile events. > + ** Should return amount of written bytes on success or zero in case of error. > + ** Setting *data to NULL means end of profiling. > + ** For details see . Please add also, that this writer *must be* asinc-safe function (see also man 7 signal-safety). > + */ > + size_t (*writer)(const void **data, size_t len, void *ctx); > + /* > + ** Callback on profiler stopping. Required for correctly cleaning > + ** at VM finalization when profiler is still running. > + ** Returns zero on success. > + */ > + int (*on_stop)(void *ctx, uint8_t *buf); > + /* > + ** Backtracing function for the host stack. Should call `frame_writer` on > + ** each frame in the stack in the order from the stack top to the stack > + ** bottom. The `frame_writer` function is implemented inside the sysprof > + ** and will be passed to the `backtracer` function. If `frame_writer` returns > + ** NULL, backtracing should be stopped. If `frame_writer` returns not NULL, > + ** the backtracing should be continued if there are frames left. > + */ > + void (*backtracer)(void *(*frame_writer)(int frame_no, void *addr)); > +}; > + > +enum luam_sysprof_mode { Nit: I don't think that we need to provide the enum here, just macro defines is enough. Feel free to ignore, if not: Minor: Ditto for naming as for luam_sysprof_config. > + /* > + ** DEFAULT mode collects only data for luam_sysprof_counters, which is stored > + ** in memory and can be collected with luaM_sysprof_report after profiler > + ** stops. > + */ > + LUAM_SYSPROF_DEFAULT, > + /* > + ** LEAF mode = DEFAULT + streams samples with only top frames of host and > + ** guests stacks in format described in > + */ > + LUAM_SYSPROF_LEAF, > + /* > + ** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host > + ** and guest stacks in format described in > + */ > + LUAM_SYSPROF_CALLGRAPH > +}; > + > +struct luam_sysprof_counters { Minor: Ditto for naming as for luam_sysprof_config. > + uint64_t vmst_interp; > + uint64_t vmst_lfunc; > + uint64_t vmst_ffunc; > + uint64_t vmst_cfunc; > + uint64_t vmst_gc; > + uint64_t vmst_exit; > + uint64_t vmst_record; > + uint64_t vmst_opt; > + uint64_t vmst_asm; > + uint64_t vmst_trace; > + /* XXX: order of vmst counters is important */ Minor: please add clarification that it must be the same as the order of the vmstates. Or even better: add LUAJIT_STATIC_ASSERT. Typo: s/important/important./ > + uint64_t samples; > + uint64_t overruns; > +}; > + > +/* Profiler options */ Typo: s/options/options./ > +struct luam_sysprof_options { Minor: Ditto for naming as for luam_sysprof_config. > + /* Profiling mode */ Typo: s/mode/mode./ > + enum luam_sysprof_mode mode; > + /* Sampling interval in msec */ Typo: s/msec/msec./ > + uint64_t interval; > + /* Custom buffer to write data. */ > + uint8_t *buf; > + /* The buffer's size. */ > + size_t len; > + /* Context for the profile writer and final callback. */ > + void *ctx; > +}; > + > +#define SYSPROF_SUCCESS (0) > +#define SYSPROF_ERRUSE (1) > +#define SYSPROF_ERRRUN (2) > +#define SYSPROF_ERRSTOP (3) > +#define SYSPROF_ERRIO (4) There is no need in () here. Also why we can't move PROFILER status from here instead? > + > +LUAMISC_API int luaM_sysprof_configure(const struct luam_sysprof_config *config); > + > +LUAMISC_API int luaM_sysprof_start(lua_State *L, > + const struct luam_sysprof_options *opt); > + > +LUAMISC_API int luaM_sysprof_stop(lua_State *L); > + > +LUAMISC_API int luaM_sysprof_report(struct luam_sysprof_counters *counters); > + > + > #define LUAM_MISCLIBNAME "misc" > LUALIB_API int luaopen_misc(lua_State *L); > > diff --git a/test/tarantool-tests/CMakeLists.txt b/test/tarantool-tests/CMakeLists.txt > index a872fa5e..cf47cc67 100644 > --- a/test/tarantool-tests/CMakeLists.txt > +++ b/test/tarantool-tests/CMakeLists.txt > @@ -62,6 +62,7 @@ add_subdirectory(gh-6189-cur_L) > diff --git a/test/tarantool-tests/misclib-sysprof-capi.test.lua b/test/tarantool-tests/misclib-sysprof-capi.test.lua > new file mode 100644 > index 00000000..d468572d > --- /dev/null > +++ b/test/tarantool-tests/misclib-sysprof-capi.test.lua > @@ -0,0 +1,53 @@ > +-- Sysprof is implemented for x86 and x64 architectures only. > +require("utils").skipcond( > + jit.arch ~= "x86" and jit.arch ~= "x64", > + jit.arch.." architecture is NIY for memprof" Typo: s/memprof/sysprof/ > +) > + > +local testsysprof = require("testsysprof") > + > +local tap = require("tap") > + > +local test = tap.test("clib-misc-sysprof") > +test:plan(4) > + > +test:ok(testsysprof.base()) > +test:ok(testsysprof.validation()) > + > +local function lua_payload(n) > + if n <= 1 then > + return n > + end > + return lua_payload(n - 1) + lua_payload(n - 2) > +end > + > +local function payload() > + local n_iterations = 500000 > + > + local co = coroutine.create(function () > + for i = 1, n_iterations do > + if i % 2 == 0 then > + testsysprof.c_payload(10) > + else > + lua_payload(10) > + end > + coroutine.yield() > + end > + end) > + > + for _ = 1, n_iterations do > + coroutine.resume(co) > + end > +end > + > +local jit = require('jit') > + > +jit.off() > + > +test:ok(testsysprof.profile_func(payload)) > + > +jit.on() > +jit.flush() > + > +test:ok(testsysprof.profile_func(payload)) > + Minor: please, add the following line to unify test with the others: | os.exit(test:check() and 0 or 1) > diff --git a/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt > new file mode 100644 > index 00000000..d9fb1a1a > --- /dev/null > +++ b/test/tarantool-tests/misclib-sysprof-capi/CMakeLists.txt > @@ -0,0 +1 @@ > +BuildTestCLib(testsysprof testsysprof.c) > diff --git a/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c > new file mode 100644 > index 00000000..46970a72 > --- /dev/null > +++ b/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c > @@ -0,0 +1,269 @@ I see the following warnings during compilation: | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'on_stop_cb_default': | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:74:51: warning: unused parameter 'buf' [-Wunused-parameter] | 74 | static int on_stop_cb_default(void *opt, uint8_t *buf) | | ~~~~~~~~~^~~ | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'c_payload': | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:115:1: warning: no return statement in function returning non-void [-Wreturn-type] | 115 | } | | ^ | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c: In function 'base': | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:126:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] | 126 | struct luam_sysprof_options opt = {}; | | ^~~~~~ | /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:133:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] | 133 | struct luam_sysprof_counters cnt = {}; | | ^~~~~~ Also test fails with the following coredump in backtrace: | #0 0x000055c326294d8d in lj_BC_RET () at buildvm_x86.dasc:811 | #1 0x000055c326296138 in lj_ff_coroutine_resume () at buildvm_x86.dasc:1688 | #2 0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=0, errfunc=0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187 | #3 0x00007f5639a8c9b0 in profile_func (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/test/tarantool-tests/misclib-sysprof-capi/testsysprof.c:231 | #4 0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811 | #5 0x000055c3262271da in lua_pcall (L=0x4197d378, nargs=0, nresults=-1, errfunc=2) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1187 | #6 0x000055c326220a9e in docall (L=0x4197d378, narg=0, clear=0) at /home/burii/reviews/luajit/sysprof/src/luajit.c:121 | #7 0x000055c32622146a in handle_script (L=0x4197d378, argx=0x7fffcbbb3010) at /home/burii/reviews/luajit/sysprof/src/luajit.c:291 | #8 0x000055c3262220c3 in pmain (L=0x4197d378) at /home/burii/reviews/luajit/sysprof/src/luajit.c:551 | #9 0x000055c326295313 in lj_BC_FUNCC () at buildvm_x86.dasc:811 | #10 0x000055c3262274f9 in lua_cpcall (L=0x4197d378, func=0x55c326221ee2 , ud=0x0) at /home/burii/reviews/luajit/sysprof/src/lj_api.c:1221 | #11 0x000055c3262221d5 in main (argc=4, argv=0x7fffcbbb2ff8) at /home/burii/reviews/luajit/sysprof/src/luajit.c:580 > +#include > +#include > +#include > + > +#include > + > +#include > +#include > + > +#undef NDEBUG > +#include > + > + > +/* --- utils -------------------------------------------------------------- */ > + > +#define SYSPROF_INTERVAL_DEFAULT 11 > + > +/* > +** Yep, 8Mb. Tuned in order not to bother the platform with too often flushes. > +*/ > +#define STREAM_BUFFER_SIZE (8 * 1024 * 1024) Minor: I suppose that we can take a smaller size for test. Feel free to ignore. > + > +/* Structure given as ctx to memprof writer and on_stop callback. */ > +struct sysprof_ctx { Minor: For our test files we use tabs (as in Tarantool) instead "half-quoter tabs" (yes I know, that it is inconsistent). Here and below. > + /* Output file stream for data. */ > + FILE *stream; > + /* Buffer for data. */ > + uint8_t buf[STREAM_BUFFER_SIZE]; > +}; > + > +/* > +** Default buffer writer function. > +** Just call fwrite to the corresponding FILE. This writer function is not async-signal-safe. We should use just write here instead. > +*/ > +static size_t buffer_writer_default(const void **buf_addr, size_t len, > + void *opt) > +{ > + struct sysprof_ctx *ctx = opt; > + FILE *stream = ctx->stream; > + const void * const buf_start = *buf_addr; > + const void *data = *buf_addr; > + size_t write_total = 0; > + > + assert(len <= STREAM_BUFFER_SIZE); > + > + for (;;) { > + const size_t written = fwrite(data, 1, len - write_total, stream); > + > + if (written == 0) { > + /* Re-tries write in case of EINTR. */ > + if (errno != EINTR) { > + /* Will be freed as whole chunk later. */ > + *buf_addr = NULL; > + return write_total; > + } > + errno = 0; > + continue; > + } > + > + write_total += written; > + assert(write_total <= len); > + > + if (write_total == len) > + break; > + > + data = (uint8_t *)data + (ptrdiff_t)written; > + } > + > + *buf_addr = buf_start; > + return write_total; > +} > + > +/* Default on stop callback. Just close the corresponding stream. */ > +static int on_stop_cb_default(void *opt, uint8_t *buf) > +{ > + struct sysprof_ctx *ctx = opt; > + FILE *stream = ctx->stream; > + free(ctx); > + return fclose(stream); > +} > + > +static int stream_init(struct luam_sysprof_options *opt) > +{ > + struct sysprof_ctx *ctx = calloc(1, sizeof(struct sysprof_ctx)); > + if (NULL == ctx) { Nit: {} are excess. > + return SYSPROF_ERRIO; > + } > + > + ctx->stream = fopen("/dev/null", "wb"); > + if (NULL == ctx->stream) { > + free(ctx); > + return SYSPROF_ERRIO; > + } > + > + opt->ctx = ctx; > + opt->buf = ctx->buf; > + opt->len = STREAM_BUFFER_SIZE; > + > + return SYSPROF_SUCCESS; > +} > + > +/* --- Payload ------------------------------------------------------------ */ > + > +static double fib(double n) > +{ > + if (n <= 1) { > + return n; > + } Nit: {} are excess. > + return fib(n - 1) + fib(n - 2); > +} > + > +static int c_payload(lua_State *L) > +{ > + fib(luaL_checknumber(L, 1)); How many values does this function return? > +} > + > +/* --- sysprof C API tests ------------------------------------------------ */ > + > +static int base(lua_State *L) > +{ > + struct luam_sysprof_config config = {}; > + (void)config.writer; > + (void)config.on_stop; > + (void)config.backtracer; > + > + struct luam_sysprof_options opt = {}; > + (void)opt.interval; > + (void)opt.mode; > + (void)opt.ctx; > + (void)opt.buf; > + (void)opt.len; > + > + struct luam_sysprof_counters cnt = {}; > + luaM_sysprof_report(&cnt); > + > + (void)cnt.samples; > + (void)cnt.overruns; > + (void)cnt.vmst_interp; > + (void)cnt.vmst_lfunc; > + (void)cnt.vmst_ffunc; > + (void)cnt.vmst_cfunc; > + (void)cnt.vmst_gc; > + (void)cnt.vmst_exit; > + (void)cnt.vmst_record; > + (void)cnt.vmst_opt; > + (void)cnt.vmst_asm; > + (void)cnt.vmst_trace; > + > + lua_pushboolean(L, 1); > + return 1; > +} > + > +static int validation(lua_State *L) > +{ > + struct luam_sysprof_config config = {}; > + struct luam_sysprof_options opt = {}; > + int status = SYSPROF_SUCCESS; > + > + status = luaM_sysprof_configure(&config); > + assert(SYSPROF_SUCCESS == status); > + > + /* Unknown mode */ Typo: s/mode/mode./ > + opt.mode = 0x40; > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_ERRUSE == status); > + > + /* Buffer not configured */ Typo: s/configured/configured./ > + opt.mode = LUAM_SYSPROF_CALLGRAPH; > + opt.buf = NULL; > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_ERRUSE == status); > + > + /* Bad interval */ Typo: s/interval/interval./ > + opt.mode = LUAM_SYSPROF_DEFAULT; > + opt.interval = 0; > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_ERRUSE == status); > + > + /* Check if profiling started */ Typo: s/started/started./ > + opt.mode = LUAM_SYSPROF_DEFAULT; > + opt.interval = SYSPROF_INTERVAL_DEFAULT; > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_SUCCESS == status); > + > + /* Already running */ Typo: s/running/running./ > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_ERRRUN == status); > + > + /* Profiler stopping */ Typo: s/stopping/stopping./ > + status = luaM_sysprof_stop(L); > + assert(SYSPROF_SUCCESS == status); > + > + /* Stopping profiler which is not running */ Typo: s/running/running./ > + status = luaM_sysprof_stop(L); > + assert(SYSPROF_ERRSTOP == status); Suggest to use ERRRUN here, see comment about consistensy with memprof above. > + > + lua_pushboolean(L, 1); > + return 1; > +} > + > +static int profile_func(lua_State *L) > +{ > + struct luam_sysprof_config config = {}; > + struct luam_sysprof_options opt = {}; > + struct luam_sysprof_counters cnt = {}; > + int status = SYSPROF_ERRUSE; > + > + int n = lua_gettop(L); > + if (n != 1 || !lua_isfunction(L, 1)) { Nit: {} are excess. > + luaL_error(L, "incorrect argument: 1 function is required"); > + } > + > + /* > + ** Since all the other modes functionality is the > + ** subset of CALLGRAPH mode, run this mode to test > + ** the profiler's behavior. > + */ > + opt.mode = LUAM_SYSPROF_CALLGRAPH; > + opt.interval = SYSPROF_INTERVAL_DEFAULT; > + stream_init(&opt); > + > + config.on_stop = on_stop_cb_default; > + config.writer = buffer_writer_default; > + status = luaM_sysprof_configure(&config); > + assert(SYSPROF_SUCCESS == status); > + > + status = luaM_sysprof_start(L, &opt); > + assert(SYSPROF_SUCCESS == status); > + > + /* Run payload. */ > + if (lua_pcall(L, 0, 0, 0) != 0) { Minor: it is better to use != LUA_OK Nit: {} are excess. > + luaL_error(L, "error running payload: %s", lua_tostring(L, -1)); > + } > + > + status = luaM_sysprof_stop(L); > + assert(SYSPROF_SUCCESS == status); > + > + status = luaM_sysprof_report(&cnt); > + assert(SYSPROF_SUCCESS == status); > + > + assert(cnt.samples > 1); > + assert(cnt.samples == cnt.vmst_asm + > + cnt.vmst_cfunc + > + cnt.vmst_exit + > + cnt.vmst_ffunc + > + cnt.vmst_gc + > + cnt.vmst_interp + > + cnt.vmst_lfunc + > + cnt.vmst_opt + > + cnt.vmst_record + > + cnt.vmst_trace); > + > + lua_pushboolean(L, 1); > + return 1; > +} > + > +static const struct luaL_Reg testsysprof[] = { > + {"c_payload", c_payload}, > + {"base", base}, > + {"validation", validation}, > + {"profile_func", profile_func}, Nit: Please sort entries alphabetically. > + {NULL, NULL} > +}; > + > +LUA_API int luaopen_testsysprof(lua_State *L) > +{ > + luaL_register(L, "testsysprof", testsysprof); > + return 1; > +} > -- > 2.33.0 > -- Best regards, Sergey Kaplun