From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 9277E70310; Fri, 15 Jan 2021 16:14:29 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 9277E70310 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1610716469; bh=x67fPmT3qyxLZXZpDMal0j1tTMCJNLrU8DNbVwBDBto=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=YvZpBD2qM6XPI0EwflSD/OMzE0F/keIVKsiX4iM0iwoQJLync74NzJEobEFhSISWS FXJhYl+mIM/ycUooXxlq+Xj6obGgaNrQkBiCCGRkG0r649eUfAnia6ZJc44DLKPWDb zZ+kwRh22Rt4pL8CMCvOgczK2U5gK7I16j2WPQPY= Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C32347030F for ; Fri, 15 Jan 2021 16:14:27 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C32347030F Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1l0Ovq-0001Zo-Ij; Fri, 15 Jan 2021 16:14:27 +0300 Date: Fri, 15 Jan 2021 16:14:24 +0300 To: Sergey Kaplun Message-ID: <20210115131424.GA5460@tarantool.org> References: <20201225113431.9538-1-skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20201225113431.9538-1-skaplun@tarantool.org> X-Clacks-Overhead: GNU Terry Pratchett User-Agent: Mutt/1.10.1 (2018-07-13) X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D0E79FBC973162CD81CC0669AF3BE2AC14A0BAA29BC1501600894C459B0CD1B97BB43DEBB38C4BA38943BD1DB8D91502B89A0631FD4809DAAD39CAEC4D145104 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE72267471453D8B600EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063715C0BCBD048C6310EA1F7E6F0F101C674E70A05D1297E1BBC6CDE5D1141D2B1C05E384232DE085DD127E69F6A5A233A60AF4B6E99E6950B19FA2833FD35BB23D9E625A9149C048EE9ECD01F8117BC8BEA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735209ECD01F8117BC8BEA471835C12D1D977C4224003CC836476EC64975D915A344093EC92FD9297F6718AA50765F7900637025748410550DC7BA7F4EDE966BC389F395957E7521B51C24C7702A67D5C33162DBA43225CD8A89FB265458E1AD0BBAD156CCFE7AF13BCA4B5C8C57E37DE458B4C7702A67D5C3316FA3894348FB808DB48C21F01D89DB561574AF45C6390F7469DAA53EE0834AAEE X-C1DE0DAB: 0D63561A33F958A5173105F5A063A11789F69AFB0EFB16E617ADCE887E1F2B18D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75448CF9D3A7B2C848410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34FDC9529E4578B99C8A3AC952A93801CE6792B362EBF4768DAC0F1C347EFD5C5AE477C125345D474F1D7E09C32AA3244CBA90D793D3ED302B21A3D37AC18EB8B51E098CBE561D6343927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojT/h6YDL0UQ6g1ptsjxiJmA== X-Mailru-Sender: 689FA8AB762F73936BC43F508A0638228E2D30879A5BE0C860CB36F556D9AE38A7C8D0F45F857DBFE9F1EFEE2F478337FB559BB5D741EB964C8C2C849690F8E70A04DAD6CC59E33667EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-discussions] [RFC luajit v3] rfc: describe a LuaJIT memory profiler X-BeenThere: tarantool-discussions@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development process List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-discussions Reply-To: Igor Munkin Cc: tarantool-discussions@dev.tarantool.org Errors-To: tarantool-discussions-bounces@dev.tarantool.org Sender: "Tarantool-discussions" Sergey, Thanks for the changes. There is a bit of nitpicking below and I believe we'll push the next version doc to the trunk. On 25.12.20, Sergey Kaplun wrote: > Part of #5442 > --- > > RFC on branch: https://github.com/tarantool/tarantool/blob/skaplun/gh-5442-luajit-memory-profiler/doc/rfc/5442-luajit-memory-profiler.md > > Changes in v3: > * More comments in example. > * More verbose benchmark information. > * Grammar and spelling fixes. > > Changes in v2: > * Removed C API, Tarantool integration and description of additional > features -- they will be added in another RFC if necessary. > * Removed checking profile is running from the public API. > * Added benchmarks and more meaningful example. > * Grammar fixes. > > doc/rfc/5442-luajit-memory-profiler.md | 314 +++++++++++++++++++++++++ > 1 file changed, 314 insertions(+) > create mode 100644 doc/rfc/5442-luajit-memory-profiler.md > > diff --git a/doc/rfc/5442-luajit-memory-profiler.md b/doc/rfc/5442-luajit-memory-profiler.md > new file mode 100644 > index 000000000..85a61462a > --- /dev/null > +++ b/doc/rfc/5442-luajit-memory-profiler.md > @@ -0,0 +1,314 @@ > +### Prerequisites > + > +This section describes additional changes in LuaJIT required for the feature > +implementation. This version of LuaJIT memory profiler does not support verbose > +reporting allocations from traces. All allocation from traces are reported as Typo: s/reporting allocations from/reporting for allocations made on/. > +internal. But trace code semantics should be totally the same as for the Lua > +interpreter (excluding sink optimizations). Also all deallocations reported as Typo: s/deallocations reported/deallocation are reported/. > +internal too. > + > +There are two different representations of functions in LuaJIT: the function's > +prototype (`GCproto`) and the function object so called closure (`GCfunc`). > +The closures are represented as `GCfuncL` and `GCfuncC` for Lua and C closures > +correspondingly. Also LuaJIT has a special function's type aka Fast Function. Typo: s/correspondingly/respectively/. > +It is used for LuaJIT builtins. It's better to not split this sentence. Consider the rewording: | Besides LuaJIT has a special function type a.k.a. Fast Function that | is used for LuaJIT builtins. > + > +Usually developers are not interested in information about allocations inside > +builtins. So if fast function was called from a Lua function all > +allocations are attributed to this Lua function. Otherwise attribute this event > +to a C function. I propose the following rewording: | Lua developers can do nothing with allocations made inside the | builtins except reducing its usage. So if fast function is called from | a Lua function all allocations made in its scope are attributed to this | Lua function (i.e. the builtin caller). Otherwise this event is | attributed to a C function. > + > +If one run the chunk above the profiler reports approximately the following Typo: s/run/runs/. > +(see legend [here](#reading-and-displaying-saved-data)): > +So we need to know a type of function being executed by the virtual machine > +(VM). Currently VM state identifies C function execution only, so Fast and Lua > +functions states will be added. Typo: s/will be/are/. > + > +To determine currently allocating coroutine (that may not be equal to currently > +executed one) a new field called `mem_L` is added to `global_State` structure > +to keep the coroutine address. This field is set at each reallocation to Typo: /at each reallocation to/on each reallocation to the/. > +corresponding `L` with which it was called. Typo: s/it was/it is/. > + > +When the profiling is stopped the `fclose()` is called. If it is impossible to Typo: s/the `fclose()`/`fclose()`/. > +open a file for writing or profiler fails to start, returns `nil` on failure Typo: s/returns `nil`/`nil` is returned/. > +(plus an error message as a second result and a system-dependent error code as > +a third result). Otherwise returns some true value. It would be nice to mention that the function contract is similar to other standart io.* interfaces. I glanced the source code: it's not "some" true value; it is exactly the *true* value. > + > +Memory profiler is expected to be thread safe, so it has a corresponding > +lock/unlock at internal mutex whenever you call corresponding memprof > +functions. If you want to build LuaJIT without thread safety use > +`-DLUAJIT_DISABLE_THREAD_SAFE`. This is not implemented in scope of the MVP, so drop this part. > + > +### Reading and displaying saved data > + > +Binary data can be read by `lj-parse-memprof` utility. It parses the binary Typo: s/lj-parse-memprof/luajit-parse-memprof/. > +format provided by memory profiler and render it on human-readable format. Typo: s/it on/it to/. > + > +This table shows performance deviation in relation to REFerence value (before > +commit) with stopped and running profiler. The table shows the average value > +for 11 runs. The first field of the column indicates the change in the average > +time in seconds (less is better). The second field is the standard deviation > +for the found difference. > + > +``` > + Name | REF | AFTER, memprof off | AFTER, memprof on > +----------------+------+--------------------+------------------ > +array3d | 0.21 | +0.00 (0.01) | +0.00 (0.01) > +binary-trees | 3.25 | -0.01 (0.06) | +0.53 (0.10) > +chameneos | 2.97 | +0.14 (0.04) | +0.13 (0.06) > +coroutine-ring | 1.00 | +0.01 (0.04) | +0.01 (0.04) > +euler14-bit | 1.03 | +0.01 (0.02) | +0.00 (0.02) > +fannkuch | 6.81 | -0.21 (0.06) | -0.20 (0.06) > +fasta | 8.20 | -0.07 (0.05) | -0.08 (0.03) Side note: Still curious how this can happen. It looks OK when this is negative difference in within its deviation. But this is sorta magic. > +life | 0.46 | +0.00 (0.01) | +0.35 (0.01) > +mandelbrot | 2.65 | +0.00 (0.01) | +0.01 (0.01) > +mandelbrot-bit | 1.97 | +0.00 (0.01) | +0.01 (0.02) > +md5 | 1.58 | -0.01 (0.04) | -0.04 (0.04) > +nbody | 1.34 | +0.00 (0.01) | -0.02 (0.01) > +nsieve | 2.07 | -0.03 (0.03) | -0.01 (0.04) > +nsieve-bit | 1.50 | -0.02 (0.04) | +0.00 (0.04) > +nsieve-bit-fp | 4.44 | -0.03 (0.07) | -0.01 (0.07) > +partialsums | 0.54 | +0.00 (0.01) | +0.00 (0.01) > +pidigits-nogmp | 3.47 | -0.01 (0.02) | -0.10 (0.02) > +ray | 1.62 | -0.02 (0.03) | +0.00 (0.02) > +recursive-ack | 0.20 | +0.00 (0.01) | +0.00 (0.01) > +recursive-fib | 1.63 | +0.00 (0.01) | +0.01 (0.02) > +scimark-fft | 5.72 | +0.06 (0.09) | -0.01 (0.10) > +scimark-lu | 3.47 | +0.02 (0.27) | -0.03 (0.26) > +scimark-sor | 2.34 | +0.00 (0.01) | -0.01 (0.01) > +scimark-sparse | 4.95 | -0.02 (0.04) | -0.02 (0.04) > +series | 0.95 | +0.00 (0.02) | +0.00 (0.01) > +spectral-norm | 0.96 | +0.00 (0.02) | -0.01 (0.02) > +``` > -- > 2.28.0 > -- Best regards, IM