From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 3CF8D4765E0 for ; Wed, 30 Dec 2020 12:53:57 +0300 (MSK) From: Sergey Ostanevich Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_174CCDC7-222B-49E0-B605-5A27D2904B80" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Date: Wed, 30 Dec 2020 12:53:55 +0300 In-Reply-To: <20201230093216.GP14702@root> References: <061599ac978b162d42ce80ac65106ba67c6d5fcb.1609278043.git.skaplun@tarantool.org> <20201230090630.GW5396@tarantool.org> <20201230093216.GP14702@root> Subject: Re: [Tarantool-patches] [PATCH luajit 2/3] core: fix resources leak in memory profiler List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org --Apple-Mail=_174CCDC7-222B-49E0-B605-5A27D2904B80 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii LGTM. Sergos > On 30 Dec 2020, at 12:32, Sergey Kaplun wrote: >=20 > Igor, >=20 > Thanks for the review! >=20 > I've updated commit message to the following as you purposed: >=20 > | core: fix resources leak in memory profiler > | > | When memory profiler fails to start with PROFILE_ERRRUN status both > | stream and ctx are not released. At the same time when memory = profiler > | fails to start with the PROFILE_ERRIO status both stream and ctx are > | released twice. Both cases occur due to invalid return status = checking. > | > | To fix the leakage on_stop callback is called manually inside > | the profiler when error on start occurs. Checks in > | misc.memprof.start() are omitted. > | > | Follows up tarantool/tarantool#5442 >=20 >=20 > Branch is force-pushed. >=20 > On 30.12.20, Igor Munkin wrote: >> Sergey, >>=20 >> Thanks for the patch! LGTM except the wording in commit message >> (consider the comments below). >>=20 >> Side note: I want to notice that with on_start callback things would = be >> clearer. Let's return to this again later. >>=20 >> On 30.12.20, Sergey Kaplun wrote: >>> When the profiler is failing to start with error different from >>=20 >> Typo: s/is failing/fails/. >> Typo: s/with error/with the error/. >>=20 >>> PROFILE_ERRIO neither a file stream is closed nor ctx is freed >>> in case of incorrect return status checking. >>=20 >> Strictly saying there are two problems: >> * Possible leakage for PROFILE_ERRRUN >> * Double free for PROFILE_ERRIO >>=20 >> So I propose the following wording: >>=20 >> | When memory profiler fails to start with PROFILE_ERRRUN status both >> | stream and ctx are not released. At the same time when memory = profiler >> | fails to start with the PROFILE_ERRIO status both stream and ctx = are >> | released twice. Both cases occur due to invalid return status = checking. >>=20 >>>=20 >>> To avoid this behaviour on_stop callback is called manually inside >>=20 >> Minor: s/To avoid this behaviour/To fix the leakage/. >>=20 >>> the profiler when error on start is occurring. Checks in >>=20 >> Typo: s/is occuring/occurs/. >>=20 >>> misc.memprof.start() are omitted. >>>=20 >>> Follows up tarantool/tarantool#5442 >>> --- >>>=20 >>> * How patch was checked: >>> Before patch you can occur the error like: >>> | $ src/luajit -e ' >>> | local f, msg, errno =3D = misc.memprof.start("/tmp/tmp_memprofile.bin") >>> | misc.memprof.start("/tmp/tmp_memprofile.bin") print(f,msg,errno) >>> | ' >>> | true nil nil >>> | luajit: lj_state.c:178: close_state: Assertion `g->gc.total =3D=3D = sizeof(GG_State)' failed. >>> This patch fixes it. >>>=20 >>> * Why this assertion is not failed in tests (we have the test with = same >>> functionality)? >>> This assertion failed inside close_state. Tarantool in some reason >>> doesn't call lua_close on stop. It's weird to me. I'll try to find = an >>> explanation and will create a ticket. >>>=20 >>> * Why I don't create a test case. >>> The best idea is to do something like this and waiting for OOM: >>> | for _ =3D 1, 10000 do >>> | misc.memprof.start("/tmp/tmp_memprofile.bin") >>> | end >>> But it's disgusting, so as I've discussed with Igor offline test = case >>> will be ommited. >>=20 >> Side note: The test aims to hit LUA_ERRMEM and need to be run for a = long >> time. The exact reason we didn't face this failure is the omitted >> in Tarantool. >=20 > Yep, I've found this pretty comment inside >: >=20 > | #if 0 > | /* > | * This doesn't work reliably since things > | * are too interconnected. > | */ > | tarantool_lua_free(); > | session_free(); > | user_cache_free(); > | fiber_free(); > | memory_free(); > | random_free(); > | #endif >=20 >>=20 >>>=20 >>> src/lib_misc.c | 4 ---- >>> src/lj_memprof.c | 8 ++++++-- >>> 2 files changed, 6 insertions(+), 6 deletions(-) >>>=20 >>=20 >> >>=20 >>>=20 >>> --=20 >>> 2.28.0 >>>=20 >>=20 >> --=20 >> Best regards, >> IM >=20 > --=20 > Best regards, > Sergey Kaplun --Apple-Mail=_174CCDC7-222B-49E0-B605-5A27D2904B80 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii LGTM.

Sergos

On 30 Dec 2020, at 12:32, = Sergey Kaplun <skaplun@tarantool.org> wrote:

Igor,
Thanks for = the review!

I've updated = commit message to the following as you purposed:

| core: fix = resources leak in memory profiler
|
| When memory = profiler fails to start with PROFILE_ERRRUN status both
| stream and = ctx are not released. At the same time when memory profiler
| fails to = start with the PROFILE_ERRIO status both stream and ctx are
| released = twice. Both cases occur due to invalid return status checking.
|
| To fix the = leakage on_stop callback is called manually inside
| the = profiler when error on start occurs. Checks in
| = misc.memprof.start() are omitted.
|
| Follows up = tarantool/tarantool#5442


Branch is force-pushed.

On 30.12.20, Igor Munkin wrote:
Sergey,

Thanks for the patch! LGTM except the wording in commit = message
(consider the comments below).

Side note: I want to notice that with on_start callback = things would be
clearer. Let's return to this again = later.

On 30.12.20, Sergey Kaplun wrote:
When the profiler is = failing to start with error different from

Typo: s/is failing/fails/.
Typo: s/with = error/with the error/.

PROFILE_ERRIO neither a file stream is closed = nor ctx is freed
in case of incorrect return status = checking.

Strictly saying = there are two problems:
* Possible leakage for = PROFILE_ERRRUN
* Double free for PROFILE_ERRIO

So I propose the following wording:

| When memory profiler fails to start with = PROFILE_ERRRUN status both
| stream and ctx are not = released. At the same time when memory profiler
| fails to = start with the PROFILE_ERRIO status both stream and ctx are
| released twice. Both cases occur due to invalid return = status checking.


To avoid this behaviour on_stop callback is = called manually inside

Minor: = s/To avoid this behaviour/To fix the leakage/.

the profiler when error = on start is occurring. Checks in

Typo: s/is occuring/occurs/.

misc.memprof.start() are = omitted.

Follows up = tarantool/tarantool#5442
---

* = How patch was checked:
Before patch you can occur the = error like:
| $ src/luajit -e '
| local f, = msg, errno =3D misc.memprof.start("/tmp/tmp_memprofile.bin")
| misc.memprof.start("/tmp/tmp_memprofile.bin") = print(f,msg,errno)
| '
| true =    nil     nil
| = luajit: lj_state.c:178: close_state: Assertion `g->gc.total =3D=3D = sizeof(GG_State)' failed.
This patch fixes it.

* Why this assertion is not failed in tests = (we have the test with same
functionality)?
This assertion failed inside close_state. Tarantool in some = reason
doesn't call lua_close on stop. It's weird to me. = I'll try to find an
explanation and will create a = ticket.

* Why I don't create a test = case.
The best idea is to do something like this and = waiting for OOM:
| for _ =3D 1, 10000 do
| =   misc.memprof.start("/tmp/tmp_memprofile.bin")
| = end
But it's disgusting, so as I've discussed with Igor = offline test case
will be ommited.

Side note: The test aims to hit = LUA_ERRMEM and need to be run for a long
time. The exact = reason we didn't face this failure is the omitted
<lua_close> in Tarantool.

Yep, I've = found this pretty comment inside <src/main.cc>:
| #if = 0
|  /*
|   * This doesn't work = reliably since things
|   * are too = interconnected.
|   */
|  tarantool_lua_free();
|  session_free();
|  user_cache_free();
|  fiber_free();
|  memory_free();
|  random_free();
| #endif



src/lib_misc.c   | 4 = ----
src/lj_memprof.c | 8 ++++++--
2 files = changed, 6 insertions(+), 6 deletions(-)


<snipped>


-- 
2.28.0


-- 
Best = regards,
IM

-- 
Best = regards,
Sergey = Kaplun

= --Apple-Mail=_174CCDC7-222B-49E0-B605-5A27D2904B80--