From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id E1E74469719 for ; Wed, 9 Sep 2020 15:14:37 +0300 (MSK) Date: Wed, 9 Sep 2020 15:04:07 +0300 From: Igor Munkin Message-ID: <20200909120407.GH18920@tarantool.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH v1] test: fix flaky box/on_shutdown.test.lua on asan List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Alexander V. Tikhonov" Cc: tarantool-patches@dev.tarantool.org, Alexander Turenko Sasha, Thanks for the patch! I just dump my thoughts regarding the issue, as you asked in offline. I'm totally not a test-run expert, so I can only say something from Tarantool Lua sockets and Lua GC side. connection releases the descriptor in a two ways: * via explicit [1] call * implicitly using __gc metamethod[2] I see no explicit close for socket created within function so I guess you faced the latter approach. I guess your solution fixes the issue, since GC machinery *might* call the corresponding __gc metamethod after returns. However, with call the GC engine makes a full collection cycle and "dead" sock object is released (strictly saying at least its __gc metamethod is called). I have several questions to you: * How does test-run handle these commands and connections? * Why do other start/stop actions in this test chunk stay unaffected? I believe you need to go deeper a bit to find the exact root cause. As a result you can describe the issue in a more precise way. Feel free to ask me about Lua behaviour that you find strange/mystifying. On 31.08.20, Alexander V. Tikhonov wrote: > Found that box/on_shutdown.test.lua test fails on asan build with: > > 2020-08-26 09:04:06.750 [42629] main/102/on_shutdown [string "_ = box.ctl.on_shutdown(function() log.warn("..."]:1 W> on_shutdown 5 > Starting instance proxy... > Run console at unix/:/tnt/test/var/001_box/proxy.control > Start failed: builtin/box/console.lua:865: failed to create server unix/:/tnt/test/var/001_box/proxy.control: Address already in use > > It happened on ASAN build, because server stop routine > > test-run/lib/preprocessor.py:TestState.server_stop() -> > test-run/lib/tarantool_server.py:TarantoolServer.stop() > > needs to free the proxy.control socket created by > > test-run/lib/preprocessor.py:TestState.server_start() -> > tarantoolctl:process_local() > > On some builds like ASAN server stop routine needs more time to free > the 'proxy.control' socket. So instead of time delay before server > restart need to use garbage collector to be sure that it will be freed. > > Closes #5260 > Part of #4360 > --- > > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-5260-on-shutdown-test > Issue: https://github.com/tarantool/tarantool/issues/5260 > Issue: https://github.com/tarantool/tarantool/issues/4360 > > [1]: https://github.com/tarantool/tarantool/blob/master/src/lua/socket.lua#L108L119 [2]: https://github.com/tarantool/tarantool/blob/master/src/lua/socket.lua#L64L72 -- Best regards, IM