From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp61.i.mail.ru (smtp61.i.mail.ru [217.69.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 648EE469710 for ; Thu, 7 May 2020 03:38:45 +0300 (MSK) Date: Thu, 7 May 2020 00:38:44 +0000 From: Nikita Pettik Message-ID: <20200507003844.GC9992@tarantool.org> References: <7965217ceed66d448cee453c690e8d91ba7a841b.1587948306.git.korablev@tarantool.org> <19283eae-5eb9-3edb-2dea-8588874e174b@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <19283eae-5eb9-3edb-2dea-8588874e174b@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v3 3/3] vinyl: clean-up write iterator if vy_task_write_run() fails List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy Cc: tarantool-patches@dev.tarantool.org On 01 May 02:55, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > See 3 comments below. > > > diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c > > index 9dba93d34..387f58723 100644 > > --- a/src/box/vy_scheduler.c > > +++ b/src/box/vy_scheduler.c > > @@ -1065,8 +1065,10 @@ vy_task_write_run(struct vy_task *task, bool no_compression) > > no_compression) != 0) > > goto fail; > > > > - if (wi->iface->start(wi) != 0) > > + if (wi->iface->start(wi) != 0) { > > + wi->iface->stop(wi); > > 1. I would better make start() more self-sufficient. Otherwise it > failed to start, and yet you somewhy need to stop it. Looks confusing. Ok, here's diff: diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c index 387f58723..9dba93d34 100644 --- a/src/box/vy_scheduler.c +++ b/src/box/vy_scheduler.c @@ -1065,10 +1065,8 @@ vy_task_write_run(struct vy_task *task, bool no_compression) no_compression) != 0) goto fail; - if (wi->iface->start(wi) != 0) { - wi->iface->stop(wi); + if (wi->iface->start(wi) != 0) goto fail_abort_writer; - } int rc; int loops = 0; struct tuple *stmt = NULL; diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c index 21c18d3dc..33ad5ed51 100644 --- a/src/box/vy_write_iterator.c +++ b/src/box/vy_write_iterator.c @@ -401,18 +401,23 @@ vy_write_iterator_start(struct vy_stmt_stream *vstream) struct vy_write_src *src, *tmp; rlist_foreach_entry_safe(src, &stream->src_list, in_src_list, tmp) { if (vy_write_iterator_add_src(stream, src) != 0) - return -1; + goto fail; #ifndef NDEBUG struct errinj *inj = errinj(ERRINJ_VY_WRITE_ITERATOR_START_FAIL, ERRINJ_BOOL); if (inj != NULL && inj->bparam) { inj->bparam = false; diag_set(OutOfMemory, 666, "malloc", "struct vy_stmt"); - return -1; + goto fail; } #endif } return 0; +fail: + /* Clean-up all previously added sources. */ + rlist_foreach_entry_safe(src, &stream->src_list, in_src_list, tmp) + vy_write_iterator_delete_src(stream, src); + return -1; } > > diff --git a/test/vinyl/gh-4864-stmt-alloc-fail-compact.result b/test/vinyl/gh-4864-stmt-alloc-fail-compact.result > > index af116a4b4..ea8dce0ba 100644 > > --- a/test/vinyl/gh-4864-stmt-alloc-fail-compact.result > > +++ b/test/vinyl/gh-4864-stmt-alloc-fail-compact.result > > @@ -242,6 +242,91 @@ s:drop() > > +assert(s.index.pk:stat().range_count == 1) > > + | --- > > + | - true > > + | ... > > +assert(s.index.pk:stat().run_count == 2) > > + | --- > > + | - true > > + | ... > > + > > +errinj.set('ERRINJ_VY_WRITE_ITERATOR_START_FAIL', true) > > + | --- > > + | - ok > > + | ... > > +-- Prevent next attempt to compact in a row. > > +-- > > +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 1) > > + | --- > > + | - ok > > + | ... > > + > > +s.index.pk:compact() > > + | --- > > + | ... > > +-- Leave a time gap between compaction and index drop just in case > > +-- (to make sure that compaction is already finished (re-scheduled) > > +-- when at the moment of index drop). > > +-- > > +fiber.sleep(0.5) > > 2. Can't you wait for compaction actively on some condition? Such as > smaller run count. Half of second is quite a big timeout for a regular > test. Ok, but here we can't rely on run/range count, but can use explicit scheduler.tasks_completed statistics. Diff: diff --git a/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua b/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua index 3c2b38160..547ab628e 100644 --- a/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua +++ b/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua @@ -124,16 +124,15 @@ assert(s.index.pk:stat().range_count == 1) assert(s.index.pk:stat().run_count == 2) errinj.set('ERRINJ_VY_WRITE_ITERATOR_START_FAIL', true) --- Prevent next attempt to compact in a row. --- -errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 1) - +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 0.1) +tasks_completed = box.stat.vinyl().scheduler.tasks_completed s.index.pk:compact() --- Leave a time gap between compaction and index drop just in case --- (to make sure that compaction is already finished (re-scheduled) --- when at the moment of index drop). +-- Tuple clean-up takes place after compaction is completed. +-- Meanwhile range count is updated during compaction process. +-- So instead of relying on range/run match, let's check explicitly +-- number of completed tasks. -- -fiber.sleep(0.5) +repeat fiber.sleep(0.001) until box.stat.vinyl().scheduler.tasks_completed >= tasks_completed + 1 -- Drop is required to unref all tuples. -- @@ -142,10 +141,7 @@ s:drop() -- they may be still referenced (while being pushed) in Lua. So -- invoke GC explicitly. -- -collectgarbage("collect") --- Give GC some time to operate on. --- -fiber.sleep(1) +_ = collectgarbage("collect") assert(errinj.get('ERRINJ_VY_WRITE_ITERATOR_START_FAIL') == false) errinj.set('ERRINJ_VY_WRITE_ITERATOR_START_FAIL', false) > > +-- Drop is required to unref all tuples. > > +-- > > +s:drop() > > + | --- > > + | ... > > +-- After index is dropped, not all tuples are deallocated at once: > > +-- they may be still referenced (while being pushed) in Lua. So > > +-- invoke GC explicitly. > > +-- > > +collectgarbage("collect") > > + | --- > > + | - 0 > > + | ... > > +-- Give GC some time to operate on. > > +-- > > +fiber.sleep(1) > > 3. GC is synchronous. So if collectgarbage() has returned, GC is done. Indeed, removed this sleep.