From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Mon, 21 May 2018 01:36:00 +0300 From: Konstantin Osipov Subject: Re: [PATCH] vinyl: allow to build secondary index for non-empty space Message-ID: <20180520223600.GA18203@atlas> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Vladimir Davydov Cc: tarantool-patches@freelists.org List-ID: Looks like there may be concurrency issues with invoking the callback in on_replace() trigger. The trigger operates in "read uncommitted" mode, i.e. sees actions of uncommitted transactions. This may lead to a unique constraint violation. To construct a test case I would need to know where you revert the actions on the temporary index in case of transaction rollback. It seems to be nowhere, which is itself a problem. So we should begin discussing this by having a test for transaction rollback. Once we have an on_rollback trigger, non-unique indexes should work. For unique indexes we should think about a way of maintaining uniqueness. This could be done by moving the on_replace work to on_commit(). The trouble is with reads, which may yield. I'm not sure it's a good idea to yield in on_commit() - this may break the transaction manager. So apparently we need to include the temporary LSM to transaction management. Or, which is similar, we could continue doing reads and yields in on_replace(), while a moving the final uniqueness check only at on_commit(). Let's discuss this f2f. The second comment is related to recovery. Looks like we could simplify it. I yet don't know how, but I'm not convinced it's impossible. We could optimistically write a record to the WAL as soon as we have done iterating over the primary key and building the new index. But we still commit the index in vylog after *both* WAL write *and* successful dump. Then, since we read vylog before WAL recovery, we can either skip recovery if VY_LOG_CREATE_LSM is present, or rebuild the index from scratch when encountering the WAL record if matching VY_LOG_CREATE_LSM is missing. I think this idea is ugly, because I don't want to rebuild the index from scratch during recovery. Perhaps we need to have an intermediate vylog record, which is written after all unique checks are done and a dump is complete, to speed up replay during recovery. Perhaps I don't understand your patch really well. Let's discuss. * Vladimir Davydov [18/04/21 21:20]: -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov