From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id B128A6F3C7; Fri, 26 Mar 2021 15:06:13 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B128A6F3C7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1616760373; bh=eZSzcfVxeaAxndiDsqAb4N73WSi6CYE49DeDxSI510g=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=tfQ0QZd3FH7QzL/yLZF104s1bMQmBfPAKX92lUxJrqx8QPxqJNvuo+gwbYHiaK/VV o4OT5PDqxkw0IFXKtKk1Te5kuztPelPE+k4ng3Iot29EIBWDQyw1JgiYKcgHEgrcov W1toewxwAXAGDNNOxp421vUFTFPPMj1SrIA20S+I= Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id E4C446F3C7 for ; Fri, 26 Mar 2021 15:06:11 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org E4C446F3C7 Received: by mail-lf1-f44.google.com with SMTP id b83so7280729lfd.11 for ; Fri, 26 Mar 2021 05:06:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=vZC/Qi5ZtikmWl6Vqre0pPYflt9U+UfSya4NPBqqh/8=; b=KWjC2Td8hOf1Ewl8GqTxVnGRisQwW5LCqe81ICn+m4Uc2cGMGgnJW5ofdnPQx122lx Bes8IPfJlZrgWb9kEGWUXJ4+R6/BxNWAawZAE8a0Fp9pGMiWNzcepMTaiWYkFVOaSX3z klQlXdnzlIrI0spokJhf4AunFUX1eyzfHsbW7jPCzWiAdfTna8OUEDJAAf53omfge5gf R7abn9GiWLZhUMJiAUKI5L7pqHwpO58abU0uHvgHvCr24A169oWmhvpNyDkndYR4kgk7 qiVPKqduP9KxIkBQaXUW5/zwCBK2GLEIG+zA0DVCKcK75PRF5jFvhnqFfr38F2aPFopR hTZg== X-Gm-Message-State: AOAM533F2jIQ0VxU2fsD+IYhDduXjr/rCZr/dKWRWU6s7/Ow53LoH68b l6Eq0D6N8fk3CSjv2uSFczsPNsTwxDZALg== X-Google-Smtp-Source: ABdhPJzBB/Ax2xwmviHBONpUdq+XERmvr4Zc04KL4pdn2HjL5tsIpgxaJWe+yaToECusXU5lxx0dBg== X-Received: by 2002:ac2:5f56:: with SMTP id 22mr8011680lfz.35.1616760370778; Fri, 26 Mar 2021 05:06:10 -0700 (PDT) Received: from grain.localdomain ([5.18.171.94]) by smtp.gmail.com with ESMTPSA id n11sm1142897ljg.7.2021.03.26.05.06.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Mar 2021 05:06:09 -0700 (PDT) Received: by grain.localdomain (Postfix, from userid 1000) id 0316C5601CB; Fri, 26 Mar 2021 15:06:07 +0300 (MSK) To: tml Date: Fri, 26 Mar 2021 15:06:02 +0300 Message-Id: <20210326120605.2160131-1-gorcunov@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH v5 0/3] gc/xlog: delay xlog cleanup until relays are subscribed X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Take a look please. v2: - rebase code to the fresh master branch - keep wal_cleanup_delay option name - pass wal_cleanup_delay as an option to gc_init, so it won't be dependent on cfg engine - add comment about gc_delay_unref in plain bootstrap mode - allow to setup wal_cleanup_delay dynamically - update comment in gc_wait_cleanup and call it conditionally - declare wal_cleanup_delay as a double - rename gc.cleanup_is_paused to gc.is_paused and update output - do not show ref counter in box.info.gc() output - update documentation - move gc_delay_unref inside relay_subscribe call which runs in tx context (instead of relay's context) - update tests: - add a comment why we need a temp space on replica node - use explicit insert/snapshot operations - shrkink the number of insert/snapshot to speedup testing - use "restart" instead of stop/start pair - use wait_log helper instead of own function - add is_paused test v3: - fix changelog - rework box_check_wal_cleanup_delay, the replication_anon setting is considered only in box_set_wal_cleanup_delay, ie when config is checked and parsed, moreover the order of setup is set to be behind "replication_anon" option processing - delay cycle now considers deadline instead of per cycle calculation - use `double` type for timestamp - test update - verify `.is_paused` value - minimize number of inserts - no need to use temporary space, regular space works as well - add comments on why we should restart the master node v4: - drop argument from gc_init(), since we're configuring delay value from load_cfg.lua script there is no need to read the delay early, simply start gc paused and unpause it on demand - move unpause message to main wait cycle - test update: - verify tests and fix replication/replica_rejoin since it waits for xlogs to be cleaned up too early - use 10 seconds for XlogGapError instead of 0.1 second, this is a common deadline value v5: - define limits for `wal_cleanup_delay`: it should be either 0, or in range [0.001; TIMEOUT_INFINITY]. This is done to not consider fp epsilon as a meaningul value - fix comment about why anon replica is not using delay - rework cleanup delay'ed cycle - test update: - update vinyl/replica_rejoin -- we need to disable cleanup delay explicitly - update replication/replica_rejoin for same reason - drop unneded test_run:switch() calls - add a testcase where timeout is decreased and cleanup fiber is kicked to run even with stuck replica issue https://github.com/tarantool/tarantool/issues/5806 branch gorcunov/gh-5806-xlog-gc-5 Cyrill Gorcunov (3): gc/xlog: delay xlog cleanup until relays are subscribed test: add a test for wal_cleanup_delay option test: box-tap/gc -- add test for is_paused field .../unreleased/add-wal_cleanup_delay.md | 5 + src/box/box.cc | 41 ++ src/box/box.h | 1 + src/box/gc.c | 95 +++- src/box/gc.h | 36 ++ src/box/lua/cfg.cc | 9 + src/box/lua/info.c | 4 + src/box/lua/load_cfg.lua | 5 + src/box/relay.cc | 1 + src/box/replication.cc | 2 + test/app-tap/init_script.result | 1 + test/box-tap/gc.test.lua | 3 +- test/box/admin.result | 2 + test/box/cfg.result | 4 + test/replication/gh-5806-master.lua | 8 + test/replication/gh-5806-slave.lua | 8 + test/replication/gh-5806-xlog-cleanup.result | 435 ++++++++++++++++++ .../replication/gh-5806-xlog-cleanup.test.lua | 188 ++++++++ test/replication/replica_rejoin.lua | 22 + test/replication/replica_rejoin.result | 18 +- test/replication/replica_rejoin.test.lua | 11 +- test/vinyl/replica_rejoin.lua | 5 +- test/vinyl/replica_rejoin.result | 13 + test/vinyl/replica_rejoin.test.lua | 8 + 24 files changed, 916 insertions(+), 9 deletions(-) create mode 100644 changelogs/unreleased/add-wal_cleanup_delay.md create mode 100644 test/replication/gh-5806-master.lua create mode 100644 test/replication/gh-5806-slave.lua create mode 100644 test/replication/gh-5806-xlog-cleanup.result create mode 100644 test/replication/gh-5806-xlog-cleanup.test.lua create mode 100644 test/replication/replica_rejoin.lua base-commit: f4e248c0c13a46beee238fbebc38ef687ef09d02 -- 2.30.2