From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id EFCAD6EC40; Fri, 4 Jun 2021 20:06:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EFCAD6EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1622826375; bh=+fqRId2HWLEzRy5mIS+25+YVvUzI/bm9Q72BQ4tXpCE=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=f/jm9Wu9m8HLIC8h7r7RToXkHpTlFiTkZLYmxay9CRuY59hubdJrzzmJlZAdvsfhP W4PMj3bQElBmad8WOsE3C6m72G19lheEBVQm6HhglUVqda+VZqWwXipGIHmXiGt0yK WJqSlxxhPyW3DCMyS4Cau+v7D78/lgS7egkSlo2Q= Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7E3EB6EC40 for ; Fri, 4 Jun 2021 20:06:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7E3EB6EC40 Received: by mail-lf1-f44.google.com with SMTP id w33so15125485lfu.7 for ; Fri, 04 Jun 2021 10:06:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=YQnI1debGyY12OoG9Cwrl53bYjjQwNdl6IBlPv+ftPE=; b=dDHwMO/WfPFZXZDVp9HIwX4lzKUiIyuvraVUFzepzi94ZYqQAMA/OHoWt/8/TWq9eh x0oT3YciQv8YSoyEoUpA/es+dddy0C/SPswvmV40aZvIhyQ86LHMJM+emrqa25fn/2Bo YeuXNjm81mhjxyE3aRo6DIjVaQ2Q6Iy5JXFLVwkXTRhN5uNNuYCR/rIUPRRlLtBp4WWh h1pm/f6SfdwkKZeoY+0hudzgIJXg3ob3wwkkbafsdTN3Kk1zNZqxhT84TwZLRMzZjXO4 l+MbtaXZEz3ZZwPnPGbGTj0YNtE9v9lWQhluzFn7aGQD6ec5n+TL61/jaopuZr7e6nxT Sctw== X-Gm-Message-State: AOAM533m5e+Ei3Nj6M0rOTh1ukyvbR68Z7SsXdN2LwkdCOejAzvyt5Md G0ej04ggRjQlTiioWhYSozguDD5jX2o= X-Google-Smtp-Source: ABdhPJyErFcUlN0tRxngEkLZ07VGEFpQERRvi/IwFbLXw3GLAozgQECW11yI/fY8rDdh5sPIhpLLfQ== X-Received: by 2002:a05:6512:3993:: with SMTP id j19mr3597529lfu.231.1622826373222; Fri, 04 Jun 2021 10:06:13 -0700 (PDT) Received: from grain.localdomain ([5.18.171.94]) by smtp.gmail.com with ESMTPSA id a20sm760544ljk.29.2021.06.04.10.06.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Jun 2021 10:06:12 -0700 (PDT) Received: by grain.localdomain (Postfix, from userid 1000) id 8B0645A0042; Fri, 4 Jun 2021 20:06:11 +0300 (MSK) To: tml Date: Fri, 4 Jun 2021 20:06:05 +0300 Message-Id: <20210604170607.1127177-1-gorcunov@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [RFC v7 0/2] relay: provide downstream lag information X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Guys, take a look once time permit. Previous version is here https://lists.tarantool.org/tarantool-patches/20210506214132.533913-1-gorcunov@gmail.com/ Not for merging yet! I think instead of applier_wal_stat structure we might need some _commonly shared_ statistics structure, probably bound to WAL code and all other threads will update it in a lockless way, because we might need to collect more detais on WAL processing in future. I though of something like enum { WAL_ICNT__APPLIER_TXN_START_TM, WAL_ICNT__MAX, }; struct wal_stat { int64_t icnt[WAL_ICNT__MAX]; } wal_st[VCLOCK_MAX]; and introduce wal_st__icnt_read(unsigned id); wal_st__icnt_write(unsigned id); then applier will simply push last timestamp to WAL_ICNT__APPLIER_TXN_START_TM counter, and later when we need to send ACK we use wal_st__icnt_read() to fetch it back. We won't need to allocate some dynamic memory for it but rather use statically preallocated shared between threads. Not sure though. v4 (by Vlad): - add a test case - add docbot request - dropped off xrow_encode_vclock_timed, we use opencoded assignment for tm value when send ack - struct awstat renamed to applier_wal_stat. Vlad I think this is better name than "applier_lag" because this is statistics on WAL, we simply track remote WAL propagation here, so more general name is better for grep sake and for future extensions - instead of passing applier structure we pass replica_id - the real keeper of this statistics comes into "replica" structure thus unbound of applier itself - for synchro entries we pass a pointer to the applier_wal_stat instead of using replica_id = 0 as a sign that we don't need to update statistics for initial and final join cases - to write and read statistics we provide wal_stat_update and wal_stat_ack helpers to cover the case where single ACK spans several transactions v7: - reworked the idea, so we always send last applied transaction's timestamp inside applier's ACK message - fixed changelong - in replica structure opencode the applier_txn_start_tm member - drop multiple ifs in applier_apply_tx - drop if statement in apply_synchro_row Vlad you pointed > 4. Lag is updated in the relay thread, therefore you can't > simply read it in TX thread like you do in the diff block > above. actually I can read the relay's lag in box.info() output, if the relay object is removed then it won't have RELAY_FOLLOW state so we're safe. This is what you meant? branch gorcunov/gh-5447-relay-lag-7-notest issue https://github.com/tarantool/tarantool/issues/5447 Cyrill Gorcunov (2): applier: send transaction's first row WAL time in the applier_writer_f relay: provide information about downstream lag .../unreleased/gh-5447-downstream-lag.md | 6 ++ src/box/applier.cc | 74 ++++++++++++--- src/box/applier.h | 14 +++ src/box/lua/info.c | 3 + src/box/relay.cc | 51 ++++++++++ src/box/relay.h | 6 ++ src/box/replication.cc | 1 + src/box/replication.h | 5 + .../replication/gh-5447-downstream-lag.result | 93 +++++++++++++++++++ .../gh-5447-downstream-lag.test.lua | 41 ++++++++ 10 files changed, 279 insertions(+), 15 deletions(-) create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md create mode 100644 test/replication/gh-5447-downstream-lag.result create mode 100644 test/replication/gh-5447-downstream-lag.test.lua -- 2.31.1