From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id B97936EC5A; Mon, 15 Feb 2021 11:40:23 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B97936EC5A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1613378423; bh=FfP/T3B+isdO9pXehL0JNxfBrhHRf5yQsEfbzvyhaDg=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=yGPZfPVAPenrZIhhaNNCfMizdxrsZXYYeJ50k3HPTvH6nqjUw+TAS/QBwd9H8G4Un gHuABWj21ezUwOP/0Li7y6KZKJxDMXNGwfzpipRnYHNJ4CvgVCEhl0umJ+/eA10Nsw hwxo6N+QAPlpAJ+5wLDN3+DH4sT4jhHZjakLMF6Q= Received: from smtp47.i.mail.ru (smtp47.i.mail.ru [94.100.177.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4B53E6EC5A for ; Mon, 15 Feb 2021 11:40:22 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4B53E6EC5A Received: by smtp47.i.mail.ru with esmtpa (envelope-from ) id 1lBZQZ-0000zs-NT; Mon, 15 Feb 2021 11:40:20 +0300 To: Vladislav Shpilevoy , gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org References: <20210212112541.27561-1-sergepetrenko@tarantool.org> Message-ID: <57b04874-1bb7-3d62-856d-b60df700514a@tarantool.org> Date: Mon, 15 Feb 2021 11:40:19 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD981647AC6901E234BEAC4623CA173AF239537CEFAAFD1F814182A05F53808504078D505DD2AFCD92B8CFD374E3E8127F048412552096CFD80443584DC68319CB8 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7A72B1EA4C8D5AD81EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006377845AD97F799C6E88638F802B75D45FF5571747095F342E8C7A0BC55FA0FE5FC09AC22986A7926B8879EC1615539A9F798D2EBECB36E754B389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C091DAD9F922AA71188941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B6AEEA5BB16A939343CC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C2249FB809350A470D56D76E601842F6C81A12EF20D2F80756B5F7E9C4E3C761E06A776E601842F6C81A127C277FBC8AE2E8BAB0987F711577CA53AA81AA40904B5D9DBF02ECDB25306B2B25CBF701D1BE8734AD6D5ED66289B5278DA827A17800CE7CD707F342D9BDC9867F23339F89546C5A8DF7F3B2552694A6FED454B719173D6725E5C173C3A84C3865B847893077FB535872C767BF85DA2F004C906525384306FED454B719173D6462275124DF8B9C9DE2850DD75B2526BE5BFE6E7EFDEDCD789D4C264860C145E X-C1DE0DAB: 0D63561A33F958A54A1105107E8D8289B3E4DE152D3665C9069770F9F6DE04D9D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75448CF9D3A7B2C848410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D348B409C2D257583DFE4CD5E15CB0CCD9C89C60C3F0A169358267621E694D5F75B72DB582ECB3E20C51D7E09C32AA3244CEF306040098244BAEE60FF2B289F5ED2F165894D92D62706FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojcZK6cmu79qTqpiCl2aBVPA== X-Mailru-Sender: 583F1D7ACE8F49BDF0EA4664CAF0825DEB81E40E1B627C712D7D1AA217D3472163A162687959230A823C4E0A9438D55D74690CA6451351EDEC462FDC9CAD1E11B969B486931C0B990F27244EEAA5B9A5AE208404248635DF X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH] relay: yield explicitly every N sent rows X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 13.02.2021 00:48, Vladislav Shpilevoy пишет: > Hi! Thanks for the patch! > > On 12.02.2021 12:25, Serge Petrenko via Tarantool-patches wrote: >> While sending a WAL, relay only yields in `coio_write_xrow`, once it >> sees the socket isn't ready for writes. >> It may happen that the socket is always ready for a long period of time, >> and relay doesn't yield at all while recovering a whole .xlog file. This >> may take well more than a minute. >> During this period of time, relay doesn't read replica's ACKs due to >> relay reader fiber not being scheduled, and once the reader is finally >> live it times out immediately, causing the replica to reconnect. >> >> The problem is amplified by the fact that replica waits for >> replication_timeout to pass prior to reconnecting, which lets master >> pile up even more ready WALs, and effectively making it impossible for >> the replica to sync. > I couldn't understand this part. Why is it bad? Yeah, replica waits, > but replica is applier, on another instance. How is it related? And > relay_reader does not send anything. So why is it bad? Thanks for the review! I shouldn't have included this paragraph to the explanation probably. I tried to explain how this bug leads to replica not being able to sync with master when master's under load. I reworded the commit message a bit, hope it's more clear now. > > Couldn't the problem be fixed by reading all the non-consumed data after > reading WAL? Relay does read every ack received while feeding a WAL, but it reads the acks only after finishing reading WAL, so all the reads time-out. > > The current solution also looks fine. Maybe even better because it > becomes consistent with local recovery. However I still want to > understand this part about replica. > >> To fix the problem let's yield explicitly in relay_send_row every >> WAL_ROWS_PER_YIELD rows. The same is already done in local recovery, and >> serves the same purpose: to not block the event loop for too long. >> >> Closes #5762 >> --- >> diff --git a/src/box/relay.cc b/src/box/relay.cc >> index df04f8198..afc57dfbc 100644 >> --- a/src/box/relay.cc >> +++ b/src/box/relay.cc >> @@ -836,11 +836,20 @@ relay_send(struct relay *relay, struct xrow_header *packet) >> { >> ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); >> >> + static uint64_t row_cnt = 0; > Relays are in threads. So this variable either should be thread-local, > or be in struct relay. Otherwise you get non-atomic updates which may > lead to some increments disappearing. > > Given that thread-local variable access is not free, I would go for > having it in struct relay, but up to you. Thanks for noticing! Let it be in relay then. Diff: ================================================ diff --git a/src/box/relay.cc b/src/box/relay.cc index 1d8edf116..6d9269e1d 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -117,6 +117,11 @@ struct relay {          * is passed by the replica on subscribe.          */         uint32_t id_filter; +       /** +        * How many rows has this relay sent to the replica. Used to yield once +        * in a while when reading a WAL to unblock the event loop. +        */ +       size_t row_cnt;         /**          * Local vclock at the moment of subscribe, used to check          * dataset on the other side and send missing data rows if any. @@ -218,6 +223,7 @@ relay_start(struct relay *relay, int fd, uint64_t sync,         coio_create(&relay->io, fd);         relay->sync = sync;         relay->state = RELAY_FOLLOW; +       relay->row_cnt = 0;         relay->last_row_time = ev_monotonic_now(loop());  } @@ -836,7 +842,6 @@ relay_send(struct relay *relay, struct xrow_header *packet)  {         ERROR_INJECT_YIELD(ERRINJ_RELAY_SEND_DELAY); -       static size_t row_cnt = 0;         packet->sync = relay->sync;         relay->last_row_time = ev_monotonic_now(loop());         coio_write_xrow(&relay->io, packet); @@ -846,7 +851,7 @@ relay_send(struct relay *relay, struct xrow_header *packet)          * It may happen that the socket is always ready for write, so yield          * explicitly every now and then to not block the event loop.          */ -       if (++row_cnt % WAL_ROWS_PER_YIELD == 0) +       if (++relay->row_cnt % WAL_ROWS_PER_YIELD == 0)                 fiber_sleep(0);         struct errinj *inj = errinj(ERRINJ_RELAY_TIMEOUT, ERRINJ_DOUBLE); > >> packet->sync = relay->sync; >> relay->last_row_time = ev_monotonic_now(loop()); >> coio_write_xrow(&relay->io, packet); >> fiber_gc(); >> >> + /* >> + * It may happen that the socket is always ready for write, so yield >> + * explicitly every now and then to not block the event loop. >> + */ >> + row_cnt++; >> + if (row_cnt % WAL_ROWS_PER_YIELD == 0) { >> + fiber_sleep(0); >> + } > Maybe better drop {} as the if's body is just one line. Already fixed in reply to Cyrill. -- Serge Petrenko