* [Tarantool-discussions] [server-dev] [rfc] iproto connections processing improvements @ 2020-05-29 13:47 Ilya Kosarev [not found] ` <20200529145512.GA189726@atlas> 0 siblings, 1 reply; 4+ messages in thread From: Ilya Kosarev @ 2020-05-29 13:47 UTC (permalink / raw) To: server-dev, tarantool-discussions [-- Attachment #1: Type: text/plain, Size: 3428 bytes --] Hello everyone! It is well known that tarantool processes connections through iproto subsystem. Due to some problems, roughly described in the mentioned tickets , it turns out that this subsystem behavior should be reconsidered in some aspects. Proposed changes are supposed to solve at least following problems. First one is descriptors rlimit violation in case with some clients performing enough requests while tx-thread is unresponsive. According to Yaroslav 12 vshard routers reconnecting every 10.5 seconds for 15 minutes are enough to recovery dying with «can't initialize storage: error reading directory: too many open files» error. Second one is dirty read and others when tx can response although bootstrap is not finished. The solution is basically to provide iproto with more freedom, at least in some cases. As far as i see it can be implemented using humble state-machine. The alternative is vtab and it seems like an overkill here, as far as it is less transparent and there can be only 2 options for each request: to process it or to reject. To start with, we can use 2 states to solve first problem, which seems to be more painful, and then introduce new states to solve second problem and possibly some more. These states may be called "solo" & "assist" states. "Assist" state mostly implies current iproto behavior and shoulbe the basic one, while "solo" state is intended to be enabled by tx thread when it is going to become unresponsive for considerale time (for example, while building secondary keys). "Solo" state means that iproto won't communicate with tx and will simply answer everyone with any request that tx is busy. The alternative is some kind of heartbeat from tx to iproto to allow iproto decide if it needs to change it's state itself, however it also seems like an overkill. If user, for example, loads tx so much that it can't communicate with iproto, that is his own problem. This approach is needed as far as now iproto can only accept connections, consequently spending sockets in case tx thread can't answer. tx now needs to prepare greeting and only then iproto can send it. It work the same with all other requests: tx needs to prepare the answer and then iproto processes it. Proposed approach allows iproto itself to close connections or ask them to wait in "solo" state. This will solve leaking descriptors problem. Late more states can be added, where iproto, for example, will answer itself only to dml requests (while tx is not ready for it). This idea is partly realized and it shows satisfying behavior in case with unresponsive tx. There is one thing that causes trouble: using output bufs with thread-local slab_cache. Now obuf's slab_cache belongs to tx, while proposed changes mean that both tx & iproto have to be able to use them depending on state & request type. I am currently searching for the best approach here. There is an option to use more obufs (4 instead of 2), 2 of them belonging to tx thread and 2 of them belonging to iproto thread. It is also doubtable if connections to iproto in "solo" state should be closed or retry their requests after some timeout. I propose to close them, while there is opinion that it is not the right behavior. Though I think it is more transparent and understandable for users to reconnect by themselves, also as far as this unresponsive tx state might last for quite a long time. -- Ilya Kosarev [-- Attachment #2: Type: text/html, Size: 4142 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20200529145512.GA189726@atlas>]
* Re: [Tarantool-discussions] [server-dev] [rfc] iproto connections processing improvements [not found] ` <20200529145512.GA189726@atlas> @ 2020-06-30 11:18 ` Ilya Kosarev 2020-06-30 11:47 ` Konstantin Osipov 0 siblings, 1 reply; 4+ messages in thread From: Ilya Kosarev @ 2020-06-30 11:18 UTC (permalink / raw) To: Konstantin Osipov, server-dev; +Cc: tarantool-discussions [-- Attachment #1: Type: text/plain, Size: 5929 bytes --] Ссылки были заинлайнены под слова. Не учёл, что это работает не во всех клиентах. https://github.com/tarantool/tarantool/issues/3776 https://github.com/tarantool/tarantool/issues/4646 https://github.com/tarantool/tarantool/issues/4910 Про vtab/состояния конечно вопрос отдельный, я написал, как это вижу. В iproto_msg_decode как раз есть возможность отделить одно от другого. Или речь не об этом? В предложении в том числе говорится как раз о том, чтобы выполнять iproto_msg_decode не дожидаясь tx (сейчас tx слишком активно участвует в процессе сетевого взаимодействия.) Кроме того, так как в итоге всё равно tx должен поучаствовать в процессе, когда он "занят", мы хотим отсекать любые соединения в iproto, чтобы не происходило собственно утечки дескрипторов, как минимум. -- Ilya Kosarev >Пятница, 29 мая 2020, 17:55 +03:00 от Konstantin Osipov < kostja@scylladb.com >: > >* Ilya Kosarev < i.kosarev@tarantool.org > [20/05/29 16:49]: > >Илья, по-русски то тут сложно было бы разобраться, а по-английски >уж и подавно. > >Ссылок на "mentioned tickets" нет. > >vtab значит оверкилл, - это идёт отсылка >к моему комментарию в тикете, видимо? >Можно было бы со мной напрямую обсудить. > > >В целом, тут вопрос не в rfc vs vtab, а в том как разделить >в трафике соединения от реплик, которые нужно принимать во время >бутстрапа, от соединений от клиентов. > >На сегодня в протоколе таких различий нет. > >В письме об этом ничего нет. > > >> >> Hello everyone! >> >> It is well known that tarantool processes connections through iproto >> subsystem. Due to some problems, roughly described in the mentioned >> tickets , it turns out that this subsystem behavior should be >> reconsidered in some aspects. >> >> Proposed changes are supposed to solve at least following problems. >> First one is descriptors rlimit violation in case with some clients >> performing enough requests while tx-thread is unresponsive. According >> to Yaroslav 12 vshard routers reconnecting every 10.5 seconds for 15 >> minutes are enough to recovery dying with «can't initialize storage: >> error reading directory: too many open files» error. >> Second one is dirty read and others when tx can response although >> bootstrap is not finished. >> >> The solution is basically to provide iproto with more freedom, at least >> in some cases. As far as i see it can be implemented using humble >> state-machine. The alternative is vtab and it seems like an overkill >> here, as far as it is less transparent and there can be only 2 options >> for each request: to process it or to reject. To start with, we can use >> 2 states to solve first problem, which seems to be more painful, and >> then introduce new states to solve second problem and possibly some >> more. These states may be called "solo" & "assist" states. "Assist" >> state mostly implies current iproto behavior and shoulbe the basic one, >> while "solo" state is intended to be enabled by tx thread when it is >> going to become unresponsive for considerale time (for example, while >> building secondary keys). "Solo" state means that iproto won't >> communicate with tx and will simply answer everyone with any request >> that tx is busy. The alternative is some kind of heartbeat from tx to >> iproto to allow iproto decide if it needs to change it's state itself, >> however it also seems like an overkill. If user, for example, loads tx >> so much that it can't communicate with iproto, that is his own problem. >> >> This approach is needed as far as now iproto can only accept >> connections, consequently spending sockets in case tx thread can't >> answer. tx now needs to prepare greeting and only then iproto can send >> it. It work the same with all other requests: tx needs to prepare the >> answer and then iproto processes it. >> Proposed approach allows iproto itself to close connections or ask >> them to wait in "solo" state. This will solve leaking descriptors >> problem. Late more states can be added, where iproto, for example, will >> answer itself only to dml requests (while tx is not ready for it). This >> idea is partly realized and it shows satisfying behavior in case with >> unresponsive tx. >> >> There is one thing that causes trouble: using output bufs with >> thread-local slab_cache. Now obuf's slab_cache belongs to tx, while >> proposed changes mean that both tx & iproto have to be able to use >> them depending on state & request type. I am currently searching for >> the best approach here. There is an option to use more obufs (4 instead >> of 2), 2 of them belonging to tx thread and 2 of them belonging to >> iproto thread. >> >> It is also doubtable if connections to iproto in "solo" state should be >> closed or retry their requests after some timeout. I propose to close >> them, while there is opinion that it is not the right behavior. Though >> I think it is more transparent and understandable for users to >> reconnect by themselves, also as far as this unresponsive tx state >> might last for quite a long time. >> >> -- >> Ilya Kosarev >-- >Konstantin Osipov, Moscow, Russia [-- Attachment #2: Type: text/html, Size: 7996 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Tarantool-discussions] [server-dev] [rfc] iproto connections processing improvements 2020-06-30 11:18 ` Ilya Kosarev @ 2020-06-30 11:47 ` Konstantin Osipov 2020-07-02 14:22 ` [Tarantool-discussions] [dev] " Ilya Kosarev 0 siblings, 1 reply; 4+ messages in thread From: Konstantin Osipov @ 2020-06-30 11:47 UTC (permalink / raw) To: Ilya Kosarev, server-dev; +Cc: tarantool-discussions * Ilya Kosarev <i.kosarev@tarantool.org> [20/06/30 14:19]: > > Ссылки были заинлайнены под слова. > Не учёл, что это работает не во всех клиентах. > https://github.com/tarantool/tarantool/issues/3776 > https://github.com/tarantool/tarantool/issues/4646 > https://github.com/tarantool/tarantool/issues/4910 > > Про vtab/состояния конечно вопрос отдельный, я написал, как это вижу. > > В iproto_msg_decode как раз есть возможность отделить одно от другого. Как именно? -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Tarantool-discussions] [dev] [rfc] iproto connections processing improvements 2020-06-30 11:47 ` Konstantin Osipov @ 2020-07-02 14:22 ` Ilya Kosarev 0 siblings, 0 replies; 4+ messages in thread From: Ilya Kosarev @ 2020-07-02 14:22 UTC (permalink / raw) To: Konstantin Osipov, dev; +Cc: tarantool-discussions [-- Attachment #1: Type: text/plain, Size: 1836 bytes --] As a result of private discussion, here are the steps to be implemented: 1. Greeting should be done by iproto solely. This means session creation has to be moved to a later point (after iproto_msg_decode). Thus iproto has to be able to reach iproto_msg_decode without tx assistance. Iproto also should be able to finish connection itself in case it is possible (connection being rejected by iproto). 2. Introduce state machine managed from tx. tx should be able to enable different iproto states depending on tx work phase, for example, to reject all connections on secondary index build. 3. To be more specific, we need to be able to classify different types of connections, for example, replica connection vs client connection. This means we need to add specific flag for replica authentication and prioritize it if needed depending on the iproto state. 4. New approach to connections handling means we need to reconsider clients behavior: specific error for this rejection type, reconnection on timeout. -- Ilya Kosarev >Вторник, 30 июня 2020, 14:47 +03:00 от Konstantin Osipov <kostja.osipov@gmail.com>: > >* Ilya Kosarev < i.kosarev@tarantool.org > [20/06/30 14:19]: >> >> Ссылки были заинлайнены под слова. >> Не учёл, что это работает не во всех клиентах. >> https://github.com/tarantool/tarantool/issues/3776 >> https://github.com/tarantool/tarantool/issues/4646 >> https://github.com/tarantool/tarantool/issues/4910 >> >> Про vtab/состояния конечно вопрос отдельный, я написал, как это вижу. >> >> В iproto_msg_decode как раз есть возможность отделить одно от другого. >Как именно? > > >-- >Konstantin Osipov, Moscow, Russia [-- Attachment #2: Type: text/html, Size: 2652 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-07-02 14:22 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-29 13:47 [Tarantool-discussions] [server-dev] [rfc] iproto connections processing improvements Ilya Kosarev [not found] ` <20200529145512.GA189726@atlas> 2020-06-30 11:18 ` Ilya Kosarev 2020-06-30 11:47 ` Konstantin Osipov 2020-07-02 14:22 ` [Tarantool-discussions] [dev] " Ilya Kosarev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox