Tarantool discussions archive
 help / color / mirror / Atom feed
From: "Ilya Kosarev" <i.kosarev@tarantool.org>
To: "Konstantin Osipov" <kostja@scylladb.com>, server-dev@tarantool.org
Cc: tarantool-discussions <tarantool-discussions@dev.tarantool.org>
Subject: Re: [Tarantool-discussions] [server-dev]  [rfc] iproto connections processing improvements
Date: Tue, 30 Jun 2020 14:18:12 +0300	[thread overview]
Message-ID: <1593515892.514490810@f382.i.mail.ru> (raw)
In-Reply-To: <20200529145512.GA189726@atlas>

[-- Attachment #1: Type: text/plain, Size: 5929 bytes --]

Ссылки были заинлайнены под слова.
Не учёл, что это работает не во всех клиентах.
Про vtab/состояния конечно вопрос отдельный, я написал, как это вижу.
В iproto_msg_decode как раз есть возможность отделить одно от другого.
Или речь не об этом? В предложении в том числе говорится как раз о том,
чтобы выполнять iproto_msg_decode не дожидаясь tx (сейчас tx слишком
активно участвует в процессе сетевого взаимодействия.)
Кроме того, так как в итоге всё равно tx должен поучаствовать в
процессе, когда он "занят", мы хотим отсекать любые соединения
в iproto, чтобы не происходило собственно утечки дескрипторов, как
Ilya Kosarev
>Пятница, 29 мая 2020, 17:55 +03:00 от Konstantin Osipov < kostja@scylladb.com >:
>* Ilya Kosarev < i.kosarev@tarantool.org > [20/05/29 16:49]:
>Илья, по-русски то тут сложно было бы разобраться, а по-английски
>уж и подавно.
>Ссылок на "mentioned tickets" нет.
>vtab значит оверкилл, - это идёт отсылка
>к моему комментарию в тикете, видимо?
>Можно было бы со мной напрямую обсудить.
>В целом, тут вопрос не в rfc vs vtab, а в том как разделить
>в трафике соединения от реплик, которые нужно принимать во время
>бутстрапа, от соединений от клиентов.
>На сегодня в протоколе таких различий нет.
>В письме об этом ничего нет.
>> Hello everyone!
>> It is well known that tarantool processes connections through iproto
>> subsystem. Due to some problems, roughly described in the mentioned
>> tickets , it turns out that this subsystem behavior should be
>> reconsidered in some aspects.
>> Proposed changes are supposed to solve at least following problems.
>> First one is descriptors rlimit violation in case with some clients
>> performing enough requests while tx-thread is unresponsive. According
>> to Yaroslav 12 vshard routers reconnecting every 10.5 seconds for 15
>> minutes are enough to recovery dying with «can't initialize storage:
>> error reading directory: too many open files» error.
>> Second one is dirty read and others when tx can response although
>> bootstrap is not finished.
>> The solution is basically to provide iproto with more freedom, at least
>> in some cases. As far as i see it can be implemented using humble
>> state-machine. The alternative is vtab and it seems like an overkill
>> here, as far as it is less transparent and there can be only 2 options
>> for each request: to process it or to reject. To start with, we can use
>> 2 states to solve first problem, which seems to be more painful, and
>> then introduce new states to solve second problem and possibly some
>> more. These states may be called "solo" & "assist" states. "Assist"
>> state mostly implies current iproto behavior and shoulbe the basic one,
>> while "solo" state is intended to be enabled by tx thread when it is
>> going to become unresponsive for considerale time (for example, while
>> building secondary keys). "Solo" state means that iproto won't
>> communicate with tx and will simply answer everyone with any request
>> that tx is busy. The alternative is some kind of heartbeat from tx to
>> iproto to allow iproto decide if it needs to change it's state itself,
>> however it also seems like an overkill. If user, for example, loads tx
>> so much that it can't communicate with iproto, that is his own problem.
>> This approach is needed as far as now iproto can only accept
>> connections, consequently spending sockets in case tx thread can't
>> answer. tx now needs to prepare greeting and only then iproto can send
>> it. It work the same with all other requests: tx needs to prepare the
>> answer and then iproto processes it.
>> Proposed approach allows iproto itself to close connections or ask
>> them to wait in "solo" state. This will solve leaking descriptors
>> problem. Late more states can be added, where iproto, for example, will
>> answer itself only to dml requests (while tx is not ready for it). This
>> idea is partly realized and it shows satisfying behavior in case with
>> unresponsive tx.
>> There is one thing that causes trouble: using output bufs with
>> thread-local slab_cache. Now obuf's slab_cache belongs to tx, while
>> proposed changes mean that both tx & iproto have to be able to use
>> them depending on state & request type. I am currently searching for
>> the best approach here. There is an option to use more obufs (4 instead
>> of 2), 2 of them belonging to tx thread and 2 of them belonging to
>> iproto thread.
>> It is also doubtable if connections to iproto in "solo" state should be
>> closed or retry their requests after some timeout. I propose to close
>> them, while there is opinion that it is not the right behavior. Though
>> I think it is more transparent and understandable for users to
>> reconnect by themselves, also as far as this unresponsive tx state
>> might last for quite a long time.
>> --
>> Ilya Kosarev
>Konstantin Osipov, Moscow, Russia

[-- Attachment #2: Type: text/html, Size: 7996 bytes --]

  parent reply	other threads:[~2020-06-30 11:18 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-29 13:47 Ilya Kosarev
     [not found] ` <20200529145512.GA189726@atlas>
2020-06-30 11:18   ` Ilya Kosarev [this message]
2020-06-30 11:47     ` Konstantin Osipov
2020-07-02 14:22       ` [Tarantool-discussions] [dev] " Ilya Kosarev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1593515892.514490810@f382.i.mail.ru \
    --to=i.kosarev@tarantool.org \
    --cc=kostja@scylladb.com \
    --cc=server-dev@tarantool.org \
    --cc=tarantool-discussions@dev.tarantool.org \
    --subject='Re: [Tarantool-discussions] [server-dev]  [rfc] iproto connections processing improvements' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox