[Tarantool-discussions] [server-dev] [rfc] iproto connections processing improvements

Ilya Kosarev i.kosarev at tarantool.org
Tue Jun 30 14:18:12 MSK 2020


Ссылки были заинлайнены под слова.
Не учёл, что это работает не во всех клиентах.
https://github.com/tarantool/tarantool/issues/3776
https://github.com/tarantool/tarantool/issues/4646
https://github.com/tarantool/tarantool/issues/4910
 
Про vtab/состояния конечно вопрос отдельный, я написал, как это вижу.
 
В iproto_msg_decode как раз есть возможность отделить одно от другого.
Или речь не об этом? В предложении в том числе говорится как раз о том,
чтобы выполнять iproto_msg_decode не дожидаясь tx (сейчас tx слишком
активно участвует в процессе сетевого взаимодействия.)
 
Кроме того, так как в итоге всё равно tx должен поучаствовать в
процессе, когда он "занят", мы хотим отсекать любые соединения
в iproto, чтобы не происходило собственно утечки дескрипторов, как
минимум.
 
--
Ilya Kosarev
>Пятница, 29 мая 2020, 17:55 +03:00 от Konstantin Osipov < kostja at scylladb.com >:
> 
>* Ilya Kosarev < i.kosarev at tarantool.org > [20/05/29 16:49]:
>
>Илья, по-русски то тут сложно было бы разобраться, а по-английски
>уж и подавно.
>
>Ссылок на "mentioned tickets" нет.
>
>vtab значит оверкилл, - это идёт отсылка
>к моему комментарию в тикете, видимо?
>Можно было бы со мной напрямую обсудить.
>
>
>В целом, тут вопрос не в rfc vs vtab, а в том как разделить
>в трафике соединения от реплик, которые нужно принимать во время
>бутстрапа, от соединений от клиентов.
>
>На сегодня в протоколе таких различий нет.
>
>В письме об этом ничего нет.
>
> 
>>
>> Hello everyone!
>>  
>> It is well known that tarantool processes connections through iproto
>> subsystem. Due to some problems, roughly described in the mentioned
>> tickets , it turns out that this subsystem behavior should be
>> reconsidered in some aspects.
>>  
>> Proposed changes are supposed to solve at least following problems.
>> First one is descriptors rlimit violation in case with some clients
>> performing enough requests while tx-thread is unresponsive. According
>> to Yaroslav 12 vshard routers reconnecting every 10.5 seconds for 15
>> minutes are enough to recovery dying with «can't initialize storage:
>> error reading directory: too many open files» error.
>> Second one is dirty read and others when tx can response although
>> bootstrap is not finished.
>>  
>> The solution is basically to provide iproto with more freedom, at least
>> in some cases. As far as i see it can be implemented using humble
>> state-machine. The alternative is vtab and it seems like an overkill
>> here, as far as it is less transparent and there can be only 2 options
>> for each request: to process it or to reject. To start with, we can use
>> 2 states to solve first problem, which seems to be more painful, and
>> then introduce new states to solve second problem and possibly some
>> more. These states may be called "solo" & "assist" states. "Assist"
>> state mostly implies current iproto behavior and shoulbe the basic one,
>> while "solo" state is intended to be enabled by tx thread when it is
>> going to become unresponsive for considerale time (for example, while
>> building secondary keys). "Solo" state means that iproto won't
>> communicate with tx and will simply answer everyone with any request
>> that tx is busy. The alternative is some kind of heartbeat from tx to
>> iproto to allow iproto decide if it needs to change it's state itself,
>> however it also seems like an overkill. If user, for example, loads tx
>> so much that it can't communicate with iproto, that is his own problem.
>>  
>> This approach is needed as far as now iproto can only accept
>> connections, consequently spending sockets in case tx thread can't
>> answer. tx now needs to prepare greeting and only then iproto can send
>> it. It work the same with all other requests: tx needs to prepare the
>> answer and then iproto processes it.
>> Proposed approach allows iproto itself to close connections or ask
>> them to wait in "solo" state. This will solve leaking descriptors
>> problem. Late more states can be added, where iproto, for example, will
>> answer itself only to dml requests (while tx is not ready for it). This
>> idea is partly realized and it shows satisfying behavior in case with
>> unresponsive tx.
>>  
>> There is one thing that causes trouble: using output bufs with
>> thread-local slab_cache. Now obuf's slab_cache belongs to tx, while
>> proposed changes mean that both tx & iproto have to be able to use
>> them depending on state & request type. I am currently searching for
>> the best approach here. There is an option to use more obufs (4 instead
>> of 2), 2 of them belonging to tx thread and 2 of them belonging to
>> iproto thread.
>>  
>> It is also doubtable if connections to iproto in "solo" state should be
>> closed or retry their requests after some timeout. I propose to close
>> them, while there is opinion that it is not the right behavior. Though
>> I think it is more transparent and understandable for users to
>> reconnect by themselves, also as far as this unresponsive tx state
>> might last for quite a long time.
>>  
>> --
>> Ilya Kosarev
>--
>Konstantin Osipov, Moscow, Russia
 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-discussions/attachments/20200630/204cace6/attachment.html>


More information about the Tarantool-discussions mailing list