From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 9079845C304 for ; Wed, 16 Dec 2020 02:04:43 +0300 (MSK) References: <9781cebc8d5e7b2fbf7a38b692b78047169e08b6.1607815050.git.i.kosarev@tarantool.org> From: Vladislav Shpilevoy Message-ID: <1a6b7eea-323e-a9c7-5a6a-c4b48b54648f@tarantool.org> Date: Wed, 16 Dec 2020 00:04:41 +0100 MIME-Version: 1.0 In-Reply-To: <9781cebc8d5e7b2fbf7a38b692b78047169e08b6.1607815050.git.i.kosarev@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v6 3/3] iproto: move greeting from tx thread to iproto List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ilya Kosarev , alyapunov@tarantool.org Cc: tarantool-patches@dev.tarantool.org Hi! Thanks for the patch! See 4 comments below. On 13.12.2020 00:38, Ilya Kosarev wrote: > On connection, an evio service callback is invoked to accept it. The > next step after acceptance was to process connection to tx thread > through cbus. This meant that any connection interaction involves tx > thread even before we get to send the greeting to the client. > Consequently, the client might reach the instance with enormous number > of connections, leading to the file descriptors limit exhaustion in > case of unresponsive tx thread (for example, when building secondary > index) and extra tx_process_connects afterwords, even in case the > instance doesn't fail with too many open files error. > This patch allows iproto to accept connection and send greeting by > itself. Thus the connection is being established and stays in > fetch_schema state while tx thread is unresponsive. > > Closes #3776 > --- > src/box/iproto.cc | 118 +++++++++++++----- > test/app/gh-4787-netbox-empty-errmsg.result | 18 --- > test/app/gh-4787-netbox-empty-errmsg.test.lua | 8 -- > 3 files changed, 88 insertions(+), 56 deletions(-) > > diff --git a/src/box/iproto.cc b/src/box/iproto.cc > index f7330af21d..a7da115925 100644 > --- a/src/box/iproto.cc > +++ b/src/box/iproto.cc > @@ -1091,6 +1093,46 @@ iproto_connection_on_output(ev_loop *loop, struct ev_io *watcher, > } > } > > +static int > +iproto_buf_flush(struct iproto_connection *con) 1. The function lacks a comment. It is not so trivial why do we have 2 flush functions. Also maybe worth renaming it to iproto_connection_greeting_flush. Because it is a method of 'iproto_connection' (thus the prefix), and works with the greeting only. > +{ > + struct iovec greeting; > + greeting.iov_base = &con->greeting_buf; > + greeting.iov_len = IPROTO_GREETING_SIZE; > + ssize_t nwr = sio_writev(con->output.fd, &greeting, 1); > + > + if (nwr > 0) { > + /* Count statistics */ > + rmean_collect(rmean_net, IPROTO_SENT, nwr); > + return 1; > + } else if (nwr < 0 && !sio_wouldblock(errno)) { > + return -1; > + } > + > + return 0; > +} > + > +static void > +iproto_connection_on_greeting(ev_loop *loop, struct ev_io *watcher, > + int /* revents */) > +{ > + struct iproto_connection *con = > + (struct iproto_connection *)watcher->data; > + int rc = iproto_buf_flush(con); > + if (rc <= 0) { > + if (rc == 0) { > + ev_io_start(loop, &con->output); > + } else { > + diag_log(); > + con->state = IPROTO_CONNECTION_CLOSED; 2. I don't understand how this part works. You didn't close the socket, and didn't delete it from libev. But IPROTO_CONNECTION_CLOSED clearly states /** * Socket was closed, a notification is sent to the TX * thread to close the session. */ The way you change it here does not match the description. AFAIU, the only legal function able to set this state is iproto_connection_close(), and it should remain so. Moreover, I see this should crash in an assertion in iproto_connection_close(). Because this function will be called by net_finish_connect() due to state being CLOSED (which is weird - why is close() called if the state is already CLOSED?). But inside it has "assert(con->state == IPROTO_CONNECTION_ALIVE)" in case the ev_watcher still has a descriptor - it still has, so it will go there any crash. Did I miss something? > + } > + return; > + } > + ev_io_stop(con->loop, &con->output); > + ev_io_init(&con->output, iproto_connection_on_output, > + con->output.fd, EV_WRITE); > +} 3. What if sio_writev() has written only half of the bytes? You didn't check and immediately proceeded to the normal operation mode. But the greeting wasn't sent yet. Not all. Also you don't propagate iov_base in case only a part of the data was sent. You can easily check it using a temporary error injection. Before sio_writev() try to inject iov_len = max(iov_len, IPROTO_GREETING_SIZE / 2) to force it to work in 2 iterations, and you will see the issue. Partial write is a rare thing, but it is totally real. It is not some kind of an impossible thing only from the books. Even if it is just 128 bytes. > + > static struct iproto_connection * > iproto_connection_new(int fd) > { > @@ -1938,50 +1980,64 @@ net_end_subscribe(struct cmsg *m) > } > > /** > - * Handshake a connection: invoke the on-connect trigger > - * and possibly authenticate. Try to send the client an error > - * upon a failure. > + * Handshake a connection: send greeting for it. > + */ > +static void > +iproto_process_connect(struct iproto_msg *msg) 4. Why do you take iproto_msg object if the only thing you do with it is take the connection object? Wouldn't it be simpler and more understandable, if it would be iproto_connection_start() or something like that? With an iproto_connection * argument. > +{ > + struct iproto_connection *con = msg->connection; > + /* > + * INSTANCE_UUID is guaranteed to be inited before this moment. > + * We start listening either in local_recovery() or bootstrap(). > + * The INSTANCE_UUID is ensured to be inited in the beginning of > + * both methods. In case of local_recovery() it is verified that > + * INSTANCE_UUID was read from the snapshot in memtx_engine_new(). > + * In bootstrap() INSTANCE_UUID is either taken from the > + * instance_uuid box.cfg{} param or created on the spot. > + */ > + random_bytes(con->salt, IPROTO_SALT_SIZE); > + greeting_encode(con->greeting_buf, tarantool_version_id(), > + &INSTANCE_UUID, con->salt, IPROTO_SALT_SIZE); > + assert(evio_has_fd(&con->output)); > + ev_feed_event(con->loop, &con->output, EV_WRITE); > +}