Hi!   Thanks for your review.   Sent v7 considering your comments. Some answers below. >Среда, 16 декабря 2020, 2:04 +03:00 от Vladislav Shpilevoy : >  >Hi! Thanks for the patch! > >See 4 comments below. > >On 13.12.2020 00:38, Ilya Kosarev wrote: >> On connection, an evio service callback is invoked to accept it. The >> next step after acceptance was to process connection to tx thread >> through cbus. This meant that any connection interaction involves tx >> thread even before we get to send the greeting to the client. >> Consequently, the client might reach the instance with enormous number >> of connections, leading to the file descriptors limit exhaustion in >> case of unresponsive tx thread (for example, when building secondary >> index) and extra tx_process_connects afterwords, even in case the >> instance doesn't fail with too many open files error. >> This patch allows iproto to accept connection and send greeting by >> itself. Thus the connection is being established and stays in >> fetch_schema state while tx thread is unresponsive. >> >> Closes #3776 >> --- >> src/box/iproto.cc | 118 +++++++++++++----- >> test/app/gh-4787-netbox-empty-errmsg.result | 18 --- >> test/app/gh-4787-netbox-empty-errmsg.test.lua | 8 -- >> 3 files changed, 88 insertions(+), 56 deletions(-) >> >> diff --git a/src/box/iproto.cc b/src/box/iproto.cc >> index f7330af21d..a7da115925 100644 >> --- a/src/box/iproto.cc >> +++ b/src/box/iproto.cc >> @@ -1091,6 +1093,46 @@ iproto_connection_on_output(ev_loop *loop, struct ev_io *watcher, >> } >> } >> >> +static int >> +iproto_buf_flush(struct iproto_connection *con) > >1. The function lacks a comment. It is not so trivial why do we >have 2 flush functions. Also maybe worth renaming it to >iproto_connection_greeting_flush. Because it is a method of >'iproto_connection' (thus the prefix), and works with the >greeting only. Right, fixed in v7 of the patch. > >> +{ >> + struct iovec greeting; >> + greeting.iov_base = &con->greeting_buf; >> + greeting.iov_len = IPROTO_GREETING_SIZE; >> + ssize_t nwr = sio_writev(con->output.fd, &greeting, 1); >> + >> + if (nwr > 0) { >> + /* Count statistics */ >> + rmean_collect(rmean_net, IPROTO_SENT, nwr); >> + return 1; >> + } else if (nwr < 0 && !sio_wouldblock(errno)) { >> + return -1; >> + } >> + >> + return 0; >> +} >> + >> +static void >> +iproto_connection_on_greeting(ev_loop *loop, struct ev_io *watcher, >> + int /* revents */) >> +{ >> + struct iproto_connection *con = >> + (struct iproto_connection *)watcher->data; >> + int rc = iproto_buf_flush(con); >> + if (rc <= 0) { >> + if (rc == 0) { >> + ev_io_start(loop, &con->output); >> + } else { >> + diag_log(); >> + con->state = IPROTO_CONNECTION_CLOSED; > >2. I don't understand how this part works. You didn't close the >socket, and didn't delete it from libev. But IPROTO_CONNECTION_CLOSED >clearly states > >/** >* Socket was closed, a notification is sent to the TX >* thread to close the session. >*/ > >The way you change it here does not match the description. AFAIU, >the only legal function able to set this state is >iproto_connection_close(), and it should remain so. > >Moreover, I see this should crash in an assertion in >iproto_connection_close(). Because this function will be called by >net_finish_connect() due to state being CLOSED (which is weird - >why is close() called if the state is already CLOSED?). But inside >it has "assert(con->state == IPROTO_CONNECTION_ALIVE)" in case the >ev_watcher still has a descriptor - it still has, so it will go there >any crash. > >Did I miss something? Yes, my bad. It definitely needs new state. There was not going to be an assertion fail as long as i switch it back, but it is still not good. Fixed in v7. > >> + } >> + return; >> + } >> + ev_io_stop(con->loop, &con->output); >> + ev_io_init(&con->output, iproto_connection_on_output, >> + con->output.fd, EV_WRITE); >> +} > >3. What if sio_writev() has written only half of the bytes? >You didn't check and immediately proceeded to the normal >operation mode. But the greeting wasn't sent yet. Not all. > >Also you don't propagate iov_base in case only a part of >the data was sent. > >You can easily check it using a temporary error injection. >Before sio_writev() try to inject iov_len = >max(iov_len, IPROTO_GREETING_SIZE / 2) to force it to work >in 2 iterations, and you will see the issue. > >Partial write is a rare thing, but it is totally real. It >is not some kind of an impossible thing only from the books. >Even if it is just 128 bytes. Fixed in v7. > >> + >> static struct iproto_connection * >> iproto_connection_new(int fd) >> { >> @@ -1938,50 +1980,64 @@ net_end_subscribe(struct cmsg *m) >> } >> >> /** >> - * Handshake a connection: invoke the on-connect trigger >> - * and possibly authenticate. Try to send the client an error >> - * upon a failure. >> + * Handshake a connection: send greeting for it. >> + */ >> +static void >> +iproto_process_connect(struct iproto_msg *msg) > >4. Why do you take iproto_msg object if the only thing you do >with it is take the connection object? Wouldn't it be simpler >and more understandable, if it would be iproto_connection_start() >or something like that? With an iproto_connection * argument. Fixed in v7. > >> +{ >> + struct iproto_connection *con = msg->connection; >> + /* >> + * INSTANCE_UUID is guaranteed to be inited before this moment. >> + * We start listening either in local_recovery() or bootstrap(). >> + * The INSTANCE_UUID is ensured to be inited in the beginning of >> + * both methods. In case of local_recovery() it is verified that >> + * INSTANCE_UUID was read from the snapshot in memtx_engine_new(). >> + * In bootstrap() INSTANCE_UUID is either taken from the >> + * instance_uuid box.cfg{} param or created on the spot. >> + */ >> + random_bytes(con->salt, IPROTO_SALT_SIZE); >> + greeting_encode(con->greeting_buf, tarantool_version_id(), >> + &INSTANCE_UUID, con->salt, IPROTO_SALT_SIZE); >> + assert(evio_has_fd(&con->output)); >> + ev_feed_event(con->loop, &con->output, EV_WRITE); >> +}     -- Ilya Kosarev