<HTML><BODY><div><div>Hi,</div><div>Thanks for your review!<br> </div><div>Sent v6 of the patch (patchset) considering your comments. Some answers below.</div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;">Пятница, 11 декабря 2020, 0:52 +03:00 от Vladislav Shpilevoy <v.shpilevoy@tarantool.org>:<br> <div id=""><div class="js-helper js-readmsg-msg"><div><div id="style_16076371611466119070_BODY">Hi! Thanks for the patch!<br><br>I suggest you to invite Alexander L. to start a second review. We<br>don't have much time until release.<br><br>See 13 comments below.<br><br>On 08.12.2020 23:09, Ilya Kosarev wrote:<br>> On connection, an evio service callback is invoked to accept it.The<br>> next step after acception was to process connection to tx thread<br><br>1. acception -> acceptance.</div></div></div></div></blockquote></div><div>Ok, right, fixed in v6 of the patch.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>Also missing whitespace in the first line after 'it.'.<br><br>> through cbus. This meant that any connection interaction involves<br>> tx thread even before we get to decode what does the client want<br>> from us. Consequently, a number of problems appears. The main one<br>> is that we might get descriptor leak in case of unresponsive<br>> tx thread (for example, when building secondary index).<br><br>2. It wasn't a leak. Socket close events were queued infinitely until<br>tx finished bootstrap. Please, provide more info here. Now it is<br>just misleading.<br><br>On the whole, the commit message is super miser. Imagine how will it<br>look to somebody who didn't do any reviews and didn't discuss it before.<br>The problem is far from trivial, but from the commit message it looks<br>like somebody just forgot a close() somewhere.</div></div></div></div></blockquote></div><div>Ok, the commit message is rewritten in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> This patch allows iproto to accept connection and send greeting by<br>> itself. Thus the connection is being established and stays in<br>> fetch_schema state while tx thread is unresponsive. It solves<br>> descriptors leakage problem.<br>><br>> Closes #3776<br>> ---<br>> src/box/iproto.cc | 147 ++++++++++++++----<br>> test/app/gh-4787-netbox-empty-errmsg.result | 18 ---<br>> test/app/gh-4787-netbox-empty-errmsg.test.lua | 8 -<br>> 3 files changed, 113 insertions(+), 60 deletions(-)<br>><br>> diff --git a/src/box/iproto.cc b/src/box/iproto.cc<br>> index b8f65e5eca..8c122dc58d 100644<br>> --- a/src/box/iproto.cc<br>> +++ b/src/box/iproto.cc<br>> @@ -256,6 +256,9 @@ iproto_msg_delete(struct iproto_msg *msg)<br>> iproto_resume();<br>> }<br>><br>> +static inline void<br>> +iproto_connection_delete(struct iproto_connection *con);<br><br>3. This additional declaration is not necessary.</div></div></div></div></blockquote></div><div>Right, it was extra change, removed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> +<br>> /**<br>> * A single global queue for all requests in all connections. All<br>> * requests from all connections are processed concurrently.<br>> @@ -453,6 +456,8 @@ struct iproto_connection<br>> * meaningless.<br>> */<br>> size_t parse_size;<br>> + /** Iproto buffer used to send greeting. */<br>> + struct iovec iproto_output_buf;<br>> /**<br>> * Nubmer of active long polling requests that have already<br>> * discarded their arguments in order not to stall other<br>> @@ -566,6 +571,7 @@ iproto_msg_new(struct iproto_connection *con)<br>> return NULL;<br>> }<br>> msg->connection = con;<br>> + msg->close_connection = false;<br><br>4. You don't need this change now. However still looks dangerous. So I would<br>move it to a separate commit.</div></div></div></div></blockquote></div><div>Ok, moved to separate commit.</div><div> </div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> rmean_collect(rmean_net, IPROTO_REQUESTS, 1);<br>> return msg;<br>> }<br>> @@ -1090,6 +1096,51 @@ iproto_connection_on_output(ev_loop *loop, struct ev_io *watcher,<br>> }<br>> }<br>><br>> +static int<br>> +iproto_buf_flush(struct iproto_connection *con)<br>> +{<br>> + int fd = con->output.fd;<br>> + ssize_t nwr = sio_writev(fd, &con->iproto_output_buf, 1);<br>> +<br>> + if (nwr > 0) {<br>> + /* Count statistics */<br>> + rmean_collect(rmean_net, IPROTO_SENT, nwr);<br>> + return 1;<br>> + } else if (nwr < 0 && ! sio_wouldblock(errno)) {<br><br>5. We don't use whitespaces after unary operators. The same in some<br>other places. Please, fix all of them.</div></div></div></div></blockquote></div><div>Right, copied old codestyle, fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + diag_raise();<br><br>6. Please, don't use exceptions in the new code. I know you<br>copy-pasted it from iproto_flush, but it does not mean it is good.<br><br>> + }<br>> +<br>> + return nwr;<br>> +}<br>> +<br>> +static void<br>> +iproto_connection_on_greeting(ev_loop *loop, struct ev_io *watcher,<br>> + int /* revents */)<br>> +{<br>> + struct iproto_connection *con = (struct iproto_connection *) watcher->data;<br><br>7. This is out of 80 symbols.</div></div></div></div></blockquote></div><div>Right, fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + try {<br>> + int rc;<br>> + while ((rc = iproto_buf_flush(con)) <= 0) {<br>> + if (rc != 0) {<br>> + ev_io_start(loop, &con->output);<br>> + return;<br>> + }<br>> + }<br>> + if (ev_is_active(&con->output))<br>> + ev_io_stop(con->loop, &con->output);<br><br>8. ev_io_stop() is safe to call on a non-active ev_watcher. You don't<br>need this 'if'.</div></div></div></div></blockquote></div><div>Right, fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + ev_io_init(&con->output, iproto_connection_on_output,<br>> + con->output.fd, EV_WRITE);<br>> + if (con->input.cb != iproto_connection_on_input)<br>> + ev_io_init(&con->input, iproto_connection_on_input,<br>> + con->input.fd, EV_READ);<br>> + else<br>> + ev_feed_event(loop, &con->input, EV_READ);<br><br>9. This code block above I don't understand. Why is it conditional? Why<br>cb is not equal to iproto_connection_on_input always? And why do you use<br>a callback as a flag that it is time to start reading. I thought<br>you are supposed to use the connection state for that.</div></div></div></div></blockquote></div><div>Right, it is not really needed. Fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + } catch (Exception *e) {<br>> + e->log();<br>> + con->state = IPROTO_CONNECTION_CLOSED;<br>> + }<br>> +}<br>> +<br>> static struct iproto_connection *<br>> iproto_connection_new(int fd)<br>> {<br>> @@ -1101,8 +1152,8 @@ iproto_connection_new(int fd)<br>> }<br>> con->input.data = con->output.data = con;<br>> con->loop = loop();<br>> - ev_io_init(&con->input, iproto_connection_on_input, fd, EV_READ);<br>> - ev_io_init(&con->output, iproto_connection_on_output, fd, EV_WRITE);<br>> + ev_io_init(&con->input, NULL, fd, EV_READ);<br><br>10. Why did you nullify the callback? Ev_io_init does not lead to event<br>listen start. It only initializes the struct.</div></div></div></div></blockquote></div><div>Well, I was using it as a flag. But it is neither good nor really needed, fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + ev_io_init(&con->output, iproto_connection_on_greeting, fd, EV_WRITE);<br>> ibuf_create(&con->ibuf[0], cord_slab_cache(), iproto_readahead);<br>> ibuf_create(&con->ibuf[1], cord_slab_cache(), iproto_readahead);<br>> obuf_create(&con->obuf[0], &net_slabc, iproto_readahead);<br>> @@ -1378,13 +1429,14 @@ tx_process_destroy(struct cmsg *m)<br>> {<br>> struct iproto_connection *con =<br>> container_of(m, struct iproto_connection, destroy_msg);<br>> + assert(con->state == IPROTO_CONNECTION_DESTROYED);<br><br>11. Please, introduce the states in a separate commit. I think I<br>already asked for that when we discussed the previous version,<br>but not sure.</div></div></div></div></blockquote></div><div>Nothing is introduced, it is just a comment fixed and assertion added.</div><div>Moved to separate commit.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> if (con->session) {<br>> session_destroy(con->session);<br>> con->session = NULL; /* safety */<br>> }<br>> /*<br>> - * Got to be done in iproto thread since<br>> - * that's where the memory is allocated.<br>> + * obuf is being destroyed in tx thread cause it is where<br>> + * it was allocated.<br>> */<br>> obuf_destroy(&con->obuf[0]);<br>> obuf_destroy(&con->obuf[1]);<br>> @@ -1936,50 +1988,72 @@ net_end_subscribe(struct cmsg *m)<br>> }<br>><br>> /**<br>> - * Handshake a connection: invoke the on-connect trigger<br>> - * and possibly authenticate. Try to send the client an error<br>> - * upon a failure.<br>> + * Handshake a connection: send greeting for it.<br>> + */<br>> +static void<br>> +iproto_process_connect(struct iproto_msg *msg)<br>> +{<br>> + struct iproto_connection *con = msg->connection;<br>> + con->iproto_output_buf.iov_base = static_alloc(IPROTO_GREETING_SIZE);<br><br>12. There is a serious problem in that allocation. Static allocator is<br>very volatile. What is returned from static_alloc(), does not belong to<br>you. This memory is heavily reused, and can be harnessed only for very<br>short living memory if you are sure there are no other static allocs<br>around.<br><br>Here you allocate a long-living object on it. So it will be just overridden<br>in case of any serious load, because all new connections will call this<br>alloc, and the static buffer will be recycled and turned into a pile of<br>garbage.</div></div></div></div></blockquote></div><div>All right, my bad, I decided to put it into struct iproto_connection as char</div><div>buffer as long as it is already 2176, and thus with greeting buf it is 2304.</div><div>struct iproto connection itself is allocated with mempool on cord slab cache.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + con->iproto_output_buf.iov_len = IPROTO_GREETING_SIZE;<br>> + /*<br>> + * INSTANCE_UUID is guaranteed to be inited before this moment.<br>> + * We start listening either in local_recovery() or bootstrap().<br>> + * The INSTANCE_UUID is ensured to be inited in the beginning of<br>> + * both methods. In case of local_recovery() it is verified that<br>> + * INSTANCE_UUID was read from the snapshot in memtx_engine_new().<br>> + * In bootstrap() INSTANCE_UUID is either taken from the<br>> + * instance_uuid box.cfg{} param or created on the spot.<br>> + */<br>> + struct tt_uuid uuid = INSTANCE_UUID;<br><br>13. Why do you need to copy it on the stack? Why can't you use pointer<br>at INSTANCE_UUID variable?</div></div></div></div></blockquote></div><div>Right, fixed in v6.</div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> + random_bytes(con->salt, IPROTO_SALT_SIZE);<br>> + greeting_encode((char *)con->iproto_output_buf.iov_base,<br>> + tarantool_version_id(), &uuid,<br>> + con->salt, IPROTO_SALT_SIZE);<br>> + assert(evio_has_fd(&con->output));<br>> + ev_feed_event(con->loop, &con->output, EV_WRITE);<br>> +}</div></div></div></div></blockquote><div> </div><div data-signature-widget="container"><div data-signature-widget="content"><div>--<br>Ilya Kosarev</div></div></div><div> </div></div></BODY></HTML>