[Tarantool-patches] [PATCH] socket: fix error while closing socket.tcp_server

Tue May 19 22:52:54 MSK 2020

Hi! Thanks for the review.

On 13.05.2020 17:21, Alexander Turenko wrote:
> TL;DR: I would ask you a case that will show the difference you mention.
> 
> The same, but with more words, below.
> 
> WBR, Alexander Turenko.
> 
I couldn't find a valid way to close the socket outside. So, you
won this fight :). I sent a new version of the patchset.
>>>> @@ -1082,6 +1085,9 @@ local function tcp_server_loop(server, s, addr)
>>>>        fiber.name(format("%s/%s:%s", server.name, addr.host, addr.port), {truncate = true})
>>>>        log.info("started")
>>>>        while socket_readable(s) do
>>>> +        if s._server_socket_closed then
>>>> +            break
>>>> +        end
>>>
>>> Can we use the same check `<s>._gc_socket.fd >= 0` as in check_socket()
>>> (we can wrap it to a function like socket_is_closed())?
>>>
>>> It seems that using the same way for determining whether a socket is
>>> closed owuld simplify the code a bit.
>>>
>>> WBR, Alexander Turenko.
>>>
>> If I understand correctly, in this case we must not only check whether
>> the socket is closed or not, but also check whether it was closed
>> without errors by checking errno. In fact, we come to my first patchset,
> 
> Execution path is the following:
> 
>   | tcp_server()
>   |     local s, addr = tcp_server_bind(...)
>   |     if not s then <...> end
>   |     <s is correct socket here>
>   |     fiber.create(tcp_server_loop, server, s, addr)
>   |         while socket_readable(s) do -- it is yield
>   |             <a check should be added here>
>   |             local sc, from = socket_accept(s)
>   |             <...>
>   |         end
> 
> What can appear with <s> when we're in yield? It can be closed or
> mangled in some bad way by a user (say, `socket._gc_socket = {}`). The
> latter is invalid usage, so we should handle only closing the socket.
> 
> The next question is what condition we should use to check whether a
> socket is closed. Look at check_socket() and socket_close() functions
> (marked your new code with '+' marks):
> 
>   | local function check_socket(socket)
>   |     local gc_socket = type(socket) == 'table' and socket._gc_socket
>   |     if ffi.istype(gc_socket_t, gc_socket) then
>   |         local fd = gc_socket.fd
>   |         if fd >= 0 then
>   |             return fd
>   |         else
>   |             error("attempt to use closed socket")
>   |         end
>   |     else
>   |         local msg = "Usage: socket:method()"
>   |         if socket ~= nil then msg = msg .. ", called with non-socket" end
>   |         error(msg)
>   |     end
>   | end
> 
>   | local gc_socket_sentinel = ffi.new(gc_socket_t, { fd = -1 })
>   |
>   | local function socket_close(socket)
>   |     local fd = check_socket(socket)
>   +     if socket._server_socket_closed ~= nil then
>   +         socket._server_socket_closed = true
>   +     end
>   |     socket._errno = nil
>   |     local r = ffi.C.coio_close(fd)
>   |     -- .fd is const to prevent tampering
>   |     ffi.copy(socket._gc_socket, gc_socket_sentinel, ffi.sizeof(gc_socket_t))
>   |     if r ~= 0 then
>   |         socket._errno = boxerrno()
>   |         return false
>   |     end
>   |     return true
>   | end
> 
> check_socket() use `<s>._gc_socket.fd >= 0` to determine whether socket
> is closed. I don't see a reason to add another way to determine it. But
> I would extract the check into a function:
> 
>   | local function socket_is_closed(socket)
>   |     return socket._gc_socket.fd < 0
>   | end
> 
> If I miss some case, please, show reproducer code, which reveals it.
> 
>> where I disable throwing an exception from the check_socket(). Disabling
> 
> Does not look the same: check_socket() is used everywhere in the API
> functions.
I was thinking of something like that you can return nil in case of
a closed socket and throw exception in case of an invalid argument.
> 
>> throwing an exception is the right way because now our API doesn't
>> correspond to the documentation (I will create a ticket for this). But
>> fixing a socket check leads to big changes in the code that are not
>> related to the problem and will not pass the review. I can wrap the flag
> 
> We discussed Leonid's question externally. It is about raising an error
> for a closed socket handle. I would consider closed socket as an illegal
> parameter and raise an error as for any other illegal parameter error.
> At least until someone will show that a correct code may pass a closed
> socket to a socket module function / a handle method due to some
> external influence (when other end of socket becomes closed or so).
> 
> Our documentation usually miss to mention that an illegal parameter
> error is raised, not returned. This is not good, but it is usual
> convention for our modules in fact.
> 
For closed socket a separate exception used without usual "Usage ...".
I suggest not discussing the problem of API documentation here. I was 
wrong when started.
>> check in a method, but then all sockets should have a flag and it's look
>> like overkill (this will duplicate the functionality of the
>> check_socket() function after fix).
> 
> Since check_socket() change you propose (don't raise an error) is
> questionable I cannot take it as the argument.
Ok.
> 
>> In addition, we don't check EV_ERROR and it seems to me that this can
>> also lead to the fact that the check you proposed may be insufficient.
> 
> To be honest, I don't get how it is related to the question we discuss,
> but anyway.
> 
>  From http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod
> 
>   | EV_ERROR
>   |
>   | An unspecified error has occurred, the watcher has been stopped. This
>   | might happen because the watcher could not be properly started
>   | because libev ran out of memory, a file descriptor was found to be
>   | closed or any other problem. Libev considers these application bugs.
>   |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   |
>   | You best act on it by reporting the problem and somehow coping with
>   | the watcher being stopped. Note that well-written programs should not
>   | receive an error ever, so when your watcher receives it, this usually
>   | indicates a bug in your program.
>   |
>   | Libev will usually signal a few "dummy" events together with an
>   | error, for example it might indicate that a fd is readable or
>   | writable, and if your callbacks is well-written it can just attempt
>   |                                                        ^^^^^^^^^^^^
>   | the operation and cope with the error from read() or write(). This
>   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   | will not work in multi-threaded programs, though, as the fd could
>   | already be closed and reused for another thing, so beware.
> 
> So, first: it should not appear at all. Second, we can just read() /
> write() and look at the return value + errno.
>