[PATCH v2] socket: evaluate buffer size in recv / recvfrom
Vladimir Davydov
vdavydov.dev at gmail.com
Tue Aug 28 14:32:14 MSK 2018
On Tue, Aug 28, 2018 at 12:05:10AM +0300, Alexander Turenko wrote:
> When size parameter is not passed to socket:recv() or socket:recvfrom()
> it will call a) or b) on the socket to evaluate size of the buffer to
> store the receiving datagram. Before this commit a datagram will be
> truncated to 512 bytes in the case.
>
> a) Linux: recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK)
> b) Mac OS: getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len)
>
> It is recommended to set 'size' parameter (size of the input buffer)
> explicitly based on known message format and known network conditions
> (say, set it less then MTU to prevent IP fragmentation, which can be
> inefficient) or pass it from a configuration option of a library / an
> application. The reason is that explicit buffer size provided allows to
> avoid extra syscall to evaluate necessary buffer size.
>
> When 'size' parameter is set explicitly for recv / recvfrom on a UDP
> socket and the next datagram length is larger then the size, the
> returned message will be truncated to the size provided and the rest of
> the datagram will be discarded. Of course, the tail will not be
> discarded in case of a TCP socket and will be available to read by the
> next recv / recvfrom call.
>
> Fixes #3619.
> ---
>
> branch: Totktonada/gh-3619-socket-recvfrom-detect-a-cropped-packet
Please put a hyperlink to the branch, not just its name.
Also, prefixes "branch:", "issue:", "travis-ci" aren't necessary as it
is apparent which is which.
> issue: https://github.com/tarantool/tarantool/issues/3619
> travis-ci: https://travis-ci.org/tarantool/tarantool/builds/421272709
>
> Changes since the previous version of the patch:
>
> - discarded the patch to prevent UDP datagram truncation as Yaroslav
> asks;
> - moved itype saving into the socket constructor;
> - removed is_udp and dgram_length from the public API (moved to
> socket.internal reflecting the msgpackffi approach);
> - adopted and added test cases for recv / recvfrom from the discarded
> patch to hold the current behaviour (truncate with explicit size).
>
> src/lua/socket.c | 3 +
> src/lua/socket.lua | 68 +++-
> test/app/socket.result | 672 ++++++++++++++++++++++++++++++++++++++-
> test/app/socket.test.lua | 249 ++++++++++++++-
> 4 files changed, 980 insertions(+), 12 deletions(-)
>
> diff --git a/src/lua/socket.c b/src/lua/socket.c
> index 1458d3ba4..84711b441 100644
> --- a/src/lua/socket.c
> +++ b/src/lua/socket.c
> @@ -268,6 +268,9 @@ static const struct lbox_sockopt_reg so_opts[] = {
> #ifdef SO_PROTOCOL
> {"SO_PROTOCOL", SO_PROTOCOL, 1, 0, },
> #endif
> +#ifdef SO_NREAD /* available on Darwin */
> + {"SO_NREAD", SO_NREAD, 1, 0, },
> +#endif
>
> {"SO_TYPE", SO_TYPE, 1, 0, },
>
> diff --git a/src/lua/socket.lua b/src/lua/socket.lua
> index 06306eae2..ddf7ebd53 100644
> --- a/src/lua/socket.lua
> +++ b/src/lua/socket.lua
> @@ -86,8 +86,12 @@ local function check_socket(socket)
> end
> end
>
> -local function make_socket(fd)
> - local socket = { _gc_socket = ffi.new(gc_socket_t, { fd = fd }) }
> +local function make_socket(fd, itype)
> + assert(itype ~= nil)
> + local socket = {
> + _gc_socket = ffi.new(gc_socket_t, { fd = fd }),
> + itype = itype,
> + }
> return setmetatable(socket, socket_mt)
> end
>
> @@ -590,7 +594,7 @@ local function socket_accept(self)
> self._errno = boxerrno()
> return nil
> end
> - local client = make_socket(client_fd)
> + local client = make_socket(client_fd, self.itype)
> if not client:nonblock(true) then
> client:close()
> return
> @@ -770,6 +774,38 @@ local function socket_send(self, octets, flags)
> return tonumber(res)
> end
>
> +local function socket_is_udp_internal(self)
> + assert(self.itype ~= nil)
> + return self.itype == internal.SO_TYPE['SOCK_DGRAM']
Nit: get_ivalue?
Also, I'd inline this function as it's a one-liner, and remove the
assertion as itype is going to be set anyway.
> +end
> +
> +-- returns nil and sets EAGAIN when there are no datagrams in the receive queue
> +-- as well as when zero-length datagram is the next one
> +local function socket_dgram_length_internal(self)
I don't like the function name. Let's call it simply get_dgram_len?
> + local fd = check_socket(self)
> + if not socket_is_udp_internal(self) then
> + error("socket_dgram_length_internal() supports only a UDP socket")
> + end
I'd remove these checks as this function is only called by recv() and
recvfrom(), where these checks have already been carried out.
> +
> + local len
> + if jit.os == 'OSX' then
> + self._errno = nil
> + len = self:getsockopt('SOL_SOCKET', 'SO_NREAD')
> + else
> + local iflags = get_iflags(internal.SEND_FLAGS, {'MSG_TRUNC', 'MSG_PEEK'})
> + assert(iflags ~= nil)
> + self._errno = nil
Nit: 'self._errno = nil' can be moved before 'if'.
> + len = tonumber(ffi.C.recv(fd, nil, 0, iflags))
Nit: tonumber is not necessary here.
> + end
> +
> + if len == -1 or len == 0 then
AFAIU errno isn't set unless the return value is -1. Please check and
remove 'len == 0' from the expression if so.
> + self._errno = boxerrno()
> + return nil
> + end
> +
> + return len
> +end
> +
> local function socket_recv(self, size, flags)
> local fd = check_socket(self)
> local iflags = get_iflags(internal.SEND_FLAGS, flags)
> @@ -778,7 +814,15 @@ local function socket_recv(self, size, flags)
> return nil
> end
>
> - size = size or 512
> + if socket_is_udp_internal(self) then
> + size = size or socket_dgram_length_internal(self)
> + if size == nil then
> + return nil
> + end
> + else
> + size = size or 512
> + end
> +
> self._errno = nil
> local buf = ffi.new("char[?]", size)
>
> @@ -799,7 +843,15 @@ local function socket_recvfrom(self, size, flags)
> return nil
> end
>
> - size = size or 512
> + if socket_is_udp_internal(self) then
> + size = size or socket_dgram_length_internal(self)
> + if size == nil then
> + return nil
> + end
> + else
> + size = size or 512
> + end
> +
Don't really like the code duplication. What about factoring it out in a
helper function? Then you wouldn't need get_dgram_len() at all as you
could inline it in there. I'm not sure how to call such a function. What
about get_recv_size(size)?
> self._errno = nil
> local res, from = internal.recvfrom(fd, size, iflags)
> if res == nil then
> @@ -860,7 +912,7 @@ local function socket_new(domain, stype, proto)
>
> local fd = ffi.C.socket(idomain, itype, iproto)
> if fd >= 0 then
> - local socket = make_socket(fd)
> + local socket = make_socket(fd, itype)
> if not socket:nonblock(true) then
> socket:close()
> else
> @@ -1171,6 +1223,10 @@ socket_mt = {
> name = socket_name;
> peer = socket_peer;
> fd = socket_fd;
> + internal = { -- for testing
> + is_udp = socket_is_udp_internal;
> + dgram_length = socket_dgram_length_internal;
> + };
I don't think it's worth exporting these functions, even for testing,
because you cover them for both stream and dgram sockets anyway. I'd
remove them and corresponding tests.
> };
> __tostring = function(self)
> local fd = check_socket(self)
> diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
> index f6539438b..cef71fc95 100644
> --- a/test/app/socket.test.lua
> +++ b/test/app/socket.test.lua
> @@ -591,6 +593,7 @@ test_run:cmd("setopt delimiter ''");
> --------------------------------------------------------------------------------
> -- Lua Socket Emulation
> --------------------------------------------------------------------------------
> +-- {{{
Nit: we never use {{{ }}} for separating test groups in tests so this
looks inconsistent.
More information about the Tarantool-patches
mailing list