Tarantool development patches archive
 help / color / mirror / Atom feed
From: Alexander Turenko <alexander.turenko@tarantool.org>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: tarantool-patches@freelists.org,
	Yaroslav Dynnikov <yaroslav.dynnikov@tarantool.org>
Subject: Re: [PATCH v2] socket: evaluate buffer size in recv / recvfrom
Date: Tue, 28 Aug 2018 18:01:53 +0300	[thread overview]
Message-ID: <20180828150152.i7hzbgnby5apuydm@tkn_work_nb> (raw)
In-Reply-To: <20180828113214.kg77lrydeg6t467e@esperanza>

I'll send the patch v3 soon. Answered to the comments below.

I have tested the patch with zero length datagrams using patched sendto
method:

diff --git a/src/lua/socket.lua b/src/lua/socket.lua
index 5d98fd2b2..d3ce6c3f5 100644
--- a/src/lua/socket.lua
+++ b/src/lua/socket.lua
@@ -862,7 +862,7 @@ local function socket_sendto(self, host, port, octets, flags)
     end

     self._errno = nil
-    if octets == nil or octets == '' then
+    if octets == nil then -- or octets == '' then
         return true
     end

On the following test cases:

diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
index 840f042e4..af8495aa2 100644
--- a/test/app/socket.test.lua
+++ b/test/app/socket.test.lua
@@ -761,6 +761,24 @@ from == nil -- expected true
 from
 e == errno.EAGAIN -- expected true
 
+-- case: recv, zero datagram
+sending_socket:sendto('127.0.0.1', receiving_socket_port, '')
+received_message = receiving_socket:recv()
+e = receiving_socket:errno()
+received_message == '' -- expected true
+received_message
+e == 0 -- expected true
+
+-- case: recvfrom, zero datagram
+sending_socket:sendto('127.0.0.1', receiving_socket_port, '')
+received_message, from = receiving_socket:recvfrom()
+e = receiving_socket:errno()
+received_message == '' -- expected true
+received_message
+from ~= nil -- expected true
+from.host == '127.0.0.1' -- expected true
+e == 0 -- expected true
+
 -- case: recv, no datagram, explicit size
 received_message = receiving_socket:recv(512)
 e = receiving_socket:errno()
@@ -777,6 +795,24 @@ from == nil -- expected true
 from
 e == errno.EAGAIN -- expected true
 
+-- case: recv, zero datagram, explicit size
+sending_socket:sendto('127.0.0.1', receiving_socket_port, '')
+received_message = receiving_socket:recv(512)
+e = receiving_socket:errno()
+received_message == '' -- expected true
+received_message
+e == 0 -- expected true
+
+-- case: recvfrom, zero datagram, explicit size
+sending_socket:sendto('127.0.0.1', receiving_socket_port, '')
+received_message, from = receiving_socket:recvfrom(512)
+e = receiving_socket:errno()
+received_message == '' -- expected true
+received_message
+from ~= nil -- expected true
+from.host == '127.0.0.1' -- expected true
+e == 0 -- expected true
+
 message_len = 1025
 message = string.rep('x', message_len)

But I'll not include this in the patch, because it necessary to copy
patched sendto function into the test to allow send zero-length
datagrams (or change the original sendto with possible compatibility
break); Vladimir seems to be against it.

WBR, Alexander Turenko.

On Tue, Aug 28, 2018 at 02:32:14PM +0300, Vladimir Davydov wrote:
> On Tue, Aug 28, 2018 at 12:05:10AM +0300, Alexander Turenko wrote:
> > When size parameter is not passed to socket:recv() or socket:recvfrom()
> > it will call a) or b) on the socket to evaluate size of the buffer to
> > store the receiving datagram. Before this commit a datagram will be
> > truncated to 512 bytes in the case.
> > 
> > a) Linux: recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK)
> > b) Mac OS: getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len)
> > 
> > It is recommended to set 'size' parameter (size of the input buffer)
> > explicitly based on known message format and known network conditions
> > (say, set it less then MTU to prevent IP fragmentation, which can be
> > inefficient) or pass it from a configuration option of a library / an
> > application. The reason is that explicit buffer size provided allows to
> > avoid extra syscall to evaluate necessary buffer size.
> > 
> > When 'size' parameter is set explicitly for recv / recvfrom on a UDP
> > socket and the next datagram length is larger then the size, the
> > returned message will be truncated to the size provided and the rest of
> > the datagram will be discarded. Of course, the tail will not be
> > discarded in case of a TCP socket and will be available to read by the
> > next recv / recvfrom call.
> > 
> > Fixes #3619.
> > ---
> > 
> > branch: Totktonada/gh-3619-socket-recvfrom-detect-a-cropped-packet
> 
> Please put a hyperlink to the branch, not just its name.
> 
> Also, prefixes "branch:", "issue:", "travis-ci" aren't necessary as it
> is apparent which is which.
> 

Ok.

> > issue: https://github.com/tarantool/tarantool/issues/3619
> > travis-ci: https://travis-ci.org/tarantool/tarantool/builds/421272709
> > 
> > Changes since the previous version of the patch:
> > 
> > - discarded the patch to prevent UDP datagram truncation as Yaroslav
> >   asks;
> > - moved itype saving into the socket constructor;
> > - removed is_udp and dgram_length from the public API (moved to
> >   socket.internal reflecting the msgpackffi approach);
> > - adopted and added test cases for recv / recvfrom from the discarded
> >   patch to hold the current behaviour (truncate with explicit size).
> > 
> >  src/lua/socket.c         |   3 +
> >  src/lua/socket.lua       |  68 +++-
> >  test/app/socket.result   | 672 ++++++++++++++++++++++++++++++++++++++-
> >  test/app/socket.test.lua | 249 ++++++++++++++-
> >  4 files changed, 980 insertions(+), 12 deletions(-)
> > 
> > diff --git a/src/lua/socket.c b/src/lua/socket.c
> > index 1458d3ba4..84711b441 100644
> > --- a/src/lua/socket.c
> > +++ b/src/lua/socket.c
> > @@ -268,6 +268,9 @@ static const struct lbox_sockopt_reg so_opts[] = {
> >  #ifdef SO_PROTOCOL
> >  	{"SO_PROTOCOL",		SO_PROTOCOL,		1,	0, },
> >  #endif
> > +#ifdef SO_NREAD /* available on Darwin */
> > +	{"SO_NREAD",		SO_NREAD,		1,	0, },
> > +#endif
> >  
> >  	{"SO_TYPE",		SO_TYPE,		1,	0, },
> >  
> > diff --git a/src/lua/socket.lua b/src/lua/socket.lua
> > index 06306eae2..ddf7ebd53 100644
> > --- a/src/lua/socket.lua
> > +++ b/src/lua/socket.lua
> > @@ -86,8 +86,12 @@ local function check_socket(socket)
> >      end
> >  end
> >  
> > -local function make_socket(fd)
> > -    local socket = { _gc_socket = ffi.new(gc_socket_t, { fd = fd }) }
> > +local function make_socket(fd, itype)
> > +    assert(itype ~= nil)
> > +    local socket = {
> > +        _gc_socket = ffi.new(gc_socket_t, { fd = fd }),
> > +        itype = itype,
> > +    }
> >      return setmetatable(socket, socket_mt)
> >  end
> >  
> > @@ -590,7 +594,7 @@ local function socket_accept(self)
> >          self._errno = boxerrno()
> >          return nil
> >      end
> > -    local client = make_socket(client_fd)
> > +    local client = make_socket(client_fd, self.itype)
> >      if not client:nonblock(true) then
> >          client:close()
> >          return
> > @@ -770,6 +774,38 @@ local function socket_send(self, octets, flags)
> >      return tonumber(res)
> >  end
> >  
> > +local function socket_is_udp_internal(self)
> > +    assert(self.itype ~= nil)
> > +    return self.itype == internal.SO_TYPE['SOCK_DGRAM']
> 
> Nit: get_ivalue?
> 

Ok.

> Also, I'd inline this function as it's a one-liner, and remove the
> assertion as itype is going to be set anyway.
> 
> > +end
> > +
> > +-- returns nil and sets EAGAIN when there are no datagrams in the receive queue
> > +-- as well as when zero-length datagram is the next one
> > +local function socket_dgram_length_internal(self)
> 
> I don't like the function name. Let's call it simply get_dgram_len?
> 

Ok. Inlined into get_recv_size.

> > +    local fd = check_socket(self)
> > +    if not socket_is_udp_internal(self) then
> > +        error("socket_dgram_length_internal() supports only a UDP socket")
> > +    end
> 
> I'd remove these checks as this function is only called by recv() and
> recvfrom(), where these checks have already been carried out.
> 

Ok.

> > +
> > +    local len
> > +    if jit.os == 'OSX' then
> > +        self._errno = nil
> > +        len = self:getsockopt('SOL_SOCKET', 'SO_NREAD')
> > +    else
> > +        local iflags = get_iflags(internal.SEND_FLAGS, {'MSG_TRUNC', 'MSG_PEEK'})
> > +        assert(iflags ~= nil)
> > +        self._errno = nil
> 
> Nit: 'self._errno = nil' can be moved before 'if'.
> 

Ok.

> > +        len = tonumber(ffi.C.recv(fd, nil, 0, iflags))
> 

It is needed, because lua_tointeger in ffi.C.recvfrom() cannot handle
cdata<int64_t> correctly. It is possible to add a function like
luaL_convertint64 from src/lua/utils.c, but w/o converting a string to
number. I think it is overkill, especially, because tonumber() is
already used here and there this way in socket.lua.

> Nit: tonumber is not necessary here.
> 
> > +    end
> > +
> > +    if len == -1 or len == 0 then
> 
> AFAIU errno isn't set unless the return value is -1. Please check and
> remove 'len == 0' from the expression if so.
> 

You are right. Removed.

> > +        self._errno = boxerrno()
> > +        return nil
> > +    end
> > +
> > +    return len
> > +end
> > +
> >  local function socket_recv(self, size, flags)
> >      local fd = check_socket(self)
> >      local iflags = get_iflags(internal.SEND_FLAGS, flags)
> > @@ -778,7 +814,15 @@ local function socket_recv(self, size, flags)
> >          return nil
> >      end
> >  
> > -    size = size or 512
> > +    if socket_is_udp_internal(self) then
> > +        size = size or socket_dgram_length_internal(self)
> > +        if size == nil then
> > +            return nil
> > +        end
> > +    else
> > +        size = size or 512
> > +    end
> > +
> >      self._errno = nil
> >      local buf = ffi.new("char[?]", size)
> >  
> > @@ -799,7 +843,15 @@ local function socket_recvfrom(self, size, flags)
> >          return nil
> >      end
> >  
> > -    size = size or 512
> > +    if socket_is_udp_internal(self) then
> > +        size = size or socket_dgram_length_internal(self)
> > +        if size == nil then
> > +            return nil
> > +        end
> > +    else
> > +        size = size or 512
> > +    end
> > +
> 
> Don't really like the code duplication. What about factoring it out in a
> helper function? Then you wouldn't need get_dgram_len() at all as you
> could inline it in there. I'm not sure how to call such a function. What
> about get_recv_size(size)?
>

Ok.

> >      self._errno = nil
> >      local res, from = internal.recvfrom(fd, size, iflags)
> >      if res == nil then
> > @@ -860,7 +912,7 @@ local function socket_new(domain, stype, proto)
> >  
> >      local fd = ffi.C.socket(idomain, itype, iproto)
> >      if fd >= 0 then
> > -        local socket = make_socket(fd)
> > +        local socket = make_socket(fd, itype)
> >          if not socket:nonblock(true) then
> >              socket:close()
> >          else
> > @@ -1171,6 +1223,10 @@ socket_mt   = {
> >          name = socket_name;
> >          peer = socket_peer;
> >          fd = socket_fd;
> > +        internal = { -- for testing
> > +            is_udp = socket_is_udp_internal;
> > +            dgram_length = socket_dgram_length_internal;
> > +        };
> 
> I don't think it's worth exporting these functions, even for testing,
> because you cover them for both stream and dgram sockets anyway. I'd
> remove them and corresponding tests.
> 

Ok.

> >      };
> >      __tostring  = function(self)
> >          local fd = check_socket(self)
> 
> > diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
> > index f6539438b..cef71fc95 100644
> > --- a/test/app/socket.test.lua
> > +++ b/test/app/socket.test.lua
> > @@ -591,6 +593,7 @@ test_run:cmd("setopt delimiter ''");
> >  --------------------------------------------------------------------------------
> >  -- Lua Socket Emulation
> >  --------------------------------------------------------------------------------
> > +-- {{{
> 
> Nit: we never use {{{ }}} for separating test groups in tests so this
> looks inconsistent.

Ok.

      reply	other threads:[~2018-08-28 15:01 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-27 21:05 Alexander Turenko
2018-08-28 11:32 ` Vladimir Davydov
2018-08-28 15:01   ` Alexander Turenko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180828150152.i7hzbgnby5apuydm@tkn_work_nb \
    --to=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=yaroslav.dynnikov@tarantool.org \
    --subject='Re: [PATCH v2] socket: evaluate buffer size in recv / recvfrom' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox