Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10
@ 2021-03-24 21:24 Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 01/15] fio: don't use shared buffer in pread() Vladislav Shpilevoy via Tarantool-patches
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

The patchset is a ported fix for 5632 from the master branch.

No big changes except these:

- 1.10 didn't have static_alloc in Lua directly, but still the static buffer
  could be returned indirectly from tt_uuid_str() for example;

- 1.10 didn't have Lua registers, but had some global variables created via
  ffi.new(). For example, in uuid.lua, msgpackffi.lua;

- sio address formatting function was formatted differently.

Nonetheless the commit messages are left intact. They still mention
static_alloc and Lua registers.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5632-global-buf-crash-1.10
Issue: https://github.com/tarantool/tarantool/issues/5632

Vladislav Shpilevoy (15):
  fio: don't use shared buffer in pread()
  test: don't use IBUF_SHARED in the tests
  tuple: pass global ibuf explicitly where possible
  iconv: take errno before reseting the context
  cord_buf: introduce cord_buf API
  cord_buf: introduce ownership management
  buffer: implement ffi stash
  uuid: replace static_alloc with ffi stash
  uuid: drop tt_uuid_str() from Lua
  uri: replace static_alloc with ffi stash and ibuf
  lua: use lua_pushfstring() instead of tt_sprintf()
  sio: rework sio_strfaddr()
  sio: increase SERVICE_NAME_MAXLEN size
  sio: introduce and use sio_snprintf()
  buffer: remove Lua registers

 changelogs/unreleased/fix-ibuf-static.md   |   7 +
 extra/exports                              |   5 +-
 src/CMakeLists.txt                         |   1 +
 src/box/lua/schema.lua                     |  47 ++++--
 src/box/lua/session.c                      |   7 +-
 src/box/lua/space.cc                       |   6 +-
 src/box/lua/tuple.c                        |  80 +++++-----
 src/box/lua/tuple.lua                      |  14 +-
 src/cord_buf.c                             | 165 +++++++++++++++++++++
 src/cord_buf.h                             |  49 ++++++
 src/lua/buffer.lua                         | 101 ++++++++++++-
 src/lua/fio.lua                            |   3 +-
 src/lua/iconv.lua                          |  27 ++--
 src/lua/init.c                             |   3 -
 src/lua/msgpack.c                          |   6 +-
 src/lua/msgpackffi.lua                     |  46 +++---
 src/lua/socket.lua                         |  14 +-
 src/lua/uri.lua                            |  22 ++-
 src/lua/utf8.c                             |  21 +--
 src/lua/utils.h                            |   1 -
 src/lua/uuid.lua                           |  37 +++--
 src/sio.cc                                 |  67 +++++----
 src/sio.h                                  |  16 +-
 test/app-tap/buffer.test.lua               |  59 ++++++++
 test/app-tap/gh-5632-gc-buf-reuse.test.lua | 151 +++++++++++++++++++
 test/app-tap/module_api.test.lua           |   3 +-
 test/app/fio.result                        | 103 ++++++++-----
 test/app/fio.test.lua                      |  68 +++++----
 test/app/msgpack.result                    |   7 +-
 test/app/msgpack.test.lua                  |   5 +-
 test/app/uuid.result                       |   2 +-
 test/unit/luaT_tuple_new.c                 |   4 -
 32 files changed, 901 insertions(+), 246 deletions(-)
 create mode 100644 changelogs/unreleased/fix-ibuf-static.md
 create mode 100644 src/cord_buf.c
 create mode 100644 src/cord_buf.h
 create mode 100755 test/app-tap/buffer.test.lua
 create mode 100755 test/app-tap/gh-5632-gc-buf-reuse.test.lua

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 01/15] fio: don't use shared buffer in pread()
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 10/15] uri: replace static_alloc with ffi stash and ibuf Vladislav Shpilevoy via Tarantool-patches
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

fio:pread() used buffer.IBUF_SHARED, which might be reused after a
yield. As a result, if pread() was called from 2 different fibers
or in parallel with something else using IBUF_SHARED, it would
turn the buffer into garbage for all parallel usages.

The same problem existed for read(), and was fixed in
c7c24f841322528c17714482d150b769d0afcddb ("fio: Fix race condition
in fio.read"). But apparently pread() was missed.

What is worse, the original commit's test passed even without the
fix from that commit. Because it didn't check the results of
read()s called from 2 fibers.

The patch fixes pread() and adds a test covering both read() and
pread(). The old test from the original commit is dropped.

Follow up #3187

(cherry picked from commit 24d86294b693867afd02a8847649026ce6b6fb4f)
---
 src/lua/fio.lua       |   3 +-
 test/app/fio.result   | 103 ++++++++++++++++++++++++++----------------
 test/app/fio.test.lua |  68 ++++++++++++++++------------
 3 files changed, 103 insertions(+), 71 deletions(-)

diff --git a/src/lua/fio.lua b/src/lua/fio.lua
index 748efed14..a47bf2192 100644
--- a/src/lua/fio.lua
+++ b/src/lua/fio.lua
@@ -95,8 +95,7 @@ fio_methods.pread = function(self, buf, size, offset)
     if not ffi.istype(const_char_ptr_t, buf) then
         offset = size
         size = buf
-        tmpbuf = buffer.IBUF_SHARED
-        tmpbuf:reset()
+        tmpbuf = buffer.ibuf()
         buf = tmpbuf:reserve(size)
     end
     local res, err = internal.pread(self.fh, buf, size, offset)
diff --git a/test/app/fio.result b/test/app/fio.result
index f5820c199..03134b6af 100644
--- a/test/app/fio.result
+++ b/test/app/fio.result
@@ -7,6 +7,9 @@ ffi = require 'ffi'
 buffer = require 'buffer'
 ---
 ...
+fiber = require('fiber')
+---
+...
 test_run = require('test_run').new()
 ---
 ...
@@ -1179,90 +1182,110 @@ fio.unlink(tmpfile)
 ---
 - true
 ...
-tmp1 = fio.pathjoin(tmpdir, "tmp1")
+fio.rmdir(tmpdir)
 ---
+- true
 ...
-tmp2= fio.pathjoin(tmpdir, "tmp2")
+--
+-- gh-4439: mktree error handling fix - creation of existing file/directory
+--
+fio.mktree('/dev/null')
 ---
+- false
+- 'Error creating directory /dev/null: File exists'
 ...
-test_run:cmd("setopt delimiter ';'")
+fio.mktree('/dev/null/dir')
 ---
-- true
+- false
+- 'Error creating directory /dev/null: File exists'
 ...
-function write_file(name, odd)
-    local fh = fio.open(name, { 'O_RDWR', 'O_TRUNC', 'O_CREAT' }, 0777)
-    if odd then
-        fh:write(string.rep('1', 100))
-    else
-        fh:write(string.rep('2', 100))
-    end
-    fh:write(name)
-    fh:seek(0)
-    return fh
-end;
+--
+-- read() and pread() should not use a shared buffer so as not to clash with
+-- other fibers during yield.
+--
+rights = tonumber('0777', 8)
 ---
 ...
-test_run:cmd("setopt delimiter ''");
+flags = {'O_RDWR', 'O_TRUNC', 'O_CREAT'}
 ---
-- true
 ...
-fh1 = write_file(tmp1)
+tmpdir = fio.tempdir()
 ---
 ...
-fh2 = write_file(tmp2)
+file1 = fio.pathjoin(tmpdir, 'file1')
 ---
 ...
-fiber = require('fiber')
+file2 = fio.pathjoin(tmpdir, 'file2')
 ---
 ...
-digest = require('digest')
+fd1 = fio.open(file1, flags, rights)
 ---
 ...
-str = fh1:read()
+fd2 = fio.open(file2, flags, rights)
 ---
 ...
-fh1:seek(0)
+fd1:write('1'), fd1:seek(0)
 ---
+- true
 - 0
 ...
-hash = digest.crc32(str)
+fd2:write('2'), fd2:seek(0)
 ---
+- true
+- 0
 ...
-ch = fiber.channel(1)
+res1, res2 = nil
 ---
 ...
-f1 = fiber.create(function() str = fh1:read() ch:put(digest.crc32(str)) end)
+do                                                                              \
+    fiber.create(function() res1 = fd1:read() end)                              \
+    fiber.create(function() res2 = fd2:read() end)                              \
+end
 ---
 ...
-f2 = fiber.create(function() str = fh2:read() end)
+_ = test_run:wait_cond(function() return res1 and res2 end)
 ---
 ...
-ch:get() == hash
+assert(res1 == '1')
 ---
 - true
 ...
-fio.unlink(tmp1)
+assert(res2 == '2')
 ---
 - true
 ...
-fio.unlink(tmp2)
+--
+-- The same with pread().
+--
+res1, res2 = nil
+---
+...
+do                                                                              \
+    fiber.create(function() res1 = fd1:pread(1, 0) end)                         \
+    fiber.create(function() res2 = fd2:pread(1, 0) end)                         \
+end
+---
+...
+_ = test_run:wait_cond(function() return res1 and res2 end)
+---
+...
+assert(res1 == '1')
 ---
 - true
 ...
-fio.rmdir(tmpdir)
+assert(res2 == '2')
 ---
 - true
 ...
---
--- gh-4439: mktree error handling fix - creation of existing file/directory
---
-fio.mktree('/dev/null')
+fd1:close()
 ---
-- false
-- 'Error creating directory /dev/null: File exists'
+- true
 ...
-fio.mktree('/dev/null/dir')
+fd2:close()
 ---
-- false
-- 'Error creating directory /dev/null: File exists'
+- true
+...
+fio.rmtree(tmpdir)
+---
+- true
 ...
diff --git a/test/app/fio.test.lua b/test/app/fio.test.lua
index b2c53e3e3..4e0a2e87f 100644
--- a/test/app/fio.test.lua
+++ b/test/app/fio.test.lua
@@ -1,6 +1,7 @@
 fio = require 'fio'
 ffi = require 'ffi'
 buffer = require 'buffer'
+fiber = require('fiber')
 test_run = require('test_run').new()
 -- umask
 
@@ -381,38 +382,47 @@ fio.path.lexists(link)
 
 fio.unlink(link)
 fio.unlink(tmpfile)
-tmp1 = fio.pathjoin(tmpdir, "tmp1")
-tmp2= fio.pathjoin(tmpdir, "tmp2")
-test_run:cmd("setopt delimiter ';'")
-function write_file(name, odd)
-    local fh = fio.open(name, { 'O_RDWR', 'O_TRUNC', 'O_CREAT' }, 0777)
-    if odd then
-        fh:write(string.rep('1', 100))
-    else
-        fh:write(string.rep('2', 100))
-    end
-    fh:write(name)
-    fh:seek(0)
-    return fh
-end;
-test_run:cmd("setopt delimiter ''");
-fh1 = write_file(tmp1)
-fh2 = write_file(tmp2)
-fiber = require('fiber')
-digest = require('digest')
-str = fh1:read()
-fh1:seek(0)
-hash = digest.crc32(str)
-ch = fiber.channel(1)
-f1 = fiber.create(function() str = fh1:read() ch:put(digest.crc32(str)) end)
-f2 = fiber.create(function() str = fh2:read() end)
-ch:get() == hash
-
-fio.unlink(tmp1)
-fio.unlink(tmp2)
 fio.rmdir(tmpdir)
 --
 -- gh-4439: mktree error handling fix - creation of existing file/directory
 --
 fio.mktree('/dev/null')
 fio.mktree('/dev/null/dir')
+
+--
+-- read() and pread() should not use a shared buffer so as not to clash with
+-- other fibers during yield.
+--
+rights = tonumber('0777', 8)
+flags = {'O_RDWR', 'O_TRUNC', 'O_CREAT'}
+tmpdir = fio.tempdir()
+file1 = fio.pathjoin(tmpdir, 'file1')
+file2 = fio.pathjoin(tmpdir, 'file2')
+fd1 = fio.open(file1, flags, rights)
+fd2 = fio.open(file2, flags, rights)
+fd1:write('1'), fd1:seek(0)
+fd2:write('2'), fd2:seek(0)
+
+res1, res2 = nil
+do                                                                              \
+    fiber.create(function() res1 = fd1:read() end)                              \
+    fiber.create(function() res2 = fd2:read() end)                              \
+end
+_ = test_run:wait_cond(function() return res1 and res2 end)
+assert(res1 == '1')
+assert(res2 == '2')
+--
+-- The same with pread().
+--
+res1, res2 = nil
+do                                                                              \
+    fiber.create(function() res1 = fd1:pread(1, 0) end)                         \
+    fiber.create(function() res2 = fd2:pread(1, 0) end)                         \
+end
+_ = test_run:wait_cond(function() return res1 and res2 end)
+assert(res1 == '1')
+assert(res2 == '2')
+
+fd1:close()
+fd2:close()
+fio.rmtree(tmpdir)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 10/15] uri: replace static_alloc with ffi stash and ibuf
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 01/15] fio: don't use shared buffer in pread() Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 11/15] lua: use lua_pushfstring() instead of tt_sprintf() Vladislav Shpilevoy via Tarantool-patches
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

static_alloc() appears not to be safe to use in Lua, because it
does not provide any ownership protection for the returned values.

The problem appears when something is allocated, then Lua GC
starts, and some __gc handlers might also use static_alloc(). In
Lua and in C - both lead to the buffer being corrupted in its
original usage place.

The patch is a part of activity of getting rid of static_alloc()
in Lua. It removes it from uri Lua module and makes it use the
new FFI stash feature, which helps to cache frequently used and
heavy to allocate FFI values.

In one place static_alloc() was used for an actual buffer - it was
replaced with cord_ibuf which is equally fast when preallocated.

ffi.new() for temporary struct uri is not used, because

- It produces a new GC object;

- ffi.new('struct uri') costs around 20ns while FFI stash
  costs around 0.8ns. The hack with 'struct uri[1]' does not help
  because size of uri is > 128 bytes;

- Without JIT ffi.new() costs about the same as the stash, not
  better as well;

The patch makes uri perf a bit better in the places where
static_alloc() was used, because its cost was around 7ns for one
allocation.

(cherry picked from commit 7175b43e842fe42a04bba2b88006732f20bd7552)
---
 src/lua/uri.lua                            | 22 ++++++--
 test/app-tap/gh-5632-gc-buf-reuse.test.lua | 60 +++++++++++++++++++++-
 2 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/src/lua/uri.lua b/src/lua/uri.lua
index d2946cd2d..98f4e02ec 100644
--- a/src/lua/uri.lua
+++ b/src/lua/uri.lua
@@ -1,6 +1,7 @@
 -- uri.lua (internal file)
 
 local ffi = require('ffi')
+local buffer = require('buffer')
 
 ffi.cdef[[
 struct uri {
@@ -31,14 +32,19 @@ uri_format(char *str, size_t len, struct uri *uri, bool write_password);
 ]]
 
 local builtin = ffi.C;
-
-local uribuf = ffi.new('struct uri')
+local uri_stash = buffer.ffi_stash_new('struct uri')
+local uri_stash_take = uri_stash.take
+local uri_stash_put = uri_stash.put
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_put = buffer.internal.cord_ibuf_put
 
 local function parse(str)
     if str == nil then
         error("Usage: uri.parse(string)")
     end
+    local uribuf = uri_stash_take()
     if builtin.uri_parse(uribuf, str) ~= 0 then
+        uri_stash_put(uribuf)
         return nil
     end
     local result = {}
@@ -55,10 +61,12 @@ local function parse(str)
     elseif uribuf.host_hint == 3 then
         result.unix = result.service
     end
+    uri_stash_put(uribuf)
     return result
 end
 
 local function format(uri, write_password)
+    local uribuf = uri_stash_take()
     uribuf.scheme = uri.scheme
     uribuf.scheme_len = string.len(uri.scheme or '')
     uribuf.login = uri.login
@@ -75,9 +83,13 @@ local function format(uri, write_password)
     uribuf.query_len = string.len(uri.query or '')
     uribuf.fragment = uri.fragment
     uribuf.fragment_len = string.len(uri.fragment or '')
-    local str = ffi.new('char[1024]')
-    builtin.uri_format(str, 1024, uribuf, write_password and 1 or 0)
-    return ffi.string(str)
+    local ibuf = cord_ibuf_take()
+    local str = ibuf:alloc(1024)
+    local len = builtin.uri_format(str, 1024, uribuf, write_password and 1 or 0)
+    uri_stash_put(uribuf)
+    str = ffi.string(str, len)
+    cord_ibuf_put(ibuf)
+    return str
 end
 
 return {
diff --git a/test/app-tap/gh-5632-gc-buf-reuse.test.lua b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
index b09b1bf3e..81dafd36e 100755
--- a/test/app-tap/gh-5632-gc-buf-reuse.test.lua
+++ b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
@@ -10,6 +10,7 @@
 local tap = require('tap')
 local ffi = require('ffi')
 local uuid = require('uuid')
+local uri = require('uri')
 
 local function test_uuid(test)
     test:plan(1)
@@ -42,8 +43,65 @@ local function test_uuid(test)
     test:ok(is_success, 'uuid in gc')
 end
 
+local function test_uri(test)
+    test:plan(1)
+
+    local gc_count = 100
+    local iter_count = 1000
+    local port = 1
+    local ip = 1
+    local login = 1
+    local pass = 1
+    local is_success = true
+
+    local function uri_parse()
+        local loc_ip = ip
+        local loc_port = port
+        local loc_pass = pass
+        local loc_login = login
+
+        ip = ip + 1
+        port = port + 1
+        pass = pass + 1
+        login = login + 1
+        if port > 60000 then
+            port = 1
+        end
+        if ip > 255 then
+            ip = 1
+        end
+
+        loc_ip = string.format('127.0.0.%s', loc_ip)
+        loc_port = tostring(loc_port)
+        loc_pass = string.format('password%s', loc_pass)
+        loc_login = string.format('login%s', loc_login)
+        local host = string.format('%s:%s@%s:%s', loc_login, loc_pass,
+                                   loc_ip, loc_port)
+        local u = uri.parse(host)
+        if u.host ~= loc_ip or u.login ~= loc_login or u.service ~= loc_port or
+           u.password ~= loc_pass then
+            is_success = false
+            assert(false)
+        end
+    end
+
+    local function create_gc()
+        for _ = 1, gc_count do
+            ffi.gc(ffi.new('char[1]'), uri_parse)
+        end
+    end
+
+    for _ = 1, iter_count do
+        create_gc()
+        uri_parse()
+    end
+
+    test:ok(is_success, 'uri in gc')
+end
+
 local test = tap.test('gh-5632-gc-buf-reuse')
-test:plan(1)
+test:plan(2)
 test:test('uuid in __gc', test_uuid)
+test:test('uri in __gc', test_uri)
 
 os.exit(test:check() and 0 or 1)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 11/15] lua: use lua_pushfstring() instead of tt_sprintf()
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 01/15] fio: don't use shared buffer in pread() Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 10/15] uri: replace static_alloc with ffi stash and ibuf Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 12/15] sio: rework sio_strfaddr() Vladislav Shpilevoy via Tarantool-patches
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

In a few places to push a formatted string was used 2 calls:
tt_sprintf() + lua_pushstring(). It wasn't necessary because Lua
API has lua_pushfstring() with a big enough subset of printf
format features.

But more importantly - it was a bug. lua_pushstring() is a GC
point. Before copying the passed string it tries to invoke Lua GC,
which might invoke a __gc handler for some cdata, where static
alloc might be used, and it can rewrite the string passed to
lua_pushstring() in the beginning of the stack.

Part of #5632

(cherry picked from commit b3872a38b1e4be6a84d0621c833a196baf70284f)
---
 src/box/lua/space.cc | 6 ++----
 src/lua/utf8.c       | 5 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 9ea0d6f7e..962796e6a 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -480,8 +480,7 @@ lbox_space_frommap(struct lua_State *L)
 	space = space_by_id(id);
 	if (space == NULL) {
 		lua_pushnil(L);
-		lua_pushstring(L, tt_sprintf("Space with id '%d' "\
-					     "doesn't exist", id));
+		lua_pushfstring(L, "Space with id '%d' doesn't exist", id);
 		return 2;
 	}
 	assert(space->format != NULL);
@@ -498,8 +497,7 @@ lbox_space_frommap(struct lua_State *L)
 		if (tuple_fieldno_by_name(dict, key, key_len, key_hash,
 					  &fieldno)) {
 			lua_pushnil(L);
-			lua_pushstring(L, tt_sprintf("Unknown field '%s'",
-						     key));
+			lua_pushfstring(L, "Unknown field '%s'", key);
 			return 2;
 		}
 		lua_rawseti(L, -3, fieldno+1);
diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 0d9c49a8b..53cdcb023 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -83,9 +83,8 @@ utf8_str_to_case(struct lua_State *L, const char *src, int src_bsize,
 		} else {
 			cord_ibuf_put(ibuf);
 			lua_pushnil(L);
-			lua_pushstring(L, tt_sprintf("error during ICU case "\
-						     "transform: %s",
-						     u_errorName(err)));
+			lua_pushfstring(L, "error during ICU case "
+					"transform: %s", u_errorName(err));
 			return 2;
 		}
 		/*
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 12/15] sio: rework sio_strfaddr()
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 11/15] lua: use lua_pushfstring() instead of tt_sprintf() Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 13/15] sio: increase SERVICE_NAME_MAXLEN size Vladislav Shpilevoy via Tarantool-patches
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

The function was overcomplicated, and made it harder to update it
in the next patches with functional changes.

The main source of the complication was usage of both inet_ntoa()
and getnameinfo(). The latter is more universal, it can cover the
case of the former.

The patch makes it use only getnameinfo() for IP addresses
regardless of v4 or v6.

Needed for #5632

(cherry picked from commit 441cb814d4cdd1ec4999b812d72e04cb5528687f)
---
 src/sio.cc | 39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/src/sio.cc b/src/sio.cc
index 3becd1ee2..0ba19346c 100644
--- a/src/sio.cc
+++ b/src/sio.cc
@@ -514,29 +514,22 @@ const char *
 sio_strfaddr(struct sockaddr *addr, socklen_t addrlen)
 {
 	static __thread char name[NI_MAXHOST + _POSIX_PATH_MAX + 2];
-	switch(addr->sa_family) {
-		case AF_UNIX:
-			if (addrlen >= sizeof(sockaddr_un)) {
-				snprintf(name, sizeof(name), "unix/:%s",
-					((struct sockaddr_un *)addr)->sun_path);
-			} else {
-				snprintf(name, sizeof(name),
-					 "unix/:(socket)");
-			}
-			break;
-		default: {
-			char host[NI_MAXHOST], serv[NI_MAXSERV];
-			if (getnameinfo(addr, addrlen, host, sizeof(host),
-					serv, sizeof(serv),
-					NI_NUMERICHOST | NI_NUMERICSERV) == 0) {
-				snprintf(name, sizeof(name),
-					 addr->sa_family == AF_INET
-					 ? "%s:%s" : "[%s]:%s", host, serv);
-			} else {
-				snprintf(name, sizeof(name), "(host):(port)");
-			}
-			break;
-		}
+	if (addr->sa_family == AF_UNIX) {
+		struct sockaddr_un *u = (struct sockaddr_un *)addr;
+		if (addrlen >= sizeof(*u))
+			snprintf(name, sizeof(name), "unix/:%s", u->sun_path);
+		else
+			snprintf(name, sizeof(name), "unix/:(socket)");
+	} else {
+		char host[NI_MAXHOST], serv[NI_MAXSERV];
+		int flags = NI_NUMERICHOST | NI_NUMERICSERV;
+		if (getnameinfo(addr, addrlen, host, sizeof(host),
+				serv, sizeof(serv), flags) != 0)
+			snprintf(name, sizeof(name), "(host):(port)");
+		else if (addr->sa_family == AF_INET)
+			snprintf(name, sizeof(name), "%s:%s", host, serv);
+		else
+			snprintf(name, sizeof(name), "[%s]:%s", host, serv);
 	}
 	return name;
 }
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 13/15] sio: increase SERVICE_NAME_MAXLEN size
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (3 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 12/15] sio: rework sio_strfaddr() Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 14/15] sio: introduce and use sio_snprintf() Vladislav Shpilevoy via Tarantool-patches
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

It was 32, and couldn't fit long IPv6 and Unix socket addresses.

The patch makes it 200 so now it fits any supported addresses
family used in the code.

Having SERVICE_NAME_MAXLEN valid is necessary to be able to save
a complete address string on the stack in the places where the
static buffer returned by sio_strfaddr() can't be used safely. For
instance, in the code working with Lua due to Lua GC which might
be invoked any moment and in a __gc handler could overwrite the
static buffer.

Needed for #5632

(cherry picked from commit 6b331f7a78aa409ad68ffc8ceb3deb3733a32473)
---
 src/sio.cc |  2 +-
 src/sio.h  | 11 ++++++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/sio.cc b/src/sio.cc
index 0ba19346c..a425b8383 100644
--- a/src/sio.cc
+++ b/src/sio.cc
@@ -81,7 +81,7 @@ sio_socketname(int fd)
 {
 	/* Preserve errno */
 	int save_errno = errno;
-	static __thread char name[2 * SERVICE_NAME_MAXLEN];
+	static __thread char name[SERVICE_NAME_MAXLEN];
 	int rc = sio_socketname_to_buffer(fd, name, sizeof(name));
 	/*
 	 * Could fail only because of a bad format in snprintf, but it is not
diff --git a/src/sio.h b/src/sio.h
index f728af547..3b91f7eee 100644
--- a/src/sio.h
+++ b/src/sio.h
@@ -46,7 +46,16 @@
 extern "C" {
 #endif /* defined(__cplusplus) */
 
-enum { SERVICE_NAME_MAXLEN = 32 };
+enum {
+	/**
+	 * - Unix socket path is 108 bytes max;
+	 * - IP(v4, v6) max string len is 45;
+	 *
+	 * Max result is rounded up just in case the numbers are a bit different
+	 * on various platforms.
+	 */
+	SERVICE_NAME_MAXLEN = 200,
+};
 
 const char *
 sio_strfaddr(struct sockaddr *addr, socklen_t addrlen);
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 14/15] sio: introduce and use sio_snprintf()
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (4 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 13/15] sio: increase SERVICE_NAME_MAXLEN size Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 15/15] buffer: remove Lua registers Vladislav Shpilevoy via Tarantool-patches
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

sio_strfaddr() can't be used in the places where static buffer
is not acceptable - in any code which wants to push the value to
Lua, or the address string must be long living.

The patch introduces sio_snprintf(), which does the same, but
saves the result into a provided buffer with a limited size.

In the Lua C code the patch saves the address string on the stack
which makes it safe against Lua GC interruptions.

Part of #5632

(cherry picked from commit fde44b569bf920a08469b9569eab1701d4e57299)
---
 src/box/lua/session.c |  7 +++++--
 src/sio.cc            | 38 +++++++++++++++++++++++++-------------
 src/sio.h             |  5 +++++
 3 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/src/box/lua/session.c b/src/box/lua/session.c
index d1d0da2d2..26315bc4b 100644
--- a/src/box/lua/session.c
+++ b/src/box/lua/session.c
@@ -264,10 +264,13 @@ lbox_session_peer(struct lua_State *L)
 
 	struct sockaddr_storage addr;
 	socklen_t addrlen = sizeof(addr);
-	if (sio_getpeername(fd, (struct sockaddr *)&addr, &addrlen) < 0)
+	struct sockaddr *addr_base = (struct sockaddr *)&addr;
+	if (sio_getpeername(fd, addr_base, &addrlen) < 0)
 		luaL_error(L, "session.peer(): getpeername() failed");
 
-	lua_pushstring(L, sio_strfaddr((struct sockaddr *)&addr, addrlen));
+	char addrbuf[SERVICE_NAME_MAXLEN];
+	sio_addr_snprintf(addrbuf, sizeof(addrbuf), addr_base, addrlen);
+	lua_pushstring(L, addrbuf);
 	return 1;
 }
 
diff --git a/src/sio.cc b/src/sio.cc
index a425b8383..1f7800d6d 100644
--- a/src/sio.cc
+++ b/src/sio.cc
@@ -62,16 +62,17 @@ sio_socketname_to_buffer(int fd, char *buf, int size)
 		return 0;
 	struct sockaddr_storage addr;
 	socklen_t addrlen = sizeof(addr);
-	int rc = getsockname(fd, (struct sockaddr *) &addr, &addrlen);
+	struct sockaddr *base_addr = (struct sockaddr *)&addr;
+	int rc = getsockname(fd, base_addr, &addrlen);
 	if (rc == 0) {
-		SNPRINT(n, snprintf, buf, size, ", aka %s",
-			sio_strfaddr((struct sockaddr *)&addr, addrlen));
+		SNPRINT(n, snprintf, buf, size, ", aka ");
+		SNPRINT(n, sio_addr_snprintf, buf, size, base_addr, addrlen);
 	}
 	addrlen = sizeof(addr);
 	rc = getpeername(fd, (struct sockaddr *) &addr, &addrlen);
 	if (rc == 0) {
-		SNPRINT(n, snprintf, buf, size, ", peer of %s",
-			sio_strfaddr((struct sockaddr *)&addr, addrlen));
+		SNPRINT(n, snprintf, buf, size, ", peer of ");
+		SNPRINT(n, sio_addr_snprintf, buf, size, base_addr, addrlen);
 	}
 	return 0;
 }
@@ -510,26 +511,37 @@ sio_getpeername(int fd, struct sockaddr *addr, socklen_t *addrlen)
 }
 
 /** Pretty print a peer address. */
-const char *
-sio_strfaddr(struct sockaddr *addr, socklen_t addrlen)
+int
+sio_addr_snprintf(char *buf, size_t size, const struct sockaddr *addr,
+		  socklen_t addrlen)
 {
-	static __thread char name[NI_MAXHOST + _POSIX_PATH_MAX + 2];
+	int res;
 	if (addr->sa_family == AF_UNIX) {
 		struct sockaddr_un *u = (struct sockaddr_un *)addr;
 		if (addrlen >= sizeof(*u))
-			snprintf(name, sizeof(name), "unix/:%s", u->sun_path);
+			res = snprintf(buf, size, "unix/:%s", u->sun_path);
 		else
-			snprintf(name, sizeof(name), "unix/:(socket)");
+			res = snprintf(buf, size, "unix/:(socket)");
 	} else {
 		char host[NI_MAXHOST], serv[NI_MAXSERV];
 		int flags = NI_NUMERICHOST | NI_NUMERICSERV;
 		if (getnameinfo(addr, addrlen, host, sizeof(host),
 				serv, sizeof(serv), flags) != 0)
-			snprintf(name, sizeof(name), "(host):(port)");
+			res = snprintf(buf, size, "(host):(port)");
 		else if (addr->sa_family == AF_INET)
-			snprintf(name, sizeof(name), "%s:%s", host, serv);
+			res = snprintf(buf, size, "%s:%s", host, serv);
 		else
-			snprintf(name, sizeof(name), "[%s]:%s", host, serv);
+			res = snprintf(buf, size, "[%s]:%s", host, serv);
 	}
+	assert(res + 1 < SERVICE_NAME_MAXLEN);
+	assert(res >= 0);
+	return res;
+}
+
+const char *
+sio_strfaddr(struct sockaddr *addr, socklen_t addrlen)
+{
+	static __thread char name[SERVICE_NAME_MAXLEN];
+	sio_addr_snprintf(name, sizeof(name), addr, addrlen);
 	return name;
 }
diff --git a/src/sio.h b/src/sio.h
index 3b91f7eee..262bfcab8 100644
--- a/src/sio.h
+++ b/src/sio.h
@@ -57,6 +57,11 @@ enum {
 	SERVICE_NAME_MAXLEN = 200,
 };
 
+/** Format the address into the given buffer. Behaves like snprintf(). */
+int
+sio_addr_snprintf(char *buf, size_t size, const struct sockaddr *addr,
+		  socklen_t addrlen);
+
 const char *
 sio_strfaddr(struct sockaddr *addr, socklen_t addrlen);
 
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 15/15] buffer: remove Lua registers
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (5 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 14/15] sio: introduce and use sio_snprintf() Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 02/15] test: don't use IBUF_SHARED in the tests Vladislav Shpilevoy via Tarantool-patches
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

Lua buffer module used to have a couple of preallocated objects of
type 'union c_register'. It was a bunch of C scalar and array
types intended for use instead of ffi.new() where it was needed to
allocate a temporary object like 'int[1]' just to be able to pass
'int *' into a C function via FFI.

It was a bit faster than ffi.new() even for small sizes. For
instance (when JIT works), getting a register to use it as
'int[1]' cost around 0.2-0.3 ns while ffi.new('int[1]') costs
around 0.4 ns. Also the code looked cleaner.

But Lua registers were global and therefore had the same issue as
IBUF_SHARED and static_alloc() in Lua - no ownership, and sudden
reuse when GC starts right the register is still in use in some
Lua code. __gc handlers could wipe the register values making the
original code behave unpredictably.

IBUF_SHARED was fixed by proper ownership implementation, but it
is not necessary with Lua registers. It could be done with the
buffer.ffi_stash_new() feature, but its performance is about 0.8
ns which is worse than plain ffi.new() for simple scalar types.

This patch eliminates Lua registers, and uses ffi.new() instead
everywhere.

Closes #5632

(cherry picked from commit 911ca60e202986ea283341bb31bfd7a7a5688559)
---
 changelogs/unreleased/fix-ibuf-static.md   |  7 ++++
 src/lua/msgpackffi.lua                     | 39 +++++++++---------
 test/app-tap/gh-5632-gc-buf-reuse.test.lua | 46 +++++++++++++++++++++-
 3 files changed, 72 insertions(+), 20 deletions(-)
 create mode 100644 changelogs/unreleased/fix-ibuf-static.md

diff --git a/changelogs/unreleased/fix-ibuf-static.md b/changelogs/unreleased/fix-ibuf-static.md
new file mode 100644
index 000000000..34450b85d
--- /dev/null
+++ b/changelogs/unreleased/fix-ibuf-static.md
@@ -0,0 +1,7 @@
+## bugfix/core
+
+* Extensive usage of `uri` and `uuid` modules with debug log level could lead to
+  a crash or corrupted result of the functions from these modules. Also their
+  usage from the callbacks passed to `ffi.gc()` could lead to the same but much
+  easier. The same could happen with some functions from the modules `fio`,
+  `box.tuple`, `iconv` (gh-5632).
diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index ad7998ed1..b07f0e7f0 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -31,11 +31,6 @@ union tmpint {
 
 local strict_alignment = (jit.arch == 'arm')
 
-local tmpint
-if strict_alignment then
-   tmpint = ffi.new('union tmpint[1]')
-end
-
 local function bswap_u16(num)
     return bit.rshift(bit.bswap(tonumber(num)), 16)
 end
@@ -71,7 +66,7 @@ end
 local encode_u16
 if strict_alignment then
     encode_u16 = function(buf, code, num)
-        tmpint[0].u16 = bswap_u16(num)
+        local tmpint = ffi.new('uint16_t[1]', bswap_u16(num))
         local p = buf:alloc(3)
         p[0] = code
         ffi.copy(p + 1, tmpint, 2)
@@ -87,8 +82,9 @@ end
 local encode_u32
 if strict_alignment then
     encode_u32 = function(buf, code, num)
-        tmpint[0].u32 =
-            ffi.cast('uint32_t', bit.bswap(tonumber(num)))
+        local tmpint =
+            ffi.new('uint32_t[1]', ffi.cast('uint32_t',
+                                            bit.bswap(tonumber(num))))
         local p = buf:alloc(5)
         p[0] = code
         ffi.copy(p + 1, tmpint, 4)
@@ -105,7 +101,8 @@ end
 local encode_u64
 if strict_alignment then
     encode_u64 = function(buf, code, num)
-        tmpint[0].u64 = bit.bswap(ffi.cast('uint64_t', num))
+        local tmpint =
+            ffi.new('uint64_t[1]', bit.bswap(ffi.cast('uint64_t', num)))
         local p = buf:alloc(9)
         p[0] = code
         ffi.copy(p + 1, tmpint, 8)
@@ -328,9 +325,10 @@ end
 local decode_u16
 if strict_alignment then
     decode_u16 = function(data)
+        local tmpint = ffi.new('uint16_t[1]')
         ffi.copy(tmpint, data[0], 2)
         data[0] = data[0] + 2
-        return tonumber(bswap_u16(tmpint[0].u16))
+        return tonumber(bswap_u16(tmpint[0]))
     end
 else
     decode_u16 = function(data)
@@ -343,10 +341,11 @@ end
 local decode_u32
 if strict_alignment then
     decode_u32 = function(data)
+        local tmpint = ffi.new('uint32_t[1]')
         ffi.copy(tmpint, data[0], 4)
         data[0] = data[0] + 4
         return tonumber(
-            ffi.cast('uint32_t', bit.bswap(tonumber(tmpint[0].u32))))
+            ffi.cast('uint32_t', bit.bswap(tonumber(tmpint[0]))))
     end
 else
     decode_u32 = function(data)
@@ -360,9 +359,10 @@ end
 local decode_u64
 if strict_alignment then
     decode_u64 = function(data)
+        local tmpint = ffi.new('uint64_t[1]')
         ffi.copy(tmpint, data[0], 8);
         data[0] = data[0] + 8
-        local num = bit.bswap(tmpint[0].u64)
+        local num = bit.bswap(tmpint[0])
         if num <= DBL_INT_MAX then
             return tonumber(num) -- return as 'number'
         end
@@ -389,8 +389,9 @@ end
 local decode_i16
 if strict_alignment then
     decode_i16 = function(data)
+        local tmpint = ffi.new('uint16_t[1]')
         ffi.copy(tmpint, data[0], 2)
-        local num = bswap_u16(tmpint[0].u16)
+        local num = bswap_u16(tmpint[0])
         data[0] = data[0] + 2
         -- note: this double cast is actually necessary
         return tonumber(ffi.cast('int16_t', ffi.cast('uint16_t', num)))
@@ -407,8 +408,9 @@ end
 local decode_i32
 if strict_alignment then
     decode_i32 = function(data)
+        local tmpint = ffi.new('uint32_t[1]')
         ffi.copy(tmpint, data[0], 4)
-        local num = bit.bswap(tonumber(tmpint[0].u32))
+        local num = bit.bswap(tonumber(tmpint[0]))
         data[0] = data[0] + 4
         return num
     end
@@ -423,9 +425,10 @@ end
 local decode_i64
 if strict_alignment then
     decode_i64 = function(data)
+        local tmpint = ffi.new('int64_t[1]')
         ffi.copy(tmpint, data[0], 8)
         data[0] = data[0] + 8
-        local num = bit.bswap(ffi.cast('int64_t', tmpint[0].u64))
+        local num = bit.bswap(tmpint[0])
         if num >= -DBL_INT_MAX and num <= DBL_INT_MAX then
             return tonumber(num) -- return as 'number'
         end
@@ -550,13 +553,11 @@ decode_r = function(data)
 end
 
 ---
--- A temporary const char ** buffer.
 -- All decode_XXX functions accept const char **data as its first argument,
 -- like libmsgpuck does. After decoding data[0] position is changed to the next
 -- element. It is significally faster on LuaJIT to use double pointer than
 -- return result, newpos.
 --
-local bufp = ffi.new('const unsigned char *[1]');
 
 local function check_offset(offset, len)
     if offset == nil then
@@ -576,13 +577,13 @@ local function decode_unchecked(str, offset)
     if type(str) == "string" then
         offset = check_offset(offset, #str)
         local buf = ffi.cast(char_ptr_t, str)
-        bufp[0] = buf + offset - 1
+        local bufp = ffi.new('const unsigned char *[1]', buf + offset - 1)
         local r = decode_r(bufp)
         return r, bufp[0] - buf + 1
     elseif ffi.istype(char_ptr_t, str) then
         -- Note: ffi.istype() ignores the const qualifier, so both
         -- (char *) and (const char *) buffers are valid.
-        bufp[0] = str
+        local bufp = ffi.new('const unsigned char *[1]', str)
         local r = decode_r(bufp)
         return r, ffi.cast(ffi.typeof(str), bufp[0])
     else
diff --git a/test/app-tap/gh-5632-gc-buf-reuse.test.lua b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
index 81dafd36e..6efddb714 100755
--- a/test/app-tap/gh-5632-gc-buf-reuse.test.lua
+++ b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
@@ -11,6 +11,7 @@ local tap = require('tap')
 local ffi = require('ffi')
 local uuid = require('uuid')
 local uri = require('uri')
+local msgpackffi = require('msgpackffi')
 
 local function test_uuid(test)
     test:plan(1)
@@ -99,9 +100,52 @@ local function test_uri(test)
     test:ok(is_success, 'uri in gc')
 end
 
+local function test_msgpackffi(test)
+    test:plan(1)
+
+    local mp_encode = msgpackffi.encode
+    local mp_decode = msgpackffi.decode
+    local gc_count = 100
+    local iter_count = 1000
+    local is_success = true
+    local data = {0, 1, 1000, 100000000, 'str', true, 1.1}
+
+    local function do_encode()
+        if not is_success then
+            return
+        end
+        local t = mp_encode(data)
+        t = mp_decode(t)
+        if #t ~= #data then
+            is_success = false
+            return
+        end
+        for i = 1, #t do
+            if t[i] ~= data[i] then
+                is_success = false
+                return
+            end
+        end
+    end
+
+    local function create_gc()
+        for _ = 1, gc_count do
+            ffi.gc(ffi.new('char[1]'), do_encode)
+        end
+    end
+
+    for _ = 1, iter_count do
+        create_gc()
+        do_encode()
+    end
+
+    test:ok(is_success, 'msgpackffi in gc')
+end
+
 local test = tap.test('gh-5632-gc-buf-reuse')
-test:plan(2)
+test:plan(3)
 test:test('uuid in __gc', test_uuid)
 test:test('uri in __gc', test_uri)
+test:test('msgpackffi in __gc', test_msgpackffi)
 
 os.exit(test:check() and 0 or 1)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 02/15] test: don't use IBUF_SHARED in the tests
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (6 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 15/15] buffer: remove Lua registers Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 03/15] tuple: pass global ibuf explicitly where possible Vladislav Shpilevoy via Tarantool-patches
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

In msgpack test it is used only to check that 'struct ibuf *' can
be passed to encode() functions. But soon IBUF_SHARED will be
deleted, and its alternative won't be yield-tolerant. This means
it can't be used in this test. There are yields between the buffer
usages.

In varbinary test it is used in a too complicated way to be able
to put it back normally. And otherwise its usage does not make
much sense - without put() it is going to be created from the
scratch on non-first usage until a yield.

In the module_api test it is used to check if some function works
with 'struct ibuf *'. Can be done without IBUF_SHARED.

Part of #5632

(cherry picked from commit d0f0fc4772bc05de8557094832d6df94740a0055)
---
 test/app-tap/module_api.test.lua | 3 ++-
 test/app/msgpack.result          | 7 +++++--
 test/app/msgpack.test.lua        | 5 +++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/test/app-tap/module_api.test.lua b/test/app-tap/module_api.test.lua
index 23b62f115..a181e7492 100755
--- a/test/app-tap/module_api.test.lua
+++ b/test/app-tap/module_api.test.lua
@@ -45,13 +45,14 @@ local function test_buffers(test, module)
 
     local ibuf = buffer.ibuf()
     local pbuf = ibuf:alloc(128)
+    local ibuf_ptr = ffi.cast('struct ibuf *', ibuf)
 
     test:ok(not module.toibuf(nil), 'toibuf of nil')
     test:ok(not module.toibuf({}), 'toibuf of {}')
     test:ok(not module.toibuf(1LL), 'toibuf of 1LL')
     test:ok(not module.toibuf(box.NULL), 'toibuf of box.NULL')
     test:ok(not module.toibuf(buffer.reg1), 'toibuf of reg1')
-    test:ok(module.toibuf(buffer.IBUF_SHARED), "toibuf of ibuf*")
+    test:ok(module.toibuf(ibuf_ptr), "toibuf of ibuf*")
     test:ok(module.toibuf(ibuf), 'toibuf of ibuf')
     test:ok(not module.toibuf(pbuf), 'toibuf of pointer to ibuf data')
 end
diff --git a/test/app/msgpack.result b/test/app/msgpack.result
index 54714a59d..e7031e281 100644
--- a/test/app/msgpack.result
+++ b/test/app/msgpack.result
@@ -214,10 +214,10 @@ msgpack.decode(ffi.cast('char *', '\x04\x05\x06'), -1)
 - error: 'msgpack.decode: size can''t be negative'
 ...
 -- Provide a buffer. Try both 'struct ibuf' and 'struct ibuf *'.
-buf = buffer.IBUF_SHARED
+buf_storage = buffer.ibuf()
 ---
 ...
-buf:reset()
+buf = ffi.cast('struct ibuf *', buf_storage)
 ---
 ...
 size = msgpack.encode({a = 1, b = 2}, buf)
@@ -227,6 +227,9 @@ size = msgpack.encode({a = 1, b = 2}, buf)
 ---
 - {'a': 1, 'b': 2}
 ...
+buf_storage = nil
+---
+...
 buf = buffer.ibuf()
 ---
 ...
diff --git a/test/app/msgpack.test.lua b/test/app/msgpack.test.lua
index f501b1bf1..4193e2312 100644
--- a/test/app/msgpack.test.lua
+++ b/test/app/msgpack.test.lua
@@ -71,10 +71,11 @@ msgpack.decode(buf.rpos, buf:size() - 1)
 msgpack.decode(ffi.cast('char *', '\x04\x05\x06'), -1)
 
 -- Provide a buffer. Try both 'struct ibuf' and 'struct ibuf *'.
-buf = buffer.IBUF_SHARED
-buf:reset()
+buf_storage = buffer.ibuf()
+buf = ffi.cast('struct ibuf *', buf_storage)
 size = msgpack.encode({a = 1, b = 2}, buf)
 (msgpack.decode(buf.rpos, size))
+buf_storage = nil
 buf = buffer.ibuf()
 size = msgpack.encode({c = 3, d = 4}, buf)
 (msgpack.decode(buf.rpos, size))
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 03/15] tuple: pass global ibuf explicitly where possible
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (7 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 02/15] test: don't use IBUF_SHARED in the tests Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 04/15] iconv: take errno before reseting the context Vladislav Shpilevoy via Tarantool-patches
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

Code in lua/tuple.c used global tarantool_lua_ibuf in many places
relying on it never being changed and not reused by other code
until a yield.

But it is not so. In fact, as it was discovered in #5632, in any
Lua function may be started GC. Any GC handler might touch some
API also using tarantool_lua_ibuf inside.

This makes the first usage in lua/tuple.c invalid - the buffer
could be reset or reallocated or its wpos/rpos could change during
GC.

In order to fix this, first of all there should be clear points
where the buffer is taken, and where it becomes not needed
anymore.

The patch makes code in lua/tuple.c take tarantool_lua_ibuf when
it is needed first time. Not during usage. The same is done for
the fiber region for the API symmetry.

Part of #5632

(cherry picked from commit fabf0d57ea31d69be848fa33c98397a9a4a964e9)
---
 src/box/lua/tuple.c | 60 ++++++++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 23 deletions(-)

diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 617acc9aa..00ac3b44c 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -134,10 +134,8 @@ luaT_istuple(struct lua_State *L, int narg)
  * Helper for <lbox_tuple_new>().
  */
 static int
-luaT_tuple_encode_values(struct lua_State *L)
+luaT_tuple_encode_values(struct lua_State *L, struct ibuf *buf)
 {
-	struct ibuf *buf = tarantool_lua_ibuf;
-	ibuf_reset(buf);
 	struct mpstream stream;
 	mpstream_init(&stream, buf, ibuf_reserve_cb, ibuf_alloc_cb, luamp_error,
 		      L);
@@ -150,22 +148,31 @@ luaT_tuple_encode_values(struct lua_State *L)
 	return 0;
 }
 
-typedef void luaT_mpstream_init_f(struct mpstream *stream, struct lua_State *L);
+typedef void luaT_mpstream_init_f(struct mpstream *stream, struct lua_State *L,
+				  void *buffer);
 
 static void
-luaT_mpstream_init_lua_ibuf(struct mpstream *stream, struct lua_State *L)
+luaT_mpstream_init_lua_ibuf(struct mpstream *stream, struct lua_State *L,
+			    void *buffer)
 {
-	mpstream_init(stream, tarantool_lua_ibuf, ibuf_reserve_cb,
+	mpstream_init(stream, (struct ibuf *)buffer, ibuf_reserve_cb,
 		      ibuf_alloc_cb, luamp_error, L);
 }
 
 static void
-luaT_mpstream_init_box_region(struct mpstream *stream, struct lua_State *L)
+luaT_mpstream_init_box_region(struct mpstream *stream, struct lua_State *L,
+			      void *buffer)
 {
-	mpstream_init(stream, &fiber()->gc, region_reserve_cb, region_alloc_cb,
-		      luamp_error, L);
+	mpstream_init(stream, (struct region *)buffer, region_reserve_cb,
+		      region_alloc_cb, luamp_error, L);
 }
 
+/** Helper to pass parameters into luaT_tuple_encode_table via the Lua stack. */
+struct luaT_tuple_encode_ctx {
+	luaT_mpstream_init_f *mpstream_init_f;
+	void *buffer;
+};
+
 /**
  * Encode a Lua table or a tuple as MsgPack.
  *
@@ -177,8 +184,8 @@ static int
 luaT_tuple_encode_table(struct lua_State *L)
 {
 	struct mpstream stream;
-	luaT_mpstream_init_f *luaT_mpstream_init_f = lua_topointer(L, 1);
-	luaT_mpstream_init_f(&stream, L);
+	const struct luaT_tuple_encode_ctx *ctx = lua_topointer(L, 1);
+	ctx->mpstream_init_f(&stream, L, ctx->buffer);
 	luamp_encode_tuple(L, &tuple_serializer, &stream, 2);
 	mpstream_flush(&stream);
 	return 0;
@@ -189,7 +196,8 @@ luaT_tuple_encode_table(struct lua_State *L)
  */
 static int
 luaT_tuple_encode_on_mpstream(struct lua_State *L, int idx,
-			      luaT_mpstream_init_f *luaT_mpstream_init_f)
+			      luaT_mpstream_init_f *luaT_mpstream_init_f,
+			      void *buffer)
 {
 	assert(idx != 0);
 	if (!lua_istable(L, idx) && !luaT_istuple(L, idx)) {
@@ -212,7 +220,11 @@ luaT_tuple_encode_on_mpstream(struct lua_State *L, int idx,
 	lua_rawgeti(L, LUA_REGISTRYINDEX, luaT_tuple_encode_table_ref);
 	assert(lua_isfunction(L, -1));
 
-	lua_pushlightuserdata(L, luaT_mpstream_init_f);
+	struct luaT_tuple_encode_ctx ctx = {
+		.mpstream_init_f = luaT_mpstream_init_f,
+		.buffer = buffer,
+	};
+	lua_pushlightuserdata(L, &ctx);
 	lua_pushvalue(L, idx);
 
 	int rc = luaT_call(L, 2, 0);
@@ -225,12 +237,11 @@ luaT_tuple_encode_on_mpstream(struct lua_State *L, int idx,
  */
 static char *
 luaT_tuple_encode_on_lua_ibuf(struct lua_State *L, int idx,
-			      size_t *tuple_len_ptr)
+			      size_t *tuple_len_ptr, struct ibuf *buf)
 {
-	struct ibuf *buf = tarantool_lua_ibuf;
-	ibuf_reset(buf);
 	if (luaT_tuple_encode_on_mpstream(L, idx,
-					  luaT_mpstream_init_lua_ibuf) != 0)
+					  luaT_mpstream_init_lua_ibuf,
+					  buf) != 0)
 		return NULL;
 	if (tuple_len_ptr != NULL)
 		*tuple_len_ptr = ibuf_used(buf);
@@ -246,7 +257,8 @@ luaT_tuple_encode(struct lua_State *L, int idx, size_t *tuple_len_ptr)
 	struct region *region = &fiber()->gc;
 	size_t region_svp = region_used(region);
 	if (luaT_tuple_encode_on_mpstream(L, idx,
-					  luaT_mpstream_init_box_region) != 0) {
+					  luaT_mpstream_init_box_region,
+					  region) != 0) {
 		region_truncate(region, region_svp);
 		return NULL;
 	}
@@ -267,15 +279,16 @@ luaT_tuple_encode(struct lua_State *L, int idx, size_t *tuple_len_ptr)
 box_tuple_t *
 luaT_tuple_new(struct lua_State *L, int idx, box_tuple_format_t *format)
 {
+	struct ibuf *ibuf = tarantool_lua_ibuf;
+	ibuf_reset(ibuf);
 	size_t tuple_len;
-	char *tuple_data = luaT_tuple_encode_on_lua_ibuf(L, idx, &tuple_len);
+	char *tuple_data = luaT_tuple_encode_on_lua_ibuf(L, idx, &tuple_len,
+							 ibuf);
 	if (tuple_data == NULL)
 		return NULL;
 	box_tuple_t *tuple = box_tuple_new(format, tuple_data,
 					   tuple_data + tuple_len);
-	if (tuple == NULL)
-		return NULL;
-	ibuf_reinit(tarantool_lua_ibuf);
+	ibuf_reinit(ibuf);
 	return tuple;
 }
 
@@ -295,7 +308,8 @@ lbox_tuple_new(lua_State *L)
 	box_tuple_format_t *fmt = box_tuple_format_default();
 	if (argc != 1 || (!lua_istable(L, 1) && !luaT_istuple(L, 1))) {
 		struct ibuf *buf = tarantool_lua_ibuf;
-		luaT_tuple_encode_values(L); /* may raise */
+		ibuf_reset(buf);
+		luaT_tuple_encode_values(L, buf); /* may raise */
 		struct tuple *tuple = box_tuple_new(fmt, buf->buf,
 						    buf->buf + ibuf_used(buf));
 		ibuf_reinit(buf);
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 04/15] iconv: take errno before reseting the context
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (8 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 03/15] tuple: pass global ibuf explicitly where possible Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 05/15] cord_buf: introduce cord_buf API Vladislav Shpilevoy via Tarantool-patches
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

In Lua iconv_convert() in case ffi.C.tnt_iconv() with normal
arguments failed, tried to clear iconv context by calling the
function again with all arguments NULL. Then it looked at errno.

But the second call could do anything with errno. For instance, it
could also fail, and change errno.

The patch saves errno into a variable before calling tnt_iconv()
second time.

It still does not give a perfect protection as it was discovered
in scope of #5632, but still better.

The patch is mostly motivated by the next patches about #5632
which will add another call to the error path, and it should
better be after errno save.

Needed for #5632

(cherry picked from commit 8c48974448f846629873302f0200c0a1505d40b7)
---
 src/lua/iconv.lua | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/lua/iconv.lua b/src/lua/iconv.lua
index ade66d760..e68509dec 100644
--- a/src/lua/iconv.lua
+++ b/src/lua/iconv.lua
@@ -43,15 +43,18 @@ local function iconv_convert(iconv, data)
         buf_left[0] = buf:unused()
         local res = ffi.C.tnt_iconv(iconv, data_ptr, data_left,
                                 buf_ptr, buf_left)
-        if res == ffi.cast('size_t', -1) and errno() ~= E2BIG then
-            ffi.C.tnt_iconv(iconv, nil, nil, nil, nil)
-            if errno() == EINVAL then
-                error('Invalid multibyte sequence')
+        if res == ffi.cast('size_t', -1) then
+            local err = errno()
+            if err ~= E2BIG then
+                ffi.C.tnt_iconv(iconv, nil, nil, nil, nil)
+                if err == EINVAL then
+                    error('Invalid multibyte sequence')
+                end
+                if err == EILSEQ then
+                    error('Incomplete multibyte sequence')
+                end
+                error('Unknown conversion error: ' .. errno.strerror(err))
             end
-            if errno() == EILSEQ then
-                error('Incomplete multibyte sequence')
-            end
-            error('Unknown conversion error: ' .. errno.strerror())
         end
         buf:alloc(buf:unused() - buf_left[0])
     end
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 05/15] cord_buf: introduce cord_buf API
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (9 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 04/15] iconv: take errno before reseting the context Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 06/15] cord_buf: introduce ownership management Vladislav Shpilevoy via Tarantool-patches
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

There was a global ibuf object called tarantool_lua_ibuf. It was
used in all the places working with Lua which didn't have yields,
and where fiber's region could be potentially slower due to not
being able to guarantee the allocated memory is contiguous.

Yields during the ibuf usage were prohibited because another fiber
would take the same ibuf and override its previous content which
was still used by another fiber.

But it wasn't taken into account that there is Lua GC. It can be
invoked from any Lua function in Lua C code, and almost on any
line in the Lua scripts. During GC some deleted objects might have
GC handlers installed as __gc metamethods. From the handler they
could call Tarantool functions, including the ones using the
global ibuf.

Therefore ibuf could be overridden not only at yields, but almost
in any moment. Because with the Lua GC at hand, the multitasking
is not strictly "cooperative" anymore.

It is necessary to implement ownership for the global buffer. The
patch prepares the API for this: the buffer is moved to its own
file, and has methods take(), put(), and drop().

Take() is supposed to make the current fiber own the buffer. Put()
makes it available again. Drop() does the same but also clears the
buffer (frees its memory). The ownership itself is a subject for
the next patches. Here only the API is prepared.

The patch "hits" performance a little. Previously the get of
buffer.IBUF_SHARED cost around 1 ns. Now cord_ibuf_take() +
cord_ibuf_put() cost around 5 ns together. The next patches will
make it worse, up to 15 ns until #5871 is done.

Part of #5632

(cherry picked from commit ade45685f4fdab47204e11cbd82048bb68bbd03b)
---
 extra/exports              |  4 ++-
 src/CMakeLists.txt         |  1 +
 src/box/lua/schema.lua     | 47 ++++++++++++++++++++++++-----------
 src/box/lua/tuple.c        | 28 ++++++++++-----------
 src/box/lua/tuple.lua      | 14 +++++++----
 src/cord_buf.c             | 47 +++++++++++++++++++++++++++++++++++
 src/cord_buf.h             | 45 +++++++++++++++++++++++++++++++++
 src/lua/buffer.lua         | 51 ++++++++++++++++++++++++++++++++++++--
 src/lua/iconv.lua          |  8 +++---
 src/lua/init.c             |  3 ---
 src/lua/msgpack.c          |  6 ++---
 src/lua/msgpackffi.lua     |  7 +++---
 src/lua/socket.lua         | 14 +++++------
 src/lua/utf8.c             | 16 +++++++-----
 src/lua/utils.h            |  1 -
 test/unit/luaT_tuple_new.c |  4 ---
 16 files changed, 229 insertions(+), 67 deletions(-)
 create mode 100644 src/cord_buf.c
 create mode 100644 src/cord_buf.h

diff --git a/extra/exports b/extra/exports
index 94e04264e..91094206d 100644
--- a/extra/exports
+++ b/extra/exports
@@ -44,7 +44,6 @@ tnt_iconv
 exception_get_string
 exception_get_int
 
-tarantool_lua_ibuf
 uuid_nil
 tt_uuid_create
 tt_uuid_str
@@ -111,6 +110,9 @@ fiber_cond_signal
 fiber_cond_broadcast
 fiber_cond_wait_timeout
 fiber_cond_wait
+cord_ibuf_drop
+cord_ibuf_put
+cord_ibuf_take
 cord_slab_cache
 coio_wait
 coio_close
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index d14997413..1e840aab9 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -107,6 +107,7 @@ set (core_sources
      http_parser.c
      coll.c
      coll_def.c
+     cord_buf.c
  )
 
 if (TARGET_OS_NETBSD)
diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index a23fe0813..513dab1be 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -1,6 +1,7 @@
 -- schema.lua (internal file)
 --
 local ffi = require('ffi')
+local buffer = require('buffer')
 local msgpack = require('msgpack')
 local msgpackffi = require('msgpackffi')
 local fun = require('fun')
@@ -20,6 +21,8 @@ local tuple_encode = box.tuple.encode
 local tuple_bless = box.tuple.bless
 local is_tuple = box.tuple.is
 assert(tuple_encode ~= nil and tuple_bless ~= nil and is_tuple ~= nil)
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_put = buffer.internal.cord_ibuf_put
 
 local INT64_MIN = tonumber64('-9223372036854775808')
 local INT64_MAX = tonumber64('9223372036854775807')
@@ -1145,9 +1148,12 @@ base_index_mt.__len = base_index_mt.len
 -- min and max
 base_index_mt.min_ffi = function(index, key)
     check_index_arg(index, 'min')
-    local pkey, pkey_end = tuple_encode(key)
-    if builtin.box_index_min(index.space_id, index.id,
-                             pkey, pkey_end, ptuple) ~= 0 then
+    local ibuf = cord_ibuf_take()
+    local pkey, pkey_end = tuple_encode(ibuf, key)
+    local nok = builtin.box_index_min(index.space_id, index.id, pkey, pkey_end,
+                                      ptuple) ~= 0
+    cord_ibuf_put(ibuf)
+    if nok then
         box.error() -- error
     elseif ptuple[0] ~= nil then
         return tuple_bless(ptuple[0])
@@ -1162,9 +1168,12 @@ base_index_mt.min_luac = function(index, key)
 end
 base_index_mt.max_ffi = function(index, key)
     check_index_arg(index, 'max')
-    local pkey, pkey_end = tuple_encode(key)
-    if builtin.box_index_max(index.space_id, index.id,
-                             pkey, pkey_end, ptuple) ~= 0 then
+    local ibuf = cord_ibuf_take()
+    local pkey, pkey_end = tuple_encode(ibuf, key)
+    local nok = builtin.box_index_max(index.space_id, index.id, pkey, pkey_end,
+                                      ptuple) ~= 0
+    cord_ibuf_put(ibuf)
+    if nok then
         box.error() -- error
     elseif ptuple[0] ~= nil then
         return tuple_bless(ptuple[0])
@@ -1197,10 +1206,12 @@ end
 -- iteration
 base_index_mt.pairs_ffi = function(index, key, opts)
     check_index_arg(index, 'pairs')
-    local pkey, pkey_end = tuple_encode(key)
+    local ibuf = cord_ibuf_take()
+    local pkey, pkey_end = tuple_encode(ibuf, key)
     local itype = check_iterator_type(opts, pkey + 1 >= pkey_end);
 
     local keybuf = ffi.string(pkey, pkey_end - pkey)
+    cord_ibuf_put(ibuf)
     local pkeybuf = ffi.cast('const char *', keybuf)
     local cdata = builtin.box_index_iterator(index.space_id, index.id,
         itype, pkeybuf, pkeybuf + #keybuf);
@@ -1224,10 +1235,12 @@ end
 -- index subtree size
 base_index_mt.count_ffi = function(index, key, opts)
     check_index_arg(index, 'count')
-    local pkey, pkey_end = tuple_encode(key)
+    local ibuf = cord_ibuf_take()
+    local pkey, pkey_end = tuple_encode(ibuf, key)
     local itype = check_iterator_type(opts, pkey + 1 >= pkey_end);
     local count = builtin.box_index_count(index.space_id, index.id,
         itype, pkey, pkey_end);
+    cord_ibuf_put(ibuf)
     if count == -1 then
         box.error()
     end
@@ -1242,9 +1255,12 @@ end
 
 base_index_mt.get_ffi = function(index, key)
     check_index_arg(index, 'get')
-    local key, key_end = tuple_encode(key)
-    if builtin.box_index_get(index.space_id, index.id,
-                             key, key_end, ptuple) ~= 0 then
+    local ibuf = cord_ibuf_take()
+    local key, key_end = tuple_encode(ibuf, key)
+    local nok = builtin.box_index_get(index.space_id, index.id, key, key_end,
+                                      ptuple) ~= 0
+    cord_ibuf_put(ibuf)
+    if nok then
         return box.error() -- error
     elseif ptuple[0] ~= nil then
         return tuple_bless(ptuple[0])
@@ -1275,13 +1291,16 @@ end
 
 base_index_mt.select_ffi = function(index, key, opts)
     check_index_arg(index, 'select')
-    local key, key_end = tuple_encode(key)
+    local ibuf = cord_ibuf_take()
+    local key, key_end = tuple_encode(ibuf, key)
     local iterator, offset, limit = check_select_opts(opts, key + 1 >= key_end)
 
     local port = ffi.cast('struct port *', port_tuple)
 
-    if builtin.box_select(index.space_id, index.id,
-        iterator, offset, limit, key, key_end, port) ~= 0 then
+    local nok = builtin.box_select(index.space_id, index.id, iterator, offset,
+                                   limit, key, key_end, port) ~= 0
+    cord_ibuf_put(ibuf)
+    if nok then
         return box.error()
     end
 
diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 00ac3b44c..ad056a3bc 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -36,6 +36,7 @@
 #include "diag.h" /* diag_set() */
 #include <small/ibuf.h>
 #include <small/region.h>
+#include "cord_buf.h"
 #include <fiber.h>
 
 #include "box/tuple.h"
@@ -279,16 +280,18 @@ luaT_tuple_encode(struct lua_State *L, int idx, size_t *tuple_len_ptr)
 box_tuple_t *
 luaT_tuple_new(struct lua_State *L, int idx, box_tuple_format_t *format)
 {
-	struct ibuf *ibuf = tarantool_lua_ibuf;
-	ibuf_reset(ibuf);
+	struct ibuf *ibuf = cord_ibuf_take();
 	size_t tuple_len;
+	box_tuple_t *tuple;
 	char *tuple_data = luaT_tuple_encode_on_lua_ibuf(L, idx, &tuple_len,
 							 ibuf);
-	if (tuple_data == NULL)
-		return NULL;
-	box_tuple_t *tuple = box_tuple_new(format, tuple_data,
-					   tuple_data + tuple_len);
-	ibuf_reinit(ibuf);
+	if (tuple_data == NULL) {
+		tuple = NULL;
+	} else {
+		tuple = box_tuple_new(format, tuple_data,
+				      tuple_data + tuple_len);
+	}
+	cord_ibuf_drop(ibuf);
 	return tuple;
 }
 
@@ -307,12 +310,11 @@ lbox_tuple_new(lua_State *L)
 	 */
 	box_tuple_format_t *fmt = box_tuple_format_default();
 	if (argc != 1 || (!lua_istable(L, 1) && !luaT_istuple(L, 1))) {
-		struct ibuf *buf = tarantool_lua_ibuf;
-		ibuf_reset(buf);
+		struct ibuf *buf = cord_ibuf_take();
 		luaT_tuple_encode_values(L, buf); /* may raise */
 		struct tuple *tuple = box_tuple_new(fmt, buf->buf,
 						    buf->buf + ibuf_used(buf));
-		ibuf_reinit(buf);
+		cord_ibuf_drop(buf);
 		if (tuple == NULL)
 			return luaT_error(L);
 		luaT_pushtuple(L, tuple);
@@ -586,8 +588,7 @@ lbox_tuple_transform(struct lua_State *L)
 		return 1;
 	}
 
-	struct ibuf *buf = tarantool_lua_ibuf;
-	ibuf_reset(buf);
+	struct ibuf *buf = cord_ibuf_take();
 	struct mpstream stream;
 	mpstream_init(&stream, buf, ibuf_reserve_cb, ibuf_alloc_cb,
 		      luamp_error, L);
@@ -632,12 +633,11 @@ lbox_tuple_transform(struct lua_State *L)
 		new_tuple = tuple_new(box_tuple_format_default(),
 				      new_data, new_data + new_size);
 	region_truncate(region, used);
-
+	cord_ibuf_put(buf);
 	if (new_tuple == NULL)
 		luaT_error(L);
 
 	luaT_pushtuple(L, new_tuple);
-	ibuf_reset(buf);
 	return 1;
 }
 
diff --git a/src/box/lua/tuple.lua b/src/box/lua/tuple.lua
index 63ea73e43..186216420 100644
--- a/src/box/lua/tuple.lua
+++ b/src/box/lua/tuple.lua
@@ -6,6 +6,8 @@ local msgpackffi = require('msgpackffi')
 local fun = require('fun')
 local buffer = require('buffer')
 local internal = require('box.internal')
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_put = buffer.internal.cord_ibuf_put
 
 ffi.cdef[[
 /** \cond public */
@@ -72,9 +74,7 @@ local encode_fix = msgpackffi.internal.encode_fix
 local encode_array = msgpackffi.internal.encode_array
 local encode_r = msgpackffi.internal.encode_r
 
-local tuple_encode = function(obj)
-    local tmpbuf = buffer.IBUF_SHARED
-    tmpbuf:reset()
+local tuple_encode = function(tmpbuf, obj)
     if obj == nil then
         encode_fix(tmpbuf, 0x90, 0)  -- empty array
     elseif is_tuple(obj) then
@@ -231,8 +231,10 @@ local function tuple_update(tuple, expr)
     if type(expr) ~= 'table' then
         error("Usage: tuple:update({ { op, field, arg}+ })")
     end
-    local pexpr, pexpr_end = tuple_encode(expr)
+    local ibuf = cord_ibuf_take()
+    local pexpr, pexpr_end = tuple_encode(ibuf, expr)
     local tuple = builtin.box_tuple_update(tuple, pexpr, pexpr_end)
+    cord_ibuf_put(ibuf)
     if tuple == nil then
         return box.error()
     end
@@ -244,8 +246,10 @@ local function tuple_upsert(tuple, expr)
     if type(expr) ~= 'table' then
         error("Usage: tuple:upsert({ { op, field, arg}+ })")
     end
-    local pexpr, pexpr_end = tuple_encode(expr)
+    local ibuf = cord_ibuf_take()
+    local pexpr, pexpr_end = tuple_encode(ibuf, expr)
     local tuple = builtin.box_tuple_upsert(tuple, pexpr, pexpr_end)
+    cord_ibuf_put(ibuf)
     if tuple == nil then
         return box.error()
     end
diff --git a/src/cord_buf.c b/src/cord_buf.c
new file mode 100644
index 000000000..cac508c3d
--- /dev/null
+++ b/src/cord_buf.c
@@ -0,0 +1,47 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+#include "cord_buf.h"
+#include "fiber.h"
+
+#include "small/ibuf.h"
+
+enum {
+	/* No any reason why that value. Historical constant. */
+	CORD_IBUF_START_CAPACITY = 16384,
+};
+
+static struct ibuf *cord_buf_global = NULL;
+
+struct ibuf *
+cord_ibuf_take(void)
+{
+	assert(cord_is_main());
+	struct ibuf *buf = cord_buf_global;
+	if (buf != NULL) {
+		ibuf_reset(buf);
+		return buf;
+	}
+	buf = malloc(sizeof(*buf));
+	if (buf == NULL)
+		panic("Couldn't allocate thread buffer");
+	ibuf_create(buf, &cord()->slabc, CORD_IBUF_START_CAPACITY);
+	cord_buf_global = buf;
+	return buf;
+}
+
+void
+cord_ibuf_put(struct ibuf *ibuf)
+{
+	(void)ibuf;
+	assert(ibuf == cord_buf_global);
+}
+
+void
+cord_ibuf_drop(struct ibuf *ibuf)
+{
+	ibuf_reinit(ibuf);
+	assert(ibuf == cord_buf_global);
+}
diff --git a/src/cord_buf.h b/src/cord_buf.h
new file mode 100644
index 000000000..59f429c8f
--- /dev/null
+++ b/src/cord_buf.h
@@ -0,0 +1,45 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+#pragma once
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+struct ibuf;
+
+/**
+ * Take the global ibuf, or allocate a new one if the stash is empty.
+ */
+struct ibuf *
+cord_ibuf_take(void);
+
+/**
+ * Put the global ibuf back.
+ */
+void
+cord_ibuf_put(struct ibuf *ibuf);
+
+/**
+ * Put the global ibuf back and free its memory. So only the buffer object
+ * itself is saved to the stash. Main reason why it is a dedicated function is
+ * because it is often needed from Lua, and allows not to call :recycle() there,
+ * which would be an additional FFI call before cord_ibuf_put().
+ *
+ * XXX: recycle of the global buffer is a workaround for the ibuf being used in
+ * some places working with Lua API, where it wasn't wanted to "reuse" it
+ * anyhow. Instead, the global buffer is used to protect from the buffer leak in
+ * case it would be created locally, and then a Lua error would be raised. When
+ * the buffer is global, it is not a problem, because it is reused/recycled
+ * later. But it hurts the places, where re-usage could be good. Probably it is
+ * worth to separate take/put() from new/drop() API. Or delete drop() entirely.
+ */
+void
+cord_ibuf_drop(struct ibuf *ibuf);
+
+#if defined(__cplusplus)
+}
+#endif /* defined(__cplusplus) */
diff --git a/src/lua/buffer.lua b/src/lua/buffer.lua
index a72d8d1f9..398319cf2 100644
--- a/src/lua/buffer.lua
+++ b/src/lua/buffer.lua
@@ -7,7 +7,15 @@ ffi.cdef[[
 struct slab_cache;
 struct slab_cache *
 tarantool_lua_slab_cache();
-extern struct ibuf *tarantool_lua_ibuf;
+
+struct ibuf *
+cord_ibuf_take(void);
+
+void
+cord_ibuf_put(struct ibuf *ibuf);
+
+void
+cord_ibuf_drop(struct ibuf *ibuf);
 
 struct ibuf
 {
@@ -174,8 +182,47 @@ local function ibuf_new(arg, arg2)
     errorf('Usage: ibuf([size])')
 end
 
+--
+-- Cord buffer is useful for the places, where
+--
+-- * Want to reuse the already allocated memory which might be stored in the
+--   cord buf. Although sometimes the buffer is recycled, so should not rely on
+--   being able to reuse it always. When reused, the win is the biggest -
+--   becomes about x20 times faster than a new buffer creation (~5ns vs ~100ns);
+--
+-- * Want to avoid allocation of a new ibuf because it produces a new GC object
+--   which is additional load for Lua GC. Although according to benches it is
+--   not super expensive;
+--
+-- * Almost always can put the buffer back manually. Not rely on it being
+--   recycled automatically. It is recycled, but still should not rely on that;
+--
+-- It is important to wrap the C functions, not expose them directly. Because
+-- JIT works a bit better when C functions are called as 'ffi.C.func()' than
+-- 'func()' with func being cached. The only pros is to cache 'ffi.C' itself.
+-- It is quite strange though how having them wrapped into a Lua function is
+-- faster than cached directly as C functions.
+--
+local function cord_ibuf_take()
+    return builtin.cord_ibuf_take()
+end
+
+local function cord_ibuf_put(buf)
+    return builtin.cord_ibuf_put(buf)
+end
+
+local function cord_ibuf_drop(buf)
+    return builtin.cord_ibuf_drop(buf)
+end
+
+local internal = {
+    cord_ibuf_take = cord_ibuf_take,
+    cord_ibuf_put = cord_ibuf_put,
+    cord_ibuf_drop = cord_ibuf_drop,
+}
+
 return {
+    internal = internal,
     ibuf = ibuf_new;
-    IBUF_SHARED = ffi.C.tarantool_lua_ibuf;
     READAHEAD = READAHEAD;
 }
diff --git a/src/lua/iconv.lua b/src/lua/iconv.lua
index e68509dec..732b80514 100644
--- a/src/lua/iconv.lua
+++ b/src/lua/iconv.lua
@@ -1,6 +1,8 @@
 local ffi    = require('ffi')
 local errno  = require('errno')
 local buffer = require('buffer')
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_put = buffer.internal.cord_ibuf_put
 
 ffi.cdef[[
 typedef struct iconv *iconv_t;
@@ -33,10 +35,9 @@ local function iconv_convert(iconv, data)
 
     -- prepare at lease BUF_SIZE and at most data_len bytes in shared buffer
     local output_len = data_len >= BUF_SIZE and data_len or BUF_SIZE
-    local buf      = buffer.IBUF_SHARED;
+    local buf      = cord_ibuf_take();
     local buf_ptr  = char_ptr_arr_t()
     local buf_left = size_t_arr_t()
-    buf:reset()
 
     while data_left[0] > 0 do
         buf_ptr[0]  = buf:reserve(output_len)
@@ -46,6 +47,7 @@ local function iconv_convert(iconv, data)
         if res == ffi.cast('size_t', -1) then
             local err = errno()
             if err ~= E2BIG then
+                cord_ibuf_put(buf)
                 ffi.C.tnt_iconv(iconv, nil, nil, nil, nil)
                 if err == EINVAL then
                     error('Invalid multibyte sequence')
@@ -62,7 +64,7 @@ local function iconv_convert(iconv, data)
     -- iconv function sets cd's conversion state to the initial state
     ffi.C.tnt_iconv(iconv, nil, nil, nil, nil)
     local result = ffi.string(buf.rpos, buf:size())
-    buf:reset()
+    cord_ibuf_put(buf)
     return result
 end
 
diff --git a/src/lua/init.c b/src/lua/init.c
index d11ab8bf3..3809f6068 100644
--- a/src/lua/init.c
+++ b/src/lua/init.c
@@ -70,8 +70,6 @@
  * The single Lua state of the transaction processor (tx) thread.
  */
 struct lua_State *tarantool_L;
-static struct ibuf tarantool_lua_ibuf_body;
-struct ibuf *tarantool_lua_ibuf = &tarantool_lua_ibuf_body;
 /**
  * The fiber running the startup Lua script
  */
@@ -429,7 +427,6 @@ tarantool_lua_init(const char *tarantool_bin, int argc, char **argv)
 	if (L == NULL) {
 		panic("failed to initialize Lua");
 	}
-	ibuf_create(tarantool_lua_ibuf, tarantool_lua_slab_cache(), 16000);
 	luaL_openlibs(L);
 	tarantool_lua_setpaths(L);
 
diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
index 838f582ff..ea63570bd 100644
--- a/src/lua/msgpack.c
+++ b/src/lua/msgpack.c
@@ -41,6 +41,7 @@
 #include <small/region.h>
 #include <small/ibuf.h>
 
+#include "cord_buf.h"
 #include <fiber.h>
 
 void
@@ -448,8 +449,7 @@ lua_msgpack_encode(lua_State *L)
 					  "must be of type 'struct ibuf'");
 		}
 	} else {
-		buf = tarantool_lua_ibuf;
-		ibuf_reset(buf);
+		buf = cord_ibuf_take();
 	}
 	size_t used = ibuf_used(buf);
 
@@ -466,7 +466,7 @@ lua_msgpack_encode(lua_State *L)
 		lua_pushinteger(L, ibuf_used(buf) - used);
 	} else {
 		lua_pushlstring(L, buf->buf, ibuf_used(buf));
-		ibuf_reinit(buf);
+		cord_ibuf_drop(buf);
 	}
 	return 1;
 }
diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index abcbd54fa..ad7998ed1 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -10,6 +10,8 @@ local uint16_ptr_t = ffi.typeof('uint16_t *')
 local uint32_ptr_t = ffi.typeof('uint32_t *')
 local uint64_ptr_t = ffi.typeof('uint64_t *')
 local char_ptr_t = ffi.typeof('char *')
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_drop = buffer.internal.cord_ibuf_drop
 
 ffi.cdef([[
 char *
@@ -280,11 +282,10 @@ local function encode_r(buf, obj, level)
 end
 
 local function encode(obj)
-    local tmpbuf = buffer.IBUF_SHARED
-    tmpbuf:reset()
+    local tmpbuf = cord_ibuf_take()
     encode_r(tmpbuf, obj, 0)
     local r = ffi.string(tmpbuf.rpos, tmpbuf:size())
-    tmpbuf:recycle()
+    cord_ibuf_drop(tmpbuf)
     return r
 end
 
diff --git a/src/lua/socket.lua b/src/lua/socket.lua
index 7c2b1ac6b..f68270876 100644
--- a/src/lua/socket.lua
+++ b/src/lua/socket.lua
@@ -10,6 +10,8 @@ local fiber = require('fiber')
 local fio = require('fio')
 local log = require('log')
 local buffer = require('buffer')
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_drop = buffer.internal.cord_ibuf_drop
 
 local format = string.format
 
@@ -293,19 +295,15 @@ local function socket_sysread(self, arg1, arg2)
         error('socket:sysread(): size can not be negative')
     end
 
-    local buf = buffer.IBUF_SHARED
-    buf:reset()
+    local buf = cord_ibuf_take()
     local p = buf:alloc(size)
 
     local res = sysread(self, p, size)
     if res then
-        local str = ffi.string(p, res)
-        buf:recycle()
-        return str
-    else
-        buf:recycle()
-        return res
+        res = ffi.string(p, res)
     end
+    cord_ibuf_drop(buf)
+    return res
 end
 
 local function socket_nonblock(self, nb)
diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 6d3e4d39a..0d9c49a8b 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -33,11 +33,10 @@
 #include <coll.h>
 #include "lua/utils.h"
 #include "lua/utf8.h"
+#include "cord_buf.h"
 #include "diag.h"
 #include "small/ibuf.h"
 
-extern struct ibuf *tarantool_lua_ibuf;
-
 /** Default universal casemap for case transformations. */
 static UCaseMap *root_map = NULL;
 
@@ -54,12 +53,13 @@ utf8_str_to_case(struct lua_State *L, const char *src, int src_bsize,
 	int i = 0;
 	int dst_bsize = src_bsize;
 	(void) i;
+	struct ibuf *ibuf = cord_ibuf_take();
 	do {
 		UErrorCode err = U_ZERO_ERROR;
-		ibuf_reset(tarantool_lua_ibuf);
-		char *dst = ibuf_alloc(tarantool_lua_ibuf, dst_bsize);
+		char *dst = ibuf_alloc(ibuf, dst_bsize);
 		if (dst == NULL) {
 			diag_set(OutOfMemory, dst_bsize, "ibuf_alloc", "dst");
+			cord_ibuf_put(ibuf);
 			return luaT_error(L);
 		}
 		int real_bsize;
@@ -75,11 +75,13 @@ utf8_str_to_case(struct lua_State *L, const char *src, int src_bsize,
 		if (err == U_ZERO_ERROR ||
 		    err == U_STRING_NOT_TERMINATED_WARNING) {
 			lua_pushlstring(L, dst, real_bsize);
+			cord_ibuf_put(ibuf);
 			return 1;
 		} else if (err == U_BUFFER_OVERFLOW_ERROR) {
 			assert(real_bsize > dst_bsize);
 			dst_bsize = real_bsize;
 		} else {
+			cord_ibuf_put(ibuf);
 			lua_pushnil(L);
 			lua_pushstring(L, tt_sprintf("error during ICU case "\
 						     "transform: %s",
@@ -251,9 +253,10 @@ utf8_char(struct lua_State *L)
 		return 1;
 	}
 	/* Slow way - use dynamic buffer. */
-	ibuf_reset(tarantool_lua_ibuf);
-	char *str = ibuf_alloc(tarantool_lua_ibuf, top * U8_MAX_LENGTH);
+	struct ibuf *ibuf = cord_ibuf_take();
+	char *str = ibuf_alloc(ibuf, top * U8_MAX_LENGTH);
 	if (str == NULL) {
+		cord_ibuf_put(ibuf);
 		diag_set(OutOfMemory, top * U8_MAX_LENGTH, "ibuf_alloc",
 			 "str");
 		return luaT_error(L);
@@ -263,6 +266,7 @@ utf8_char(struct lua_State *L)
 		U8_APPEND_UNSAFE(str, len, c);
 	}
 	lua_pushlstring(L, str, len);
+	cord_ibuf_put(ibuf);
 	return 1;
 }
 
diff --git a/src/lua/utils.h b/src/lua/utils.h
index cc2e4211f..a61212102 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -66,7 +66,6 @@ typedef struct ibuf box_ibuf_t;
  * snprintf(m_errmsg, sizeof(m_errmsg), "%s", msg ? msg : "");
  */
 extern struct lua_State *tarantool_L;
-extern struct ibuf *tarantool_lua_ibuf;
 
 /** \cond public */
 
diff --git a/test/unit/luaT_tuple_new.c b/test/unit/luaT_tuple_new.c
index dcf16ef02..60bc522dc 100644
--- a/test/unit/luaT_tuple_new.c
+++ b/test/unit/luaT_tuple_new.c
@@ -25,8 +25,6 @@
  * box/tuple.test.lua.
  */
 
-extern struct ibuf *tarantool_lua_ibuf;
-
 uint32_t
 min_u32(uint32_t a, uint32_t b)
 {
@@ -180,8 +178,6 @@ main()
 	memory_init();
 	fiber_init(fiber_c_invoke);
 
-	ibuf_create(tarantool_lua_ibuf, &cord()->slabc, 16000);
-
 	struct lua_State *L = luaL_newstate();
 	luaL_openlibs(L);
 
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 06/15] cord_buf: introduce ownership management
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (10 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 05/15] cord_buf: introduce cord_buf API Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 07/15] buffer: implement ffi stash Vladislav Shpilevoy via Tarantool-patches
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

The global ibuf used for hot Lua and Lua C code didn't have
ownership management. As a result, it could be reused in some
unexpected ways during Lua GC via __gc handlers, even if it was
currently in use in some code below the stack.

The patch makes cord_ibuf_take() steal the global buffer from its
global stash, and assign to the current fiber. cord_ibuf_put()
puts it back to the stash, and detaches from the fiber. If yield
happens before cord_ibuf_put(), the buffer is detached
automatically.

Fiber attach/detach is done via on_yield/on_stop triggers. The
buffer is not supposed to survive a yield, so this allows to
free/put the buffer back to the stash even if the owner didn't do
that. For instance, if a Lua exception was raised before
cord_ibuf_put() was called.

This makes cord buffer being safe to use in any yield-free code,
even if Lua GC might be started. And in non-Lua code as well.

Part of #5632

(cherry picked from commit c20e0449c52e36a987fb1c8fa3c18f398a395851)
---
 src/cord_buf.c               | 148 +++++++++++++++++++++++++++++++----
 src/cord_buf.h               |   6 +-
 test/app-tap/buffer.test.lua |  59 ++++++++++++++
 3 files changed, 197 insertions(+), 16 deletions(-)
 create mode 100755 test/app-tap/buffer.test.lua

diff --git a/src/cord_buf.c b/src/cord_buf.c
index cac508c3d..661c60010 100644
--- a/src/cord_buf.c
+++ b/src/cord_buf.c
@@ -5,6 +5,7 @@
  */
 #include "cord_buf.h"
 #include "fiber.h"
+#include "trigger.h"
 
 #include "small/ibuf.h"
 
@@ -13,35 +14,152 @@ enum {
 	CORD_IBUF_START_CAPACITY = 16384,
 };
 
-static struct ibuf *cord_buf_global = NULL;
+/** Global buffer with automatic collection on fiber yield. */
+struct cord_buf {
+	/** Base buffer. */
+	struct ibuf base;
+	/**
+	 * Triggers on fiber stop/yield when the buffer is either destroyed or
+	 * cached to the global stash for later reuse.
+	 */
+	struct trigger on_stop;
+	struct trigger on_yield;
+#ifndef NDEBUG
+	/**
+	 * Fiber owning the buffer right now. Used for debug and sanity checks
+	 * only.
+	 */
+	struct fiber *owner;
+#endif
+};
 
-struct ibuf *
-cord_ibuf_take(void)
+/**
+ * The global buffer last saved to the cache. Having it here is supposed to
+ * help to reuse the buffer's already allocated data sometimes.
+ */
+static struct cord_buf *cord_buf_global = NULL;
+
+static inline void
+cord_buf_put(struct cord_buf *buf);
+
+static void
+cord_buf_delete(struct cord_buf *buf);
+
+static inline void
+cord_buf_set_owner(struct cord_buf *buf)
 {
-	assert(cord_is_main());
-	struct ibuf *buf = cord_buf_global;
-	if (buf != NULL) {
-		ibuf_reset(buf);
-		return buf;
-	}
-	buf = malloc(sizeof(*buf));
+	assert(buf->owner == NULL);
+	struct fiber *f = fiber();
+	trigger_add(&f->on_stop, &buf->on_stop);
+	trigger_add(&f->on_yield, &buf->on_yield);
+#ifndef NDEBUG
+	buf->owner = f;
+#endif
+	ibuf_reset(&buf->base);
+}
+
+static inline void
+cord_buf_clear_owner(struct cord_buf *buf)
+{
+	assert(buf->owner == fiber());
+	trigger_clear(&buf->on_stop);
+	trigger_clear(&buf->on_yield);
+#ifndef NDEBUG
+	buf->owner = NULL;
+#endif
+}
+
+static void
+cord_buf_on_stop(struct trigger *trigger, void *event)
+{
+	(void)event;
+	struct cord_buf *buf = trigger->data;
+	assert(trigger == &buf->on_stop);
+	cord_buf_put(buf);
+}
+
+static void
+cord_buf_on_yield(struct trigger *trigger, void *event)
+{
+	(void)event;
+	struct cord_buf *buf = trigger->data;
+	assert(trigger == &buf->on_yield);
+	cord_buf_put(buf);
+}
+
+static struct cord_buf *
+cord_buf_new(void)
+{
+	struct cord_buf *buf = malloc(sizeof(*buf));
 	if (buf == NULL)
 		panic("Couldn't allocate thread buffer");
-	ibuf_create(buf, &cord()->slabc, CORD_IBUF_START_CAPACITY);
-	cord_buf_global = buf;
+	ibuf_create(&buf->base, &cord()->slabc, CORD_IBUF_START_CAPACITY);
+	trigger_create(&buf->on_stop, cord_buf_on_stop, buf, NULL);
+	trigger_create(&buf->on_yield, cord_buf_on_yield, buf, NULL);
+#ifndef NDEBUG
+	buf->owner = NULL;
+#endif
+	return buf;
+}
+
+static inline void
+cord_buf_put(struct cord_buf *buf)
+{
+	assert(cord_is_main());
+	cord_buf_clear_owner(buf);
+	/*
+	 * Delete if the stash is busy. It could happen if there was >= 2
+	 * buffers at some point and one of them is already saved back to the
+	 * stash.
+	 *
+	 * XXX: in future it might be useful to consider saving the buffers into
+	 * a list. Maybe keep always at most 2 buffers, because usually there
+	 * are at most 2 contexts: normal Lua and Lua during GC. Recursive
+	 * GC is supposed to be rare, no need to optimize it.
+	 */
+	if (cord_buf_global == NULL)
+		cord_buf_global = buf;
+	else
+		cord_buf_delete(buf);
+}
+
+static inline struct cord_buf *
+cord_buf_take(void)
+{
+	assert(cord_is_main());
+	struct cord_buf *buf = cord_buf_global;
+	if (buf != NULL)
+		cord_buf_global = NULL;
+	else
+		buf = cord_buf_new();
+	cord_buf_set_owner(buf);
 	return buf;
 }
 
+static void
+cord_buf_delete(struct cord_buf *buf)
+{
+	assert(buf->owner == NULL);
+	ibuf_destroy(&buf->base);
+	TRASH(buf);
+	free(buf);
+}
+
+struct ibuf *
+cord_ibuf_take(void)
+{
+	return &cord_buf_take()->base;
+}
+
 void
 cord_ibuf_put(struct ibuf *ibuf)
 {
-	(void)ibuf;
-	assert(ibuf == cord_buf_global);
+	cord_buf_put((struct cord_buf *)ibuf);
 }
 
 void
 cord_ibuf_drop(struct ibuf *ibuf)
 {
 	ibuf_reinit(ibuf);
-	assert(ibuf == cord_buf_global);
+	cord_ibuf_put(ibuf);
 }
diff --git a/src/cord_buf.h b/src/cord_buf.h
index 59f429c8f..5e65d138b 100644
--- a/src/cord_buf.h
+++ b/src/cord_buf.h
@@ -18,7 +18,9 @@ struct ibuf *
 cord_ibuf_take(void);
 
 /**
- * Put the global ibuf back.
+ * Put the global ibuf back. It is not necessary - the buffer is put back on the
+ * next yield. But then it can't be reused/freed until the yield. Put it back
+ * manually when possible.
  */
 void
 cord_ibuf_put(struct ibuf *ibuf);
@@ -29,6 +31,8 @@ cord_ibuf_put(struct ibuf *ibuf);
  * because it is often needed from Lua, and allows not to call :recycle() there,
  * which would be an additional FFI call before cord_ibuf_put().
  *
+ * Drop is not necessary though, see the put() comment.
+ *
  * XXX: recycle of the global buffer is a workaround for the ibuf being used in
  * some places working with Lua API, where it wasn't wanted to "reuse" it
  * anyhow. Instead, the global buffer is used to protect from the buffer leak in
diff --git a/test/app-tap/buffer.test.lua b/test/app-tap/buffer.test.lua
new file mode 100755
index 000000000..f57b3cf45
--- /dev/null
+++ b/test/app-tap/buffer.test.lua
@@ -0,0 +1,59 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local fiber = require('fiber')
+local buffer = require('buffer')
+local cord_ibuf_take = buffer.internal.cord_ibuf_take
+local cord_ibuf_put = buffer.internal.cord_ibuf_put
+local cord_ibuf_drop = buffer.internal.cord_ibuf_drop
+
+local function test_cord_ibuf(test)
+    test:plan(10)
+
+    local ibuf1 = cord_ibuf_take()
+    test:is(ibuf1:size(), 0, 'is empty')
+    ibuf1:alloc(1)
+    test:is(ibuf1:size(), 1, 'alloc 1')
+    cord_ibuf_put(ibuf1)
+
+    ibuf1 = cord_ibuf_take()
+    test:is(ibuf1:size(), 0, 'is empty again')
+    ibuf1:alloc(1)
+    cord_ibuf_drop(ibuf1)
+
+    ibuf1 = cord_ibuf_take()
+    test:is(ibuf1:capacity(), 0, 'has no capacity')
+    local pos1 = ibuf1:alloc(1)
+    pos1[0] = 1
+
+    local ibuf2 = cord_ibuf_take()
+    test:isnt(ibuf1, ibuf2, 'can have 2 cord buffers')
+    test:is(ibuf2:size(), 0, 'second is empty')
+    local pos2 = ibuf2:alloc(1)
+    pos2[0] = 2
+    test:is(pos1[0], 1, 'change does not affect the first buffer')
+    cord_ibuf_put(ibuf2)
+    ibuf1 = ibuf2
+
+    fiber.yield()
+    ibuf2 = cord_ibuf_take()
+    test:is(ibuf1, ibuf2, 'yield drops the ownership')
+    cord_ibuf_put(ibuf2)
+
+    ibuf1 = nil
+    local f = fiber.new(function()
+        ibuf1 = cord_ibuf_take()
+    end)
+    f:set_joinable(true)
+    f:join()
+    test:isnt(ibuf1, nil, 'took a cord buf in a new fiber')
+    ibuf2 = cord_ibuf_take()
+    test:is(ibuf1, ibuf2, 'was freed on fiber stop and reused')
+    cord_ibuf_put(ibuf2)
+end
+
+local test = tap.test('buffer')
+test:plan(1)
+test:test("cord buffer", test_cord_ibuf)
+
+os.exit(test:check() and 0 or 1)
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 07/15] buffer: implement ffi stash
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (11 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 06/15] cord_buf: introduce ownership management Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 08/15] uuid: replace static_alloc with " Vladislav Shpilevoy via Tarantool-patches
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

Buffer module now exposes ffi_stash_new() function which returns 2
functions take() and put().

FFI stash implements proper ownership of global heavy-to-create
objects which can only be created via FFI. Such as structs,
pointers, arrays.

It should help to fix buffer's registers (buffer.reg1,
buffer.reg2, buffer.reg_array), and other global FFI objects such
as 'struct port_c' in schema.lua.

The issue is that when these objects are global, they might be
re-used right during usage in case Lua starts GC and invokes
__gc handlers. Just like it happened with IBUF_SHARED and
static_alloc().

Part of #5632

(cherry picked from commit a5549a44db835a59ad6ab82d206172f9d8047a2d)
---
 src/lua/buffer.lua | 50 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/src/lua/buffer.lua b/src/lua/buffer.lua
index 398319cf2..555970ea8 100644
--- a/src/lua/buffer.lua
+++ b/src/lua/buffer.lua
@@ -182,6 +182,55 @@ local function ibuf_new(arg, arg2)
     errorf('Usage: ibuf([size])')
 end
 
+--
+-- Stash keeps an FFI object for re-usage and helps to ensure the proper
+-- ownership. Is supposed to be used in yield-free code when almost always it is
+-- possible to put the taken object back.
+-- Then cost of the stash is almost the same as ffi.new() for small objects like
+-- 'int[1]' even when jitted. Examples:
+--
+-- * ffi.new('int[1]') is about ~0.4ns, while the stash take() + put() is about
+--   ~0.8ns;
+--
+-- * Much better on objects > 128 bytes in size. ffi.new('struct uri[1]') is
+--   ~300ns, while the stash is still ~0.8ns;
+--
+-- * For structs not allocated as an array is also much better than ffi.new().
+--   For instance, ffi.new('struct tt_uuid') is ~300ns, the stash is ~0.8ns.
+--   Even though 'struct tt_uuid' is 16 bytes;
+--
+local function ffi_stash_new(c_type)
+    local item = nil
+
+    local function take()
+        local res
+        -- This line is guaranteed to be GC-safe. GC is not invoked. Because
+        -- there are no allocation. So it can be considered 'atomic'.
+        res, item = item, nil
+        -- The next lines don't need to be atomic and can survive GC. The only
+        -- important part was to take the global item and set it to nil.
+        if res then
+            return res
+        end
+        return ffi.new(c_type)
+    end
+
+    local function put(i)
+        -- It is ok to rewrite the existing global item if it was set. Does
+        -- not matter. They are all the same.
+        item = i
+    end
+
+    -- Due to some random reason if the stash returns a table with methods it
+    -- works faster than returning them as multiple values. Regardless of how
+    -- the methods are used later. Even if the caller will cache take and put
+    -- methods anyway.
+    return {
+        take = take,
+        put = put,
+    }
+end
+
 --
 -- Cord buffer is useful for the places, where
 --
@@ -225,4 +274,5 @@ return {
     internal = internal,
     ibuf = ibuf_new;
     READAHEAD = READAHEAD;
+    ffi_stash_new = ffi_stash_new,
 }
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 08/15] uuid: replace static_alloc with ffi stash
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (12 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 07/15] buffer: implement ffi stash Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 09/15] uuid: drop tt_uuid_str() from Lua Vladislav Shpilevoy via Tarantool-patches
  2021-03-29 15:41 ` [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Kirill Yukhin via Tarantool-patches
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

static_alloc() appears not to be safe to use in Lua, because it
does not provide any ownership protection for the returned values.

The problem appears when something is allocated, then Lua GC
starts, and some __gc handlers might also use static_alloc(). In
Lua and in C - both lead to the buffer being corrupted in its
original usage place.

The patch is a part of activity of getting rid of static_alloc()
in Lua. It removes it from uuid Lua module and makes it use the
new FFI stash feature, which helps to cache frequently used and
heavy to allocate FFI values.

ffi.new() is not used, because

- It produces a new GC object;

- ffi.new('struct tt_uuid') costs around 300ns while FFI stash
  costs around 0.8ns (although it is magically fixed when
  ffi.new('struct tt_uuid[1]') is used);

- Without JIT ffi.new() costs about the same as the stash, ~280ns
  for small objects like tt_uuid.

The patch makes uuid perf a bit better in the places where
static_alloc() was used, because its cost was around 7ns for one
allocation.

(cherry picked from commit 5cbb69077ff9fe22224f94aeaa05d8da9ed9cb6e)
---
 src/lua/uuid.lua     | 24 +++++++++++++++++-------
 test/app/uuid.result |  2 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/src/lua/uuid.lua b/src/lua/uuid.lua
index 956ad6e36..46b35075a 100644
--- a/src/lua/uuid.lua
+++ b/src/lua/uuid.lua
@@ -1,6 +1,7 @@
 -- uuid.lua (internal file)
 
 local ffi = require("ffi")
+local buffer = require('buffer')
 local builtin = ffi.C
 
 ffi.cdef[[
@@ -33,7 +34,9 @@ extern const struct tt_uuid uuid_nil;
 local uuid_t = ffi.typeof('struct tt_uuid')
 local UUID_STR_LEN = 36
 local UUID_LEN = ffi.sizeof(uuid_t)
-local uuidbuf = ffi.new(uuid_t)
+local uuid_stash = buffer.ffi_stash_new(uuid_t)
+local uuid_stash_take = uuid_stash.take
+local uuid_stash_put = uuid_stash.put
 
 local uuid_tostring = function(uu)
     if not ffi.istype(uuid_t, uu) then
@@ -69,11 +72,12 @@ local uuid_tobin = function(uu, byteorder)
         return error('Usage: uuid:bin([byteorder])')
     end
     if need_bswap(byteorder) then
-        if uu ~= uuidbuf then
-            ffi.copy(uuidbuf, uu, UUID_LEN)
-        end
+        local uuidbuf = uuid_stash_take()
+        ffi.copy(uuidbuf, uu, UUID_LEN)
         builtin.tt_uuid_bswap(uuidbuf)
-        return ffi.string(ffi.cast('char *', uuidbuf), UUID_LEN)
+        uu = ffi.string(ffi.cast('char *', uuidbuf), UUID_LEN)
+        uuid_stash_put(uuidbuf)
+        return uu
     end
     return ffi.string(ffi.cast('char *', uu), UUID_LEN)
 end
@@ -114,12 +118,18 @@ local uuid_new = function()
 end
 
 local uuid_new_bin = function(byteorder)
+    local uuidbuf = uuid_stash_take()
     builtin.tt_uuid_create(uuidbuf)
-    return uuid_tobin(uuidbuf, byteorder)
+    local res = uuid_tobin(uuidbuf, byteorder)
+    uuid_stash_put(uuidbuf)
+    return res
 end
 local uuid_new_str = function()
+    local uuidbuf = uuid_stash_take()
     builtin.tt_uuid_create(uuidbuf)
-    return uuid_tostring(uuidbuf)
+    local res = uuid_tostring(uuidbuf)
+    uuid_stash_put(uuidbuf)
+    return res
 end
 
 local uuid_mt = {
diff --git a/test/app/uuid.result b/test/app/uuid.result
index 0713614c6..1da8b3e58 100644
--- a/test/app/uuid.result
+++ b/test/app/uuid.result
@@ -106,7 +106,7 @@ uu.node[5]
 -- invalid values
 uuid.fromstr(nil)
 ---
-- error: 'builtin/uuid.lua:47: fromstr(str)'
+- error: 'builtin/uuid.lua:50: fromstr(str)'
 ...
 uuid.fromstr('')
 ---
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Tarantool-patches] [PATCH 09/15] uuid: drop tt_uuid_str() from Lua
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (13 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 08/15] uuid: replace static_alloc with " Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-24 21:24 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-29 15:41 ` [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Kirill Yukhin via Tarantool-patches
  15 siblings, 0 replies; 17+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-24 21:24 UTC (permalink / raw)
  To: tarantool-patches, kyukhin

The function converts struct tt_uuid * to a string. The string is
allocated on the static buffer, which can't be used in Lua due to
unpredictable GC behaviour. It can start working any moment even
if tt_uuid_str() has returned, but its result wasn't passed to
ffi.string() yet. Then the buffer might be overwritten.

Lua uuid now uses tt_uuid_to_string() which does the same but
takes the buffer pointer. The buffer is stored in an ffi stash,
because it is x4 times faster than ffi.new('char[37]') (where 37
is length of a UUID string + terminating 0) (2.4 ns vs 0.8 ns).

After this patch UUID is supposed to be fully compatible with Lua
GC handlers.

Part of #5632

(cherry picked from commit acf8745ed8fef47e6d1f1c31708c7c9d6324d2f3)
---
 extra/exports                              |  1 +
 src/lua/uuid.lua                           | 13 ++++--
 test/app-tap/gh-5632-gc-buf-reuse.test.lua | 49 ++++++++++++++++++++++
 test/app/uuid.result                       |  2 +-
 4 files changed, 61 insertions(+), 4 deletions(-)
 create mode 100755 test/app-tap/gh-5632-gc-buf-reuse.test.lua

diff --git a/extra/exports b/extra/exports
index 91094206d..b27b22a7c 100644
--- a/extra/exports
+++ b/extra/exports
@@ -51,6 +51,7 @@ tt_uuid_is_equal
 tt_uuid_is_nil
 tt_uuid_bswap
 tt_uuid_from_string
+tt_uuid_to_string
 log_level
 log_format
 uri_parse
diff --git a/src/lua/uuid.lua b/src/lua/uuid.lua
index 46b35075a..3efb7f66b 100644
--- a/src/lua/uuid.lua
+++ b/src/lua/uuid.lua
@@ -26,8 +26,6 @@ bool
 tt_uuid_is_nil(const struct tt_uuid *uu);
 bool
 tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs);
-char *
-tt_uuid_str(const struct tt_uuid *uu);
 extern const struct tt_uuid uuid_nil;
 ]]
 
@@ -38,11 +36,20 @@ local uuid_stash = buffer.ffi_stash_new(uuid_t)
 local uuid_stash_take = uuid_stash.take
 local uuid_stash_put = uuid_stash.put
 
+local uuid_str_stash =
+    buffer.ffi_stash_new(string.format('char[%s]', UUID_STR_LEN + 1))
+local uuid_str_stash_take = uuid_str_stash.take
+local uuid_str_stash_put = uuid_str_stash.put
+
 local uuid_tostring = function(uu)
     if not ffi.istype(uuid_t, uu) then
         return error('Usage: uuid:str()')
     end
-    return ffi.string(builtin.tt_uuid_str(uu), UUID_STR_LEN)
+    local strbuf = uuid_str_stash_take()
+    builtin.tt_uuid_to_string(uu, strbuf)
+    uu = ffi.string(strbuf, UUID_STR_LEN)
+    uuid_str_stash_put(strbuf)
+    return uu
 end
 
 local uuid_fromstr = function(str)
diff --git a/test/app-tap/gh-5632-gc-buf-reuse.test.lua b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
new file mode 100755
index 000000000..b09b1bf3e
--- /dev/null
+++ b/test/app-tap/gh-5632-gc-buf-reuse.test.lua
@@ -0,0 +1,49 @@
+#!/usr/bin/env tarantool
+
+--
+-- gh-5632: Lua code should not use any global buffers or objects without
+-- proper ownership protection. Otherwise these items might be suddenly reused
+-- during Lua GC which happens almost at any moment. That might lead to data
+-- corruption.
+--
+
+local tap = require('tap')
+local ffi = require('ffi')
+local uuid = require('uuid')
+
+local function test_uuid(test)
+    test:plan(1)
+
+    local gc_count = 100
+    local iter_count = 1000
+    local is_success = true
+
+    local function uuid_to_str()
+        local uu = uuid.new()
+        local str1 = uu:str()
+        local str2 = uu:str()
+        if str1 ~= str2 then
+            is_success = false
+            assert(false)
+        end
+    end
+
+    local function create_gc()
+        for _ = 1, gc_count do
+            ffi.gc(ffi.new('char[1]'), function() uuid_to_str() end)
+        end
+    end
+
+    for _ = 1, iter_count do
+        create_gc()
+        uuid_to_str()
+    end
+
+    test:ok(is_success, 'uuid in gc')
+end
+
+local test = tap.test('gh-5632-gc-buf-reuse')
+test:plan(1)
+test:test('uuid in __gc', test_uuid)
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/app/uuid.result b/test/app/uuid.result
index 1da8b3e58..b252c497d 100644
--- a/test/app/uuid.result
+++ b/test/app/uuid.result
@@ -106,7 +106,7 @@ uu.node[5]
 -- invalid values
 uuid.fromstr(nil)
 ---
-- error: 'builtin/uuid.lua:50: fromstr(str)'
+- error: 'builtin/uuid.lua:57: fromstr(str)'
 ...
 uuid.fromstr('')
 ---
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10
  2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
                   ` (14 preceding siblings ...)
  2021-03-24 21:24 ` [Tarantool-patches] [PATCH 09/15] uuid: drop tt_uuid_str() from Lua Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-29 15:41 ` Kirill Yukhin via Tarantool-patches
  15 siblings, 0 replies; 17+ messages in thread
From: Kirill Yukhin via Tarantool-patches @ 2021-03-29 15:41 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hello,

On 24 мар 22:24, Vladislav Shpilevoy wrote:
> The patchset is a ported fix for 5632 from the master branch.
> 
> No big changes except these:
> 
> - 1.10 didn't have static_alloc in Lua directly, but still the static buffer
>   could be returned indirectly from tt_uuid_str() for example;
> 
> - 1.10 didn't have Lua registers, but had some global variables created via
>   ffi.new(). For example, in uuid.lua, msgpackffi.lua;
> 
> - sio address formatting function was formatted differently.
> 
> Nonetheless the commit messages are left intact. They still mention
> static_alloc and Lua registers.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5632-global-buf-crash-1.10
> Issue: https://github.com/tarantool/tarantool/issues/5632

I've cherry-pick your patch set into 1.10.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-03-29 15:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24 21:24 [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 01/15] fio: don't use shared buffer in pread() Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 10/15] uri: replace static_alloc with ffi stash and ibuf Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 11/15] lua: use lua_pushfstring() instead of tt_sprintf() Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 12/15] sio: rework sio_strfaddr() Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 13/15] sio: increase SERVICE_NAME_MAXLEN size Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 14/15] sio: introduce and use sio_snprintf() Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 15/15] buffer: remove Lua registers Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 02/15] test: don't use IBUF_SHARED in the tests Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 03/15] tuple: pass global ibuf explicitly where possible Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 04/15] iconv: take errno before reseting the context Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 05/15] cord_buf: introduce cord_buf API Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 06/15] cord_buf: introduce ownership management Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 07/15] buffer: implement ffi stash Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 08/15] uuid: replace static_alloc with " Vladislav Shpilevoy via Tarantool-patches
2021-03-24 21:24 ` [Tarantool-patches] [PATCH 09/15] uuid: drop tt_uuid_str() from Lua Vladislav Shpilevoy via Tarantool-patches
2021-03-29 15:41 ` [Tarantool-patches] [PATCH 00/15] Cord buffer, static alloc, and Lua GC bug for 1.10 Kirill Yukhin via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox