[Tarantool-patches] [PATCH] tuple: make tuple_bless() compilable

Sergey Kaplun skaplun at tarantool.org
Fri Oct 22 16:02:25 MSK 2021


tuple_bless() uses a tail call to ffi.gc() with return to the caller.
This tail call replaces the current (tuple_bless) frame with the frame
of the callee (ffi.gc). When JIT tries to compile return from `ffi.gc()`
to the frame below it aborts the trace recording with the error "NYI:
return to lower frame".

This patch replaces the tail call with using additional local variable
returned to the caller right after.
---

Actually, this patch become possible thanks to Michael Filonenko and his
benchmarks of TDG runs with jit.dump() enabled. After analysis of this
dump we realize that tuple_bless is not compiled. This uncompiled chunk
of code leads to the JIT cancer for all possible workflows that use
tuple_bless() (i.e. tuple:update() and tuple:upsert()). This change is
really trivial, but adds almost x2 improvement of performance for
tuple:update()/upsert() scenario. Hope, that this patch will be a
stimulus for including benchmarks of our forward products like TDG to
routine performance running with the corresponding profilers dumps.

Benchmarks:

Before patch:

Update:
| Tarantool 2.10.0-beta1-90-g31594b427
| type 'help' for interactive help
| tarantool> local t = {}
|            for i = 1, 1e6 do
|                table.insert(t, box.tuple.new{'abc', 'def', 'ghi', 'abc'})
|            end
|            local clock = require"clock"
|            local S = clock.proc()
|            for i = 1, 1e6 do t[i]:update{{"=", 3, "xxx"}} end
|            return clock.proc() - S;
| ---
| - 4.208298872

Upsert: 4.158661731

After patch:

Update:
| Tarantool 2.10.0-beta1-90-g31594b427
| type 'help' for interactive help
| tarantool> local t = {}
|            for i = 1, 1e6 do
|                table.insert(t, box.tuple.new{'abc', 'def', 'ghi', 'abc'})
|            end
|            local clock = require"clock"
|            local S = clock.proc()
|            for i = 1, 1e6 do t[i]:update{{"=", 3, "xxx"}} end
|            return clock.proc() - S;
| ---
| - 2.357670738

Upsert: 2.334134195

Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-tuple-bless-compile

 src/box/lua/tuple.lua | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/box/lua/tuple.lua b/src/box/lua/tuple.lua
index fa76f4f7f..73446ab22 100644
--- a/src/box/lua/tuple.lua
+++ b/src/box/lua/tuple.lua
@@ -98,7 +98,14 @@ local tuple_bless = function(tuple)
     -- overflow checked by tuple_bless() in C
     builtin.box_tuple_ref(tuple)
     -- must never fail:
-    return ffi.gc(ffi.cast(const_tuple_ref_t, tuple), tuple_gc)
+    -- XXX: If we use tail call (instead creating a new frame for
+    -- a call just replace the top one) here, then JIT tries
+    -- to compile return from `ffi.gc()` to the frame below. This
+    -- abort the trace recording with the error "NYI: return to
+    -- lower frame". So avoid tail call and use additional stack
+    -- slots (for the local variable and the frame).
+    local tuple_ref = ffi.gc(ffi.cast(const_tuple_ref_t, tuple), tuple_gc)
+    return tuple_ref
 end
 
 local tuple_check = function(tuple, usage)
-- 
2.31.0



More information about the Tarantool-patches mailing list