[Tarantool-patches] [PATCH] tuple: make tuple_bless() compilable
Sergey Kaplun
skaplun at tarantool.org
Fri Oct 22 16:02:25 MSK 2021
tuple_bless() uses a tail call to ffi.gc() with return to the caller.
This tail call replaces the current (tuple_bless) frame with the frame
of the callee (ffi.gc). When JIT tries to compile return from `ffi.gc()`
to the frame below it aborts the trace recording with the error "NYI:
return to lower frame".
This patch replaces the tail call with using additional local variable
returned to the caller right after.
---
Actually, this patch become possible thanks to Michael Filonenko and his
benchmarks of TDG runs with jit.dump() enabled. After analysis of this
dump we realize that tuple_bless is not compiled. This uncompiled chunk
of code leads to the JIT cancer for all possible workflows that use
tuple_bless() (i.e. tuple:update() and tuple:upsert()). This change is
really trivial, but adds almost x2 improvement of performance for
tuple:update()/upsert() scenario. Hope, that this patch will be a
stimulus for including benchmarks of our forward products like TDG to
routine performance running with the corresponding profilers dumps.
Benchmarks:
Before patch:
Update:
| Tarantool 2.10.0-beta1-90-g31594b427
| type 'help' for interactive help
| tarantool> local t = {}
| for i = 1, 1e6 do
| table.insert(t, box.tuple.new{'abc', 'def', 'ghi', 'abc'})
| end
| local clock = require"clock"
| local S = clock.proc()
| for i = 1, 1e6 do t[i]:update{{"=", 3, "xxx"}} end
| return clock.proc() - S;
| ---
| - 4.208298872
Upsert: 4.158661731
After patch:
Update:
| Tarantool 2.10.0-beta1-90-g31594b427
| type 'help' for interactive help
| tarantool> local t = {}
| for i = 1, 1e6 do
| table.insert(t, box.tuple.new{'abc', 'def', 'ghi', 'abc'})
| end
| local clock = require"clock"
| local S = clock.proc()
| for i = 1, 1e6 do t[i]:update{{"=", 3, "xxx"}} end
| return clock.proc() - S;
| ---
| - 2.357670738
Upsert: 2.334134195
Branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-tuple-bless-compile
src/box/lua/tuple.lua | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/box/lua/tuple.lua b/src/box/lua/tuple.lua
index fa76f4f7f..73446ab22 100644
--- a/src/box/lua/tuple.lua
+++ b/src/box/lua/tuple.lua
@@ -98,7 +98,14 @@ local tuple_bless = function(tuple)
-- overflow checked by tuple_bless() in C
builtin.box_tuple_ref(tuple)
-- must never fail:
- return ffi.gc(ffi.cast(const_tuple_ref_t, tuple), tuple_gc)
+ -- XXX: If we use tail call (instead creating a new frame for
+ -- a call just replace the top one) here, then JIT tries
+ -- to compile return from `ffi.gc()` to the frame below. This
+ -- abort the trace recording with the error "NYI: return to
+ -- lower frame". So avoid tail call and use additional stack
+ -- slots (for the local variable and the frame).
+ local tuple_ref = ffi.gc(ffi.cast(const_tuple_ref_t, tuple), tuple_gc)
+ return tuple_ref
end
local tuple_check = function(tuple, usage)
--
2.31.0
More information about the Tarantool-patches
mailing list