[Tarantool-patches] [PATCH 00/20] Rewrite performance critical parts of net.box in C
Vladimir Davydov
vdavydov at tarantool.org
Fri Jul 23 14:07:10 MSK 2021
https://github.com/tarantool/tarantool/tree/vdavydov/net-box-optimization
This patch set rewrites performance-critical parts of net.box (response
dispatching and IO loop) in C. It shouldn't introduce any user-visible
changes or changes in the logic, because the C version was basically
created by rewriting Lua code line-by-line. The goal of this work is to
improve performance of CPU-bound applications that use net.box, such as
vshard.router.
To ensure that this patch does meet the expectations, I ran a simple
benchmark [tnt-bench.lua], which issues multiple concurrent requests in
a loop. The test measures RPS per wall time (WALL) and processor time
(PROC). Concurrency is implemented with either fibers or futures.
There are a few test cases that issue different kinds of requests:
- REPLACE({k, 'bar', i + k})
- UPDATE({k}, {{'=', 2, 'bar'}, {'=', 3, i + k}})
- SELECT({k})
- CALL('bench_func', {1, 2, 3, 'foo', 'bar'})
where i and k are integers and bench_func is defined as follows:
function bench_func(...) return {...} end
The test was run on my laptop (i5-10210U 1.60GHz) for the following
Tarantool versions built with CMAKE_BUILD_TYPE=RelWithDebInfo:
- master: 2.9.0-165-ga02cfe60cf23
- patched: master + this patch set
- poc: master + [tarantool-net-box-call-in-c.patch]
The latter is a proof-of-concept version that I created before starting
to work on this patch set.
The results are below.
/// USING FIBERS (SYNCHRONOUS) ///
---------+-----------------------------++-----------------------------+
| KRPS (WALL TIME) || KRPS (PROC TIME) |
+---------+---------+---------++---------+---------+---------+
| master | patched | poc || master | patched | poc |
---------+---------+---------+---------++---------+---------+---------+
REPLACE | 162.628 | 268.349 | N/A || 221.402 | 459.965 | N/A |
UPDATE | 126.905 | 195.835 | N/A || 173.635 | 334.609 | N/A |
SELECT | 187.742 | 353.043 | N/A || 207.605 | 427.147 | N/A |
CALL | 163.700 | 290.717 | 375.412 || 213.560 | 481.349 | 761.238 |
/// USING FUTURES (ASYNCHRONOUS) ///
---------+-----------------------------++-----------------------------+
| RPS (WALL TIME) || RPS (PROC TIME) |
+---------+---------+---------++---------+---------+---------+
| master | patched | poc || master | patched | poc |
---------+---------+---------+---------++---------+---------+---------+
REPLACE | 191.529 | 249.810 | N/A || 277.648 | 413.360 | N/A |
UPDATE | 155.116 | 173.850 | N/A || 231.603 | 273.624 | N/A |
SELECT | 238.657 | 286.699 | N/A || 269.040 | 333.706 | N/A |
CALL | 192.041 | 241.571 | N/A || 261.085 | 365.139 | N/A |
So the patch set increases RPS of synchronous net.box.call, which is the
primary method used by vshard.router, by about 75%. Other synchronous
methods show the improvement between 50 and 90%. The requests per
processor second ratio is doubled by the patch, which means that it also
reduces CPU usage during the test - judging by KRPS[WALL]/KRPS[PROC]
the ratio, it is decreased from 75% to 60%.
Asynchronous calls don't show as much of an improvement as synchronous,
because per each asynchronous call we still have to create a 'future'
object in Lua. Still, the improvement is quite noticeable - 30% for
REPLACE, 10% for UPDATE, 20% for SELECT, 25% for CALL.
What is surprising is that the PoC version still outperforms the patched
version by about 30% and shows even lower CPU usage (50% vs 60%). This
is probably caused by the IO loop implementation. I'm going to look into
that separately.
Links:
[tnt-bench.lua] https://gist.github.com/locker/7faeb39129a2421a85568c512288208f
[tarantool-net-box-call-in-c.patch] https://gist.github.com/locker/cd357f9482bfd207ffe7df610c4b2fba
For more information about net.box performance, see
- C/C++ vs Net.Box Connector Performance
https://docs.google.com/document/d/1v-d-qQ9zilOdDgDJZWTzs0cSJ9XVXLWRfoQnxNfYttc
- vshard.router.call performance analysis
https://docs.google.com/document/d/1VwMzs75Umi5IhFw-r54wj0b8s_d9WDCYFZ3lRMFzfB8
Vladimir Davydov (20):
net.box: fix console connection breakage when request is discarded
net.box: wake up wait_result callers when request is discarded
net.box: do not check worker_fiber in request:result,is_ready
net.box: remove decode_push from method_decoder table
net.box: use decode_tuple instead of decode_get
net.box: rename request.ctx to request.format
net.box: use integer id instead of method name
net.box: remove useless encode optimization
net.box: rewrite request encoder in C
lua/utils: make char ptr Lua CTIDs public
net.box: rewrite response decoder in C
net.box: rewrite error decoder in C
net.box: rewrite send_and_recv_{iproto,console} in C
net.box: rename netbox_{prepare,encode}_request to {begin,end}
net.box: rewrite request implementation in C
net.box: store next_request_id in C code
net.box: rewrite console handlers in C
net.box: rewrite iproto handlers in C
net.box: merge new_id, new_request and encode_method
net.box: do not create request object in Lua for sync requests
src/box/lua/net_box.c | 1714 ++++++++++++++---
src/box/lua/net_box.lua | 733 ++-----
src/lib/core/errinj.h | 1 +
src/lua/utils.c | 4 +-
src/lua/utils.h | 2 +
test/box/access.result | 24 +-
test/box/access.test.lua | 20 +-
test/box/errinj.result | 1 +
...net.box_console_connections_gh-2677.result | 2 +-
...t.box_console_connections_gh-2677.test.lua | 2 +-
.../net.box_discard_console_request.result | 62 +
.../net.box_discard_console_request.test.lua | 19 +
test/box/net.box_discard_gh-3107.result | 11 +
test/box/net.box_discard_gh-3107.test.lua | 3 +
.../net.box_incorrect_iterator_gh-841.result | 9 +-
...net.box_incorrect_iterator_gh-841.test.lua | 9 +-
test/box/net.box_iproto_hangs_gh-3464.result | 2 +-
.../box/net.box_iproto_hangs_gh-3464.test.lua | 2 +-
.../net.box_long-poll_input_gh-3400.result | 13 +-
.../net.box_long-poll_input_gh-3400.test.lua | 8 +-
test/box/suite.ini | 2 +-
21 files changed, 1735 insertions(+), 908 deletions(-)
create mode 100644 test/box/net.box_discard_console_request.result
create mode 100644 test/box/net.box_discard_console_request.test.lua
--
2.25.1
More information about the Tarantool-patches
mailing list