[PATCH v2 0/5] box: functional indexes

Kirill Shcherbatov kshcherbatov at tarantool.org
Mon Jun 24 17:26:59 MSK 2019


This patchset introduces functional indexes in memtx.
Functional index is an index that use user-defined function to extract
key by processed tuple.

In current implementation only persistent deterministic sandboxed Lua function
previously created with box.schema.func.create may be used in functional index.
This provides a potential ability to support new languages to transparently
extend supported extractors in future (e.g. SQL extractor, C extractor).

Designing functional indexes we should mention the following quality attributes:
QAS1: Functional index mustn't be able to harm data, it must be safe even with malicious user.
QAS2: All customer snapshots must be bootable.

To follow QAS1, function used by functional index must be initialized in a
special sandbox (via setfenv), where only a limited number of functions are
available. This set of functions should be as minimal as possible. None of
them should provide access to Tarantool's data. A function is initialized in a
sandbox also couldn't access global environment.

Factually we are able to use C functions from _func space in functional index
now, but this is contrary to QAS2: an ability to always boot from a user
snapshot. A C dynamic library may depend on some system libraries, may have
some architecture features so we cannot allow it in functional index.

It should be mentioned that _index space is loaded before _func space during
recovery. Therefore the completion of the initialization of functional_handle
structure that represents functional index in tuple format is delayed.
The was an alternative: introduce a new _func space with corresponding number,
but this solution has a huge architectural problems. The legacy system space
can't be dropped, this confuses the code and the environment. The name of
both spaces must be the same, that conflicts with space_by_name cache concept
and produce many asserts.

Changes in version 2:
    1. tuple validation against func are performed when a tuple is validated
       against a tuple_format, not when it is inserted into a memtx index,
       i.e. tuple_format "know" about functions its tuples are supposed to
       be compatible with;
    2. tuple_field_by_part transparently handle functional index parts by
       getting the function value from the hash or computing it if there's
       no match.
    3. all existent hints and multikey machinery is reused
    4. patch is based on new reworked uniform functions

v1: https://www.freelists.org/post/tarantool-patches/PATCH-v1-08-box-functional-indexes

http://github.com/tarantool/tarantool/tree/kshch/gh-1260-functional-index-new
https://github.com/tarantool/tarantool/issues/1260

Kirill Shcherbatov (5):
  box: introduce tuple_extra infrastructure
  box: introduce key_def->is_multikey flag
  box: refactor key_validate_parts to return key_end
  box: move the function hash to the tuple library
  box: introduce functional indexes in memtx

 src/box/CMakeLists.txt          |   4 +-
 src/box/alter.cc                |  23 +-
 src/box/box.cc                  |   3 +-
 src/box/errcode.h               |   1 +
 src/box/func.c                  |   2 +
 src/box/func.h                  |  15 +
 src/box/func_cache.c            | 123 ++++++
 src/box/func_cache.h            |  78 ++++
 src/box/functional_key.c        | 271 +++++++++++++
 src/box/functional_key.h        |  87 ++++
 src/box/index.cc                |  10 +-
 src/box/index.h                 |   3 +-
 src/box/index_def.c             |  55 ++-
 src/box/index_def.h             |  20 +
 src/box/key_def.c               |  72 +++-
 src/box/key_def.h               |  49 ++-
 src/box/lua/key_def.c           |   7 +-
 src/box/lua/schema.lua          |   2 +
 src/box/lua/space.cc            |  14 +
 src/box/memtx_bitset.c          |   5 +-
 src/box/memtx_engine.c          |  82 +++-
 src/box/memtx_rtree.c           |   5 +-
 src/box/memtx_space.c           |  34 +-
 src/box/memtx_tree.c            |   4 +-
 src/box/opt_def.c               |  11 +-
 src/box/opt_def.h               |  20 +-
 src/box/schema.cc               |  73 +---
 src/box/schema.h                |  19 +-
 src/box/space.c                 |   4 +-
 src/box/sql.c                   |   2 +-
 src/box/sql/build.c             |   2 +-
 src/box/sql/select.c            |   2 +-
 src/box/sql/where.c             |   2 +-
 src/box/sysview.c               |   4 +-
 src/box/tuple.c                 |  82 +++-
 src/box/tuple.h                 | 127 +++++-
 src/box/tuple_bloom.c           |   4 +-
 src/box/tuple_compare.cc        | 174 +++++---
 src/box/tuple_extract_key.cc    |  23 +-
 src/box/tuple_format.c          |  71 +++-
 src/box/tuple_format.h          |  51 +++
 src/box/tuple_hash.cc           |  14 +-
 src/box/vinyl.c                 |  19 +-
 src/box/vy_stmt.c               |   3 +
 src/box/vy_stmt.h               |   6 +-
 test/box/bitset.result          |  24 ++
 test/box/bitset.test.lua        |   9 +
 test/box/hash.result            |  24 ++
 test/box/hash.test.lua          |   9 +
 test/box/misc.result            |   1 +
 test/box/rtree_misc.result      |  24 ++
 test/box/rtree_misc.test.lua    |   9 +
 test/engine/engine.cfg          |   5 +-
 test/engine/functional.result   | 689 ++++++++++++++++++++++++++++++++
 test/engine/functional.test.lua | 241 +++++++++++
 test/unit/luaT_tuple_new.c      |   2 +-
 test/unit/merger.test.c         |   6 +-
 test/unit/tuple_bigref.c        |   2 +-
 test/unit/vy_iterators_helper.c |   2 +-
 test/vinyl/misc.result          |  23 ++
 test/vinyl/misc.test.lua        |   9 +
 61 files changed, 2492 insertions(+), 269 deletions(-)
 create mode 100644 src/box/func_cache.c
 create mode 100644 src/box/func_cache.h
 create mode 100644 src/box/functional_key.c
 create mode 100644 src/box/functional_key.h
 create mode 100644 test/engine/functional.result
 create mode 100644 test/engine/functional.test.lua

-- 
2.21.0




More information about the Tarantool-patches mailing list