[PATCH v1 0/8] box: functional indexes
Vladimir Davydov
vdavydov.dev at gmail.com
Mon Jun 3 18:55:33 MSK 2019
I won't be commenting on the code this time, because I have a couple of
high-level concerns that might require you to rewrite the whole patch
set from scratch, provided you agree with my suggestions, of course.
1. The code lacks a layer of abstraction that would provide an easy way
to call any persistent function, irrespective of the language it is
written in. Even though we have only two languages, C and Lua, for
now, we don't have a unified infrastructure that would encapsulate a
function call:
- box_c_call => func_call is used for calling C functions via
iproto.
- box_lua_call => execute_lua_call / execute_persistent_function
is used for calling Lua functions via iproto.
- tuple_fextract_key => execute_tuple_fextract_key is used for
calling persistent Lua functions for indexing data.
This is a mess. I doubt we will be able to add persistent SQL
procedures with such a design. We probably need to make func_call()
handle all kinds of functions, irrespective of language, both
persistent and not (make it virtual probably).
2. I don't like the way you made functional indexes so special to memtx
tree implementation, i.e.
- introduced a separate index vtab for memtx_tree
- made func a member of struct memtx_tree rather than key_def
- added tuple_fextract_key instead of using tuple_extract_key
- wrote special comparators for functional indexes
- attached (tuple => func value) hash to memtx_tree rather than
making it global
This makes it nearly impossible to support functional indexes in
Vinyl without copy-and-pasting a lot of code.
I'm inclined to think that the functional index layer should
be completely hidden behind key_def and its methods and thus
independent of engine internals, i.e.
- func should be a member of key_def
- hash (tuple+func => func value) should be global (or we should
have a separate hash per each function); nodes corresponding to
a particular tuple should be freed when the tuple is deleted
- tuple validation against func should be done when a tuple is
validated against a tuple_format, not when it is inserted into
a memtx index, i.e. tuple_format should "know" about functions
its tuples are supposed to be compatible with; we can cache
function values computed during tuple validation in the hash so
that we don't need to recompute them when we insert the tuple
into an index
- tuple_field_by_part should transparently handle functional index
parts by getting the function value from the hash or computing it
if there's no match
- tree hints should store comparison hints in case the index
function returns a single value; if the function returns an
array, tree hints should store multikey offsets, just like for
non-functional indexes; function value should be obtained from
the hash with the aid of tuple_field_by_part
By implementing this generic infrastructure, we'll get functional
index support for all kinds of indexes for free (vinyl will require
some extra work, but it'll be minimal). Although multikey functional
indexes will perform slower due to hash lookup on each comparison,
single value functional indexes will be faster thanks to comparison
hints. I think it's okay. We will also be able to disable function
value caching completely (or almost completely) to save some memory
if we want to (we might add a configuration option for this in
future).
More information about the Tarantool-patches
mailing list