[PATCH v1 0/8] box: functional indexes

Vladimir Davydov vdavydov.dev at gmail.com
Mon Jun 3 18:55:33 MSK 2019


I won't be commenting on the code this time, because I have a couple of
high-level concerns that might require you to rewrite the whole patch
set from scratch, provided you agree with my suggestions, of course.

 1. The code lacks a layer of abstraction that would provide an easy way
    to call any persistent function, irrespective of the language it is
    written in. Even though we have only two languages, C and Lua, for
    now, we don't have a unified infrastructure that would encapsulate a
    function call:

     - box_c_call => func_call is used for calling C functions via
       iproto.
     - box_lua_call => execute_lua_call / execute_persistent_function
       is used for calling Lua functions via iproto.
     - tuple_fextract_key => execute_tuple_fextract_key is used for
       calling persistent Lua functions for indexing data. 

    This is a mess. I doubt we will be able to add persistent SQL
    procedures with such a design. We probably need to make func_call()
    handle all kinds of functions, irrespective of language, both
    persistent and not (make it virtual probably).

 2. I don't like the way you made functional indexes so special to memtx
    tree implementation, i.e.

     - introduced a separate index vtab for memtx_tree
     - made func a member of struct memtx_tree rather than key_def
     - added tuple_fextract_key instead of using tuple_extract_key
     - wrote special comparators for functional indexes
     - attached (tuple => func value) hash to memtx_tree rather than
       making it global

    This makes it nearly impossible to support functional indexes in
    Vinyl without copy-and-pasting a lot of code.

    I'm inclined to think that the functional index layer should
    be completely hidden behind key_def and its methods and thus
    independent of engine internals, i.e.

     - func should be a member of key_def
     - hash (tuple+func => func value) should be global (or we should
       have a separate hash per each function); nodes corresponding to
       a particular tuple should be freed when the tuple is deleted
     - tuple validation against func should be done when a tuple is
       validated against a tuple_format, not when it is inserted into
       a memtx index, i.e. tuple_format should "know" about functions
       its tuples are supposed to be compatible with; we can cache
       function values computed during tuple validation in the hash so
       that we don't need to recompute them when we insert the tuple
       into an index
     - tuple_field_by_part should transparently handle functional index
       parts by getting the function value from the hash or computing it
       if there's no match
     - tree hints should store comparison hints in case the index
       function returns a single value; if the function returns an
       array, tree hints should store multikey offsets, just like for
       non-functional indexes; function value should be obtained from
       the hash with the aid of tuple_field_by_part

    By implementing this generic infrastructure, we'll get functional
    index support for all kinds of indexes for free (vinyl will require
    some extra work, but it'll be minimal). Although multikey functional
    indexes will perform slower due to hash lookup on each comparison,
    single value functional indexes will be faster thanks to comparison
    hints. I think it's okay. We will also be able to disable function
    value caching completely (or almost completely) to save some memory
    if we want to (we might add a configuration option for this in
    future).



More information about the Tarantool-patches mailing list