[Tarantool-patches] [PATCH v5 1/1] box: introduce _session_settings system space

Thu Dec 12 02:47:54 MSK 2019

Hi! Thanks for the patch!

Overall it is getting better and cooler each time!

See 10 comments below!

On 09/12/2019 14:23, imeevma at tarantool.org wrote:
> This patch creates the _session_settings space. This space
> contains names and values of the session settings. User can use
> this system space to view or change session settings.
> 
> Part of #4511
> 
> @TarantoolBot document
> Title: _session_settings system space
> The _session_settings system space used to view or change session
> settings.
> 
> This space uses a new engine. This allows us to create tuples on
> the fly when the get() or select() methods are called. This
> engine does not support the insert(), replace(), and delete()
> methods. The only way to change the setting value is update(),
> which can only be used with the "=" operation.
> 
> Because space creates a tuple on the fly, it allows us to get a
> tuple without saving it anywhere. But this means that every time
> we get a tuple from this system space, it is a new tuple, even if
> they look the same:
> 
> tarantool> s = box.space._session_settings
> tarantool> name = 'sql_default_engine'
> tarantool> s:get({name}) == s:get({name})
> ---
> - false
> ...
> 
> Currently, this space contains only SQL settings, since the only
> session settings are SQL settings.
> 
> List of currently available session settings:
> 
> sql_default_engine
> sql_defer_foreign_keys
> sql_full_column_names
> sql_recursive_triggers
> sql_reverse_unordered_selects
> 
> Debug build also have debug settings that could be obtained from
> this sysview:
> 
> sql_parser_trace
> sql_select_trace
> sql_trace
> sql_vdbe_addoptrace
> sql_vdbe_debug
> sql_vdbe_eqp
> sql_vdbe_listing
> sql_vdbe_trace
> sql_where_trace
> 
> Example of usage:
> tarantool> s = box.space._session_settings
> -- View session settings values.
> tarantool> s:get({'sql_default_engine'})
> ---
> - ['sql_default_engine', 'memtx']
> ...
> 
> tarantool> s:select()
> ---
> - - ['sql_default_engine', 'memtx']
>   - ['sql_defer_foreign_keys', false]
>   - ['sql_full_column_names', false]
>   - ['sql_recursive_triggers', true]
>   - ['sql_reverse_unordered_selects', false]
> ...
> 
> tarantool> s:select('sql_g', {iterator='LE'})
> ---
> - - ['sql_full_column_names', false]
>   - ['sql_defer_foreign_keys', false]
>   - ['sql_default_engine', 'memtx']
> ...
> 
> -- Change session setting value.
> tarantool> s:update('sql_default_engine', {{'=', value, 'vinyl'}})

1. Looks like the value should be in quotes, if that is a field
name of the _session_settings space.

> ---
> - ['sql_default_engine', 'vinyl']
> ...
> ---
> https://github.com/tarantool/tarantool/issues/4511
> https://github.com/tarantool/tarantool/tree/imeevma/gh-4511-new-engine
> 
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index 5cd5cba8..0f94a5b 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -78,6 +78,7 @@ add_library(box STATIC
>      memtx_space.c
>      sysview.c
>      blackhole.c
> +    loophole.c

2. I propose to name this engine 'virtual_engine', and name
the files 'virtual_engine.c/.h'.

>      vinyl.c
>      vy_stmt.c
>      vy_mem.c
> 
> diff --git a/src/box/loophole.c b/src/box/loophole.c
> new file mode 100644
> index 0000000..775f5a1
> --- /dev/null
> +++ b/src/box/loophole.c
> @@ -0,0 +1,431 @@
> +#include "loophole.h"
> +#include "session.h"
> +#include "schema.h"
> +#include "engine.h"
> +#include "tuple.h"
> +#include "xrow.h"
> +
> +struct loophole_index {
> +	struct index base;
> +	struct tuple_format *format;

3. Please, write some comments. Especially why do you
store format in the index.

> +};
> +
> +struct loophole_iterator {
> +	/** Base iterator. Must be the first member. */
> +	struct iterator base;
> +	/** Format of the tuples this iterator returns. */
> +	struct tuple_format *format;
> +	/** ID of the option in global list of the options. */
> +	int current_id;
> +	/** Decoded key. */
> +	char *key;
> +	/** Type of iterator. */
> +	enum iterator_type iterator_type;
> +};

4. I don't like that the whole API below is hardcoded to use settings.
Engine API should not be like that. For example, loophole iterator
explicitly uses 'settings' API. loophole space, index too.

Virtual engine should be generic. You already have space vtab, so
you can keep all these methods, but rename them to
settings_space_*(), create a separate vtab for that space, and in
virtual_engine_create_space() you check space ID only once. If that
is ID of settings space, you use the settings vtab.

> +
> +static void
> +loophole_iterator_free(struct iterator *ptr)
> +{
> +	struct loophole_iterator *it = (struct loophole_iterator *)ptr;
> +	free(it->key);
> +	free(it);
> +}
> +
> +static int
> +loophole_iterator_next(struct iterator *iterator, struct tuple **ret)
> +{
> +	struct loophole_iterator *it = (struct loophole_iterator *)iterator;
> +	it->current_id = session_setting_id_by_name(it->key, it->current_id,
> +						    it->iterator_type);
> +	int id = it->current_id;
> +	it->current_id += iterator_type_is_reverse(it->iterator_type) ? -1 : 1;
> +	return session_setting_tuple(it->format, id, ret);
> +}
> +
> +static void
> +loophole_index_destroy(struct index *index)
> +{
> +	free_session_settings();
> +	free(index);
> +}
> +
> +static struct iterator *
> +loophole_index_create_iterator(struct index *base, enum iterator_type type,
> +			       const char *key, uint32_t part_count)
> +{
> +	struct loophole_index *index = (struct loophole_index *)base;
> +	char *decoded_key = NULL;
> +	if (part_count > 0) {
> +		assert(part_count == 1);
> +		assert(mp_typeof(*key) == MP_STR);
> +		uint32_t len;
> +		const char *name = mp_decode_str(&key, &len);
> +		decoded_key = (char *)malloc(len + 1);
> +		if (decoded_key == NULL) {
> +			diag_set(OutOfMemory, len + 1, "malloc", "decoded_key");
> +			return NULL;
> +		}
> +		memcpy(decoded_key, name, len);
> +		decoded_key[len] = '\0';
> +	}
> +	struct loophole_iterator *it =
> +		(struct loophole_iterator *)malloc(sizeof(*it));
> +	if (it == NULL) {
> +		diag_set(OutOfMemory, sizeof(*it), "malloc", "it");
> +		free(decoded_key);
> +		return NULL;
> +	}
> +	iterator_create(&it->base, base);
> +	it->base.next = loophole_iterator_next;
> +	it->base.free = loophole_iterator_free;
> +	it->key = decoded_key;
> +	it->iterator_type = type;
> +	it->format = index->format;
> +	it->current_id = iterator_type_is_reverse(type) ?
> +			 session_get_settings_max_id() - 1 : 0;
> +	return (struct iterator *)it;
> +	return NULL;
> +}
> +
> +static int
> +loophole_index_get(struct index *base, const char *key,
> +		   uint32_t part_count, struct tuple **result)
> +{
> +	struct loophole_index *index = (struct loophole_index *)base;
> +	(void)base;

5. Why is 'base' void? It is used. Not only in debug.

> +	assert(part_count == 1);
> +	(void)part_count;
> +	uint32_t len;
> +	const char *tmp = mp_decode_str(&key, &len);
> +	const char *decoded_key = tt_cstr(tmp, len);
> +	int id = session_setting_id_by_name(decoded_key, 0, ITER_EQ);
> +	return session_setting_tuple(index->format, id, result);
> +}
> +
> +static const struct index_vtab loophole_index_vtab = {
> +	/* .destroy = */ loophole_index_destroy,
> +	/* .commit_create = */ generic_index_commit_create,
> +	/* .abort_create = */ generic_index_abort_create,
> +	/* .commit_modify = */ generic_index_commit_modify,
> +	/* .commit_drop = */ generic_index_commit_drop,
> +	/* .update_def = */ generic_index_update_def,
> +	/* .depends_on_pk = */ generic_index_depends_on_pk,
> +	/* .def_change_requires_rebuild = */
> +		generic_index_def_change_requires_rebuild,
> +	/* .size = */ generic_index_size,
> +	/* .bsize = */ generic_index_bsize,
> +	/* .min = */ generic_index_min,
> +	/* .max = */ generic_index_max,
> +	/* .random = */ generic_index_random,
> +	/* .count = */ generic_index_count,
> +	/* .get = */ loophole_index_get,
> +	/* .replace = */ generic_index_replace,
> +	/* .create_iterator = */ loophole_index_create_iterator,
> +	/* .create_snapshot_iterator = */
> +		generic_index_create_snapshot_iterator,
> +	/* .stat = */ generic_index_stat,
> +	/* .compact = */ generic_index_compact,
> +	/* .reset_stat = */ generic_index_reset_stat,
> +	/* .begin_build = */ generic_index_begin_build,
> +	/* .reserve = */ generic_index_reserve,
> +	/* .build_next = */ generic_index_build_next,
> +	/* .end_build = */ generic_index_end_build,
> +};
> +> +static void
> +loophole_space_destroy(struct space *space)
> +{
> +	free(space);
> +}
> +
> +static int
> +loophole_space_execute_replace(struct space *space, struct txn *txn,
> +			       struct request *request, struct tuple **result)
> +{
> +	(void)space;
> +	(void)txn;
> +	(void)request;
> +	(void)result;
> +	diag_set(ClientError, ER_UNSUPPORTED, "Loophole", "replace()");
> +	return -1;
> +}
> +
> +static int
> +loophole_space_execute_delete(struct space *space, struct txn *txn,
> +			      struct request *request, struct tuple **result)
> +{
> +	(void)space;
> +	(void)txn;
> +	(void)request;
> +	(void)result;
> +	diag_set(ClientError, ER_UNSUPPORTED, "Loophole", "delete()");
> +	return -1;
> +}
> +
> +static int
> +loophole_space_execute_update(struct space *space, struct txn *txn,
> +			      struct request *request, struct tuple **result)
> +{
> +	(void)txn;

6. That is actually something arguable.

tarantool> box.space._session_settings:get({'sql_default_engine'})
---
- ['sql_default_engine', 'memtx']
...

tarantool> box.begin()
---
...

tarantool> box.space._session_settings:update({'sql_default_engine'}, {{'=', 'value', 'vinyl'}})
---
- ['sql_default_engine', 'vinyl']
...

tarantool> box.rollback()
---
...

tarantool> box.space._session_settings:get({'sql_default_engine'})
---
- ['sql_default_engine', 'vinyl']
...

This is definitely wrong. And I don't know what to do with that
now. Except that it should be documented at least.

> +	uint32_t len;
> +	const char *tmp;
> +	const char *data;

7. Lets avoid C89 declaration style.

> +
> +	data = request->key;
> +	uint32_t key_len = mp_decode_array(&data);
> +	if (key_len == 0) {
> +		diag_set(ClientError, ER_EXACT_MATCH, 1, 0);
> +		return -1;
> +	}
> +	if (key_len > 1 || mp_typeof(*data) != MP_STR) {
> +		diag_set(ClientError, ER_KEY_PART_TYPE, 0, "string");
> +		return -1;
> +	}
> +	tmp = mp_decode_str(&data, &len);
> +	const char *decoded_key = tt_cstr(tmp, len);
> +	int id = session_setting_id_by_name(decoded_key, 0, ITER_EQ);
> +	if (id == session_get_settings_max_id()) {
> +		*result = NULL;
> +		return 0;
> +	}
> +
> +	data = request->tuple;
> +	assert(mp_typeof(*data) == MP_ARRAY);
> +	uint32_t update_counter = mp_decode_array(&data);
> +	if (mp_typeof(*data) != MP_ARRAY) {
> +		diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "
> +			 "must be an array {op,..}");
> +		return -1;
> +	}
> +	if (update_counter > 1) {
> +		diag_set(ClientError, ER_UNSUPPORTED, "Loophole", "update of "
> +			 "more than one field");
> +		return -1;
> +	}
> +	if (mp_decode_array(&data) != 3) {
> +		diag_set(ClientError, ER_UNKNOWN_UPDATE_OP);
> +		return -1;
> +	}
> +	if (mp_typeof(*data) != MP_STR) {
> +		diag_set(ClientError, ER_ILLEGAL_PARAMS, "update operation "
> +			 "name must be a string");
> +		return -1;
> +	}
> +	tmp = mp_decode_str(&data, &len);
> +	if (tmp[0] != '=' || len != 1) {
> +		diag_set(ClientError, ER_UNSUPPORTED, "Loophole", "not '='");
> +		return -1;
> +	}
> +	if (mp_typeof(*data) == MP_UINT) {
> +		uint32_t id = mp_decode_uint(&data);
> +		if (id - request->index_base == 0) {
> +			diag_set(ClientError, ER_CANT_UPDATE_PRIMARY_KEY,
> +				 space_index(space, 0)->def->name,
> +				 space_name(space));
> +			return -1;
> +		}
> +		if (id - request->index_base != 1) {
> +			diag_set(ClientError, ER_EXACT_FIELD_COUNT, id,
> +				 1 + request->index_base);
> +			return -1;
> +		}
> +	} else if (mp_typeof(*data) == MP_STR) {
> +		tmp = mp_decode_str(&data, &len);
> +		const char *field_name = tt_cstr(tmp, len);
> +		if (strcmp(field_name, "name") == 0) {
> +			diag_set(ClientError, ER_CANT_UPDATE_PRIMARY_KEY,
> +				 space_index(space, 0)->def->name,
> +				 space_name(space));
> +			return -1;
> +		}
> +		if (strcmp(field_name, "value") != 0) {
> +			diag_set(ClientError, ER_NO_SUCH_FIELD_NAME,
> +				 field_name);
> +			return -1;
> +		}
> +	} else {
> +		diag_set(ClientError, ER_ILLEGAL_PARAMS,
> +			 "field id must be a number or a string");
> +		return -1;
> +	}
> +	const char *value = data;
> +
> +	if (session_setting_new_value(id, value) != 0)
> +		return -1;
> +	return session_setting_tuple(space->format, id, result);

8. Well, most part of that function could be replaced by:

    struct tuple *old_tuple;
    uint32_t new_size = 0, bsize;
    struct tuple_format *format = space->format;
    const char *old_data = tuple_data_range(old_tuple, &bsize);
    session_setting_tuple(format, id, old_tuple);
    const char *new_tuple_data = xrow_update_execute(
            request->tuple, request->tuple_end,
            old_data, old_data + bsize, format->dict,
            &new_size, request->index_base, NULL);

And you have the updated tuple in new_tuple_data. Now you
create a new tuple:

    struct tuple *new_tuple = tuple_new(format, new_tuple_data, new_size);

>From that tuple you get the new value. And you don't need to
implement the update by yourself. That also makes possible to
use any update operations.

Please, patch schema_def.h with definition of this new space
fields. So as you could obtain the new value by:

    const char *new_value = tuple_field(
            new_tuple, BOX_SESSION_SETTINGS_FIELD_VALUE);

> +}
> +
> +static int
> +loophole_space_execute_upsert(struct space *space, struct txn *txn,
> +			      struct request *request)
> +{
> +	(void)space;
> +	(void)txn;
> +	(void)request;
> +	diag_set(ClientError, ER_UNSUPPORTED, "Loophole", "upsert()");
> +	return -1;
> +}
> +
> +static struct index *
> +loophole_space_create_index(struct space *space, struct index_def *def)
> +{
> +	if (space->def->id != BOX_SESSION_SETTINGS_ID) {
> +		diag_set(ClientError, ER_UNSUPPORTED, "Loophole",
> +			 "create_index()");
> +		return NULL;
> +	}
> +	new_session_settings();

9. That is something I really don't like. See more in the comments
to session setting API.

> diff --git a/src/box/session.cc b/src/box/session.cc
> index 461d1cf..d72d917 100644
> --- a/src/box/session.cc
> +++ b/src/box/session.cc
> @@ -36,6 +36,7 @@
>  #include "user.h"
>  #include "error.h"
>  #include "tt_static.h"
> +#include "sql.h"

10. This is where my idea was taken wrong. I am trying to
remove explicit session's dependency on SQL. This is possible
in at least two ways:

- Way 1. All the settings are defined in one namespace.
  In one enum. That enum is defined in session.h.
  Otherwise, when you will add more settings from other
  subsystems, you will get a collision.
  In session.c you have an array of triplets
  {enum setting, get_function, set_function}. The functions
  are all NULL from the beginning. Subsystems, when they
  initialize, fill their part in that array (sql_init()
  will fill get/set_function values of its SQL settings).
  That way may be bad, because session will need to know
  names of all settings of all subsystems.

- Way 2. Make session setting consist of 2 parts:
  subsystem ID and setting ID. Then you define subsystem
  IDs in session.h, and settings of each subsystem in
  any other file. In build.c, in sql.c, whatever. For
  example, you store this in session.h:

      enum session_setting_module {
              SESSION_SETTING_BOX,
              SESSION_SETTING_SQL,
              ...
      };

  And keep your current SQL settings in build.c.

  In session.c you have an array with triplets
  {enum setting_module, set_function, get_function}.
  Get/set functions are filled exactly like in the first
  way - during module initializations.

We actually already do something similar with
session_vtab_registry global array.