[tarantool-patches] Re: [PATCH v4 1/4] schema: add new system space for CHECK constraints

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun May 19 19:01:57 MSK 2019


Hi! Thanks for the patchset!

Great! See 17 comments below!

1. Nikita in the email 13.05 02:04 asked you to create
an issue "sql: add handle to disable constraints". Where
is it? - I can't find.

On 16/05/2019 16:56, Kirill Shcherbatov wrote:
> This patch introduces a new system space to persist check
> constraints. Format of the space:
> 
> _ck_constraint (space id = 357)
> [<constraint name> STR, <space id> UINT,
>  <is_deferred>BOOL, <code>STR, <language>STR]

2. As I understand, now we want to allow stored functions in Lua, C,
and in future in SQL. But how checks are different? Why there is no
an option like 'language' or something? Lua and C checks are going to
be much faster than SQL ones. Also I see, that Kostja asked you to add
a language at 14.05 23:09 email, but you just ignored it. Why?

> 
> CK constraint is local to space, so every pair <CK name, space id>
> is unique (and it is PK in _ck_constraint space).

3. Please, change the space format to [space_id, ck_name, ...] and the
primary index parts to [space_id, ck_name]. Otherwise you can't
lookup all checks of one space having space_id. On the other hand lookup
by name without a space id is useless. This is why _index system space
has such a format and a pk.

> 
> After insertion into this space, a new instance describing check
> constraint is created. Check constraint holds AST of expression
> representing it.
> While space features check constraints, it isn't allowed to
> be dropped. The :drop() space method firstly deletes all check
> constraints and then removes entry from _space.
> 
> Because space alter, index alter and space truncate operations
> cause space recreation, introduced RebuildCkConstrains object
> that compiles new ck constraint objects, replaces and removes
> existent instances atomically(when some compilation fails,
> nothing is changed). In fact, in scope of this patch we don't
> really need to recreate ck_constraint object in such situations
> (patch space_def pointer in AST like we did it before is enough
> now, but we are going to recompile VDBE that represents ck
> constraint in future that can fail).
> 
> The main motivation for these changes is the ability to support
> ADD CHECK CONSTRAINT operation in the future. CK constraints are
> are easier to manage as self-sustained objects: such change is

4. Double 'are'.

> managed with atomic insertion(unlike the current architecture).
> 
> Disabled xfer optimization when some space have ck constraints
> because in the following patches this xfer optimisation becomes
> impossible. No reason to rewrite this code.
> 
> Needed for #3691
> ---
> diff --git a/src/box/alter.cc b/src/box/alter.cc
> index 9279426d2..e65c49d14 100644
> --- a/src/box/alter.cc
> +++ b/src/box/alter.cc
> @@ -1380,6 +1370,71 @@ UpdateSchemaVersion::alter(struct alter_space *alter)
>      ++schema_version;
>  }
>  
> +/**
> + * As ck_constraint object depends on space_def we must rebuild
> + * all ck constraints on space alter.
> + *
> + * To perform it transactionally, we create a list of a new ck

5. 'a' article is never used with plural.

> + * constraints objects in ::prepare method that is fault-tolerant.
> + * Finally in ::alter or ::rollback methods we only swap thouse
> + * lists securely.
> + */
> +class RebuildCkConstraints: public AlterSpaceOp
> +{
> +public:
> +	RebuildCkConstraints(struct alter_space *alter) : AlterSpaceOp(alter),
> +		ck_constraint(RLIST_HEAD_INITIALIZER(ck_constraint)) {}
> +	struct rlist ck_constraint;
> +	virtual void prepare(struct alter_space *alter);
> +	virtual void alter(struct alter_space *alter);
> +	virtual void rollback(struct alter_space *alter);
> +	virtual ~RebuildCkConstraints();
> +};
> @@ -4035,6 +4098,176 @@ on_replace_dd_fk_constraint(struct trigger * /* trigger*/, void *event)
>  	}
>  }
>  
> +/** Create an instance of check constraint definition by tuple. */
> +static struct ck_constraint_def *
> +ck_constraint_def_new_from_tuple(struct tuple *tuple)
> +{
> +	uint32_t name_len;
> +	const char *name =
> +		tuple_field_str_xc(tuple, BOX_CK_CONSTRAINT_FIELD_NAME,
> +				   &name_len);
> +	if (name_len > BOX_NAME_MAX) {
> +		tnt_raise(ClientError, ER_CREATE_CK_CONSTRAINT,
> +			  tt_cstr(name, BOX_INVALID_NAME_MAX),
> +				  "check constraint name is too long");
> +	}
> +	identifier_check_xc(name, name_len);
> +	const char *language_str =
> +		tuple_field_cstr_xc(tuple, BOX_CK_CONSTRAINT_FIELD_LANGUAGE);
> +	enum ck_constraint_language language =
> +		STR2ENUM(ck_constraint_language, language_str);
> +	if (language == ck_constraint_language_MAX) {
> +		tnt_raise(ClientError, ER_FUNCTION_LANGUAGE, language_str,
> +			  tt_cstr(name, name_len));
> +	}

6. I do not see a language on the branch. Either the mail, or the
branch, are outdated.

> +	uint32_t expr_str_len;
> +	const char *expr_str =
> +		tuple_field_str_xc(tuple, BOX_CK_CONSTRAINT_FIELD_CODE,
> +				   &expr_str_len);
> +	uint32_t name_offset, expr_str_offset;
> +	uint32_t ck_def_sz =
> +		ck_constraint_def_sizeof(name_len, expr_str_len, &name_offset,
> +					 &expr_str_offset);
> +	struct ck_constraint_def *ck_def =
> +		(struct ck_constraint_def *)malloc(ck_def_sz);
> +	if (ck_def == NULL)
> +		tnt_raise(OutOfMemory, ck_def_sz, "malloc", "ck_def");
> +
> +	ck_def->name = (char *)ck_def + name_offset;
> +	ck_def->expr_str = (char *)ck_def + expr_str_offset;
> +	ck_def->language = language;
> +	memcpy(ck_def->expr_str, expr_str, expr_str_len);
> +	ck_def->expr_str[expr_str_len] = '\0';
> +	memcpy(ck_def->name, name, name_len);
> +	ck_def->name[name_len] = '\0';
> +
> +	return ck_def;
> +}
> +
> +/** Trigger invoked on rollback in the _ck_constraint space. */
> +static void
> +on_replace_ck_constraint_rollback(struct trigger *trigger, void *event)
> +{
> +	struct txn_stmt *stmt = txn_last_stmt((struct txn *) event);
> +	struct ck_constraint *ck = (struct ck_constraint *)trigger->data;
> +	struct space *space = NULL;
> +	if (ck != NULL)
> +		space = space_by_id(ck->space_id);
> +	if (stmt->old_tuple != NULL && stmt->new_tuple == NULL) {
> +		/* Rollback DELETE check constraint. */
> +		assert(ck != NULL);
> +		assert(space != NULL);
> +		assert(space_ck_constraint_by_name(space,
> +				ck->def->name, strlen(ck->def->name)) == NULL);
> +		rlist_add_entry(&space->ck_constraint, ck, link);
> +	}  else if (stmt->new_tuple != NULL && stmt->old_tuple == NULL) {

7. Double whitespace prior to 'else'.

> +		/* Rollback INSERT check constraint. */
> +		assert(space != NULL);
> +		assert(space_ck_constraint_by_name(space,
> +				ck->def->name, strlen(ck->def->name)) != NULL);
> +		rlist_del_entry(ck, link);
> +		ck_constraint_delete(ck);
> +	} else {
> +		/* Rollback REPLACE check constraint. */
> +		assert(space != NULL);
> +		const char *name = ck->def->name;
> +		struct ck_constraint *new_ck =
> +			space_ck_constraint_by_name(space, name, strlen(name));
> +		assert(new_ck != NULL);
> +		rlist_del_entry(new_ck, link);
> +		rlist_add_entry(&space->ck_constraint, ck, link);
> +		ck_constraint_delete(new_ck);
> +	}
> +}
> +
> +/** A trigger invoked on replace in the _ck_constraint space. */
> +static void
> +on_replace_dd_ck_constraint(struct trigger * /* trigger*/, void *event)
> +{
> +	struct txn *txn = (struct txn *) event;
> +	txn_check_singlestatement_xc(txn, "Space _ck_constraint");
> +	struct txn_stmt *stmt = txn_current_stmt(txn);
> +	struct tuple *old_tuple = stmt->old_tuple;
> +	struct tuple *new_tuple = stmt->new_tuple;
> +	uint32_t space_id =
> +		tuple_field_u32_xc(old_tuple != NULL ? old_tuple : new_tuple,
> +				   BOX_CK_CONSTRAINT_FIELD_SPACE_ID);
> +	struct space *space = space_cache_find_xc(space_id);
> +	struct trigger *on_rollback =
> +		txn_alter_trigger_new(on_replace_ck_constraint_rollback, NULL);
> +	struct trigger *on_commit =
> +		txn_alter_trigger_new(on_replace_ck_constraint_commit, NULL);
> +
> +	if (new_tuple != NULL) {
> +		bool is_deferred =
> +			tuple_field_bool_xc(new_tuple,
> +					    BOX_CK_CONSTRAINT_FIELD_DEFERRED);
> +		if (is_deferred) {
> +			tnt_raise(ClientError, ER_UNSUPPORTED, "Tarantool",
> +				  "deferred ck constraints");
> +		}
> +		/* Create or replace check constraint. */
> +		struct ck_constraint_def *ck_def =
> +			ck_constraint_def_new_from_tuple(new_tuple);
> +		auto ck_guard = make_scoped_guard([=] { free(ck_def); });
> +		/*
> +		 * FIXME: Ck constraint creation on non-empty
> +		 * space must be implemented as preparatory
> +		 * step for ALTER SPACE ADD CONSTRAINT feature.

8. We already have ADD CONSTRAINT. It works for FK, UNIQUE, PK. The
problem is that we can't call it on a non-empty space.

> +		 */
> +		struct index *pk = space_index(space, 0);
> +		if (pk != NULL && index_size(pk) > 0) {
> +			tnt_raise(ClientError, ER_CREATE_CK_CONSTRAINT,
> +				  ck_def->name,
> +				  "referencing space must be empty");
> +		}
> +		struct ck_constraint *new_ck_constraint =
> +			ck_constraint_new(ck_def, space->def);
> +		if (new_ck_constraint == NULL)
> +			diag_raise();
> +		ck_guard.is_active = false;
> +		const char *name = new_ck_constraint->def->name;
> +		struct ck_constraint *old_ck_constraint =
> +			space_ck_constraint_by_name(space, name, strlen(name));
> +		if (old_ck_constraint != NULL)
> +			rlist_del_entry(old_ck_constraint, link);
> +		rlist_add_entry(&space->ck_constraint, new_ck_constraint, link);
> +		on_commit->data = old_tuple == NULL ? new_ck_constraint :
> +						      old_ck_constraint;
> +		on_rollback->data = on_commit->data;
> +	} else {
> +		assert(new_tuple == NULL && old_tuple != NULL);
> +		/* Drop check constraint. */
> +		uint32_t name_len;
> +		const char *name =
> +			tuple_field_str_xc(old_tuple,
> +					   BOX_CK_CONSTRAINT_FIELD_NAME,
> +					   &name_len);
> +		struct ck_constraint *old_ck_constraint =
> +			space_ck_constraint_by_name(space, name, name_len);
> +		assert(old_ck_constraint != NULL);
> +		rlist_del_entry(old_ck_constraint, link);
> +		on_commit->data = old_ck_constraint;
> +		on_rollback->data = old_ck_constraint;
> +	}
> +
> +	txn_on_rollback(txn, on_rollback);
> +	txn_on_commit(txn, on_commit);
> +}
> diff --git a/src/box/ck_constraint.c b/src/box/ck_constraint.c
> new file mode 100644
> index 000000000..db012837b
> --- /dev/null
> +++ b/src/box/ck_constraint.c
> @@ -0,0 +1,117 @@
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include "box/session.h"
> +#include "ck_constraint.h"
> +#include "errcode.h"
> +#include "sql.h"
> +#include "sql/sqlInt.h"
> +
> +const char *ck_constraint_language_strs[] = {"SQL"};

9. I do not see that on the branch.

> +
> +/**
> + * Resolve space_def references for check constraint via AST
> + * tree traversal.
> + * @param ck_constraint Check constraint object to update.

10. There is no such parameter.

> + * @param space_def Space definition to use.
> + * @retval 0 on success.
> + * @retval -1 on error.
> + */
> +static int
> +ck_constraint_resolve_field_names(struct Expr *expr,
> +				  struct space_def *space_def)
> +{
> +	struct Parse parser;
> +	sql_parser_create(&parser, sql_get(), default_flags);
> +	parser.parse_only = true;
> +	sql_resolve_self_reference(&parser, space_def, NC_IsCheck, expr, NULL);
> +	int rc = parser.is_aborted ? -1 : 0;
> +	sql_parser_destroy(&parser);
> +	return rc;
> +}
> +
> diff --git a/src/box/ck_constraint.h b/src/box/ck_constraint.h
> new file mode 100644
> index 000000000..9c72d5977
> --- /dev/null
> +++ b/src/box/ck_constraint.h
> @@ -0,0 +1,174 @@
> +/**
> + * Find check constraint object in space by given name and
> + * name_len.
> + * @param space The space to lookup check constraint.
> + * @param name The check constraint name.
> + * @param name_len The length of the name.
> + * @retval not NULL Check constrain if exists, NULL otherwise.

11. constrain -> constraint.

> + */
> +struct ck_constraint *
> +space_ck_constraint_by_name(struct space *space, const char *name,
> +			    uint32_t name_len);
> +
> +#if defined(__cplusplus)
> +} /* extern "C" { */
> +#endif
> +
> +#endif /* INCLUDES_BOX_CK_CONSTRAINT_H */
> diff --git a/src/box/lua/upgrade.lua b/src/box/lua/upgrade.lua
> index 89d6e3d52..7f2656eb7 100644
> --- a/src/box/lua/upgrade.lua
> +++ b/src/box/lua/upgrade.lua
> @@ -737,6 +739,46 @@ local function upgrade_to_2_1_3()
>      end
>  end
>  
> +local function upgrade_to_2_2_0()
> +    -- In previous Tarantool releases check constraints were
> +    -- stored in space opts. Now we use separate space
> +    -- _ck_constraint for this purpose. Perform legacy data
> +    -- migration.
> +    local MAP = setmap({})
> +    local _space = box.space[box.schema.SPACE_ID]
> +    local _index = box.space[box.schema.INDEX_ID]
> +    local _ck_constraint = box.space[box.schema.CK_CONSTRAINT_ID]
> +    log.info("create space _ck_constraint")
> +    local format = {{name='name', type='string'},
> +                    {name='space_id', type='unsigned'},
> +                    {name='is_deferred', type='boolean'},
> +                    {name='code', type='str'}, {name='language', type='str'}}
> +    _space:insert{_ck_constraint.id, ADMIN, '_ck_constraint', 'memtx', 0, MAP, format}
> +
> +    log.info("create index primary on _ck_constraint")
> +    _index:insert{_ck_constraint.id, 0, 'primary', 'tree',
> +                  {unique = true}, {{0, 'string'}, {1, 'unsigned'}}}
> +
> +    log.info("create secondary index child_id on _ck_constraint")
> +    _index:insert{_ck_constraint.id, 1, 'space_id', 'tree',
> +                  {unique = false}, {{1, 'unsigned'}}}
> +    for _, space in _space:pairs() do
> +        local flags = space.flags
> +        if flags['checks'] ~= nil then

12. Is there any reason why you do not use field access by '.'?
You could use

    local _space = box.space._space
    local _index = box.space._index
    local _ck_constraint = box.space._ck_constraint
    ...
    if flags.checks then
    ...
    flags.checks = nil

instead of

    local _space = box.space[box.schema.SPACE_ID]
    local _index = box.space[box.schema.INDEX_ID]
    local _ck_constraint = box.space[box.schema.CK_CONSTRAINT_ID]
    ...
    if flags['checks'] ~= nil then
    ...
    flags['checks'] = nil

IMO the former looks simpler.

> +            for i, check in pairs(flags['checks']) do
> +                local expr_str = check.expr
> +                local check_name = check.name or
> +                                   "CK_CONSTRAINT_"..i.."_"..space.name
> +                _ck_constraint:insert({check_name, space.id, false, expr_str, 'SQL'})
> +            end
> +            flags['checks'] = nil
> +            _space:replace(box.tuple.new({space.id, space.owner, space.name,
> +                                          space.engine, space.field_count,
> +                                          flags, space.format}))

13. Why do you create tuple to replace? Anyway it is not inserted as
is, and :replace() creates another tuple internally.

> +        end
> +    end
> +end
> +
> diff --git a/src/box/space_def.h b/src/box/space_def.h
> index c6639a8d8..71356e2a6 100644
> --- a/src/box/space_def.h
> +++ b/src/box/space_def.h
> @@ -71,8 +71,6 @@ struct space_opts {
>  	bool is_view;
>  	/** SQL statement that produced this space. */
>  	char *sql;
> -	/** SQL Checks expressions list. */
> -	struct ExprList *checks;

14. Now you can drop 'struct ExprList;' announcement
from this header.

>  };
>  
> diff --git a/src/box/sql/build.c b/src/box/sql/build.c
> index 6051a2529..f2bbbb9c4 100644
> --- a/src/box/sql/build.c
> +++ b/src/box/sql/build.c
> @@ -638,40 +640,109 @@ primary_key_exit:
>  	return;
>  }
>  
> +/**
> + * Prepare a 0-terminated string in the wptr memory buffer that
> + * does not contain a sequence of more than one whatespace
> + * character. Routine enforces ' ' (space) as whitespace
> + * delimiter.> + * The wptr buffer is expected to have str_len + 1 bytes
> + * (this is the expected scenario where no extra whitespace
> + * characters preset in the source string).
> + * @param wptr The destination memory buffer of size
> + *             @a str_len + 1.
> + * @param str The source string to be copied.
> + * @param str_len The source string @a str length.
> + */
> +static void
> +trim_space_snprintf(char *wptr, const char *str, uint32_t str_len)

15. You can't just drop all the whitespaces, new lines, and tabs from
an SQL expression, and expect its behaviour unchanged.

	tarantool> box.execute("CREATE TABLE test (id INT PRIMARY KEY, a TEXT, CONSTRAINT CK1 CHECK(a != '   '))")
	---
	- row_count: 1
	...

	tarantool> box.execute("INSERT INTO test VALUES (1, '   ')")
	---
	- row_count: 1
	...

	tarantool> box.space._ck_constraint:select{}
	---
	- - ['CK1', 512, false, 'a != '' ''']
	...

I've forbidden to insert '   ', but you trimmed the string, and
changed the expression logic. Please, either remove
trim_space_snprintf(), or make it smarter and do not touch
space symbols inside quotes. I vote for the second option. As Nikita
fairly noticed, otherwise CHECK bodies are unreadable.

> +{
> +	const char *str_end = str + str_len;
> +	bool is_prev_chr_space = false;
> +	while (str < str_end) {
> +		if (isspace((unsigned char)*str)) {
> +			if (!is_prev_chr_space)
> +				*wptr++ = ' ';
> +			is_prev_chr_space = true;
> +			str++;
> +			continue;
> +		}
> +		is_prev_chr_space = false;
> +		*wptr++ = *str++;
> +	}
> +	*wptr = '\0';
> +}
> +
>  void
> -sql_add_check_constraint(struct Parse *parser)
> +sql_create_check_contraint(struct Parse *parser)
>  {
> -	struct create_ck_def *ck_def = &parser->create_ck_def;
> +	struct create_ck_def *create_ck_def = &parser->create_ck_def;
> +	struct ExprSpan *expr_span = create_ck_def->expr;
> +	sql_expr_delete(parser->db, expr_span->pExpr, false);
> +
>  	struct alter_entity_def *alter_def =
>  		(struct alter_entity_def *) &parser->create_ck_def;

16. Here you can cast 'create_ck_def' without taking it from
&parser->create_ck_def again.

>  	assert(alter_def->entity_type == ENTITY_TYPE_CK);
>  	(void) alter_def;
> -	struct Expr *expr = ck_def->expr->pExpr;
>  	struct space *space = parser->create_table_def.new_space;
> -	if (space != NULL) {
> -		expr->u.zToken =
> -			sqlDbStrNDup(parser->db,
> -				     (char *) ck_def->expr->zStart,
> -				     (int) (ck_def->expr->zEnd -
> -					    ck_def->expr->zStart));
> -		if (expr->u.zToken == NULL)
> -			goto release_expr;
> -		space->def->opts.checks =
> -			sql_expr_list_append(parser->db,
> -					     space->def->opts.checks, expr);
> -		if (space->def->opts.checks == NULL) {
> -			sqlDbFree(parser->db, expr->u.zToken);
> -			goto release_expr;
> -		}
> -		struct create_entity_def *entity_def = &ck_def->base.base;
> -		if (entity_def->name.n > 0) {
> -			sqlExprListSetName(parser, space->def->opts.checks,
> -					   &entity_def->name, 1);
> +	assert(space != NULL);
> +
> +	/* Prepare payload for ck constraint definition. */
> +	struct region *region = &parser->region;
> +	struct Token *name_token = &create_ck_def->base.base.name;
> +	const char *name;
> +	if (name_token->n > 0) {
> +		name = sql_normalized_name_region_new(region, name_token->z,
> +						      name_token->n);
> +		if (name == NULL) {
> +			parser->is_aborted = true;
> +			return;
>  		}
>  	} else {
> -release_expr:
> -		sql_expr_delete(parser->db, expr, false);
> +		uint32_t ck_idx = ++parser->create_table_def.check_count;
> +		name = tt_sprintf("CK_CONSTRAINT_%d_%s", ck_idx,
> +				  space->def->name);
> +	}
> +	size_t name_len = strlen(name);
> +
> +	uint32_t expr_str_len = (uint32_t)(create_ck_def->expr->zEnd -
> +					   create_ck_def->expr->zStart);

17. create_ck_def->expr is expr_span, which you obtained above.

> +	const char *expr_str = create_ck_def->expr->zStart;
> +
> +	/*
> +	 * Allocate memory for ck constraint parse structure and
> +	 * ck constraint definition as a single memory chunk on
> +	 * region:
> +	 *
> +	 *    [ck_parse][ck_def[name][expr_str]]
> +	 *         |_____^  |___^     ^
> +	 *                  |_________|
> +	 */




More information about the Tarantool-patches mailing list