From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounces@dev.tarantool.org>
Received: from [87.239.111.99] (localhost [127.0.0.1])
	by dev.tarantool.org (Postfix) with ESMTP id 0CCD36EC40;
	Fri, 13 Aug 2021 18:12:33 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0CCD36EC40
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev;
	t=1628867553; bh=BSm50agWIbRnm6fnjIZYl7UssaH81iaQIdo1Q1790VI=;
	h=Date:To:Cc:References:In-Reply-To:Subject:List-Id:
	 List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe:
	 From:Reply-To:From;
	b=aEt4bpZ8xFQezSjLC6UiV5pFKMs5U7B+ntkip9VQF2DUxHKmvQEjZgHyr6nFt7MOc
	 zwM96sFB0D7FCjsm0O5Z5oKRlD4OeNPhIm/IUVtxr8XPdXuqwMglxjAdTGlRwL+zFI
	 T23A/AkzivfiiK9qohgueOa3RuVqFweBLtmKOn44=
Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id B5F826EC40
 for <tarantool-patches@dev.tarantool.org>;
 Fri, 13 Aug 2021 18:12:31 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B5F826EC40
Received: by smtpng1.m.smailru.net with esmtpa (envelope-from
 <vdavydov@tarantool.org>)
 id 1mEYrG-00079y-TW; Fri, 13 Aug 2021 18:12:31 +0300
Date: Fri, 13 Aug 2021 18:12:29 +0300
To: imeevma@tarantool.org
Cc: tarantool-patches@dev.tarantool.org, Mergen Imeev <imeevma@gmail.com>
Message-ID: <20210813151229.dish2wtuyu32zc72@esperanza>
References: <20eb7c0115c8b3c90386da95a0ddf64dccbd6b0f.1628824097.git.imeevma@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20eb7c0115c8b3c90386da95a0ddf64dccbd6b0f.1628824097.git.imeevma@gmail.com>
X-7564579A: 646B95376F6C166E
X-77F55803: 4F1203BC0FB41BD92087353F0EC44DD910164DC12A5633065676A9727AC27C74182A05F538085040C0393AF93D1047F181EA2DA42E7563310F101D2A5BBE6828D73156CEDFD02BFD
X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7F4E79F226F99D4DDEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637334E2757C55E8D4EEA1F7E6F0F101C6723150C8DA25C47586E58E00D9D99D84E1BDDB23E98D2D38BBCA57AF85F7723F2CA6C5B18F4F6D89149C3391549A7BEBBCC7F00164DA146DAFE8445B8C89999728AA50765F7900637F6B57BC7E64490618DEB871D839B7333395957E7521B51C2DFABB839C843B9C08941B15DA834481F8AA50765F7900637F924B32C592EA89F389733CBF5DBD5E9B5C8C57E37DE458B9E9CE733340B9D5F3BBE47FD9DD3FB595F5C1EE8F4F765FC72CEEB2601E22B093A03B725D353964B0B7D0EA88DDEDAC722CA9DD8327EE4930A3850AC1BE2E735D05AD665AB97B35DC4224003CC83647689D4C264860C145E
X-C1DE0DAB: 0D63561A33F958A590A0E651EC58FFCCF5A8FF3D7BE31FEB40D0C06346C8EBAED59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA750E14360347543F58410CA545F18667F91A7EA1CDA0B5A7A0
X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D348BE83DFD8AFB1CACADE09E30B0EC8C4F45B474ACBD01D3C88444499F7DAB9B73F2F3892D1C2ED9411D7E09C32AA3244C73846519361284E43B0425AD55E1FCAEE646F07CC2D4F3D8927AC6DF5659F194
X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj0dLV0c3jbkybYNMFG344Kg==
X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5DA1F0B212C42E6265E421CBB480E6F7E4274CEFED1673C562683ABF942079399BFB559BB5D741EB966A65DFF43FF7BE03240331F90058701C67EA787935ED9F1B
X-Mras: Ok
Subject: Re: [Tarantool-patches] [PATCH v1 1/1] sql: define ephemeral space
 fields
X-BeenThere: tarantool-patches@dev.tarantool.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
From: Vladimir Davydov via Tarantool-patches
 <tarantool-patches@dev.tarantool.org>
Reply-To: Vladimir Davydov <vdavydov@tarantool.org>
Errors-To: tarantool-patches-bounces@dev.tarantool.org
Sender: "Tarantool-patches" <tarantool-patches-bounces@dev.tarantool.org>

On Fri, Aug 13, 2021 at 06:09:19AM +0300, imeevma@tarantool.org wrote:
> diff --git a/src/box/sql.c b/src/box/sql.c
> index c805a1e5c..2b95aed8d 100644
> --- a/src/box/sql.c
> +++ b/src/box/sql.c
> @@ -243,114 +243,163 @@ tarantoolsqlCount(struct BtCursor *pCur)
>  	return index_count(pCur->index, pCur->iter_type, NULL, 0);
>  }
>  
> -struct space *
> -sql_ephemeral_space_create(uint32_t field_count, struct sql_key_info *key_info)
> +struct space_info *
> +sql_space_info_new(uint32_t field_count, uint32_t part_count)
>  {
> -	struct key_def *def = NULL;
> -	uint32_t part_count = field_count;
> -	if (key_info != NULL) {
> -		def = sql_key_info_to_key_def(key_info);
> -		if (def == NULL)
> -			return NULL;
> -		/*
> -		 * In case is_pk_rowid is true we can use rowid
> -		 * as the only part of the key.
> -		 */
> -		if (key_info->is_pk_rowid)
> -			part_count = 1;
> +	assert(field_count > 0);
> +	uint32_t info_size = sizeof(struct space_info);
> +	uint32_t field_size = field_count * sizeof(enum field_type);
> +	uint32_t colls_size = field_count * sizeof(uint32_t);
> +	uint32_t parts_size = part_count * sizeof(uint32_t);
> +	uint32_t size = info_size + field_size + colls_size + parts_size;
> +
> +	struct space_info *info = sqlDbMallocRawNN(sql_get(), size);
> +	if (info == NULL) {
> +		diag_set(OutOfMemory, size, "sqlDbMallocRawNN", "info");
> +		return NULL;
>  	}
> +	info->types = (enum field_type *)((char *)info + info_size);
> +	info->coll_ids = (uint32_t *)((char *)info->types + field_size);
> +	info->parts = part_count == 0 ? NULL :
> +		      (uint32_t *)((char *)info->coll_ids + colls_size);
> +	info->field_count = field_count;
> +	info->part_count = part_count;
> +	info->sort_order = SORT_ORDER_ASC;
>  
> -	struct region *region = &fiber()->gc;
> +	for (uint32_t i = 0; i < field_count; ++i) {
> +		info->types[i] = FIELD_TYPE_SCALAR;
> +		info->coll_ids[i] = COLL_NONE;
> +	}
> +	for (uint32_t i = 0; i < part_count; ++i)
> +		info->parts[i] = i;
> +	return info;
> +}
> +
> +void
> +sql_space_info_from_space_def(struct space_info *info,
> +			      const struct space_def *def)

Separating allocation from initialization only complicates the protocol:
 - The caller must ensure that the arguments passed to alloc and init
   are compatible.
 - There's a window between alloc and init when the object is unusable.

Please don't:

	struct sql_space_info *
	sql_space_info_new_from_space_def(...);

	struct sql_space_info *
	sql_space_info_new_from_index_def(...);

> +{
> +	assert(info->field_count == def->field_count + 1);
> +	assert(info->part_count == 0);
> +	for (uint32_t i = 0; i < def->field_count; ++i) {
> +		info->types[i] = def->fields[i].type;
> +		info->coll_ids[i] = def->fields[i].coll_id;
> +	}
> +	/* Add one more field for rowid. */
> +	info->types[def->field_count] = FIELD_TYPE_INTEGER;
> +	info->coll_ids[def->field_count] = COLL_NONE;
> +}
> +
> +void
> +sql_space_info_from_index_def(struct space_info *info,
> +			      const struct index_def *def)
> +{
> +	assert(info->field_count >= def->key_def->part_count);
> +	assert(info->part_count == 0);
> +	for (uint32_t i = 0; i < def->key_def->part_count; ++i) {
> +		info->types[i] = def->key_def->parts[i].type;
> +		info->coll_ids[i] = def->key_def->parts[i].coll_id;
> +	}
> +}
> +
> +enum {
>  	/*
> -	 * Name of the fields will be "_COLUMN_1", "_COLUMN_2"
> -	 * and so on. Due to this, length of each name is no more
> -	 * than strlen("_COLUMN_") plus length of UINT32_MAX
> -	 * turned to string, which is 10 and plus 1 for \0.
> +	 * Name of the fields will be "_COLUMN_1", "_COLUMN_2" and so on. Due to
> +	 * this, length of each name is no more than strlen("_COLUMN_") plus
> +	 * length of UINT32_MAX turned to string, which is 10 and plus 1 for \0.
>  	 */
> -	uint32_t name_len = strlen("_COLUMN_") + 11;
> -	uint32_t size = field_count * (sizeof(struct field_def) + name_len) +
> -			part_count * sizeof(struct key_part_def);
> +	NAME_LEN = 19,
> +};

Please move this definition to the function that uses it.

> +
> +struct space *
> +sql_ephemeral_space_new_from_info(const struct space_info *info)

struct space *
sql_ephemeral_space_new(const struct sql_space_info *info);

> +{
> +	uint32_t field_count = info->field_count;
> +	uint32_t part_count = info->parts == NULL ? field_count :
> +			      info->part_count;
> +	uint32_t parts_indent = field_count * sizeof(struct field_def);
> +	uint32_t names_indent = part_count * sizeof(struct key_part_def) +
> +				parts_indent;
> +	uint32_t size = names_indent + field_count * NAME_LEN;
> +
> +	struct region *region = &fiber()->gc;
> +	size_t svp = region_used(region);
>  	struct field_def *fields = region_aligned_alloc(region, size,
>  							alignof(fields[0]));
>  	if (fields == NULL) {
>  		diag_set(OutOfMemory, size, "region_aligned_alloc", "fields");
>  		return NULL;
>  	}
> -	struct key_part_def *ephemer_key_parts =
> -		(void *)fields + field_count * sizeof(struct field_def);
> -	static_assert(alignof(*fields) == alignof(*ephemer_key_parts),
> -		      "allocated in one block, and should have the same "
> -		      "alignment");
> -	char *names = (char *)ephemer_key_parts +
> -		       part_count * sizeof(struct key_part_def);
> -	for (uint32_t i = 0; i < field_count; ++i) {
> -		struct field_def *field = &fields[i];
> -		field->name = names;
> -		names += name_len;
> -		sprintf(field->name, "_COLUMN_%d", i);
> -		field->is_nullable = true;
> -		field->nullable_action = ON_CONFLICT_ACTION_NONE;
> -		field->default_value = NULL;
> -		field->default_value_expr = NULL;
> -		if (def != NULL && i < def->part_count) {
> -			assert(def->parts[i].type < field_type_MAX);
> -			field->type = def->parts[i].type;
> -			field->coll_id = def->parts[i].coll_id;
> -		} else {
> -			field->coll_id = COLL_NONE;
> -			field->type = FIELD_TYPE_SCALAR;
> +	struct key_part_def *parts = (struct key_part_def *)((char *)fields +
> +							     parts_indent);
> +	static_assert(alignof(*fields) == alignof(*parts), "allocated in one "
> +		      "block, and should have the same alignment");
> +	char *names = (char *)fields + names_indent;
> +
> +	for (uint32_t i = 0; i < info->field_count; ++i) {
> +		fields[i].name = &names[i * NAME_LEN];
> +		sprintf(fields[i].name, "_COLUMN_%d", i);
> +		fields[i].is_nullable = true;
> +		fields[i].nullable_action = ON_CONFLICT_ACTION_NONE;
> +		fields[i].default_value = NULL;
> +		fields[i].default_value_expr = NULL;
> +		fields[i].type = info->types[i];
> +		fields[i].coll_id = info->coll_ids[i];
> +	}
> +	if (info->parts == NULL) {
> +		for (uint32_t i = 0; i < part_count; ++i) {
> +			parts[i].fieldno = i;
> +			parts[i].nullable_action = ON_CONFLICT_ACTION_NONE;
> +			parts[i].is_nullable = true;
> +			parts[i].exclude_null = false;
> +			parts[i].sort_order = info->sort_order;
> +			parts[i].path = NULL;
> +			parts[i].type = info->types[i];
> +			parts[i].coll_id = info->coll_ids[i];
> +		}
> +	} else {
> +		for (uint32_t i = 0; i < part_count; ++i) {
> +			uint32_t j = info->parts[i];
> +			parts[i].fieldno = j;
> +			parts[i].nullable_action = ON_CONFLICT_ACTION_NONE;
> +			parts[i].is_nullable = true;
> +			parts[i].exclude_null = false;
> +			parts[i].sort_order = info->sort_order;
> +			parts[i].path = NULL;
> +			parts[i].type = info->types[j];
> +			parts[i].coll_id = info->coll_ids[j];

	for (uint32_t i = 0; i < part_count; ++i) {
		uint32_t j = info->parts != NULL ? info->parts[i] : i;
		parts[i].fieldno = j;
		...
	}

>  		}
>  	}
>  
> -	for (uint32_t i = 0; i < part_count; ++i) {
> -		struct key_part_def *part = &ephemer_key_parts[i];
> -		/*
> -		 * In case we need to save initial order of
> -		 * inserted into ephemeral space rows we use rowid
> -		 * as the only part of PK. If ephemeral space has
> -		 * a rowid, it is always the last column.
> -		 */
> -		uint32_t j = i;
> -		if (key_info != NULL && key_info->is_pk_rowid)
> -			j = field_count - 1;
> -		part->fieldno = j;
> -		part->nullable_action = ON_CONFLICT_ACTION_NONE;
> -		part->is_nullable = true;
> -		part->exclude_null = false;
> -		part->sort_order = SORT_ORDER_ASC;
> -		part->path = NULL;
> -		part->type = fields[j].type;
> -		part->coll_id = fields[j].coll_id;
> -	}
> -	struct key_def *ephemer_key_def = key_def_new(ephemer_key_parts,
> -						      part_count, false);
> -	if (ephemer_key_def == NULL)
> +	struct key_def *key_def = key_def_new(parts, part_count, false);
> +	if (key_def == NULL)
>  		return NULL;
>  
> -	struct index_def *ephemer_index_def =
> -		index_def_new(0, 0, "ephemer_idx", strlen("ephemer_idx"), TREE,
> -			      &index_opts_default, ephemer_key_def, NULL);
> -	key_def_delete(ephemer_key_def);
> -	if (ephemer_index_def == NULL)
> +	const char *name = "ephemer_idx";
> +	struct index_def *index_def = index_def_new(0, 0, name, strlen(name),
> +						    TREE, &index_opts_default,
> +						    key_def, NULL);
> +	key_def_delete(key_def);
> +	if (index_def == NULL)
>  		return NULL;
>  
>  	struct rlist key_list;
>  	rlist_create(&key_list);
> -	rlist_add_entry(&key_list, ephemer_index_def, link);
> +	rlist_add_entry(&key_list, index_def, link);
>  
> -	struct space_def *ephemer_space_def =
> -		space_def_new_ephemeral(field_count, fields);
> -	if (ephemer_space_def == NULL) {
> -		index_def_delete(ephemer_index_def);
> +	struct space_def *space_def = space_def_new_ephemeral(field_count,
> +							      fields);
> +	if (space_def == NULL) {
> +		index_def_delete(index_def);
>  		return NULL;
>  	}
>  
> -	struct space *ephemer_new_space = space_new_ephemeral(ephemer_space_def,
> -							      &key_list);
> -	index_def_delete(ephemer_index_def);
> -	space_def_delete(ephemer_space_def);
> +	struct space *space = space_new_ephemeral(space_def, &key_list);
> +	index_def_delete(index_def);
> +	space_def_delete(space_def);
> +	region_truncate(region, svp);
>  
> -	return ephemer_new_space;
> +	return space;
>  }
>  
>  int tarantoolsqlEphemeralInsert(struct space *space, const char *tuple,
> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index b9107fccc..a3e551017 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -5879,6 +6047,23 @@ sqlSelect(Parse * pParse,		/* The parser context */
>  				      sSort.reg_eph,
>  				      nCols,
>  				      0, (char *)key_info, P4_KEYINFO);
> +		/*
> +		 * We also should define space_info in case this opcode will not
> +		 * be converted to SorterOpen.
> +		 */
> +		uint32_t part_count = sSort.pOrderBy->nExpr + 1;
> +		struct space_info *info = sql_space_info_new(nCols, part_count);
> +		if (info == NULL) {
> +			pParse->is_aborted = true;
> +			goto select_end;
> +		}
> +		key_info->info = info;
> +		if (space_info_for_sorting(info, pParse, sSort.pOrderBy,
> +					   pEList) != 0) {
> +			pParse->is_aborted = true;
> +			goto select_end;
> +		}
> +

I don't understand why you can't pass sql_space_info instead of
sql_key_info. AFAIU, you don't need key_info here at all anymore.
Anyway, keeping a pointer to space_info in key_info looks bad.
Can you avoid that?

>  		sqlVdbeAddOp3(v, OP_IteratorOpen, sSort.iECursor, 0, sSort.reg_eph);
>  		VdbeComment((v, "Sort table"));
>  	} else {
> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
> index 1f87e6823..d854562d0 100644
> --- a/src/box/sql/sqlInt.h
> +++ b/src/box/sql/sqlInt.h
> @@ -4055,6 +4057,66 @@ sql_key_info_unref(struct sql_key_info *key_info);
>  struct key_def *
>  sql_key_info_to_key_def(struct sql_key_info *key_info);
>  
> +/**
> + * Structure that is used to store information about ephemeral space field types
> + * and fieldno of key parts.
> + */
> +struct space_info {

sql_space_info

> +	/** Field types of all fields of ephemeral space. */
> +	enum field_type *types;
> +	/** Collation ids of all fields of ephemeral space. */
> +	uint32_t *coll_ids;
> +	/**
> +	 * Fieldno key parts of the ephemeral space. If NULL, then the index
> +	 * consists of all fields in sequential order.
> +	 */
> +	uint32_t *parts;
> +	/** Number of fields of ephemetal space. */
> +	uint32_t field_count;
> +	/**
> +	 * Number of parts in primary index of ephemetal space. If 0 then parts
> +	 * is also NULL.
> +	 */
> +	uint32_t part_count;
> +	/** Sort order of index. */
> +	enum sort_order sort_order;
> +};
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index fcea9eefe..14caa6f82 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -2392,10 +2390,11 @@ open_cursor_set_hints:
>  case OP_OpenTEphemeral: {
>  	assert(pOp->p1 >= 0);
>  	assert(pOp->p2 > 0);
> -	assert(pOp->p4type != P4_KEYINFO || pOp->p4.key_info != NULL);
>  
> -	struct space *space = sql_ephemeral_space_create(pOp->p2,
> -							 pOp->p4.key_info);

You don't use p2 anymore. Do you still need to pass it?

> +	struct space_info *info = pOp->p4type == P4_KEYINFO ?
> +				  pOp->p4.key_info->info : pOp->p4.space_info;
> +	assert(info != NULL);
> +	struct space *space = sql_ephemeral_space_new_from_info(info);
>  
>  	if (space == NULL)
>  		goto abort_due_to_error;