[tarantool-patches] Re: [PATCH v5 3/6] sql: introduce tuple_fetcher class

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun May 26 15:05:26 MSK 2019


Hi! Thanks for the fixes! See 10 comments below.

> diff --git a/src/box/sql.h b/src/box/sql.h
> index 15ef74b19..3aaeb2274 100644
> --- a/src/box/sql.h
> +++ b/src/box/sql.h
> @@ -386,6 +386,52 @@ sql_src_list_entry_name(const struct SrcList *list, int i);
>  void
>  sqlSrcListDelete(struct sql *db, struct SrcList *list);
>  
> +/**
> + * Auxilary VDBE structure to speed-up tuple data field access.

1. 'Auxilary' -> 'Auxiliary'. Looks like you still did not install
spell checker, as I asked in some previous patchsets. Please, do it.

> + * A memory allocation that manage this structure must have
> + * trailing unused bytes that extends the last 'slots' array.
> + * The amount of reserved memory should correspond to the problem
> + * to be solved and is usually equal to the greatest number of
> + * fields in the tuple.
> + *
> + * +------------------------+
> + * |  struct tuple_fetcher  |
> + * +------------------------+
> + * |     RESERVED MEMORY    |
> + * +------------------------+
> + */
> +struct tuple_fetcher {
> +	/** Tuple pointer or NULL when undefined. */
> +	struct tuple *tuple;
> +	/** Tuple data pointer. */
> +	const char *data;
> +	/** Tuple data size. */
> +	uint32_t data_sz;
> +	/** Count of fields in tuple. */
> +	uint32_t field_count;
> +	/**
> +	 * Index of the rightmost initialized slot in slots
> +	 * array.
> +	 */
> +	uint32_t rightmost_slot;
> +	/**
> +	 * Array of offsets of tuple fields.
> +	 * Only values <= rightmost_slot are valid.
> +	 */
> +	uint32_t slots[1];
> +};
> +
> +/**
> + * Initialize a new tuple_fetcher instance with given tuple
> + * data.
> + * @param fetcher The tuple_fetcher instance to initialize.
> + * @param tuple The tuple object pointer or NULL when undefined.
> + * @param data The tuple data (is always defined).
> + * @param data_sz The size of tuple data (is always defined).
> + */
> +void
> +tuple_fetcher_create(struct tuple_fetcher *fetcher, struct tuple *tuple,
> +		     const char *data, uint32_t data_sz);

2. Great, I like this function, especially that it does not have
any single 'if' - perfect case for processor's conveyor parallelizing.

But the API is ambiguous - what should a user pass here? Always both
tuple and data, or tuple and different data, or tuple and its data,
or tuple + NULL vs NULL + data?

Please, expose two public functions: tuple_fetcher_prepare_data and
tuple_fetcher_prepare_tuple. Each takes either const char *data or
struct tuple *tuple. Internally they call tuple_fetcher_create with
both tuple and data.

Once you did it, you can stop calling tarantoolsqlPayloadFetch in
OP_Column and do either tuple_fetcher_prepare_tuple(pCrsr->last_tuple)
in case it is a normal cursor, or do
tuple_fetcher_prepare_data(pReg->z, pReg->n) in case of a pseudo cursor.

>  
>  #if defined(__cplusplus)
>  } /* extern "C" { */
> diff --git a/src/box/sql/tarantoolInt.h b/src/box/sql/tarantoolInt.h
> index 2b04d961e..bb0f8f1a4 100644
> --- a/src/box/sql/tarantoolInt.h
> +++ b/src/box/sql/tarantoolInt.h
> @@ -28,7 +28,8 @@ const void *tarantoolsqlPayloadFetch(BtCursor * pCur, u32 * pAmt);
>   *         offset to @a fieldno.
>   */
>  const void *
> -tarantoolsqlTupleColumnFast(BtCursor *pCur, u32 fieldno, u32 *field_size);
> +tarantool_tuple_field_fast(struct tuple *tuple, uint32_t fieldno,
> +			   uint32_t *field_size);

3. I suggest you to make this function method of the fetcher and static inline
right above tuple_fetcher_fetch(). It is not needed in any other place.

>  
>  int tarantoolsqlFirst(BtCursor * pCur, int *pRes);
>  int tarantoolsqlLast(BtCursor * pCur, int *pRes);
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 5d37f63fb..33c5458cd 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -612,6 +612,119 @@ mem_type_to_str(const struct Mem *p)
>  	}
>  }
>  
> +/**
> + * Fetch field by field_idx using tuple_fetcher and store result
> + * in dest_mem.
> + * @param fetcher The initialized tuple_fetcher instance to use.
> + * @param field_idx The id of the field to fetch.
> + * @param field_type The destination memory field type is used
> + *                   when fetch result type is undefined.

4. The only reason to make tuple_fetcher_fetch() a separate function
was to get rid of this parameter and related code, keeping it in
OP_Column. Please, do it. Remove the parameter and keep it in
OP_Column.

> + * @param default_val_mem The value to return when fetcher's data
> + *                        lacks field by given @a field_idx.

5. Why would ever need that parameter for OP_Fetch? It is not a
task of a fetcher to return default fields IMO. It should only fetch
a field, not replace it with a default value.

> + * @param[out] dest_mem The memory variable to store result.
> + * @retval SQL_OK Status code in case of success.
> + * @retval sql_ret_code Error code otherwise.
> + */
> +static int
> +tuple_fetcher_fetch(struct tuple_fetcher *fetcher, uint32_t field_idx,
> +		    enum field_type field_type, struct Mem *default_val_mem,
> +		    struct Mem *dest_mem)
> @@ -2614,19 +2722,21 @@ case OP_Column: {>  
> +/* Opcode: Fetch P1 P2 P3 P4 P5
> + * Synopsis: r[P3]=PX
> + *
> + * Interpret the data that P1 points as an initialized

6. "Interpret data P1 points at as an initialized ...".

> + * tuple_fetcher object.
> + *
> + * Fetch the P2-th column from their tuple. The value extracted

7. 'Their' is for plural. Use 'its'.

> + * is stored in register P3.
> + *
> + * If the column contains fewer than P2 fields, then extract
> + * a NULL. Or, if the P4 argument is a P4_MEM use the value of
> + * the P4 argument as the result.

8. You never generate OP_Fetch with a default value. Please,
drop P4 argument. As well as P5. They are never used, even in the
last patch.

> + *
> + * Value of P5 register is an 'expected' destination value type.

9. It is not 'expected'. I can't pass here FIELD_TYPE_STRING and get
a string value despite an actually stored type. It is rather a flag,
that you need to transform 'int' into 'number'. Anyway, it should be
dropped.

> + */
> +case OP_Fetch: {
> +	struct tuple_fetcher *fetcher =
> +		(struct tuple_fetcher *) p->aMem[pOp->p1].u.p;
> +	uint32_t field_idx = pOp->p2;
> +	struct Mem *dest_mem = &aMem[pOp->p3];
> +	struct Mem *default_val_mem =
> +		pOp->p4type == P4_MEM ? pOp->p4.pMem : NULL;
> +	enum field_type field_type = pOp->p5;
> +	memAboutToChange(p, dest_mem);
> +	rc = tuple_fetcher_fetch(fetcher, field_idx, field_type,
> +			         default_val_mem, dest_mem);
> +	if (rc != SQL_OK)
> +		goto abort_due_to_error;
> +	UPDATE_MAX_BLOBSIZE(dest_mem);
10. Why is not this command a part of tuple_fetcher_fetch()?

> +	REGISTER_TRACE(p, pOp->p3, dest_mem);
> +	break;
> +}




More information about the Tarantool-patches mailing list