[tarantool-patches] Re: [PATCH v1 4/4] sql: got rid of redundant bitmask helpers
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Fri Feb 15 20:05:10 MSK 2019
Thanks for the patch! See 2 comments below.
On 08/02/2019 13:52, Kirill Shcherbatov wrote:
> The original SQLite code allowed the program to be compiled using
> 32-bit masks to save disk space. Since we are not going to
> compile Tarantool in such way, we will get rid of these macros by
> replacing them(where possible) with helpers from column_mask.h
>
> Closes #3571
> ---
> src/box/sql/expr.c | 31 ++++++-----
> src/box/sql/resolve.c | 12 +----
> src/box/sql/sqliteInt.h | 24 ++-------
> src/box/sql/where.c | 110 ++++++++++++++++++----------------------
> src/box/sql/whereInt.h | 3 +-
> 5 files changed, 74 insertions(+), 106 deletions(-)
>
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index cf82bd366..613c29bbb 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -2456,19 +2456,20 @@ sqlite3FindInIndex(Parse * pParse, /* Parsing context */
> eType == 0; ++k) {
> struct index *idx = space->index[k];
> Bitmask colUsed; /* Columns of the index used */
> - Bitmask mCol; /* Mask for the current column */
> uint32_t part_count =
> idx->def->key_def->part_count;
> struct key_part *parts =
> idx->def->key_def->parts;
> if ((int)part_count < nExpr)
> continue;
> - /* Maximum nColumn is BMS-2, not BMS-1, so that we can compute
> - * BITMASK(nExpr) without overflowing
> + /*
> + * Maximum nColumn is
1. There is no nColumn variable.
> + * COLUMN_MASK_SIZE-2, not
> + * COLUMN_MASK_SIZE-1, so that we
> + * can compute bitmask without
> + * overflowing.
> */
> - testcase(part_count == BMS - 2);
> - testcase(part_count == BMS - 1);
> - if (part_count >= BMS - 1)
> + if (part_count >= COLUMN_MASK_SIZE - 1)
> continue;
> if (mustBeUnique &&
> ((int)part_count > nExpr ||
> @@ -2504,19 +2505,23 @@ sqlite3FindInIndex(Parse * pParse, /* Parsing context */
> }
> if (j == nExpr)
> break;
> - mCol = MASKBIT(j);
> - if (mCol & colUsed)
> - break; /* Each column used only once */
> - colUsed |= mCol;
> + /*
> + * Each column used only
> + * once.
> + */
> + if (column_mask_fieldno_is_set(colUsed,
> + j))
> + break;
2. You broke down the behaviour. Before your patch,
mCol in case of j >= BMS was == 0 and the condition
(mCol & colUsed) == false. Now it is possible that
with j >= COLUMN_MASK_SIZE the condition is true.
Even if j was not visited earlier in fact. For example,
if you visited any of columns [63, +inf], then you will
consider visited *all* the columns [63, +inf].
Taking into account the previous patch and the problem
with 32bit VdbeOp.p1 I see a more common problem - column
mask is not scalable nor configurable. For example, here
we do not need a smart mask. We need just a simple one.
On the other hand, in the previous commit we need a 32bit
mask.
Honestly, I do not know what to advice you in order how
to fix it because my solution is likely to be rejected by
Kostja. I can just formalize and list possible options
and let you choose one. Or you can come up with a better
one. And consult Kostja.
My variants:
1. Do not introduce any additional types or macros.
Just make several hardcoded sets of functions.
One set for smart 64 bit mask:
smart_column_mask_fieldno_is_set
smart_column_mask_is_overflowed
smart_column_mask_set_range
smart_column_mask_set_fieldno
One set for simple 64 bit mask:
column_mask64_fieldno_is_set
column_mask64_set_fieldno
One set for simple 32 bit mask:
column_mask32_fieldno_is_set
column_mask32_set_fieldno
2. Turn column_mask.h into a template header, like mhash, rbtree etc.
Before calling include "column_mask.h" you define something like
CMASK_IS_SMART (1 / 0), CMASK_SIZE (32 / 64).
Then column_mask.h uses these macro values in functions names and
bodies.
Example:
#ifndef CMASK_IS_SMART
#error "Expected CMASK_IS_SMART macro"
#endif
#ifndef CMASK_SIZE
#error "Expected CMASK_SIZE macro"
#endif
#if CMASK_IS_SMART == 1
#define PREFIX smart_
#else
#define PREFIX
#endif
#define mask_t uint##CMASK_SIZE##_t
#define cmask(funcname) PREFIX##column_mask##CMASK_SIZE##funcname
static inline void
cmask(set_fieldno)(mask_t *column_mask, uint32_t fieldno)
{
if (CMASK_IS_SMART && fieldno >= COLUMN_MASK_SIZE - 1)
/*
* @sa column_mask key_def declaration for
* details.
*/
*column_mask |= COLUMN_MASK_BIT(COLUMN_MASK_SIZE - 1);
else
*column_mask |= COLUMN_MASK_BIT(fieldno);
}
/** Other functions ... */
#undef CMASK_IS_SMART
#undef CMASK_SIZE
#undef PREFIX
#undef mask_t
#undef cmask
Personally, I prefer the second one. Looks beautiful in my opinion. I will
review the rest after you fixed the present comments.
> + column_mask_set_fieldno(&colUsed, j);
> if (aiMap)
> aiMap[i] = pRhs->iColumn;
> else if (pSingleIdxCol && nExpr == 1)
> *pSingleIdxCol = pRhs->iColumn;
> }
>
> - assert(i == nExpr
> - || colUsed != (MASKBIT(nExpr) - 1));
> - if (colUsed == (MASKBIT(nExpr) - 1)) {
> + assert(i == nExpr ||
> + colUsed != (COLUMN_MASK_BIT(nExpr) - 1));
> + if (colUsed == (COLUMN_MASK_BIT(nExpr) - 1)) {
> /* If we reach this point, that means the index pIdx is usable */
> int iAddr = sqlite3VdbeAddOp0(v, OP_Once);
> VdbeCoverage(v);
More information about the Tarantool-patches
mailing list