From: Mergen Imeev via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: v.shpilevoy@tarantool.org
Cc: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH v2 4/4] sql: introduce FINALIZE for custom aggregate
Date: Tue, 1 Feb 2022 16:37:27 +0300 [thread overview]
Message-ID: <162eb863bee6a45a63ee85604ce004d83e8aedfa.1643722506.git.imeevma@gmail.com> (raw)
In-Reply-To: <cover.1643722506.git.imeevma@gmail.com>
This patch introduces a way to create a FINALIZER for user-defined
aggregate functions.
Closes #2579
@TarantoolBot document
Title: User-defined aggregate functions
User-defined aggregate functions are now available. To create a
user-defined aggregate function, the 'aggregate' option should be set to
'group'. The function must have at least one argument. The last
argument is always the result of the function in the previous step, or
NULL if it is the first step. The last argument will be added
automatically.
Example:
```
box.schema.func.create("F1", {language = "Lua",
body = [[
function(x, state)
if state == nil then
state = {sum = 0, count = 0}
end
state.sum = state.sum + x
state.count = state.count + 1
return state
end
]],
param_list = {"integer", "map"},
returns = "map",
aggregate = "group",
exports = {"SQL"},
})
box.execute('CREATE TABLE t (i INT PRIMARY KEY);')
box.execute('INSERT INTO t VALUES(1), (2), (3), (4), (5);')
box.execute('SELECT f1(i) FROM t;')
```
The result will be:
```
tarantool> box.execute([[select f1(i) from t;]])
---
- metadata:
- name: COLUMN_1
type: map
rows:
- [{'sum': 15, 'count': 5}]
...
```
To create a finalizer for a function, another function should be
created. This function should follow the following rules:
1) its name should be '<function name>_finalize';
2) it must take exactly one argument;
3) it must be a non-aggregate function.
If an aggregate function has a finalizer, the result of the aggregate
function will be determined by the finalizer.
Example:
```
box.schema.func.create("F1_finalize", {
language = "Lua",
body = [[
function(state)
if state == nil then
return 0
end
return state.sum / state.count
end
]],
param_list = {"map"},
returns = "number",
exports = {'LUA', 'SQL'},
})
box.execute('SELECT f1(i) FROM t;')
```
The result will be:
```
tarantool> box.execute([[select f1(i) from t;]])
---
- metadata:
- name: COLUMN_1
type: number
rows:
- [3]
...
```
---
.../gh-2579-introduce-custom-aggregates.md | 3 +
src/box/sql/func.c | 13 ++
src/box/sql/resolve.c | 12 ++
src/box/sql/select.c | 7 ++
src/box/sql/sqlInt.h | 4 +
.../sql-tap/gh-2579-custom-aggregate.test.lua | 113 +++++++++++++++++-
6 files changed, 151 insertions(+), 1 deletion(-)
create mode 100644 changelogs/unreleased/gh-2579-introduce-custom-aggregates.md
diff --git a/changelogs/unreleased/gh-2579-introduce-custom-aggregates.md b/changelogs/unreleased/gh-2579-introduce-custom-aggregates.md
new file mode 100644
index 000000000..7975f9f08
--- /dev/null
+++ b/changelogs/unreleased/gh-2579-introduce-custom-aggregates.md
@@ -0,0 +1,3 @@
+## feature/sql
+
+* user-defined aggregate functions are now available in SQL (gh-2579).
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index cda872194..a445dbc3a 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -2071,6 +2071,19 @@ sql_func_find(struct Expr *expr)
return func;
}
+struct func *
+sql_func_finalize(const char *name)
+{
+ const char *finalize_name = tt_sprintf("%s_finalize", name);
+ uint32_t len = strlen(finalize_name);
+ struct func *finalize = func_by_name(finalize_name, len);
+ if (finalize == NULL ||
+ finalize->def->param_count != 1 ||
+ finalize->def->aggregate == FUNC_AGGREGATE_GROUP)
+ return NULL;
+ return finalize;
+}
+
uint32_t
sql_func_flags(const char *name)
{
diff --git a/src/box/sql/resolve.c b/src/box/sql/resolve.c
index 22b4e6799..095e6d6a8 100644
--- a/src/box/sql/resolve.c
+++ b/src/box/sql/resolve.c
@@ -663,6 +663,18 @@ resolveExprStep(Walker * pWalker, Expr * pExpr)
return WRC_Abort;
}
pExpr->type = func->def->returns;
+ /*
+ * In case a user-defined aggregate function was called,
+ * the result type will be the result type of the
+ * FINALIZE part of the function.
+ */
+ if (func->def->language != FUNC_LANGUAGE_SQL_BUILTIN &&
+ func->def->aggregate == FUNC_AGGREGATE_GROUP) {
+ const char *name = pExpr->u.zToken;
+ struct func *finalize = sql_func_finalize(name);
+ if (finalize != NULL)
+ pExpr->type = finalize->def->returns;
+ }
assert(!func->def->is_deterministic ||
(pNC->ncFlags & NC_IdxExpr) == 0);
if (func->def->is_deterministic)
diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index dcea48c6e..a3fe2f60f 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -5566,7 +5566,14 @@ finalize_agg_function(struct Vdbe *vdbe, const struct AggInfo_func *agg_func)
if (agg_func->func->def->language == FUNC_LANGUAGE_SQL_BUILTIN) {
sqlVdbeAddOp1(vdbe, OP_AggFinal, agg_func->iMem);
sqlVdbeAppendP4(vdbe, agg_func->func, P4_FUNC);
+ return;
}
+ if (sql_func_finalize(agg_func->pExpr->u.zToken) == NULL)
+ return;
+ const char *name = tt_sprintf("%s_finalize", agg_func->pExpr->u.zToken);
+ const char *str = sqlDbStrDup(sql_get(), name);
+ sqlVdbeAddOp4(vdbe, OP_FunctionByName, 1, agg_func->iMem,
+ agg_func->iMem, str, P4_DYNAMIC);
}
/*
diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
index f49522dc8..8d0321f56 100644
--- a/src/box/sql/sqlInt.h
+++ b/src/box/sql/sqlInt.h
@@ -4308,6 +4308,10 @@ sql_func_find(struct Expr *expr);
int
sql_emit_args_types(struct Vdbe *v, int reg, struct func *base, uint32_t argc);
+/** Return a function that is a finalizer for function with given name. */
+struct func *
+sql_func_finalize(const char *name);
+
/**
* Return the parameters of the function with the given name. If the function
* with the given name does not exist, or the function is not a built-in SQL
diff --git a/test/sql-tap/gh-2579-custom-aggregate.test.lua b/test/sql-tap/gh-2579-custom-aggregate.test.lua
index 213e2e870..fc1633b95 100755
--- a/test/sql-tap/gh-2579-custom-aggregate.test.lua
+++ b/test/sql-tap/gh-2579-custom-aggregate.test.lua
@@ -3,7 +3,7 @@ local build_path = os.getenv("BUILDDIR")
package.cpath = build_path..'/test/sql-tap/?.so;'..build_path..'/test/sql-tap/?.dylib;'..package.cpath
local test = require("sqltester")
-test:plan(5)
+test:plan(9)
test:execsql([[
CREATE TABLE t (i INT PRIMARY KEY);
@@ -105,6 +105,117 @@ test:do_test(
"least one argument"
})
+-- Make sure finalizers of aggregate functions work as intended.
+box.schema.func.create("F1_finalize", {
+ language = "Lua",
+ body = [[
+ function(state)
+ if state == nil then
+ return 0
+ end
+ return state.sum / state.count
+ end
+ ]],
+ param_list = {"map"},
+ returns = "number",
+ exports = {'LUA', 'SQL'},
+})
+
+test:do_execsql_test(
+ "gh-2579-6",
+ [[
+ SELECT f1(i), typeof(f1(i)) from t;
+ ]], {
+ 3, "integer"
+ })
+
+--
+-- Make sure that the finalizers of the aggregate function and the aggregate
+-- function can be of different types.
+--
+box.schema.func.create("F2_finalize", {
+ language = "Lua",
+ body = [[
+ function(state)
+ if state == nil then
+ return 0
+ end
+ return state[#state]
+ end
+ ]],
+ param_list = {"array"},
+ returns = "number",
+ exports = {'LUA', 'SQL'},
+})
+
+test:do_execsql_test(
+ "gh-2579-7",
+ [[
+ SELECT f2(i) from t;
+ ]], {
+ 5
+ })
+
+-- Make sure that aggregate function cannot become finalizer.
+box.schema.func.drop("F2_finalize")
+box.schema.func.create("F2_finalize", {
+ language = "Lua",
+ body = [[
+ function(state)
+ if state == nil then
+ return 0
+ end
+ sum = 0
+ for k, v in pairs(state) do
+ sum = sum + v
+ end
+ return sum
+ end
+ ]],
+ aggregate = "group",
+ param_list = {"array"},
+ returns = "number",
+ exports = {'SQL'},
+})
+
+test:do_execsql_test(
+ "gh-2579-8",
+ [[
+ SELECT f2(i), typeof(f2(i)) from t;
+ ]], {
+ {1, 2, 3, 4, 5}, "array"
+ })
+
+--
+-- Make sure that function with number of arguments not equal to 1 cannot become
+-- finalizer.
+--
+box.schema.func.drop("F2_finalize")
+box.schema.func.create("F2_finalize", {
+ language = "Lua",
+ body = [[
+ function(state, x)
+ if state == nil then
+ return 0
+ end
+ return state[#state] * x
+ end
+ ]],
+ param_list = {"map", "integer"},
+ returns = "number",
+ exports = {'LUA', 'SQL'},
+})
+
+test:do_execsql_test(
+ "gh-2579-9",
+ [[
+ SELECT f2(i), typeof(f2(i)), "F2_finalize"(f2(i), 2) from t;
+ ]], {
+ {1, 2, 3, 4, 5}, "array", 10
+ })
+
+box.schema.func.drop('F2_finalize')
+box.schema.func.drop('F1_finalize')
box.schema.func.drop('gh-2579-custom-aggregate.f3')
box.schema.func.drop('F2')
box.schema.func.drop('F1')
--
2.25.1
next prev parent reply other threads:[~2022-02-01 13:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-01 13:37 [Tarantool-patches] [PATCH v2 0/4] Introduce custom aggregate function Mergen Imeev via Tarantool-patches
2022-02-01 13:37 ` [Tarantool-patches] [PATCH v2 1/4] sql: fix COUNT() optimization conditions Mergen Imeev via Tarantool-patches
2022-02-01 13:37 ` [Tarantool-patches] [PATCH v2 2/4] sql: drop unnecessary P2 register for OP_AggFinal Mergen Imeev via Tarantool-patches
2022-02-01 13:37 ` [Tarantool-patches] [PATCH v2 3/4] sql: introduce custom aggregate functions Mergen Imeev via Tarantool-patches
2022-02-03 23:29 ` Vladislav Shpilevoy via Tarantool-patches
2022-02-01 13:37 ` Mergen Imeev via Tarantool-patches [this message]
2022-02-03 23:30 ` [Tarantool-patches] [PATCH v2 4/4] sql: introduce FINALIZE for custom aggregate Vladislav Shpilevoy via Tarantool-patches
2022-02-10 9:21 ` Mergen Imeev via Tarantool-patches
2022-02-10 9:14 [Tarantool-patches] [PATCH v2 0/4] Introduce custom aggregate functions Mergen Imeev via Tarantool-patches
2022-02-10 9:14 ` [Tarantool-patches] [PATCH v2 4/4] sql: introduce FINALIZE for custom aggregate Mergen Imeev via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=162eb863bee6a45a63ee85604ce004d83e8aedfa.1643722506.git.imeevma@gmail.com \
--to=tarantool-patches@dev.tarantool.org \
--cc=imeevma@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v2 4/4] sql: introduce FINALIZE for custom aggregate' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox