* [Tarantool-patches] [PATCH v2 0/3] Fix working with temporary files in VDBE
@ 2020-12-11 14:49 Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 14:49 UTC (permalink / raw)
To: v.shpilevoy, imeevma, korablev, sergos, m.semkin; +Cc: tarantool-patches
https://github.com/tarantool/tarantool/issues/5537
https://github.com/tarantool/tarantool/tree/lvasilev/gh-5537-sql-vdbe-sort-troubles
@ChangeLog Fixed possible crash in SQL. A crash could have occurred
when multiple instances of the tarantool were runs on the same node
and executed a `SELECT` statement with `ORDER BY`, `GROUP BY` clauses
with many results.(#5537)
Changes in v2:
-add common diag_set if sql_execute is failure.
Leonid Vasiliev (2):
sql: set an error to diag in sql_execute() on failure
sql: update temporary file name format
Mergen Imeev (1):
sql: add missing diag_set on failure when working with files inside
SQL module
src/box/execute.c | 12 +++++++++++-
src/box/sql/os_unix.c | 13 +++++++++----
2 files changed, 20 insertions(+), 5 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-11 14:49 [Tarantool-patches] [PATCH v2 0/3] Fix working with temporary files in VDBE Leonid Vasiliev
@ 2020-12-11 14:49 ` Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-15 22:12 ` Vladislav Shpilevoy
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format Leonid Vasiliev
2 siblings, 2 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 14:49 UTC (permalink / raw)
To: v.shpilevoy, imeevma, korablev, sergos, m.semkin
Cc: Mergen Imeev, tarantool-patches
From: Mergen Imeev <imeevma@gmail.com>
SQL module didn't set an error in the diagnostics area on a file
operation failure. This could lead to a crash like in #5537.
Part of #5537
---
src/box/sql/os_unix.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index b351c55..557d709 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
if (fd < 0) {
if (errno == EINTR)
continue;
+ diag_set(SystemError, strerror(errno));
break;
}
if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
break;
close(fd);
fd = -1;
- if (open("/dev/null", f, m) < 0)
+ if (open("/dev/null", f, m) < 0) {
+ diag_set(SystemError, strerror(errno));
break;
+ }
}
if (fd >= 0) {
if (m != 0) {
@@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write to */
rc = write(fd, pBuf, nBuf);
} while (rc < 0 && errno == EINTR);
- if (rc < 0)
+ if (rc < 0) {
+ diag_set(SystemError, strerror(errno));
*piErrno = errno;
+ }
return rc;
}
--
2.7.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-11 14:49 [Tarantool-patches] [PATCH v2 0/3] Fix working with temporary files in VDBE Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
@ 2020-12-11 14:49 ` Leonid Vasiliev
2020-12-11 15:03 ` Nikita Pettik
2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format Leonid Vasiliev
2 siblings, 2 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 14:49 UTC (permalink / raw)
To: v.shpilevoy, imeevma, korablev, sergos, m.semkin; +Cc: tarantool-patches
In SQL, on failure sometimes an error sets to the diag, sometimes not.
And this can dived to situation as in #5537(SEGFAULT).
So, let's set an error to the diag if the result of `sql_execute()`
is a failure and there is no error in the diag.
Part of #5537
---
After some discussion with Sergos, I added a common diag_set
when sql_execute() fails.
I wanted to add such a common error by `diag_add()` if diag
is not empty, but such a change would entail additional correction in tests.
But this patch should be included in the next release, and I want it to
be as small as possible. This patchset is about fixes a crash, not about
refactoring and improvements. For this I will create separate tasks.
src/box/execute.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/box/execute.c b/src/box/execute.c
index e14da20..87ebb44 100644
--- a/src/box/execute.c
+++ b/src/box/execute.c
@@ -687,8 +687,18 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region)
rc = sql_step(stmt);
assert(rc != SQL_ROW && rc != 0);
}
- if (rc != SQL_DONE)
+ if (rc != SQL_DONE) {
+ /*
+ * In SQL, on failure sometimes an error sets to the diag,
+ * sometimes not. So, let's set an error to the diag if
+ * the status is a failure and there is no error in the diag.
+ */
+ if (diag_is_empty(diag_get())) {
+ diag_set(ClientError, ER_SQL_EXECUTE,
+ "something went wrong");
+ }
return -1;
+ }
return 0;
}
--
2.7.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format
2020-12-11 14:49 [Tarantool-patches] [PATCH v2 0/3] Fix working with temporary files in VDBE Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure Leonid Vasiliev
@ 2020-12-11 14:49 ` Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
2 siblings, 1 reply; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 14:49 UTC (permalink / raw)
To: v.shpilevoy, imeevma, korablev, sergos, m.semkin; +Cc: tarantool-patches
The bug was consisted in fail when working with temporary files
created by VDBE to sort large result of a `SELECT` statement with
`ORDER BY`, `GROUP BY` clauses.
Whats happen (step by step):
- We have two instances on one node (sharded cluster).
- A query is created that executes on both.
- The first instance creates the name of the temporary file and
checks a file with such name on existence.
- The second instance creates the name of the temporary file
(the same as in first instance) and checks a file with such name
on existence.
- The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
flag.
- The second instance opens(try to open) the same file.
- The first instance closes (and removes) the temporary file.
- The second instance tries to work with the file and fails.
Why did it happen:
The temporary file name format has a random part, but the random
generator uses a fixed seed.
When it was decided to use a fixed seed:
32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
How the patch fixes the problem:
The patch injects the PID in the temporary file name format.
The generated name is unique for a single process (due to a random part)
and unique between processes (due to the PID part).
Alternatives:
1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
temporary files). In both cases, we need to update a significant
part of the code, and some degradation can be added. It's hard to
review.
2) Return a random seed for the generator. As far as I understand,
we want to have good reproducible system behavior, in which case
it's good to use a fixed seed.
3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
win the fight. Now we set such flags when opening a temporary file,
but after that we try to open the file in `READONLY` mode and
if ok - return the descriptor. This is strange logic for me and I
don't want to add any aditional logic here. Also, such solution will
add additional attempts to open the file.
So, it look like such minimal changes will work fine and are simple
to review.
Co-authored-by: Mergen Imeev<imeevma@gmail.com>
Fixes #5537
---
src/box/sql/os_unix.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index 557d709..ce415cb 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
assert(nBuf > 2);
zBuf[nBuf - 2] = 0;
sql_snprintf(nBuf, zBuf,
- "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
- r, 0);
+ "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
+ (long)randomnessPid, r, 0);
if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
return -1;
} while (access(zBuf, 0) == 0);
--
2.7.4
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure Leonid Vasiliev
@ 2020-12-11 15:03 ` Nikita Pettik
2020-12-11 15:40 ` Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
1 sibling, 1 reply; 18+ messages in thread
From: Nikita Pettik @ 2020-12-11 15:03 UTC (permalink / raw)
To: Leonid Vasiliev; +Cc: tarantool-patches, v.shpilevoy, m.semkin
On 11 Dec 17:49, Leonid Vasiliev wrote:
> In SQL, on failure sometimes an error sets to the diag, sometimes not.
> And this can dived to situation as in #5537(SEGFAULT).
> So, let's set an error to the diag if the result of `sql_execute()`
> is a failure and there is no error in the diag.
Personally I am against this patch. In case of error SQL submodule
must always set diagnostic error; if it does no set - something goes
really wrong. In this case crash is better option. Firsly, it provides
coredump which is useful in accident investigation. Secondly, unset
diagnostics may leave SQL in inconsistent state - nobody knows how
severe error happened.
> Part of #5537
> ---
>
> After some discussion with Sergos, I added a common diag_set
> when sql_execute() fails.
> I wanted to add such a common error by `diag_add()` if diag
> is not empty, but such a change would entail additional correction in tests.
> But this patch should be included in the next release, and I want it to
> be as small as possible. This patchset is about fixes a crash, not about
> refactoring and improvements. For this I will create separate tasks.
>
> src/box/execute.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/src/box/execute.c b/src/box/execute.c
> index e14da20..87ebb44 100644
> --- a/src/box/execute.c
> +++ b/src/box/execute.c
> @@ -687,8 +687,18 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region)
> rc = sql_step(stmt);
> assert(rc != SQL_ROW && rc != 0);
> }
> - if (rc != SQL_DONE)
> + if (rc != SQL_DONE) {
> + /*
> + * In SQL, on failure sometimes an error sets to the diag,
> + * sometimes not. So, let's set an error to the diag if
> + * the status is a failure and there is no error in the diag.
> + */
> + if (diag_is_empty(diag_get())) {
> + diag_set(ClientError, ER_SQL_EXECUTE,
> + "something went wrong");
> + }
> return -1;
> + }
> return 0;
> }
>
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-11 15:03 ` Nikita Pettik
@ 2020-12-11 15:40 ` Leonid Vasiliev
0 siblings, 0 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 15:40 UTC (permalink / raw)
To: Nikita Pettik; +Cc: tarantool-patches, v.shpilevoy, m.semkin
Hi! Thank you for the review.
On 11.12.2020 18:03, Nikita Pettik wrote:
> On 11 Dec 17:49, Leonid Vasiliev wrote:
>> In SQL, on failure sometimes an error sets to the diag, sometimes not.
>> And this can dived to situation as in #5537(SEGFAULT).
>> So, let's set an error to the diag if the result of `sql_execute()`
>> is a failure and there is no error in the diag.
>
> Personally I am against this patch. In case of error SQL submodule
> must always set diagnostic error; if it does no set - something goes
> really wrong. In this case crash is better option. Firsly, it provides
> coredump which is useful in accident investigation. Secondly, unset
> diagnostics may leave SQL in inconsistent state - nobody knows how
> severe error happened.
>
Maybe you are right. But it looks like SEGFAULT indicates a problem, but
coredump is not always helpful in this case.
Anyway, I agree to throw out this patch.
>> Part of #5537
>> ---
>>
>> After some discussion with Sergos, I added a common diag_set
>> when sql_execute() fails.
>> I wanted to add such a common error by `diag_add()` if diag
>> is not empty, but such a change would entail additional correction in tests.
>> But this patch should be included in the next release, and I want it to
>> be as small as possible. This patchset is about fixes a crash, not about
>> refactoring and improvements. For this I will create separate tasks.
>>
>> src/box/execute.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/box/execute.c b/src/box/execute.c
>> index e14da20..87ebb44 100644
>> --- a/src/box/execute.c
>> +++ b/src/box/execute.c
>> @@ -687,8 +687,18 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region)
>> rc = sql_step(stmt);
>> assert(rc != SQL_ROW && rc != 0);
>> }
>> - if (rc != SQL_DONE)
>> + if (rc != SQL_DONE) {
>> + /*
>> + * In SQL, on failure sometimes an error sets to the diag,
>> + * sometimes not. So, let's set an error to the diag if
>> + * the status is a failure and there is no error in the diag.
>> + */
>> + if (diag_is_empty(diag_get())) {
>> + diag_set(ClientError, ER_SQL_EXECUTE,
>> + "something went wrong");
>> + }
>> return -1;
>> + }
>> return 0;
>> }
>>
>> --
>> 2.7.4
>>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
@ 2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:49 ` Leonid Vasiliev
2020-12-15 22:12 ` Vladislav Shpilevoy
1 sibling, 1 reply; 18+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-13 18:30 UTC (permalink / raw)
To: Leonid Vasiliev, imeevma, korablev, sergos, m.semkin
Cc: Mergen Imeev, tarantool-patches
Hi! Thanks for the patchset!
On 11.12.2020 15:49, Leonid Vasiliev wrote:
> From: Mergen Imeev <imeevma@gmail.com>
>
> SQL module didn't set an error in the diagnostics area on a file
> operation failure. This could lead to a crash like in #5537.
>
> Part of #5537
> ---
> src/box/sql/os_unix.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
> index b351c55..557d709 100644
> --- a/src/box/sql/os_unix.c
> +++ b/src/box/sql/os_unix.c
> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
> if (fd < 0) {
> if (errno == EINTR)
> continue;
> + diag_set(SystemError, strerror(errno));
> break;
SystemError already reports strerror() in ::log() method.
I don't think you need to duplicate it. Better add a proper
description. For example, the path. Grep "diag_set(SystemError"
in our project and see other usages which try to provide
more info. The same in other places.
> }
> if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
> break;
> close(fd);
> fd = -1;
> - if (open("/dev/null", f, m) < 0)
> + if (open("/dev/null", f, m) < 0) {
> + diag_set(SystemError, strerror(errno));
> break;
> + }
> }
> if (fd >= 0) {
> if (m != 0) {
> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write to */
> rc = write(fd, pBuf, nBuf);
> } while (rc < 0 && errno == EINTR);
>
> - if (rc < 0)
> + if (rc < 0) {
> + diag_set(SystemError, strerror(errno));
> *piErrno = errno;
> + }
> return rc;
> }
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure Leonid Vasiliev
2020-12-11 15:03 ` Nikita Pettik
@ 2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:52 ` Leonid Vasiliev
1 sibling, 1 reply; 18+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-13 18:30 UTC (permalink / raw)
To: Leonid Vasiliev, imeevma, korablev, sergos, m.semkin; +Cc: tarantool-patches
Thanks for the patch!
I agree with Nikita here. The change is dangerous. If there is
no a diag, but the query failed, it means something is very wrong,
and it is not safe to continue execution. A panic() would be
better here.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format Leonid Vasiliev
@ 2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:54 ` Leonid Vasiliev
0 siblings, 1 reply; 18+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-13 18:30 UTC (permalink / raw)
To: Leonid Vasiliev, imeevma, korablev, sergos, m.semkin; +Cc: tarantool-patches
Thanks for the patch!
On 11.12.2020 15:49, Leonid Vasiliev via Tarantool-patches wrote:
> The bug was consisted in fail when working with temporary files
> created by VDBE to sort large result of a `SELECT` statement with
> `ORDER BY`, `GROUP BY` clauses.
>
> Whats happen (step by step):
> - We have two instances on one node (sharded cluster).
> - A query is created that executes on both.
> - The first instance creates the name of the temporary file and
> checks a file with such name on existence.
> - The second instance creates the name of the temporary file
> (the same as in first instance) and checks a file with such name
> on existence.
> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
> flag.
> - The second instance opens(try to open) the same file.
> - The first instance closes (and removes) the temporary file.
> - The second instance tries to work with the file and fails.
>
> Why did it happen:
> The temporary file name format has a random part, but the random
> generator uses a fixed seed.
>
> When it was decided to use a fixed seed:
> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
To reference a commit we also usually include commit title in
("...") after the hash.
The patch itself looks good.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-13 18:30 ` Vladislav Shpilevoy
@ 2020-12-14 15:49 ` Leonid Vasiliev
2020-12-15 20:29 ` Sergey Ostanevich
0 siblings, 1 reply; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-14 15:49 UTC (permalink / raw)
To: Vladislav Shpilevoy, imeevma, korablev, sergos, m.semkin
Cc: Mergen Imeev, tarantool-patches
Hi! Thank you for the review.
On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
>
> On 11.12.2020 15:49, Leonid Vasiliev wrote:
>> From: Mergen Imeev <imeevma@gmail.com>
>>
>> SQL module didn't set an error in the diagnostics area on a file
>> operation failure. This could lead to a crash like in #5537.
>>
>> Part of #5537
>> ---
>> src/box/sql/os_unix.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>> index b351c55..557d709 100644
>> --- a/src/box/sql/os_unix.c
>> +++ b/src/box/sql/os_unix.c
>> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
>> if (fd < 0) {
>> if (errno == EINTR)
>> continue;
>> + diag_set(SystemError, strerror(errno));
>> break;
>
> SystemError already reports strerror() in ::log() method.
> I don't think you need to duplicate it. Better add a proper
> description. For example, the path. Grep "diag_set(SystemError"
> in our project and see other usages which try to provide
> more info. The same in other places.
I thought about adding more info to the error, but it seems to me that
only "trace" is useful here. I do not mind your proposal. See full diff
at the end of the letter
>
>> }
>> if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
>> break;
>> close(fd);
>> fd = -1;
>> - if (open("/dev/null", f, m) < 0)
>> + if (open("/dev/null", f, m) < 0) {
>> + diag_set(SystemError, strerror(errno));
>> break;
>> + }
>> }
>> if (fd >= 0) {
>> if (m != 0) {
>> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write to */
>> rc = write(fd, pBuf, nBuf);
>> } while (rc < 0 && errno == EINTR);
>>
>> - if (rc < 0)
>> + if (rc < 0) {
>> + diag_set(SystemError, strerror(errno));
>> *piErrno = errno;
>> + }
>> return rc;
>> }
>>
>>
src/box/sql/os_unix.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index b351c55..ea7d3cc 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
if (fd < 0) {
if (errno == EINTR)
continue;
+ diag_set(SystemError, "failed to open file '%s'", z);
break;
}
if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
break;
close(fd);
fd = -1;
- if (open("/dev/null", f, m) < 0)
+ if (open("/dev/null", f, m) < 0) {
+ diag_set(SystemError, "failed to open '/dev/null'");
break;
+ }
}
if (fd >= 0) {
if (m != 0) {
@@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write
to */
rc = write(fd, pBuf, nBuf);
} while (rc < 0 && errno == EINTR);
- if (rc < 0)
+ if (rc < 0) {
+ diag_set(SystemError, "failed to write %i bytes to file", nBuf);
*piErrno = errno;
+ }
return rc;
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-13 18:30 ` Vladislav Shpilevoy
@ 2020-12-14 15:52 ` Leonid Vasiliev
2020-12-15 21:05 ` Sergey Ostanevich
0 siblings, 1 reply; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-14 15:52 UTC (permalink / raw)
To: Vladislav Shpilevoy, imeevma, korablev, sergos, m.semkin
Cc: tarantool-patches
Hi! Thank you for the review.
On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> I agree with Nikita here. The change is dangerous. If there is
> no a diag, but the query failed, it means something is very wrong,
> and it is not safe to continue execution. A panic() would be
> better here.
>
OK. I don't mind.
New patch:
sql: add panic() call in sql_execute() on complete failure
In SQL, on failure sometimes an error sets to the diag, sometimes not.
And this can dived to situation as in #5537(SEGFAULT).
So, let's call `panic()` in that case, because something is very wrong,
and it is not safe to continue execution.
Part of #5537
---
src/box/execute.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/box/execute.c b/src/box/execute.c
index e14da20..a424349 100644
--- a/src/box/execute.c
+++ b/src/box/execute.c
@@ -687,8 +687,18 @@ sql_execute(struct sql_stmt *stmt, struct port
*port, struct region *region)
rc = sql_step(stmt);
assert(rc != SQL_ROW && rc != 0);
}
- if (rc != SQL_DONE)
+ if (rc != SQL_DONE) {
+ /*
+ * In SQL, on failure sometimes an error sets to the diag,
+ * sometimes not. So, let's call `panic()` in that case, because
+ * something is very wrong, and it is not safe to continue
+ * execution.
+ */
+ if (diag_is_empty(diag_get()))
+ panic("failed to execute SQL statement");
+
return -1;
+ }
return 0;
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format
2020-12-13 18:30 ` Vladislav Shpilevoy
@ 2020-12-14 15:54 ` Leonid Vasiliev
2020-12-15 21:07 ` Sergey Ostanevich
2020-12-16 14:47 ` Nikita Pettik
0 siblings, 2 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-14 15:54 UTC (permalink / raw)
To: Vladislav Shpilevoy, imeevma, korablev, sergos, m.semkin
Cc: tarantool-patches
Hi! Thank you for the review.
On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> On 11.12.2020 15:49, Leonid Vasiliev via Tarantool-patches wrote:
>> The bug was consisted in fail when working with temporary files
>> created by VDBE to sort large result of a `SELECT` statement with
>> `ORDER BY`, `GROUP BY` clauses.
>>
>> Whats happen (step by step):
>> - We have two instances on one node (sharded cluster).
>> - A query is created that executes on both.
>> - The first instance creates the name of the temporary file and
>> checks a file with such name on existence.
>> - The second instance creates the name of the temporary file
>> (the same as in first instance) and checks a file with such name
>> on existence.
>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>> flag.
>> - The second instance opens(try to open) the same file.
>> - The first instance closes (and removes) the temporary file.
>> - The second instance tries to work with the file and fails.
>>
>> Why did it happen:
>> The temporary file name format has a random part, but the random
>> generator uses a fixed seed.
>>
>> When it was decided to use a fixed seed:
>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>
> To reference a commit we also usually include commit title in
> ("...") after the hash.
Changed to:
When it was decided to use a fixed seed:
32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
("sql: drop useless code from os_unix.c")
>
> The patch itself looks good.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-14 15:49 ` Leonid Vasiliev
@ 2020-12-15 20:29 ` Sergey Ostanevich
0 siblings, 0 replies; 18+ messages in thread
From: Sergey Ostanevich @ 2020-12-15 20:29 UTC (permalink / raw)
To: Leonid Vasiliev
Cc: m.semkin, Vladislav Shpilevoy, Mergen Imeev, tarantool-patches
Thanks for the patch, LGTM.
Sergos
> On 14 Dec 2020, at 18:49, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>
> Hi! Thank you for the review.
>
> On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the patchset!
>> On 11.12.2020 15:49, Leonid Vasiliev wrote:
>>> From: Mergen Imeev <imeevma@gmail.com>
>>>
>>> SQL module didn't set an error in the diagnostics area on a file
>>> operation failure. This could lead to a crash like in #5537.
>>>
>>> Part of #5537
>>> ---
>>> src/box/sql/os_unix.c | 9 +++++++--
>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>>> index b351c55..557d709 100644
>>> --- a/src/box/sql/os_unix.c
>>> +++ b/src/box/sql/os_unix.c
>>> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
>>> if (fd < 0) {
>>> if (errno == EINTR)
>>> continue;
>>> + diag_set(SystemError, strerror(errno));
>>> break;
>> SystemError already reports strerror() in ::log() method.
>> I don't think you need to duplicate it. Better add a proper
>> description. For example, the path. Grep "diag_set(SystemError"
>> in our project and see other usages which try to provide
>> more info. The same in other places.
>
> I thought about adding more info to the error, but it seems to me that
> only "trace" is useful here. I do not mind your proposal. See full diff
> at the end of the letter
>
>>> }
>>> if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
>>> break;
>>> close(fd);
>>> fd = -1;
>>> - if (open("/dev/null", f, m) < 0)
>>> + if (open("/dev/null", f, m) < 0) {
>>> + diag_set(SystemError, strerror(errno));
>>> break;
>>> + }
>>> }
>>> if (fd >= 0) {
>>> if (m != 0) {
>>> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write to */
>>> rc = write(fd, pBuf, nBuf);
>>> } while (rc < 0 && errno == EINTR);
>>> - if (rc < 0)
>>> + if (rc < 0) {
>>> + diag_set(SystemError, strerror(errno));
>>> *piErrno = errno;
>>> + }
>>> return rc;
>>> }
>>>
>
>
> src/box/sql/os_unix.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
> index b351c55..ea7d3cc 100644
> --- a/src/box/sql/os_unix.c
> +++ b/src/box/sql/os_unix.c
> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
> if (fd < 0) {
> if (errno == EINTR)
> continue;
> + diag_set(SystemError, "failed to open file '%s'", z);
> break;
> }
> if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
> break;
> close(fd);
> fd = -1;
> - if (open("/dev/null", f, m) < 0)
> + if (open("/dev/null", f, m) < 0) {
> + diag_set(SystemError, "failed to open '/dev/null'");
> break;
> + }
> }
> if (fd >= 0) {
> if (m != 0) {
> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd, /* File descriptor to write to */
> rc = write(fd, pBuf, nBuf);
> } while (rc < 0 && errno == EINTR);
>
> - if (rc < 0)
> + if (rc < 0) {
> + diag_set(SystemError, "failed to write %i bytes to file", nBuf);
> *piErrno = errno;
> + }
> return rc;
> }
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure
2020-12-14 15:52 ` Leonid Vasiliev
@ 2020-12-15 21:05 ` Sergey Ostanevich
0 siblings, 0 replies; 18+ messages in thread
From: Sergey Ostanevich @ 2020-12-15 21:05 UTC (permalink / raw)
To: Leonid Vasiliev; +Cc: m.semkin, Vladislav Shpilevoy, tarantool-patches
Thanks for the patch!
Effectively we just cover SIGSEGV that inevitably appear in iproto either
in box.execute - both access diag as if it set in case of non-zero result.
LGTM.
Sergos
> On 14 Dec 2020, at 18:52, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>
> Hi! Thank you for the review.
>
> On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
>> Thanks for the patch!
>> I agree with Nikita here. The change is dangerous. If there is
>> no a diag, but the query failed, it means something is very wrong,
>> and it is not safe to continue execution. A panic() would be
>> better here.
> OK. I don't mind.
>
> New patch:
>
> sql: add panic() call in sql_execute() on complete failure
>
> In SQL, on failure sometimes an error sets to the diag, sometimes not.
> And this can dived to situation as in #5537(SEGFAULT).
> So, let's call `panic()` in that case, because something is very wrong,
> and it is not safe to continue execution.
>
> Part of #5537
> ---
> src/box/execute.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/src/box/execute.c b/src/box/execute.c
> index e14da20..a424349 100644
> --- a/src/box/execute.c
> +++ b/src/box/execute.c
> @@ -687,8 +687,18 @@ sql_execute(struct sql_stmt *stmt, struct port *port, struct region *region)
> rc = sql_step(stmt);
> assert(rc != SQL_ROW && rc != 0);
> }
> - if (rc != SQL_DONE)
> + if (rc != SQL_DONE) {
> + /*
> + * In SQL, on failure sometimes an error sets to the diag,
> + * sometimes not. So, let's call `panic()` in that case, because
> + * something is very wrong, and it is not safe to continue
> + * execution.
> + */
> + if (diag_is_empty(diag_get()))
> + panic("failed to execute SQL statement");
> +
> return -1;
> + }
> return 0;
> }
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format
2020-12-14 15:54 ` Leonid Vasiliev
@ 2020-12-15 21:07 ` Sergey Ostanevich
2020-12-16 14:47 ` Nikita Pettik
1 sibling, 0 replies; 18+ messages in thread
From: Sergey Ostanevich @ 2020-12-15 21:07 UTC (permalink / raw)
To: Leonid Vasiliev; +Cc: m.semkin, Vladislav Shpilevoy, tarantool-patches
Thanks for the patch!
As I mentioned in another thread - this one is LGTM.
Sergos
> On 14 Dec 2020, at 18:54, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>
> Hi! Thank you for the review.
>
> On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
>> Thanks for the patch!
>> On 11.12.2020 15:49, Leonid Vasiliev via Tarantool-patches wrote:
>>> The bug was consisted in fail when working with temporary files
>>> created by VDBE to sort large result of a `SELECT` statement with
>>> `ORDER BY`, `GROUP BY` clauses.
>>>
>>> Whats happen (step by step):
>>> - We have two instances on one node (sharded cluster).
>>> - A query is created that executes on both.
>>> - The first instance creates the name of the temporary file and
>>> checks a file with such name on existence.
>>> - The second instance creates the name of the temporary file
>>> (the same as in first instance) and checks a file with such name
>>> on existence.
>>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>>> flag.
>>> - The second instance opens(try to open) the same file.
>>> - The first instance closes (and removes) the temporary file.
>>> - The second instance tries to work with the file and fails.
>>>
>>> Why did it happen:
>>> The temporary file name format has a random part, but the random
>>> generator uses a fixed seed.
>>>
>>> When it was decided to use a fixed seed:
>>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>> To reference a commit we also usually include commit title in
>> ("...") after the hash.
>
> Changed to:
>
> When it was decided to use a fixed seed:
> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
> ("sql: drop useless code from os_unix.c")
>
>> The patch itself looks good.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
@ 2020-12-15 22:12 ` Vladislav Shpilevoy
2020-12-16 23:17 ` Leonid Vasiliev
1 sibling, 1 reply; 18+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-15 22:12 UTC (permalink / raw)
To: Leonid Vasiliev, imeevma, korablev, sergos, m.semkin
Cc: Mergen Imeev, tarantool-patches
Thanks for the fixes!
What about missing diag in robust_ftruncate()?
In findInodeInfo() you can get -1 from fstat().
unixFileLock() can return -1 from fcntl().
seekAndRead() and seekAndWriteFd() can return -1
from lseek() and read().
fcntlSizeHint() and unixMapfile() can return -1
from fstat().
unixGetTempname() can return -1, but I don't see if
it even sets errno.
getFileMode() can return -1 from stat().
unixDelete() can return -1 from unlink(), fstat()
I suggest to fix everything. It is all in one file and all
is related.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format
2020-12-14 15:54 ` Leonid Vasiliev
2020-12-15 21:07 ` Sergey Ostanevich
@ 2020-12-16 14:47 ` Nikita Pettik
1 sibling, 0 replies; 18+ messages in thread
From: Nikita Pettik @ 2020-12-16 14:47 UTC (permalink / raw)
To: Leonid Vasiliev; +Cc: tarantool-patches, Vladislav Shpilevoy, m.semkin
Pushed this patch (not the whole patch-set!) to master, 2.6 and 2.5 branches
(CI status was green on the branch). Corresponding changelogs are updated.
As a follow-ups I hope to see:
- test case (mb with error injections);
- #5625;
- patch introducing missing diags in SQL OS internals.
> Hi! Thank you for the review.
>
> On 13.12.2020 21:30, Vladislav Shpilevoy wrote:
> > Thanks for the patch!
> >
> > On 11.12.2020 15:49, Leonid Vasiliev via Tarantool-patches wrote:
> > > The bug was consisted in fail when working with temporary files
> > > created by VDBE to sort large result of a `SELECT` statement with
> > > `ORDER BY`, `GROUP BY` clauses.
> > >
> > > Whats happen (step by step):
> > > - We have two instances on one node (sharded cluster).
> > > - A query is created that executes on both.
> > > - The first instance creates the name of the temporary file and
> > > checks a file with such name on existence.
> > > - The second instance creates the name of the temporary file
> > > (the same as in first instance) and checks a file with such name
> > > on existence.
> > > - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
> > > flag.
> > > - The second instance opens(try to open) the same file.
> > > - The first instance closes (and removes) the temporary file.
> > > - The second instance tries to work with the file and fails.
> > >
> > > Why did it happen:
> > > The temporary file name format has a random part, but the random
> > > generator uses a fixed seed.
> > >
> > > When it was decided to use a fixed seed:
> > > 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
> >
> > To reference a commit we also usually include commit title in
> > ("...") after the hash.
>
> Changed to:
>
> When it was decided to use a fixed seed:
> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
> ("sql: drop useless code from os_unix.c")
>
> >
> > The patch itself looks good.
> >
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module
2020-12-15 22:12 ` Vladislav Shpilevoy
@ 2020-12-16 23:17 ` Leonid Vasiliev
0 siblings, 0 replies; 18+ messages in thread
From: Leonid Vasiliev @ 2020-12-16 23:17 UTC (permalink / raw)
To: Vladislav Shpilevoy, imeevma, korablev, sergos, m.semkin
Cc: Mergen Imeev, tarantool-patches
Hi! Thank you for the review.
On 16.12.2020 01:12, Vladislav Shpilevoy wrote:
> Thanks for the fixes!
>
> What about missing diag in robust_ftruncate()?
>
> In findInodeInfo() you can get -1 from fstat().
>
> unixFileLock() can return -1 from fcntl().
>
> seekAndRead() and seekAndWriteFd() can return -1
> from lseek() and read().
>
> fcntlSizeHint() and unixMapfile() can return -1
> from fstat().
>
> unixGetTempname() can return -1, but I don't see if
> it even sets errno.
errno can be set inside `unixTempFileDir()`. This will cause
`unixGetTempname()` return -1.
>
> getFileMode() can return -1 from stat().
>
> unixDelete() can return -1 from unlink(), fstat()
>
> I suggest to fix everything. It is all in one file and all
> is related.
Ok. Fix everything. I don't mind.
See PATCH v3
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2020-12-16 23:18 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-11 14:49 [Tarantool-patches] [PATCH v2 0/3] Fix working with temporary files in VDBE Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 1/3] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:49 ` Leonid Vasiliev
2020-12-15 20:29 ` Sergey Ostanevich
2020-12-15 22:12 ` Vladislav Shpilevoy
2020-12-16 23:17 ` Leonid Vasiliev
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 2/3] sql: set an error to diag in sql_execute() on failure Leonid Vasiliev
2020-12-11 15:03 ` Nikita Pettik
2020-12-11 15:40 ` Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:52 ` Leonid Vasiliev
2020-12-15 21:05 ` Sergey Ostanevich
2020-12-11 14:49 ` [Tarantool-patches] [PATCH v2 3/3] sql: update temporary file name format Leonid Vasiliev
2020-12-13 18:30 ` Vladislav Shpilevoy
2020-12-14 15:54 ` Leonid Vasiliev
2020-12-15 21:07 ` Sergey Ostanevich
2020-12-16 14:47 ` Nikita Pettik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox