Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] Fix working with temporary files in VDBE
@ 2020-12-08 19:59 Leonid Vasiliev
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format Leonid Vasiliev
  0 siblings, 2 replies; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-08 19:59 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy, imeevma, korablev, sergos, m.semkin

https://github.com/tarantool/tarantool/issues/5537
https://github.com/tarantool/tarantool/tree/lvasilev/gh-5537-sql-vdbe-sort-troubles

@ChangeLog Fixed possible crash in SQL. A crash could have occurred
when multiple instances of the tarantool were runs on the same node
and executed a `SELECT` statement with `ORDER BY`, `GROUP BY` clauses
with many results.(#5537)

Leonid Vasiliev (1):
  sql: update temporary file name format

Mergen Imeev (1):
  sql: add missing diag_set on failure when working with files inside
    SQL module

 src/box/sql/os_unix.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module
  2020-12-08 19:59 [Tarantool-patches] [PATCH 0/2] Fix working with temporary files in VDBE Leonid Vasiliev
@ 2020-12-08 19:59 ` Leonid Vasiliev
  2020-12-10 14:15   ` Sergey Ostanevich
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format Leonid Vasiliev
  1 sibling, 1 reply; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-08 19:59 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy, imeevma, korablev, sergos, m.semkin
  Cc: Mergen Imeev

From: Mergen Imeev <imeevma@gmail.com>

SQL module didn't set an error in the diagnostics area on a file
operation failure. This could lead to a crash like in #5537.

Part of #5537
---
 src/box/sql/os_unix.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index b351c55..557d709 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
 		if (fd < 0) {
 			if (errno == EINTR)
 				continue;
+			diag_set(SystemError, strerror(errno));
 			break;
 		}
 		if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
 			break;
 		close(fd);
 		fd = -1;
-		if (open("/dev/null", f, m) < 0)
+		if (open("/dev/null", f, m) < 0) {
+			diag_set(SystemError, strerror(errno));
 			break;
+		}
 	}
 	if (fd >= 0) {
 		if (m != 0) {
@@ -831,8 +834,10 @@ seekAndWriteFd(int fd,		/* File descriptor to write to */
 		rc = write(fd, pBuf, nBuf);
 	} while (rc < 0 && errno == EINTR);
 
-	if (rc < 0)
+	if (rc < 0) {
+		diag_set(SystemError, strerror(errno));
 		*piErrno = errno;
+	}
 	return rc;
 }
 
-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-08 19:59 [Tarantool-patches] [PATCH 0/2] Fix working with temporary files in VDBE Leonid Vasiliev
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
@ 2020-12-08 19:59 ` Leonid Vasiliev
  2020-12-10 16:39   ` Sergey Ostanevich
  1 sibling, 1 reply; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-08 19:59 UTC (permalink / raw)
  To: tarantool-patches, v.shpilevoy, imeevma, korablev, sergos, m.semkin

The bug was consisted in fail when working with temporary files
created by VDBE to sort large result of a `SELECT` statement with
`ORDER BY`, `GROUP BY` clauses.

Whats happen (step by step):
- We have two instances on one node (sharded cluster).
- A query is created that executes on both.
- The first instance creates the name of the temporary file and
  checks a file with such name on existence.
- The second instance creates the name of the temporary file
  (the same as in  first instance) and checks a file with such name
  on existence.
- The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
  flag.
- The second instance opens(try to open) the same file.
- The first instance closes (and removes) the temporary file.
- The second instance tries to work with the file and fails.

Why did it happen:
The temporary file name format has a random part, but the random
generator uses a fixed seed.

When it was decided to use a fixed seed:
32cb1ad298b2b55d8536a85bdfb3827c8c8739e1

How the patch fixes the problem:
The patch injects the PID in the temporary file name format.
The generated name is unique for a single process (due to a random part)
and unique between processes (due to the PID part).

Alternatives:
1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
  temporary files). In both cases, we need to update a significant
  part of the code, and some degradation can be added. It's hard to
  review.
2) Return a random seed for the generator. As far as I understand,
  we want to have good reproducible system behavior, in which case
  it's good to use a fixed seed.
3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
  win the fight. Now we set such flags when opening a temporary file,
  but after that we try to open the file in `READONLY` mode and
  if ok - return the descriptor. This is strange logic for me and I
  don't want to add any aditional logic here. Also, such solution will
  add additional attempts to open the file.

So, it look like such minimal changes will work fine and are simple
to review.

Co-authored-by: Mergen Imeev<imeevma@gmail.com>

Fixes #5537
---
 src/box/sql/os_unix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index 557d709..ce415cb 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
 		assert(nBuf > 2);
 		zBuf[nBuf - 2] = 0;
 		sql_snprintf(nBuf, zBuf,
-				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
-				 r, 0);
+				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
+				 (long)randomnessPid, r, 0);
 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
 			return -1;
 	} while (access(zBuf, 0) == 0);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
@ 2020-12-10 14:15   ` Sergey Ostanevich
  2020-12-11 15:05     ` Leonid Vasiliev
  0 siblings, 1 reply; 10+ messages in thread
From: Sergey Ostanevich @ 2020-12-10 14:15 UTC (permalink / raw)
  To: Leonid Vasiliev
  Cc: m.semkin, tarantool-patches, Vladislav Shpilevoy, Mergen Imeev

Thanks for the patch!

I have a question if those diags are too low in the call stack?
Apparently, the unixOpen() is a single space the robust_open()
is called, so we’d have only one diag set instead of two?

Following this path - we should just cover UNIXVFS and sql_io_methods
members with diag set, given the errno is preserved.

By now the solution is partial, so can be applied only if we’re in a
rush.

Sergos

> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
> 
> From: Mergen Imeev <imeevma@gmail.com>
> 
> SQL module didn't set an error in the diagnostics area on a file
> operation failure. This could lead to a crash like in #5537.
> 
> Part of #5537
> ---
> src/box/sql/os_unix.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
> index b351c55..557d709 100644
> --- a/src/box/sql/os_unix.c
> +++ b/src/box/sql/os_unix.c
> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
> 		if (fd < 0) {
> 			if (errno == EINTR)
> 				continue;
> +			diag_set(SystemError, strerror(errno));
> 			break;
> 		}
> 		if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
> 			break;
> 		close(fd);
> 		fd = -1;
> -		if (open("/dev/null", f, m) < 0)
> +		if (open("/dev/null", f, m) < 0) {
> +			diag_set(SystemError, strerror(errno));
> 			break;
> +		}
> 	}
> 	if (fd >= 0) {
> 		if (m != 0) {
> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd,		/* File descriptor to write to */
> 		rc = write(fd, pBuf, nBuf);
> 	} while (rc < 0 && errno == EINTR);
> 
> -	if (rc < 0)
> +	if (rc < 0) {
> +		diag_set(SystemError, strerror(errno));
> 		*piErrno = errno;
> +	}
> 	return rc;
> }
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-08 19:59 ` [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format Leonid Vasiliev
@ 2020-12-10 16:39   ` Sergey Ostanevich
  2020-12-10 23:08     ` Leonid Vasiliev
  0 siblings, 1 reply; 10+ messages in thread
From: Sergey Ostanevich @ 2020-12-10 16:39 UTC (permalink / raw)
  To: Leonid Vasiliev; +Cc: m.semkin, tarantool-patches, Vladislav Shpilevoy

Thanks for the patch!

I tend to the 1st alternative, although the code using the name generated
is hairy. I believe the same resolution as for the first part: if we’re
in a rush - LGTM, better solution is desirable otherwise.

Sergos
 


> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
> 
> The bug was consisted in fail when working with temporary files
> created by VDBE to sort large result of a `SELECT` statement with
> `ORDER BY`, `GROUP BY` clauses.
> 
> Whats happen (step by step):
> - We have two instances on one node (sharded cluster).
> - A query is created that executes on both.
> - The first instance creates the name of the temporary file and
>  checks a file with such name on existence.
> - The second instance creates the name of the temporary file
>  (the same as in  first instance) and checks a file with such name
>  on existence.
> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>  flag.
> - The second instance opens(try to open) the same file.
> - The first instance closes (and removes) the temporary file.
> - The second instance tries to work with the file and fails.
> 
> Why did it happen:
> The temporary file name format has a random part, but the random
> generator uses a fixed seed.
> 
> When it was decided to use a fixed seed:
> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
> 
> How the patch fixes the problem:
> The patch injects the PID in the temporary file name format.
> The generated name is unique for a single process (due to a random part)
> and unique between processes (due to the PID part).
> 
> Alternatives:
> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
>  temporary files). In both cases, we need to update a significant
>  part of the code, and some degradation can be added. It's hard to
>  review.
> 2) Return a random seed for the generator. As far as I understand,
>  we want to have good reproducible system behavior, in which case
>  it's good to use a fixed seed.
> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
>  win the fight. Now we set such flags when opening a temporary file,
>  but after that we try to open the file in `READONLY` mode and
>  if ok - return the descriptor. This is strange logic for me and I
>  don't want to add any aditional logic here. Also, such solution will
 
>  add additional attempts to open the file.
> 
> So, it look like such minimal changes will work fine and are simple
> to review.
> 
> Co-authored-by: Mergen Imeev<imeevma@gmail.com>
> 
> Fixes #5537
> ---
> src/box/sql/os_unix.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
> index 557d709..ce415cb 100644
> --- a/src/box/sql/os_unix.c
> +++ b/src/box/sql/os_unix.c
> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
> 		assert(nBuf > 2);
> 		zBuf[nBuf - 2] = 0;
> 		sql_snprintf(nBuf, zBuf,
> -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
> -				 r, 0);
> +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
> +				 (long)randomnessPid, r, 0);
> 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
> 			return -1;
> 	} while (access(zBuf, 0) == 0);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-10 16:39   ` Sergey Ostanevich
@ 2020-12-10 23:08     ` Leonid Vasiliev
  2020-12-11 12:51       ` Nikita Pettik
  0 siblings, 1 reply; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-10 23:08 UTC (permalink / raw)
  To: Sergey Ostanevich; +Cc: m.semkin, tarantool-patches, Vladislav Shpilevoy

Hi! Thank you for review.

On 10.12.2020 19:39, Sergey Ostanevich wrote:
> Thanks for the patch!
> 
> I tend to the 1st alternative, although the code using the name generated
> is hairy. I believe the same resolution as for the first part: if we’re
> in a rush - LGTM, better solution is desirable otherwise.

As I wrote in the commit message, I think using `O_TMPFILE` or
`tmpfile()` is the best solution to work with temporary files.
But we already have the code in which the filename is passed from one 
function to another, and some logic over all this. So, replacing the use 
of a named file with a real temporary file will lead to refactoring,
changing some logic and increasing the number of differences with
SQLite. This can lead to degradation and complication of the review.
Instead, I propose a simple one line change. Could you please describe
the reproduction of the probable problem step by step?

> Sergos
>   
> 
> 
>> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>>
>> The bug was consisted in fail when working with temporary files
>> created by VDBE to sort large result of a `SELECT` statement with
>> `ORDER BY`, `GROUP BY` clauses.
>>
>> Whats happen (step by step):
>> - We have two instances on one node (sharded cluster).
>> - A query is created that executes on both.
>> - The first instance creates the name of the temporary file and
>>   checks a file with such name on existence.
>> - The second instance creates the name of the temporary file
>>   (the same as in  first instance) and checks a file with such name
>>   on existence.
>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>>   flag.
>> - The second instance opens(try to open) the same file.
>> - The first instance closes (and removes) the temporary file.
>> - The second instance tries to work with the file and fails.
>>
>> Why did it happen:
>> The temporary file name format has a random part, but the random
>> generator uses a fixed seed.
>>
>> When it was decided to use a fixed seed:
>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>>
>> How the patch fixes the problem:
>> The patch injects the PID in the temporary file name format.
>> The generated name is unique for a single process (due to a random part)
>> and unique between processes (due to the PID part).
>>
>> Alternatives:
>> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
>>   temporary files). In both cases, we need to update a significant
>>   part of the code, and some degradation can be added. It's hard to
>>   review.
>> 2) Return a random seed for the generator. As far as I understand,
>>   we want to have good reproducible system behavior, in which case
>>   it's good to use a fixed seed.
>> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
>>   win the fight. Now we set such flags when opening a temporary file,
>>   but after that we try to open the file in `READONLY` mode and
>>   if ok - return the descriptor. This is strange logic for me and I
>>   don't want to add any aditional logic here. Also, such solution will
>   
>>   add additional attempts to open the file.
>>
>> So, it look like such minimal changes will work fine and are simple
>> to review.
>>
>> Co-authored-by: Mergen Imeev<imeevma@gmail.com>
>>
>> Fixes #5537
>> ---
>> src/box/sql/os_unix.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>> index 557d709..ce415cb 100644
>> --- a/src/box/sql/os_unix.c
>> +++ b/src/box/sql/os_unix.c
>> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
>> 		assert(nBuf > 2);
>> 		zBuf[nBuf - 2] = 0;
>> 		sql_snprintf(nBuf, zBuf,
>> -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
>> -				 r, 0);
>> +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
>> +				 (long)randomnessPid, r, 0);
>> 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
>> 			return -1;
>> 	} while (access(zBuf, 0) == 0);
>> -- 
>> 2.7.4
>>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-10 23:08     ` Leonid Vasiliev
@ 2020-12-11 12:51       ` Nikita Pettik
  2020-12-11 14:52         ` m.semkin
  2020-12-11 15:19         ` Leonid Vasiliev
  0 siblings, 2 replies; 10+ messages in thread
From: Nikita Pettik @ 2020-12-11 12:51 UTC (permalink / raw)
  To: Leonid Vasiliev; +Cc: tarantool-patches, Vladislav Shpilevoy, m.semkin

On 11 Dec 02:08, Leonid Vasiliev wrote:
> Hi! Thank you for review.
> 
> On 10.12.2020 19:39, Sergey Ostanevich wrote:
> > Thanks for the patch!
> > 
> > I tend to the 1st alternative, although the code using the name generated
> > is hairy. I believe the same resolution as for the first part: if we’re
> > in a rush - LGTM, better solution is desirable otherwise.
> 
> As I wrote in the commit message, I think using `O_TMPFILE` or
> `tmpfile()` is the best solution to work with temporary files.
> But we already have the code in which the filename is passed from one
> function to another, and some logic over all this. So, replacing the use of
> a named file with a real temporary file will lead to refactoring,
> changing some logic and increasing the number of differences with
> SQLite. This can lead to degradation and complication of the review.

Anyway, there's already huge difference between our codebases, so
don't worry about incompatibility between SQLite and Tarantool.
If you don't want fix this problem now, could you please open issue
and describe problem there (copy-paste alternatives you suggested
in the previous letter)?

> Instead, I propose a simple one line change. Could you please describe
> the reproduction of the probable problem step by step?
> 
> > Sergos
> > 
> > 
> > > On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
> > > 
> > > The bug was consisted in fail when working with temporary files
> > > created by VDBE to sort large result of a `SELECT` statement with
> > > `ORDER BY`, `GROUP BY` clauses.
> > > 
> > > Whats happen (step by step):
> > > - We have two instances on one node (sharded cluster).
> > > - A query is created that executes on both.
> > > - The first instance creates the name of the temporary file and
> > >   checks a file with such name on existence.
> > > - The second instance creates the name of the temporary file
> > >   (the same as in  first instance) and checks a file with such name
> > >   on existence.
> > > - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
> > >   flag.
> > > - The second instance opens(try to open) the same file.
> > > - The first instance closes (and removes) the temporary file.
> > > - The second instance tries to work with the file and fails.
> > > 
> > > Why did it happen:
> > > The temporary file name format has a random part, but the random
> > > generator uses a fixed seed.
> > > 
> > > When it was decided to use a fixed seed:
> > > 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
> > > 
> > > How the patch fixes the problem:
> > > The patch injects the PID in the temporary file name format.
> > > The generated name is unique for a single process (due to a random part)
> > > and unique between processes (due to the PID part).
> > > 
> > > Alternatives:
> > > 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
> > >   temporary files). In both cases, we need to update a significant
> > >   part of the code, and some degradation can be added. It's hard to
> > >   review.
> > > 2) Return a random seed for the generator. As far as I understand,
> > >   we want to have good reproducible system behavior, in which case
> > >   it's good to use a fixed seed.
> > > 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
> > >   win the fight. Now we set such flags when opening a temporary file,
> > >   but after that we try to open the file in `READONLY` mode and
> > >   if ok - return the descriptor. This is strange logic for me and I
> > >   don't want to add any aditional logic here. Also, such solution will
> > >   add additional attempts to open the file.
> > > 
> > > So, it look like such minimal changes will work fine and are simple
> > > to review.
> > > 
> > > Co-authored-by: Mergen Imeev<imeevma@gmail.com>
> > > 
> > > Fixes #5537
> > > ---
> > > src/box/sql/os_unix.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
> > > index 557d709..ce415cb 100644
> > > --- a/src/box/sql/os_unix.c
> > > +++ b/src/box/sql/os_unix.c
> > > @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
> > > 		assert(nBuf > 2);
> > > 		zBuf[nBuf - 2] = 0;
> > > 		sql_snprintf(nBuf, zBuf,
> > > -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
> > > -				 r, 0);
> > > +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
> > > +				 (long)randomnessPid, r, 0);
> > > 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
> > > 			return -1;
> > > 	} while (access(zBuf, 0) == 0);
> > > -- 
> > > 2.7.4
> > > 
> > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-11 12:51       ` Nikita Pettik
@ 2020-12-11 14:52         ` m.semkin
  2020-12-11 15:19         ` Leonid Vasiliev
  1 sibling, 0 replies; 10+ messages in thread
From: m.semkin @ 2020-12-11 14:52 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, Vladislav Shpilevoy

[-- Attachment #1: Type: text/plain, Size: 4907 bytes --]

>> Instead, I propose a simple one line change. Could you please describe
>> the reproduction of the probable problem step by step?

And what’s the problem with the current Leonid’s solution after all?

> On 11 Dec 2020, at 15:51, Nikita Pettik <korablev@tarantool.org> wrote:
> 
> On 11 Dec 02:08, Leonid Vasiliev wrote:
>> Hi! Thank you for review.
>> 
>> On 10.12.2020 19:39, Sergey Ostanevich wrote:
>>> Thanks for the patch!
>>> 
>>> I tend to the 1st alternative, although the code using the name generated
>>> is hairy. I believe the same resolution as for the first part: if we’re
>>> in a rush - LGTM, better solution is desirable otherwise.
>> 
>> As I wrote in the commit message, I think using `O_TMPFILE` or
>> `tmpfile()` is the best solution to work with temporary files.
>> But we already have the code in which the filename is passed from one
>> function to another, and some logic over all this. So, replacing the use of
>> a named file with a real temporary file will lead to refactoring,
>> changing some logic and increasing the number of differences with
>> SQLite. This can lead to degradation and complication of the review.
> 
> Anyway, there's already huge difference between our codebases, so
> don't worry about incompatibility between SQLite and Tarantool.
> If you don't want fix this problem now, could you please open issue
> and describe problem there (copy-paste alternatives you suggested
> in the previous letter)?
> 
>> Instead, I propose a simple one line change. Could you please describe
>> the reproduction of the probable problem step by step?
>> 
>>> Sergos
>>> 
>>> 
>>>> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>>>> 
>>>> The bug was consisted in fail when working with temporary files
>>>> created by VDBE to sort large result of a `SELECT` statement with
>>>> `ORDER BY`, `GROUP BY` clauses.
>>>> 
>>>> Whats happen (step by step):
>>>> - We have two instances on one node (sharded cluster).
>>>> - A query is created that executes on both.
>>>> - The first instance creates the name of the temporary file and
>>>>  checks a file with such name on existence.
>>>> - The second instance creates the name of the temporary file
>>>>  (the same as in  first instance) and checks a file with such name
>>>>  on existence.
>>>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>>>>  flag.
>>>> - The second instance opens(try to open) the same file.
>>>> - The first instance closes (and removes) the temporary file.
>>>> - The second instance tries to work with the file and fails.
>>>> 
>>>> Why did it happen:
>>>> The temporary file name format has a random part, but the random
>>>> generator uses a fixed seed.
>>>> 
>>>> When it was decided to use a fixed seed:
>>>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>>>> 
>>>> How the patch fixes the problem:
>>>> The patch injects the PID in the temporary file name format.
>>>> The generated name is unique for a single process (due to a random part)
>>>> and unique between processes (due to the PID part).
>>>> 
>>>> Alternatives:
>>>> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
>>>>  temporary files). In both cases, we need to update a significant
>>>>  part of the code, and some degradation can be added. It's hard to
>>>>  review.
>>>> 2) Return a random seed for the generator. As far as I understand,
>>>>  we want to have good reproducible system behavior, in which case
>>>>  it's good to use a fixed seed.
>>>> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
>>>>  win the fight. Now we set such flags when opening a temporary file,
>>>>  but after that we try to open the file in `READONLY` mode and
>>>>  if ok - return the descriptor. This is strange logic for me and I
>>>>  don't want to add any aditional logic here. Also, such solution will
>>>>  add additional attempts to open the file.
>>>> 
>>>> So, it look like such minimal changes will work fine and are simple
>>>> to review.
>>>> 
>>>> Co-authored-by: Mergen Imeev<imeevma@gmail.com>
>>>> 
>>>> Fixes #5537
>>>> ---
>>>> src/box/sql/os_unix.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>>>> index 557d709..ce415cb 100644
>>>> --- a/src/box/sql/os_unix.c
>>>> +++ b/src/box/sql/os_unix.c
>>>> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
>>>> 		assert(nBuf > 2);
>>>> 		zBuf[nBuf - 2] = 0;
>>>> 		sql_snprintf(nBuf, zBuf,
>>>> -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
>>>> -				 r, 0);
>>>> +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
>>>> +				 (long)randomnessPid, r, 0);
>>>> 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
>>>> 			return -1;
>>>> 	} while (access(zBuf, 0) == 0);
>>>> -- 
>>>> 2.7.4


[-- Attachment #2: Type: text/html, Size: 14125 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module
  2020-12-10 14:15   ` Sergey Ostanevich
@ 2020-12-11 15:05     ` Leonid Vasiliev
  0 siblings, 0 replies; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 15:05 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: m.semkin, tarantool-patches, Vladislav Shpilevoy, Mergen Imeev

Hi! Thank you for the review.

On 10.12.2020 17:15, Sergey Ostanevich wrote:
> Thanks for the patch!
> 
> I have a question if those diags are too low in the call stack?

In the new patchset, I added common `diag_set()` if `sql_execute()`
fails without setting an error to diag.

> Apparently, the unixOpen() is a single space the robust_open()
> is called, so we’d have only one diag set instead of two?

In this case, we will not know exactly where the error occurred.
I think the best way to use the stack diag for such purposes but that is
out of the scope of this problem. I will create a separate task about
these improvements (or find existing ones).

> 
> Following this path - we should just cover UNIXVFS and sql_io_methods
> members with diag set, given the errno is preserved.
> 
> By now the solution is partial, so can be applied only if we’re in a
> rush.
This solution is minimalistic and solves a specific problem (and only
this problem). Because now we have a crash and fixes should be included
in the next release.

> 
> Sergos
> 
>> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>>
>> From: Mergen Imeev <imeevma@gmail.com>
>>
>> SQL module didn't set an error in the diagnostics area on a file
>> operation failure. This could lead to a crash like in #5537.
>>
>> Part of #5537
>> ---
>> src/box/sql/os_unix.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>> index b351c55..557d709 100644
>> --- a/src/box/sql/os_unix.c
>> +++ b/src/box/sql/os_unix.c
>> @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m)
>> 		if (fd < 0) {
>> 			if (errno == EINTR)
>> 				continue;
>> +			diag_set(SystemError, strerror(errno));
>> 			break;
>> 		}
>> 		if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR)
>> 			break;
>> 		close(fd);
>> 		fd = -1;
>> -		if (open("/dev/null", f, m) < 0)
>> +		if (open("/dev/null", f, m) < 0) {
>> +			diag_set(SystemError, strerror(errno));
>> 			break;
>> +		}
>> 	}
>> 	if (fd >= 0) {
>> 		if (m != 0) {
>> @@ -831,8 +834,10 @@ seekAndWriteFd(int fd,		/* File descriptor to write to */
>> 		rc = write(fd, pBuf, nBuf);
>> 	} while (rc < 0 && errno == EINTR);
>>
>> -	if (rc < 0)
>> +	if (rc < 0) {
>> +		diag_set(SystemError, strerror(errno));
>> 		*piErrno = errno;
>> +	}
>> 	return rc;
>> }
>>
>> -- 
>> 2.7.4
>>
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
  2020-12-11 12:51       ` Nikita Pettik
  2020-12-11 14:52         ` m.semkin
@ 2020-12-11 15:19         ` Leonid Vasiliev
  1 sibling, 0 replies; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-11 15:19 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches, Vladislav Shpilevoy, m.semkin

Hi!

On 11.12.2020 15:51, Nikita Pettik wrote:
> On 11 Dec 02:08, Leonid Vasiliev wrote:
>> Hi! Thank you for review.
>>
>> On 10.12.2020 19:39, Sergey Ostanevich wrote:
>>> Thanks for the patch!
>>>
>>> I tend to the 1st alternative, although the code using the name generated
>>> is hairy. I believe the same resolution as for the first part: if we’re
>>> in a rush - LGTM, better solution is desirable otherwise.
>>
>> As I wrote in the commit message, I think using `O_TMPFILE` or
>> `tmpfile()` is the best solution to work with temporary files.
>> But we already have the code in which the filename is passed from one
>> function to another, and some logic over all this. So, replacing the use of
>> a named file with a real temporary file will lead to refactoring,
>> changing some logic and increasing the number of differences with
>> SQLite. This can lead to degradation and complication of the review.
> 
> Anyway, there's already huge difference between our codebases, so
> don't worry about incompatibility between SQLite and Tarantool.
> If you don't want fix this problem now, could you please open issue
> and describe problem there (copy-paste alternatives you suggested
> in the previous letter)?

We have a discussion with Sergos, and I will create several tasks as a
result of the discussion (or find existing ones).
AFAIK, this patch will completely fix the problem. Going to a true temp
files is more about how to do it from the start. If you see potential
issues with this solution please share.

> 
>> Instead, I propose a simple one line change. Could you please describe
>> the reproduction of the probable problem step by step?
>>
>>> Sergos
>>>
>>>
>>>> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>>>>
>>>> The bug was consisted in fail when working with temporary files
>>>> created by VDBE to sort large result of a `SELECT` statement with
>>>> `ORDER BY`, `GROUP BY` clauses.
>>>>
>>>> Whats happen (step by step):
>>>> - We have two instances on one node (sharded cluster).
>>>> - A query is created that executes on both.
>>>> - The first instance creates the name of the temporary file and
>>>>    checks a file with such name on existence.
>>>> - The second instance creates the name of the temporary file
>>>>    (the same as in  first instance) and checks a file with such name
>>>>    on existence.
>>>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>>>>    flag.
>>>> - The second instance opens(try to open) the same file.
>>>> - The first instance closes (and removes) the temporary file.
>>>> - The second instance tries to work with the file and fails.
>>>>
>>>> Why did it happen:
>>>> The temporary file name format has a random part, but the random
>>>> generator uses a fixed seed.
>>>>
>>>> When it was decided to use a fixed seed:
>>>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>>>>
>>>> How the patch fixes the problem:
>>>> The patch injects the PID in the temporary file name format.
>>>> The generated name is unique for a single process (due to a random part)
>>>> and unique between processes (due to the PID part).
>>>>
>>>> Alternatives:
>>>> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
>>>>    temporary files). In both cases, we need to update a significant
>>>>    part of the code, and some degradation can be added. It's hard to
>>>>    review.
>>>> 2) Return a random seed for the generator. As far as I understand,
>>>>    we want to have good reproducible system behavior, in which case
>>>>    it's good to use a fixed seed.
>>>> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
>>>>    win the fight. Now we set such flags when opening a temporary file,
>>>>    but after that we try to open the file in `READONLY` mode and
>>>>    if ok - return the descriptor. This is strange logic for me and I
>>>>    don't want to add any aditional logic here. Also, such solution will
>>>>    add additional attempts to open the file.
>>>>
>>>> So, it look like such minimal changes will work fine and are simple
>>>> to review.
>>>>
>>>> Co-authored-by: Mergen Imeev<imeevma@gmail.com>
>>>>
>>>> Fixes #5537
>>>> ---
>>>> src/box/sql/os_unix.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>>>> index 557d709..ce415cb 100644
>>>> --- a/src/box/sql/os_unix.c
>>>> +++ b/src/box/sql/os_unix.c
>>>> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
>>>> 		assert(nBuf > 2);
>>>> 		zBuf[nBuf - 2] = 0;
>>>> 		sql_snprintf(nBuf, zBuf,
>>>> -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
>>>> -				 r, 0);
>>>> +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
>>>> +				 (long)randomnessPid, r, 0);
>>>> 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
>>>> 			return -1;
>>>> 	} while (access(zBuf, 0) == 0);
>>>> -- 
>>>> 2.7.4
>>>>
>>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-11 15:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 19:59 [Tarantool-patches] [PATCH 0/2] Fix working with temporary files in VDBE Leonid Vasiliev
2020-12-08 19:59 ` [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
2020-12-10 14:15   ` Sergey Ostanevich
2020-12-11 15:05     ` Leonid Vasiliev
2020-12-08 19:59 ` [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format Leonid Vasiliev
2020-12-10 16:39   ` Sergey Ostanevich
2020-12-10 23:08     ` Leonid Vasiliev
2020-12-11 12:51       ` Nikita Pettik
2020-12-11 14:52         ` m.semkin
2020-12-11 15:19         ` Leonid Vasiliev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox