From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp38.i.mail.ru (smtp38.i.mail.ru [94.100.177.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7029645C304 for ; Fri, 11 Dec 2020 17:53:00 +0300 (MSK) From: "m.semkin" Message-Id: <2C021FA8-0E97-475F-8362-066F40174FD6@corp.mail.ru> Content-Type: multipart/alternative; boundary="Apple-Mail=_F9A8DED7-0B78-4A89-950B-FD830C066C50" Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.5\)) Date: Fri, 11 Dec 2020 17:52:57 +0300 In-Reply-To: <20201211125130.GC12730@tarantool.org> References: <2554D7DF-A952-4238-8096-C4618FD2C938@tarantool.org> <236e0556-b6a1-0b67-58f9-617dd1fca21b@tarantool.org> <20201211125130.GC12730@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy --Apple-Mail=_F9A8DED7-0B78-4A89-950B-FD830C066C50 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 >> Instead, I propose a simple one line change. Could you please = describe >> the reproduction of the probable problem step by step? And what=E2=80=99s the problem with the current Leonid=E2=80=99s = solution after all? > On 11 Dec 2020, at 15:51, Nikita Pettik = wrote: >=20 > On 11 Dec 02:08, Leonid Vasiliev wrote: >> Hi! Thank you for review. >>=20 >> On 10.12.2020 19:39, Sergey Ostanevich wrote: >>> Thanks for the patch! >>>=20 >>> I tend to the 1st alternative, although the code using the name = generated >>> is hairy. I believe the same resolution as for the first part: if = we=E2=80=99re >>> in a rush - LGTM, better solution is desirable otherwise. >>=20 >> As I wrote in the commit message, I think using `O_TMPFILE` or >> `tmpfile()` is the best solution to work with temporary files. >> But we already have the code in which the filename is passed from one >> function to another, and some logic over all this. So, replacing the = use of >> a named file with a real temporary file will lead to refactoring, >> changing some logic and increasing the number of differences with >> SQLite. This can lead to degradation and complication of the review. >=20 > Anyway, there's already huge difference between our codebases, so > don't worry about incompatibility between SQLite and Tarantool. > If you don't want fix this problem now, could you please open issue > and describe problem there (copy-paste alternatives you suggested > in the previous letter)? >=20 >> Instead, I propose a simple one line change. Could you please = describe >> the reproduction of the probable problem step by step? >>=20 >>> Sergos >>>=20 >>>=20 >>>> On 8 Dec 2020, at 22:59, Leonid Vasiliev = wrote: >>>>=20 >>>> The bug was consisted in fail when working with temporary files >>>> created by VDBE to sort large result of a `SELECT` statement with >>>> `ORDER BY`, `GROUP BY` clauses. >>>>=20 >>>> Whats happen (step by step): >>>> - We have two instances on one node (sharded cluster). >>>> - A query is created that executes on both. >>>> - The first instance creates the name of the temporary file and >>>> checks a file with such name on existence. >>>> - The second instance creates the name of the temporary file >>>> (the same as in first instance) and checks a file with such name >>>> on existence. >>>> - The first instance creates a file with the = `SQL_OPEN_DELETEONCLOSE` >>>> flag. >>>> - The second instance opens(try to open) the same file. >>>> - The first instance closes (and removes) the temporary file. >>>> - The second instance tries to work with the file and fails. >>>>=20 >>>> Why did it happen: >>>> The temporary file name format has a random part, but the random >>>> generator uses a fixed seed. >>>>=20 >>>> When it was decided to use a fixed seed: >>>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1 >>>>=20 >>>> How the patch fixes the problem: >>>> The patch injects the PID in the temporary file name format. >>>> The generated name is unique for a single process (due to a random = part) >>>> and unique between processes (due to the PID part). >>>>=20 >>>> Alternatives: >>>> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with >>>> temporary files). In both cases, we need to update a significant >>>> part of the code, and some degradation can be added. It's hard to >>>> review. >>>> 2) Return a random seed for the generator. As far as I understand, >>>> we want to have good reproducible system behavior, in which case >>>> it's good to use a fixed seed. >>>> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we >>>> win the fight. Now we set such flags when opening a temporary = file, >>>> but after that we try to open the file in `READONLY` mode and >>>> if ok - return the descriptor. This is strange logic for me and I >>>> don't want to add any aditional logic here. Also, such solution = will >>>> add additional attempts to open the file. >>>>=20 >>>> So, it look like such minimal changes will work fine and are simple >>>> to review. >>>>=20 >>>> Co-authored-by: Mergen Imeev >>>>=20 >>>> Fixes #5537 >>>> --- >>>> src/box/sql/os_unix.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>=20 >>>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c >>>> index 557d709..ce415cb 100644 >>>> --- a/src/box/sql/os_unix.c >>>> +++ b/src/box/sql/os_unix.c >>>> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf) >>>> assert(nBuf > 2); >>>> zBuf[nBuf - 2] =3D 0; >>>> sql_snprintf(nBuf, zBuf, >>>> - "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", = zDir, >>>> - r, 0); >>>> + "%s/" SQL_TEMP_FILE_PREFIX = "%ld_%llx%c", zDir, >>>> + (long)randomnessPid, r, 0); >>>> if (zBuf[nBuf - 2] !=3D 0 || (iLimit++) > 10) >>>> return -1; >>>> } while (access(zBuf, 0) =3D=3D 0); >>>> --=20 >>>> 2.7.4 --Apple-Mail=_F9A8DED7-0B78-4A89-950B-FD830C066C50 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
Instead, I propose a simple one line change. Could you please = describe
the reproduction of the probable problem step by = step?

And = what=E2=80=99s the problem with the current Leonid=E2=80=99s solution = after all?

On 11 Dec 2020, at 15:51, Nikita Pettik = <korablev@tarantool.org> wrote:

On 11 Dec 02:08, Leonid Vasiliev = wrote:
Hi! = Thank you for review.

On 10.12.2020 19:39, = Sergey Ostanevich wrote:
Thanks for the patch!

I tend to = the 1st alternative, although the code using the name generated
is hairy. I believe the same resolution as for the first = part: if we=E2=80=99re
in a rush - LGTM, better solution = is desirable otherwise.

As I = wrote in the commit message, I think using `O_TMPFILE` or
`tmpfile()` is the best solution to work with temporary = files.
But we already have the code in which the filename = is passed from one
function to another, and some logic = over all this. So, replacing the use of
a named file with = a real temporary file will lead to refactoring,
changing = some logic and increasing the number of differences with
SQLite. This can lead to degradation and complication of the = review.

Anyway, there's already huge difference between our = codebases, so
don't worry = about incompatibility between SQLite and Tarantool.
If you don't want fix this = problem now, could you please open issue
and describe problem there (copy-paste alternatives you = suggested
in the = previous letter)?

Instead, I propose a simple one line = change. Could you please describe
the reproduction of the = probable problem step by step?

Sergos


On 8 Dec 2020, at 22:59, = Leonid Vasiliev <lvasiliev@tarantool.org> wrote:

The bug was consisted in fail when working with temporary = files
created by VDBE to sort large result of a `SELECT` = statement with
`ORDER BY`, `GROUP BY` clauses.

Whats happen (step by step):
- = We have two instances on one node (sharded cluster).
- A = query is created that executes on both.
- The first = instance creates the name of the temporary file and
 checks a file with such name on existence.
- The second instance creates the name of the temporary = file
 (the same as in  first instance) and = checks a file with such name
 on existence.
- The first instance creates a file with the = `SQL_OPEN_DELETEONCLOSE`
 flag.
- The = second instance opens(try to open) the same file.
- The = first instance closes (and removes) the temporary file.
- = The second instance tries to work with the file and fails.

Why did it happen:
The temporary = file name format has a random part, but the random
generator= uses a fixed seed.

When it was decided to = use a fixed seed:
32cb1ad298b2b55d8536a85bdfb3827c8c8739e1

How the patch fixes the problem:
The patch = injects the PID in the temporary file name format.
The = generated name is unique for a single process (due to a random part)
and unique between processes (due to the PID part).

Alternatives:
1) Use `O_TMPFILE` = or `tmpfile()` (IMHO the best way to work with
 temporary files). In both cases, we need to update a = significant
 part of the code, and some degradation = can be added. It's hard to
 review.
2) = Return a random seed for the generator. As far as I understand,
 we want to have good reproducible system behavior, in = which case
 it's good to use a fixed seed.
3) Add reopening file with the flags `O_CREAT | O_EXCL` until = we
 win the fight. Now we set such flags when opening = a temporary file,
 but after that we try to open the = file in `READONLY` mode and
 if ok - return the = descriptor. This is strange logic for me and I
 don't = want to add any aditional logic here. Also, such solution will
 add additional attempts to open the file.

So, it look like such minimal changes will = work fine and are simple
to review.

Co-authored-by: Mergen Imeev<imeevma@gmail.com>

Fixes #5537
---
src/box/sql/os_unix.c | 4 ++--
1 file changed, = 2 insertions(+), 2 deletions(-)

diff --git = a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
index = 557d709..ce415cb 100644
--- a/src/box/sql/os_unix.c
+++ b/src/box/sql/os_unix.c
@@ -1483,8 +1483,8 = @@ unixGetTempname(int nBuf, char *zBuf)
= assert(nBuf > 2);
zBuf[nBuf - 2] =3D 0;
= = sql_snprintf(nBuf, zBuf,
-  "%s/" SQL_TEMP_FILE_PREFIX = "%llx%c", zDir,
-  r, 0);
+  "%s/" SQL_TEMP_FILE_PREFIX = "%ld_%llx%c", zDir,
+  (long)randomnessPid, r, = 0);
if (zBuf[nBuf - 2] !=3D 0 || (iLimit++) > 10)
= = = return -1;
} while (access(zBuf, 0) =3D=3D = 0);
-- 2.7.4
=

= --Apple-Mail=_F9A8DED7-0B78-4A89-950B-FD830C066C50--