From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp35.i.mail.ru (smtp35.i.mail.ru [94.100.177.95]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 383E145C304 for ; Thu, 10 Dec 2020 19:39:48 +0300 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) From: Sergey Ostanevich In-Reply-To: Date: Thu, 10 Dec 2020 19:39:45 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <2554D7DF-A952-4238-8096-C4618FD2C938@tarantool.org> References: Subject: Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Leonid Vasiliev Cc: m.semkin@corp.mail.ru, tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy Thanks for the patch! I tend to the 1st alternative, although the code using the name = generated is hairy. I believe the same resolution as for the first part: if = we=E2=80=99re in a rush - LGTM, better solution is desirable otherwise. Sergos =20 > On 8 Dec 2020, at 22:59, Leonid Vasiliev = wrote: >=20 > The bug was consisted in fail when working with temporary files > created by VDBE to sort large result of a `SELECT` statement with > `ORDER BY`, `GROUP BY` clauses. >=20 > Whats happen (step by step): > - We have two instances on one node (sharded cluster). > - A query is created that executes on both. > - The first instance creates the name of the temporary file and > checks a file with such name on existence. > - The second instance creates the name of the temporary file > (the same as in first instance) and checks a file with such name > on existence. > - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE` > flag. > - The second instance opens(try to open) the same file. > - The first instance closes (and removes) the temporary file. > - The second instance tries to work with the file and fails. >=20 > Why did it happen: > The temporary file name format has a random part, but the random > generator uses a fixed seed. >=20 > When it was decided to use a fixed seed: > 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1 >=20 > How the patch fixes the problem: > The patch injects the PID in the temporary file name format. > The generated name is unique for a single process (due to a random = part) > and unique between processes (due to the PID part). >=20 > Alternatives: > 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with > temporary files). In both cases, we need to update a significant > part of the code, and some degradation can be added. It's hard to > review. > 2) Return a random seed for the generator. As far as I understand, > we want to have good reproducible system behavior, in which case > it's good to use a fixed seed. > 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we > win the fight. Now we set such flags when opening a temporary file, > but after that we try to open the file in `READONLY` mode and > if ok - return the descriptor. This is strange logic for me and I > don't want to add any aditional logic here. Also, such solution will =20 > add additional attempts to open the file. >=20 > So, it look like such minimal changes will work fine and are simple > to review. >=20 > Co-authored-by: Mergen Imeev >=20 > Fixes #5537 > --- > src/box/sql/os_unix.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) >=20 > diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c > index 557d709..ce415cb 100644 > --- a/src/box/sql/os_unix.c > +++ b/src/box/sql/os_unix.c > @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf) > assert(nBuf > 2); > zBuf[nBuf - 2] =3D 0; > sql_snprintf(nBuf, zBuf, > - "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", = zDir, > - r, 0); > + "%s/" SQL_TEMP_FILE_PREFIX = "%ld_%llx%c", zDir, > + (long)randomnessPid, r, 0); > if (zBuf[nBuf - 2] !=3D 0 || (iLimit++) > 10) > return -1; > } while (access(zBuf, 0) =3D=3D 0); > --=20 > 2.7.4 >=20