Tarantool development patches archive
 help / color / mirror / Atom feed
From: Leonid Vasiliev <lvasiliev@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org,
	Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	m.semkin@corp.mail.ru
Subject: Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format
Date: Fri, 11 Dec 2020 18:19:51 +0300	[thread overview]
Message-ID: <5513066d-5d5b-2012-29e9-99a220a7b0a7@tarantool.org> (raw)
In-Reply-To: <20201211125130.GC12730@tarantool.org>

Hi!

On 11.12.2020 15:51, Nikita Pettik wrote:
> On 11 Dec 02:08, Leonid Vasiliev wrote:
>> Hi! Thank you for review.
>>
>> On 10.12.2020 19:39, Sergey Ostanevich wrote:
>>> Thanks for the patch!
>>>
>>> I tend to the 1st alternative, although the code using the name generated
>>> is hairy. I believe the same resolution as for the first part: if we’re
>>> in a rush - LGTM, better solution is desirable otherwise.
>>
>> As I wrote in the commit message, I think using `O_TMPFILE` or
>> `tmpfile()` is the best solution to work with temporary files.
>> But we already have the code in which the filename is passed from one
>> function to another, and some logic over all this. So, replacing the use of
>> a named file with a real temporary file will lead to refactoring,
>> changing some logic and increasing the number of differences with
>> SQLite. This can lead to degradation and complication of the review.
> 
> Anyway, there's already huge difference between our codebases, so
> don't worry about incompatibility between SQLite and Tarantool.
> If you don't want fix this problem now, could you please open issue
> and describe problem there (copy-paste alternatives you suggested
> in the previous letter)?

We have a discussion with Sergos, and I will create several tasks as a
result of the discussion (or find existing ones).
AFAIK, this patch will completely fix the problem. Going to a true temp
files is more about how to do it from the start. If you see potential
issues with this solution please share.

> 
>> Instead, I propose a simple one line change. Could you please describe
>> the reproduction of the probable problem step by step?
>>
>>> Sergos
>>>
>>>
>>>> On 8 Dec 2020, at 22:59, Leonid Vasiliev <lvasiliev@tarantool.org> wrote:
>>>>
>>>> The bug was consisted in fail when working with temporary files
>>>> created by VDBE to sort large result of a `SELECT` statement with
>>>> `ORDER BY`, `GROUP BY` clauses.
>>>>
>>>> Whats happen (step by step):
>>>> - We have two instances on one node (sharded cluster).
>>>> - A query is created that executes on both.
>>>> - The first instance creates the name of the temporary file and
>>>>    checks a file with such name on existence.
>>>> - The second instance creates the name of the temporary file
>>>>    (the same as in  first instance) and checks a file with such name
>>>>    on existence.
>>>> - The first instance creates a file with the `SQL_OPEN_DELETEONCLOSE`
>>>>    flag.
>>>> - The second instance opens(try to open) the same file.
>>>> - The first instance closes (and removes) the temporary file.
>>>> - The second instance tries to work with the file and fails.
>>>>
>>>> Why did it happen:
>>>> The temporary file name format has a random part, but the random
>>>> generator uses a fixed seed.
>>>>
>>>> When it was decided to use a fixed seed:
>>>> 32cb1ad298b2b55d8536a85bdfb3827c8c8739e1
>>>>
>>>> How the patch fixes the problem:
>>>> The patch injects the PID in the temporary file name format.
>>>> The generated name is unique for a single process (due to a random part)
>>>> and unique between processes (due to the PID part).
>>>>
>>>> Alternatives:
>>>> 1) Use `O_TMPFILE` or `tmpfile()` (IMHO the best way to work with
>>>>    temporary files). In both cases, we need to update a significant
>>>>    part of the code, and some degradation can be added. It's hard to
>>>>    review.
>>>> 2) Return a random seed for the generator. As far as I understand,
>>>>    we want to have good reproducible system behavior, in which case
>>>>    it's good to use a fixed seed.
>>>> 3) Add reopening file with the flags `O_CREAT | O_EXCL` until we
>>>>    win the fight. Now we set such flags when opening a temporary file,
>>>>    but after that we try to open the file in `READONLY` mode and
>>>>    if ok - return the descriptor. This is strange logic for me and I
>>>>    don't want to add any aditional logic here. Also, such solution will
>>>>    add additional attempts to open the file.
>>>>
>>>> So, it look like such minimal changes will work fine and are simple
>>>> to review.
>>>>
>>>> Co-authored-by: Mergen Imeev<imeevma@gmail.com>
>>>>
>>>> Fixes #5537
>>>> ---
>>>> src/box/sql/os_unix.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c
>>>> index 557d709..ce415cb 100644
>>>> --- a/src/box/sql/os_unix.c
>>>> +++ b/src/box/sql/os_unix.c
>>>> @@ -1483,8 +1483,8 @@ unixGetTempname(int nBuf, char *zBuf)
>>>> 		assert(nBuf > 2);
>>>> 		zBuf[nBuf - 2] = 0;
>>>> 		sql_snprintf(nBuf, zBuf,
>>>> -				 "%s/" SQL_TEMP_FILE_PREFIX "%llx%c", zDir,
>>>> -				 r, 0);
>>>> +				 "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir,
>>>> +				 (long)randomnessPid, r, 0);
>>>> 		if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10)
>>>> 			return -1;
>>>> 	} while (access(zBuf, 0) == 0);
>>>> -- 
>>>> 2.7.4
>>>>
>>>

      parent reply	other threads:[~2020-12-11 15:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 19:59 [Tarantool-patches] [PATCH 0/2] Fix working with temporary files in VDBE Leonid Vasiliev
2020-12-08 19:59 ` [Tarantool-patches] [PATCH 1/2] sql: add missing diag_set on failure when working with files inside SQL module Leonid Vasiliev
2020-12-10 14:15   ` Sergey Ostanevich
2020-12-11 15:05     ` Leonid Vasiliev
2020-12-08 19:59 ` [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format Leonid Vasiliev
2020-12-10 16:39   ` Sergey Ostanevich
2020-12-10 23:08     ` Leonid Vasiliev
2020-12-11 12:51       ` Nikita Pettik
2020-12-11 14:52         ` m.semkin
2020-12-11 15:19         ` Leonid Vasiliev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5513066d-5d5b-2012-29e9-99a220a7b0a7@tarantool.org \
    --to=lvasiliev@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=m.semkin@corp.mail.ru \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 2/2] sql: update temporary file name format' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox