From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 613A345C304 for ; Sun, 20 Dec 2020 19:02:32 +0300 (MSK) References: <54ef9ccd2a7bee1f5f53a811c7edea1ba034c4ca.1608159414.git.lvasiliev@tarantool.org> From: Vladislav Shpilevoy Message-ID: <8e66938a-e950-1b80-e664-13af405dfaeb@tarantool.org> Date: Sun, 20 Dec 2020 17:02:29 +0100 MIME-Version: 1.0 In-Reply-To: <54ef9ccd2a7bee1f5f53a811c7edea1ba034c4ca.1608159414.git.lvasiliev@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v3 1/2] sql: add missing diag_set on failure when working inside os_unix.c List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Leonid Vasiliev , tarantool-patches@dev.tarantool.org, imeevma@tarantool.org, korablev@tarantool.org, sergos@tarantool.org Hi! Thanks for the patch! On 17.12.2020 00:09, Leonid Vasiliev via Tarantool-patches wrote: > SQL module didn't set an error in the diagnostics area on failure > inside unix.c. This could lead to a crash like in #5537. unix.c -> os_unix.c. I see os_unix.c already has some kind of diagnostics area filled by storeLastErrno(). Did you think about syncing diag_set with this one? See 8 comments below. > src/box/sql/os_unix.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 58 insertions(+), 5 deletions(-) > > diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c > index d64f1bd..4f59767 100644 > --- a/src/box/sql/os_unix.c > +++ b/src/box/sql/os_unix.c> @@ -433,7 +443,13 @@ static int > fileHasMoved(unixFile * pFile) > { > struct stat buf; > - return pFile->pInode != NULL && (stat(pFile->zPath, &buf) != 0 || > + int rc = stat(pFile->zPath, &buf); 1. stat() wasn't called if pInode was NULL. Now it is. Lets not change the behaviour. > + if (rc < 0) { 2. Why do you use < 0 here and != just 5 lines below? But the more important question is why do you even set diag here? This function (fileHasMoved) can't "fail". It just returns true or false - moved or not moved. There is no a way to return an error from it. Any issues with 'stat' were treated as 'file has moved'. I assume it was even the purpose, because usually stat() fails if there is no such file. > + diag_set(SystemError, > + "failed to retrive information about the file '%s'", > + pFile->zPath); > + } > + return pFile->pInode != NULL && (rc != 0 || > (u64) buf.st_ino != > pFile->pInode->fileId.ino); > } > @@ -482,6 +502,11 @@ unixFileLock(unixFile * pFile, struct flock *pLock) > } > } else { > rc = fcntl(pFile->h, F_SETLK, pLock); > + if (rc < 0) { > + diag_set(SystemError, > + "failed to acquire a lock on the file '%s'", > + pFile->zPath); 3. This function is not only about lock acquire. It is also used to remove the lock. For instance, by posixUnlock(). > + } > } > return rc; > } > @@ -740,6 +768,8 @@ seekAndRead(unixFile * id, sql_int64 offset, void *pBuf, int cnt) > got = 1; > continue; > } > + diag_set(SystemError, "failed to read from file '%s'", > + id->zPath); > prior = 0; > storeLastErrno((unixFile *) id, errno); > break; 4. unixRead() returns -1 in case seekAndRead() returned 0. But there is no a diag_set() anywhere for this. > @@ -825,10 +855,16 @@ seekAndWriteFd(int fd, /* File descriptor to write to */ > do { > i64 iSeek = lseek(fd, iOff, SEEK_SET); > if (iSeek < 0) { > + diag_set(SystemError, > + "failed to reposition file offset"); > rc = -1; > break; > } > rc = write(fd, pBuf, nBuf); > + if (rc < 0) { > + diag_set(SystemError, > + "failed to write %i bytes to file", nBuf); 5. The error is supposed to be ignored if errno is EINTR. > + } > } while (rc < 0 && errno == EINTR); > > if (rc < 0) 6. unixWrite() returns -1 if not everything was written, it seems. Here: if (amt > wrote) { if (wrote < 0 && pFile->lastErrno != ENOSPC) { /* lastErrno set by seekAndWrite */ return -1; } else { storeLastErrno(pFile, 0); /* not a system error */ return -1; } } The 'else' branch. But I don't understand how it can happen. > @@ -940,8 +976,12 @@ fcntlSizeHint(unixFile * pFile, i64 nByte) > i64 nSize; /* Required file size */ > struct stat buf; /* Used to hold return values of fstat() */ > > - if (fstat(pFile->h, &buf)) > + if (fstat(pFile->h, &buf)) { 7. I suggest to use != 0 explicitly, according to the code style rules for the new code. The same in the hunk below. > + diag_set(SystemError, > + "failed to retrive information about the" > + " file '%s'", pFile->zPath); > return -1; > + } > > nSize = > ((nByte + pFile->szChunk - > @@ -1165,8 +1205,12 @@ unixMapfile(unixFile * pFd, i64 nMap) > > if (nMap < 0) { > struct stat statbuf; /* Low-level file information */ > - if (fstat(pFd->h, &statbuf)) > + if (fstat(pFd->h, &statbuf)) { > + diag_set(SystemError, > + "failed to retrive information about the" > + " file '%s'", pFd->zPath); > return -1; > + } > nMap = statbuf.st_size; > } > if (nMap > pFd->mmapSizeMax) { > @@ -1449,6 +1493,8 @@ unixTempFileDir(void) > break; > zDir = azDirs[i++]; > } > + diag_set(ClientError, ER_SYSTEM, > + "No access to any temporary directory"); > return 0; > } 8. unixGetTempname() does not set diag in one case of -1 return.