From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp41.i.mail.ru (smtp41.i.mail.ru [94.100.177.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 793DB4765E0 for ; Thu, 24 Dec 2020 02:27:49 +0300 (MSK) References: <54ef9ccd2a7bee1f5f53a811c7edea1ba034c4ca.1608159414.git.lvasiliev@tarantool.org> <8e66938a-e950-1b80-e664-13af405dfaeb@tarantool.org> From: Leonid Vasiliev Message-ID: <200e5cd6-6c8c-d784-42ae-0a941a7b9e1b@tarantool.org> Date: Thu, 24 Dec 2020 02:27:43 +0300 MIME-Version: 1.0 In-Reply-To: <8e66938a-e950-1b80-e664-13af405dfaeb@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v3 1/2] sql: add missing diag_set on failure when working inside os_unix.c List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, imeevma@tarantool.org, korablev@tarantool.org, sergos@tarantool.org Hi! Thank you for the review. The updated patch is at the end of the letter. On 20.12.2020 19:02, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > On 17.12.2020 00:09, Leonid Vasiliev via Tarantool-patches wrote: >> SQL module didn't set an error in the diagnostics area on failure >> inside unix.c. This could lead to a crash like in #5537. > > unix.c -> os_unix.c. Updated. > > I see os_unix.c already has some kind of diagnostics area filled > by storeLastErrno(). Did you think about syncing diag_set with this > one? Yep, right after your question: "Did you think about syncing diag_set with this one?":) As far as I can see, `diag_set()` now has more coverage than `storeLastErrno()`. Seems like some `diag_set` can be moved to `storeLastErrno()`, but will need to add a `message` argument to `storeLastErrno()` for `diag_set ()`. I think such refactoring can be done in a separate ticket if it is useful, because it isn't related with a crash. Do you have other thoughts? > > See 8 comments below. > >> src/box/sql/os_unix.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 58 insertions(+), 5 deletions(-) >> >> diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c >> index d64f1bd..4f59767 100644 >> --- a/src/box/sql/os_unix.c >> +++ b/src/box/sql/os_unix.c> @@ -433,7 +443,13 @@ static int >> fileHasMoved(unixFile * pFile) >> { >> struct stat buf; >> - return pFile->pInode != NULL && (stat(pFile->zPath, &buf) != 0 || >> + int rc = stat(pFile->zPath, &buf); > > 1. stat() wasn't called if pInode was NULL. Now it is. Lets not > change the behaviour. After your second note, this one is deprecated. > >> + if (rc < 0) { > > 2. Why do you use < 0 here and != just 5 lines below? > > But the more important question is why do you even set diag > here? This function (fileHasMoved) can't "fail". It just > returns true or false - moved or not moved. There is no a way > to return an error from it. Any issues with 'stat' were treated > as 'file has moved'. I assume it was even the purpose, because > usually stat() fails if there is no such file. OMG! That's my mistake. Thank you. Changes reverted. > >> + diag_set(SystemError, >> + "failed to retrive information about the file '%s'", >> + pFile->zPath); >> + } >> + return pFile->pInode != NULL && (rc != 0 || >> (u64) buf.st_ino != >> pFile->pInode->fileId.ino); >> } >> @@ -482,6 +502,11 @@ unixFileLock(unixFile * pFile, struct flock *pLock) >> } >> } else { >> rc = fcntl(pFile->h, F_SETLK, pLock); >> + if (rc < 0) { >> + diag_set(SystemError, >> + "failed to acquire a lock on the file '%s'", >> + pFile->zPath); > > 3. This function is not only about lock acquire. It is also used > to remove the lock. For instance, by posixUnlock(). Updated the message. > >> + } >> } >> return rc; >> } >> @@ -740,6 +768,8 @@ seekAndRead(unixFile * id, sql_int64 offset, void *pBuf, int cnt) >> got = 1; >> continue; >> } >> + diag_set(SystemError, "failed to read from file '%s'", >> + id->zPath); >> prior = 0; >> storeLastErrno((unixFile *) id, errno); >> break; > > 4. unixRead() returns -1 in case seekAndRead() returned 0. But > there is no a diag_set() anywhere for this. AFAIU you talking about `else` case. Updated. > >> @@ -825,10 +855,16 @@ seekAndWriteFd(int fd, /* File descriptor to write to */ >> do { >> i64 iSeek = lseek(fd, iOff, SEEK_SET); >> if (iSeek < 0) { >> + diag_set(SystemError, >> + "failed to reposition file offset"); >> rc = -1; >> break; >> } >> rc = write(fd, pBuf, nBuf); >> + if (rc < 0) { >> + diag_set(SystemError, >> + "failed to write %i bytes to file", nBuf); > > 5. The error is supposed to be ignored if errno is EINTR. Yep. It's true. > >> + } >> } while (rc < 0 && errno == EINTR); >> >> if (rc < 0) > > 6. unixWrite() returns -1 if not everything was written, it > seems. Here: > > if (amt > wrote) { > if (wrote < 0 && pFile->lastErrno != ENOSPC) { > /* lastErrno set by seekAndWrite */ > return -1; > } else { > storeLastErrno(pFile, 0); /* not a system error */ > return -1; > } > } > > The 'else' branch. But I don't understand how it can happen. Updated. > >> @@ -940,8 +976,12 @@ fcntlSizeHint(unixFile * pFile, i64 nByte) >> i64 nSize; /* Required file size */ >> struct stat buf; /* Used to hold return values of fstat() */ >> >> - if (fstat(pFile->h, &buf)) >> + if (fstat(pFile->h, &buf)) { > > 7. I suggest to use != 0 explicitly, according to the code > style rules for the new code. The same in the hunk below. Ok. I don't mind. > >> + diag_set(SystemError, >> + "failed to retrive information about the" >> + " file '%s'", pFile->zPath); >> return -1; >> + } >> >> nSize = >> ((nByte + pFile->szChunk - >> @@ -1165,8 +1205,12 @@ unixMapfile(unixFile * pFd, i64 nMap) >> >> if (nMap < 0) { >> struct stat statbuf; /* Low-level file information */ >> - if (fstat(pFd->h, &statbuf)) >> + if (fstat(pFd->h, &statbuf)) { >> + diag_set(SystemError, >> + "failed to retrive information about the" >> + " file '%s'", pFd->zPath); >> return -1; >> + } >> nMap = statbuf.st_size; >> } >> if (nMap > pFd->mmapSizeMax) { >> @@ -1449,6 +1493,8 @@ unixTempFileDir(void) >> break; >> zDir = azDirs[i++]; >> } >> + diag_set(ClientError, ER_SYSTEM, >> + "No access to any temporary directory"); >> return 0; >> } > > 8. unixGetTempname() does not set diag in one case of -1 return. Updated. > Updated patch: SQL module didn't set an error in the diagnostics area on failure inside os_unix.c. This could lead to a crash like in #5537. Co-authored-by: Mergen Imeev Follow-up #5537 --- src/box/sql/os_unix.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 67 insertions(+), 6 deletions(-) diff --git a/src/box/sql/os_unix.c b/src/box/sql/os_unix.c index d64f1bd..07461f5 100644 --- a/src/box/sql/os_unix.c +++ b/src/box/sql/os_unix.c @@ -159,14 +159,17 @@ robust_open(const char *z, int f, mode_t m) if (fd < 0) { if (errno == EINTR) continue; + diag_set(SystemError, "failed to open file '%s'", z); break; } if (fd >= SQL_MINIMUM_FILE_DESCRIPTOR) break; close(fd); fd = -1; - if (open("/dev/null", f, m) < 0) + if (open("/dev/null", f, m) < 0) { + diag_set(SystemError, "failed to open '/dev/null'"); break; + } } if (fd >= 0) { if (m != 0) { @@ -193,6 +196,10 @@ robust_ftruncate(int h, sql_int64 sz) do { rc = ftruncate(h, sz); } while (rc < 0 && errno == EINTR); + + if (rc < 0) + diag_set(SystemError, "failed to truncate file"); + return rc; } @@ -395,6 +402,9 @@ findInodeInfo(unixFile * pFile, /* Unix file with file desc used in the key */ fd = pFile->h; rc = fstat(fd, &statbuf); if (rc != 0) { + diag_set(SystemError, + "failed to retrieve information about the file '%s'", + pFile->zPath); storeLastErrno(pFile, errno); return -1; } @@ -473,8 +483,12 @@ unixFileLock(unixFile * pFile, struct flock *pLock) lock.l_len = SHARED_SIZE; lock.l_type = F_WRLCK; rc = fcntl(pFile->h, F_SETLK, &lock); - if (rc < 0) + if (rc < 0) { + diag_set(SystemError, + "failed to acquire / release a lock on" + " the file '%s'", pFile->zPath); return rc; + } pInode->bProcessLock = 1; pInode->nLock++; } else { @@ -482,6 +496,11 @@ unixFileLock(unixFile * pFile, struct flock *pLock) } } else { rc = fcntl(pFile->h, F_SETLK, pLock); + if (rc < 0) { + diag_set(SystemError, + "failed to acquire / release a lock on" + " the file '%s'", pFile->zPath); + } } return rc; } @@ -729,6 +748,9 @@ seekAndRead(unixFile * id, sql_int64 offset, void *pBuf, int cnt) do { newOffset = lseek(id->h, offset, SEEK_SET); if (newOffset < 0) { + diag_set(SystemError, + "failed to reposition the offset of '%s' file", + id->zPath); storeLastErrno((unixFile *) id, errno); return -1; } @@ -740,6 +762,8 @@ seekAndRead(unixFile * id, sql_int64 offset, void *pBuf, int cnt) got = 1; continue; } + diag_set(SystemError, "failed to read from file '%s'", + id->zPath); prior = 0; storeLastErrno((unixFile *) id, errno); break; @@ -794,6 +818,11 @@ unixRead(sql_file * id, void *pBuf, int amt, sql_int64 offset) /* lastErrno set by seekAndRead */ return -1; } else { + /* Read less than planned from file. */ + const char *err = tt_sprintf("read less (%i bytes) than " + "planned (%i bytes) from file: %s", + got, amt, pFile->zPath); + diag_set(ClientError, ER_SQL_EXECUTE, err); storeLastErrno(pFile, 0); /* not a system error */ /* Unread parts of the buffer must be zero-filled */ memset(&((char *)pBuf)[got], 0, amt - got); @@ -825,10 +854,16 @@ seekAndWriteFd(int fd, /* File descriptor to write to */ do { i64 iSeek = lseek(fd, iOff, SEEK_SET); if (iSeek < 0) { + diag_set(SystemError, + "failed to reposition file offset"); rc = -1; break; } rc = write(fd, pBuf, nBuf); + if (rc < 0 && errno != EINTR) { + diag_set(SystemError, + "failed to write %i bytes to file", nBuf); + } } while (rc < 0 && errno == EINTR); if (rc < 0) @@ -873,6 +908,12 @@ unixWrite(sql_file * id, const void *pBuf, int amt, sql_int64 offset) /* lastErrno set by seekAndWrite */ return -1; } else { + /* Wrote to file less than planned. */ + const char *err = tt_sprintf("wrote less (%i bytes) " + "than planned (%i bytes) " + "to file: %s", + wrote, amt, pFile->zPath); + diag_set(ClientError, ER_SQL_EXECUTE, err); storeLastErrno(pFile, 0); /* not a system error */ return -1; } @@ -940,8 +981,12 @@ fcntlSizeHint(unixFile * pFile, i64 nByte) i64 nSize; /* Required file size */ struct stat buf; /* Used to hold return values of fstat() */ - if (fstat(pFile->h, &buf)) + if (fstat(pFile->h, &buf) != 0) { + diag_set(SystemError, + "failed to retrieve information about the" + " file '%s'", pFile->zPath); return -1; + } nSize = ((nByte + pFile->szChunk - @@ -1165,8 +1210,12 @@ unixMapfile(unixFile * pFd, i64 nMap) if (nMap < 0) { struct stat statbuf; /* Low-level file information */ - if (fstat(pFd->h, &statbuf)) + if (fstat(pFd->h, &statbuf) != 0) { + diag_set(SystemError, + "failed to retrieve information about the" + " file '%s'", pFd->zPath); return -1; + } nMap = statbuf.st_size; } if (nMap > pFd->mmapSizeMax) { @@ -1449,6 +1498,8 @@ unixTempFileDir(void) break; zDir = azDirs[i++]; } + diag_set(ClientError, ER_SYSTEM, + "No access to any temporary directory"); return 0; } @@ -1480,8 +1531,11 @@ unixGetTempname(int nBuf, char *zBuf) sql_snprintf(nBuf, zBuf, "%s/" SQL_TEMP_FILE_PREFIX "%ld_%llx%c", zDir, (long)randomnessPid, r, 0); - if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10) + if (zBuf[nBuf - 2] != 0 || (iLimit++) > 10) { + diag_set(ClientError, ER_SQL_EXECUTE, + "can't create temporary file"); return -1; + } } while (access(zBuf, 0) == 0); return 0; } @@ -1558,6 +1612,9 @@ getFileMode(const char *zFile, /* File name */ *pUid = sStat.st_uid; *pGid = sStat.st_gid; } else { + diag_set(SystemError, + "failed to retrieve information about the file '%s'", + zFile); rc = -1; } return rc; @@ -1813,6 +1870,7 @@ unixDelete(sql_vfs * NotUsed, /* VFS containing this as the xDelete method */ int rc = 0; UNUSED_PARAMETER(NotUsed); if (unlink(zPath) == (-1)) { + diag_set(SystemError, "failed to unlink the file '%s'", zPath); return -1; } if ((dirSync & 1) != 0) { @@ -1820,7 +1878,10 @@ unixDelete(sql_vfs * NotUsed, /* VFS containing this as the xDelete method */ rc = openDirectory(zPath, &fd); if (rc == 0) { struct stat buf; - if (fstat(fd, &buf)) { + if (fstat(fd, &buf) != 0) { + diag_set(SystemError, + "failed to retrieve information about" + " the file '%s'", zPath); rc = -1; } close(fd); -- 2.7.4