Hi! Thanks for your review! Please find my answers below, together with an incremental diff. I’ll send v3 shortly. > 16 февр. 2020 г., в 19:15, Vladislav Shpilevoy написал(а): > > Hi! Thanks for the patch! I will review other commits when > Kostja is fine with them. > > Since he finished with this one, here is my nit: lets keep > assertion for the debug build. If not this assertion, we > probably wouldn't notice this bug, and may miss future bugs > without it. During running the tests we won't notice a > warning. > > Also we probably should not call vclock_follow() at all, if > lsn is broken. Just keep it as it. It does not look right to Ok > decrease it. And make it panic() in vclock_follow() to catch > other bugs related to it. Kostja was against panicking on the previous review iteration, the assertion is still in `vclock_follow()` > > On 13/02/2020 22:52, sergepetrenko wrote: >> From: Serge Petrenko >> >> There is an assertion in vclock_follow `lsn > prev_lsn`, which doesn't >> fire in release builds, of course. Let's at least warn the user on an >> attemt to write a record with a duplicate or otherwise broken lsn. >> >> Follow-up #4739 >> --- >> src/box/wal.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/src/box/wal.c b/src/box/wal.c >> index 0ae66ff32..f8ee2b7d8 100644 >> --- a/src/box/wal.c >> +++ b/src/box/wal.c >> @@ -951,9 +951,18 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, >> (*row)->tsn = tsn; >> (*row)->is_commit = row == end - 1; >> } else { >> - vclock_follow(vclock_diff, (*row)->replica_id, >> - (*row)->lsn - vclock_get(base, >> - (*row)->replica_id)); >> + int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id); >> + if (diff <= vclock_get(vclock_diff, >> + (*row)->replica_id)) { >> + say_crit("Attempt to write a broken LSN to WAL:" >> + " replica id: %d, committed lsn: %d," >> + " new lsn %d", (*row)->replica_id, >> + vclock_get(base, (*row)->replica_id) + >> + vclock_get(vclock_diff, >> + (*row)->replica_id), >> + (*row)->lsn); >> + } >> + vclock_follow(vclock_diff, (*row)->replica_id, diff); > > On the summary, lets call follow() in 'else' branch, and add unreachable() > after crit log. I believe `unreachable` doesn’t fit here, since it implies that the code is truly unreachable, while we are trying to catch something that «shouldn’t happen». I’ve even seen a ticket in our repo regarding this misuse. Let’s just leave an `assert(0)` in this branch, unreachable is defined like that anyway. Here are my changes: diff --git a/src/box/wal.c b/src/box/wal.c index f8ee2b7d8..a87aedf1d 100644 --- a/src/box/wal.c +++ b/src/box/wal.c @@ -955,14 +955,16 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, if (diff <= vclock_get(vclock_diff, (*row)->replica_id)) { say_crit("Attempt to write a broken LSN to WAL:" - " replica id: %d, committed lsn: %d," + " replica id: %d, confirmed lsn: %d," " new lsn %d", (*row)->replica_id, vclock_get(base, (*row)->replica_id) + vclock_get(vclock_diff, (*row)->replica_id), (*row)->lsn); + assert(0); + } else { + vclock_follow(vclock_diff, (*row)->replica_id, diff); } - vclock_follow(vclock_diff, (*row)->replica_id, diff); } } } -- Serge Petrenko sergepetrenko@tarantool.org