[tarantool-patches] Re: [PATCH 2/2] wal: drop rows_per_wal option

Wed Oct 9 19:05:33 MSK 2019

The patch itself LGTM.

The motivation why it is needed is not quite clear. Sergey, Kirill,
please, see the first cited discussion below.

WBR, Alexander Turenko.

----

> >> Rows_per_wal option was deprecated because it can be covered by
> >> wal_max_size. In order not to complicate WAL code with that
> >> option's support this commit drops it completely.
> > 
> > It is just one 'if' subcondition in the code and one field for storing
> > yields frequency. (And the field can be replaced with a static variable
> > w/o removing the option.) I don't see how this commit simplifies the
> > code.
> 
> AFAIU this complicates not only the present code, but also future code,
> according to Kostja's comment in the issue description:
> 
>     "We will have trouble with automatic size management if we
>      continue using wal_max_rows"
> 
> But honestly I don't really care that much about this option. I just
> decided to fix the unassigned issue, marked as breaking change, as
> soon as possible, to break less users. So I can't provide any serious
> evidence how rows_per_wal was going to complicate auto size management.

CCed Sergos and Kirill Yu.

> > The option was convenient for tests at least. Now additional
> > calculations are necessary to dig into cases that verifies xlog files
> > count.
> 
> If that is needed for testing only, then it should not be exposed to
> public API.

Good point.

> 
> > 
> > However I don't think that those cons and pros are important, so I don't
> > want to block the patch. The patch itself looks good except minor
> > comments below.
> > 
> >>
> >> In some tests the option was used to create several small xlog
> >> files. Now the same is done via wal_max_size. Where it was
> >> needed, number of rows per wal is estimated as wal_max_size / 50.
> >> Because struct xrow_header size ~= 50 not counting paddings and
> >> body.
> >>
> >> Closes #3762
> >> ---
> >>  src/box/box.cc                       |  30 +-------
> >>  src/box/lua/load_cfg.lua             |   5 --
> >>  src/box/wal.c                        |  14 +---
> >>  src/box/wal.h                        |  13 +++-
> >>  test/app-tap/init_script.result      |  39 +++++-----
> >>  test/app-tap/snapshot.test.lua       |   2 +-
> >>  test/app/app.lua                     |   2 +-
> >>  test/app/fiber.result                |   6 +-
> >>  test/app/fiber.test.lua              |   4 +-
> >>  test/box-py/box.lua                  |   2 +-
> >>  test/box-tap/cfg.test.lua            |   3 +-
> >>  test/box/admin.result                |   2 -
> >>  test/box/cfg.result                  |   4 -
> >>  test/box/configuration.result        | 107 ---------------------------
> >>  test/box/proxy.lua                   |   2 +-
> >>  test/box/tiny.lua                    |   1 -
> >>  test/engine/box.lua                  |   2 +-
> >>  test/engine_long/box.lua             |   1 -
> >>  test/long_run-py/box.lua             |   1 -
> >>  test/vinyl/vinyl.lua                 |   1 -
> >>  test/xlog-py/box.lua                 |   1 -
> >>  test/xlog/checkpoint_daemon.result   |  11 ++-
> >>  test/xlog/checkpoint_daemon.test.lua |   9 ++-
> >>  test/xlog/errinj.result              |   7 +-
> >>  test/xlog/errinj.test.lua            |   5 +-
> >>  test/xlog/panic.lua                  |   1 -
> >>  test/xlog/upgrade/fill.lua           |   2 +-
> >>  test/xlog/xlog.lua                   |   2 +-
> >>  28 files changed, 72 insertions(+), 207 deletions(-)
> >>  delete mode 100644 test/box/configuration.result
> > 
> >>  static void
> >> -wal_stream_create(struct wal_stream *ctx, size_t wal_max_rows)
> >> +wal_stream_create(struct wal_stream *ctx)
> >>  {
> >>  	xstream_create(&ctx->base, apply_wal_row);
> >>  	ctx->rows = 0;
> >> -	/**
> >> -	 * Make the yield logic covered by the functional test
> >> -	 * suite, which has a small setting for rows_per_wal.
> >> -	 * Each yield can take up to 1ms if there are no events,
> >> -	 * so we can't afford many of them during recovery.
> >> -	 */
> >> -	ctx->yield = (wal_max_rows >> 4)  + 1;
> > 
> > Yield frequency is not more depend on wal file size / rows count, so we
> > we'll not test this scenario anymore. Look:
> > 
> > https://coveralls.io/builds/26111576/source?filename=src/box/box.cc#L345
> > https://coveralls.io/builds/25591832/source?filename=src/box/box.cc#L343
> > 
> > Don't sure whether it is really important, but it seems that someone
> > paid special attention to this and even leave the comment about.
> > 
> > Can we, say, store 'rows per yield' in a static variable with 32000 as
> > the default value and allow tests to set it with errinj?
> 
> There are too many tests which had small rows_per_wal setting. I don't
> think it is a good idea to make them all errinj dependent. That would
> require to disable them in release build, and to move to separate files.
> Instead, I added a new test:

Okay.

> 
> ===========================================================================
> 
> diff --git a/test/engine/recover_snapshot_wal.result b/test/engine/recover_snapshot_wal.result
> index d2667559d..6b7b05362 100644
> --- a/test/engine/recover_snapshot_wal.result
> +++ b/test/engine/recover_snapshot_wal.result
> @@ -23,9 +23,9 @@ box.snapshot()
>  ---
>  - ok
>  ...
> -space:insert({1001})
> +space:insert({33001})
>  ---
> -- [1001]
> +- [33001]
>  ...
>  test_run:cmd("restart server default")
>  space = box.space['test']
> @@ -37,7 +37,7 @@ index = space.index['primary']
>  index:select({}, {iterator = box.index.ALL})
>  ---
>  - - [0]
> -  - [1001]
> +  - [33001]
>  ...
>  for key = 1, 351 do space:insert({key}) end
>  ---
> @@ -46,7 +46,18 @@ box.snapshot()
>  ---
>  - ok
>  ...
> -for key = 352, 1000 do space:insert({key}) end
> +-- Insert so many tuples, that recovery would need to yield
> +-- periodically to allow other fibers do something. At the moment
> +-- of writing this the yield period was 32k tuples.
> +box.begin()                                                     \
> +for key = 352, 33000 do                                         \
> +    space:insert({key})                                         \
> +    if key % 10000 == 0 then                                    \
> +        box.commit()                                            \
> +        box.begin()                                             \
> +    end                                                         \
> +end                                                             \
> +box.commit()
>  ---
>  ...
>  test_run:cmd("restart server default")
> @@ -56,1010 +67,24 @@ space = box.space['test']
>  index = space.index['primary']
>  ---
>  ...
> -index:select({}, {iterator = box.index.ALL})
> +i = 0
>  ---
> -- - [0]
> -  - [1]
> ...           -- lots of deletions
> -  - [999]
> -  - [1000]
> -  - [1001]
> +...
> +err = nil
> +---
> +...
> +for _, t in space:pairs() do                                    \
> +    if t[1] ~= i then                                           \
> +        err = {i, t}                                            \
> +        break                                                   \
> +    end                                                         \
> +    i = i + 1                                                   \
> +end
>  ---
>  ...
> +i
> +---
> +- 33002
> +...
> +err
> +---
> +- null
>  ...
>  space:drop()
>  ---
> diff --git a/test/engine/recover_snapshot_wal.test.lua b/test/engine/recover_snapshot_wal.test.lua
> index 1f133dd72..ff286bc4f 100644
> --- a/test/engine/recover_snapshot_wal.test.lua
> +++ b/test/engine/recover_snapshot_wal.test.lua
> @@ -10,7 +10,7 @@ index = space:create_index('primary')
>  
>  space:insert({0})
>  box.snapshot()
> -space:insert({1001})
> +space:insert({33001})
>  
>  test_run:cmd("restart server default")
>  
> @@ -20,13 +20,33 @@ index:select({}, {iterator = box.index.ALL})
>  
>  for key = 1, 351 do space:insert({key}) end
>  box.snapshot()
> -for key = 352, 1000 do space:insert({key}) end
> +-- Insert so many tuples, that recovery would need to yield
> +-- periodically to allow other fibers do something. At the moment
> +-- of writing this the yield period was 32k tuples.
> +box.begin()                                                     \
> +for key = 352, 33000 do                                         \
> +    space:insert({key})                                         \
> +    if key % 10000 == 0 then                                    \
> +        box.commit()                                            \
> +        box.begin()                                             \
> +    end                                                         \
> +end                                                             \
> +box.commit()
>  
>  test_run:cmd("restart server default")
>  
>  space = box.space['test']
>  index = space.index['primary']
> -index:select({}, {iterator = box.index.ALL})
> +i = 0
> +err = nil
> +for _, t in space:pairs() do                                    \
> +    if t[1] ~= i then                                           \
> +        err = {i, t}                                            \
> +        break                                                   \
> +    end                                                         \
> +    i = i + 1                                                   \
> +end
> +i
> +err
>  
>  space:drop()
>  test_run:cmd("restart server default with cleanup=1")
> 
> ===========================================================================
> 
> But I don't think it was necessary. According to the comment (and
> to the patch introduced it), this signal-handling-sleep(0) was never
> properly tested. And even though this line was covered by our tests,
> its purpose was not tested.

I saw an issue when a yield leads to segfault:
https://github.com/tarantool/tarantool/issues/4110

It is good to have a regression test: at least it verifies that we'll
continue reading an xlog file after yield.

> >> diff --git a/test/xlog/errinj.test.lua b/test/xlog/errinj.test.lua
> >> index de0e2e7e8..3d72dc4e4 100644
> >> --- a/test/xlog/errinj.test.lua
> >> +++ b/test/xlog/errinj.test.lua
> >> @@ -30,12 +30,13 @@ _ = test:create_index('primary')
> >>  
> >>  box.schema.user.grant('guest', 'write', 'space', 'test')
> >>  
> >> -for i=1, box.cfg.rows_per_wal do test:insert{i, 'test'} end
> >> +row_count_per_wal = box.cfg.wal_max_size / 50 + 10
> >> +for i=1, row_count_per_wal do test:insert{i, 'test'} end
> > 
> > Is +10 here because tuples are a bit larger than in other tests?
> > 
> 
> No. Rather because in other tests it is too +10, and here it wasn't.
> Size of one wal record I estimated as >= 50 (size of xrow_header +
> msgpack headers + body size), so perhaps in theory / 50 was already
> enough. +10 is just in case, to ensure that row_count_per_wal 100%
> will create at least one new xlog file.

I misread this, because other tests do not add 10 rows to the variable,
but add them in a loop range:

 | row_count_per_wal = box.cfg.wal_max_size / 50
 | for i = 1, row_count_per_wal + 10 do <...> end

Anyway, it is ok.