[tarantool-patches] Re: [PATCH] test: enable parallel for python tests and long

Sun Dec 23 14:04:13 MSK 2018

Hi!

See comments below and fixups on the
sergw/enable-parallel-test-py-long-fixups branch.

I tested it with the command below and after the fixup it passes.

```
$ ./test-run.py --long -- $(for i in $(seq 1 10); do echo -n "-py/ "; done)
```

WBR, Alexander Turenko.

On Tue, Dec 18, 2018 at 11:43:00AM +0300, Sergei Voronezhskii wrote:
> Fixed cleanup for python tests:
> - box-py/iproto.test.py need to cleanup created cluster
>   reproduce:
>   - [box-py/iproto.test.py, null]
>   - [box-py/bootstrap.test.py, null]
> - box-py/boostrap.test.py should not restart server because of next
>   box-py/call.test.py got error:
>    `NetworkError: (104, 'Connection reset by peer')`
>    at `iproto.authenticate('test', 'test')`
>   reproduce:
>   - [box-py/iproto.test.py, null]
>   - [box-py/bootstrap.test.py, null]
>   - [box-py/call.test.py, null]
> - replication-py/multi.test.py should not relay on hardcoded server.id

relay -> rely

Why it should not? You cleaned up _cluster in box-py/iproto.test.py.
Why don't clean up it in replication-py/cluster.test.py too?

>   because previous test can create some and `id` will autoincremented,
>   instead this we need to calculate vclock_diff

You use hardcoded server id 1. 'previous test can create some' -- some
what? I suppose you are about using vclock difference instead of
absolute value, but the message states something different.

>   reproduce:
>   - [replication-py/cluster.test.py, null]
>   - [replication-py/multi.test.py, null]
> 
> Part of: #3232
> ---
> BRANCH: https://github.com/tarantool/tarantool/tree/sergw/enable-parallel-test-py-long
>  test/box-py/bootstrap.test.py     |  4 +---
>  test/box-py/iproto.result         |  4 ++++
>  test/box-py/iproto.test.py        |  1 +
>  test/box-py/suite.ini             |  2 +-
>  test/engine_long/suite.ini        |  2 +-
>  test/long_run-py/suite.ini        |  2 +-
>  test/luajit-tap/suite.ini         |  2 +-
>  test/replication-py/multi.result  | 19 +++++--------------
>  test/replication-py/multi.test.py | 28 +++++++++++++++++-----------
>  test/replication-py/suite.ini     |  2 +-
>  test/xlog-py/suite.ini            |  2 +-
>  11 files changed, 34 insertions(+), 34 deletions(-)
> 
> diff --git a/test/box-py/bootstrap.test.py b/test/box-py/bootstrap.test.py
> index 9d690b03f..dba6f5ae9 100644
> --- a/test/box-py/bootstrap.test.py
> +++ b/test/box-py/bootstrap.test.py
> @@ -9,8 +9,6 @@ cluster_uuid = yaml.load(server.admin('box.space._schema:get("cluster")',
>  sys.stdout.push_filter(cluster_uuid, '<cluster uuid>')
>  
>  server.admin('box.internal.bootstrap()')
> -server.restart()
> -

I suppose now the test case will not test what it was intended. Please,
implement waiting for the server startup.

You should have strong opinion why this will not broke the test case
before propose such kind of things in a patch. When you have this
opinion it is easy to write it in the commit message.

I don't sure whether it really brokes the test case, but I don't see any
justification in the commit message. So I need to investigate the
problem from scratch to prove it is good or bad. And so what gain we get
from your work?

The code cost is zero w/o investigation. The investigation result was
not shared. I don't understand why do you behave in this way. Again and
again.

>  server.admin('box.space._schema:select{}')
>  server.admin('box.space._cluster:select{}')
>  server.admin('box.space._space:select{}')
> @@ -20,4 +18,4 @@ server.admin('box.space._func:select{}')
>  server.admin('box.space._priv:select{}')
>  
>  # Cleanup
> -sys.stdout.pop_filter()
> +sys.stdout.clear_all_filters()

Two filters was pushed, two should be popped, I think.

> diff --git a/test/replication-py/multi.test.py b/test/replication-py/multi.test.py
> index 224332266..3394d47be 100644
> --- a/test/replication-py/multi.test.py
> +++ b/test/replication-py/multi.test.py
> @@ -10,10 +10,14 @@ ROW_N = REPLICA_N * 20
>  
>  # master server
>  master = server
> +
>  master.admin("fiber = require('fiber')")
>  master.admin("box.schema.user.grant('guest', 'replication')")
>  master.admin("box.schema.user.grant('guest', 'execute', 'universe')")
>  
> +# Get vclock on master
> +vclock_cnt = yaml.load(master.admin("box.info.vclock[1]", silent = True))[0]
> +
>  print '----------------------------------------------------------------------'
>  print 'Bootstrap replicas'
>  print '----------------------------------------------------------------------'
> @@ -48,14 +52,16 @@ for server in cluster:
>      server.iproto.py_con.eval("box.cfg { replication = ... }", [sources])
>  
>  # Wait connections to establish
> -for server in cluster:
> +for sid, server in enumerate(cluster, 1):
>      for server2 in cluster:
>          server.iproto.py_con.eval("""
>              while #box.info.vclock[...] ~= nil do
>                  fiber.sleep(0.01)
>              end;""", server2.id)
> -        print 'server', server.id, "connected"
> -    server.admin("box.info.vclock")
> +        print 'server', sid, "connected"
> +    vclock_new = yaml.load(server.admin("box.info.vclock[1]", silent = True))[0]
> +    print "vclock_diff: {}".format(vclock_new - vclock_cnt)
> +
>  

Re server.id -> sid: Please, cleanup _cluster in
replication-py/cluster.test.py instead. The same for all such changes
below.

Re vclocks: Why this vector clock difference is a scalar value? Why it
is always for server id 1? Please, describe the problem first. It is
that cluster.test.py moves vclock? I think we can just don't print it,
because it was informational thing.