[Tarantool-patches] [PATCH v1 1/2] test: remove obvious part in rpm spec for Travis

Sat Dec 26 06:30:18 MSK 2020

On Fri, Dec 25, 2020 at 11:44:16PM +0300, Alexander V. Tikhonov wrote:
> Removed obvious part in rpm spec for Travis-CI, due to it is no
> longer in use.
> ---
> 
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/rpm-spec-timeouts
> 
>  rpm/tarantool.spec | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/rpm/tarantool.spec b/rpm/tarantool.spec
> index 489b94df8..aae60f8dc 100644
> --- a/rpm/tarantool.spec
> +++ b/rpm/tarantool.spec
> @@ -170,11 +170,6 @@ make %{?_smp_mflags}
>  rm -rf %{buildroot}%{_datarootdir}/doc/tarantool/
>  
>  %check
> -%if "%{_ci}" == "travis"
> -%if (0%{?fedora} >= 22 || 0%{?rhel} >= 7 || 0%{?sle_version} >= 1500)
> -cd test && ./test-run.py --force -j 1 unit/ app/ app-tap/ box/ box-tap/ engine/ vinyl/
> -%endif
> -%else
>  %if 0%{?rhel} != 6
>  # Run all available test suites except 'replication'
>  # which is not currently ready for this testing and
> @@ -182,7 +177,6 @@ cd test && ./test-run.py --force -j 1 unit/ app/ app-tap/ box/ box-tap/ engine/
>  # https://github.com/tarantool/tarantool/issues/4798
>  TEST_RUN_EXCLUDE='replication/' make test-force
>  %endif
> -%endif
>  
>  %pre
>  /usr/sbin/groupadd -r tarantool > /dev/null 2>&1 || :
> -- 

Thanks!

This change is obvious, so I'll push it out of order.

I have added several comments to the commit message, see at the end of
the email.

Pushed to master, 2.6, 2.5 and 1.10.

CCed Kirill.

WBR, Alexander Turenko.

----

commit d9c25b7a8991ef56c71d5bd9296881b2068afb79
Author: Alexander V. Tikhonov <avtikhon at tarantool.org>
Date:   Fri Dec 25 15:00:40 2020 +0300

    test: remove obvious part in rpm spec for Travis

    Removed obvious part in rpm spec for Travis-CI, due to it is no
    longer in use.

    ---- Comments from @Totktonada ----

    This change is a kind of revertion of the commit
    d48406d58c54231b7a53c8f8ed4ba9c1d5275a59 ('test: add more tests to
    packaging testing'), which did close #4599.

    Here I described the story, why the change was made and why it is
    reverted now.

    We run testing during an RPM package build: it may catch some
    distribution specific problem. We had reduced quantity of tests and
    single thread tests execution to keep the testing stable and don't break
    packages build and deployment due to known fragile tests.

    Our CI had to use Travis CI, but we were in transition to GitLab CI to
    use our own machines and don't reach Travis CI limit with five jobs
    running in parallel.

    We moved package builds to GitLab CI, but kept build+deploy jobs on
    Travis CI for a while: GitLab CI was the new for us and we wanted to do
    this transition smoothly for users of our APT / YUM repositories.

    After enabling packages building on GitLab CI, we wanted to enable more
    tests (to catch more problems) and wanted to enable parallel execution
    of tests to speed up testing (and reduce amount of time a developer wait
    for results).

    We observed that if we'll enable more tests and parallel execution on
    Travis CI, the testing results will become much less stable and so we'll
    often have holes in deployed packages and red CI.

    So, we decided to keep the old way testing on Travis CI and perform all
    changes (more tests, more parallelism) only for GitLab CI.

    We had a guess that we have enough machine resources and will able to do
    some load balancing to overcome flaky fails on our own machines, but in
    fact we picked up another approach later (see below).

    That's all story behind #4599. What changes from those days?

    We moved deployment jobs to GitLab CI[^1] and now we completely disabled
    Travis CI (see #4410 and #4894). All jobs were moved either to GitLab CI
    or right to GitHub Actions[^2].

    We revisited our approach to improve stability of testing. Attemps to do
    some load balancing together with attempts to keep not-so-large
    execution time were failed. We should increase parallelism for speed,
    but decrease it for stability at the same time. There is no optimal
    balance.

    So we decided to track flaky fails in the issue tracker and restart a
    test after a known fail (see details in [1]). This way we don't need to
    exclude tests and disable parallelism in order to get the stable and
    fast testing[^3]. At least in theory. We're on the way to verify this
    guess, but hopefully we'll stick with some adequate defaults that will
    work everywhere[^4].

    To sum up, there are several reasons to remove the old workaround, which
    was implemented in the scope of #4599: no Travis CI, no foreseeable
    reasons to exclude tests and reduce parallelism depending on a CI
    provider.

    Footnotes:

    [^1]: This is simplification. Travis CI deployment jobs were not moved
          as is. GitLab CI jobs push packages to the new repositories
          backend (#3380). Travis CI jobs were disabled later (as part of
          #4947), after proofs that the new infrastructure works fine.
          However this is the another story.

    [^2]: Now we're going to use GitHub Actions for all jobs, mainly because
          GitLab CI is poorly integrated with GitHub pull requests (when
          source branch is in a forked repository).

    [^3]: Some work toward this direction still to be done:

          First, 'replication' test suite still excluded from the testing
          under RPM package build. It seems, we should just enable it back,
          it is tracked by #4798.

          Second, there is the issue [2] to get rid of ancient traces of the
          old attempts to keep the testing stable (from test-run side).
          It'll give us more parallelism in testing.

    [^4]: Of course, we perform investigations of flaky fails and fix code
          and testing problems it feeds to us. However it appears to be the
          long activity.

    References:

    [1]: https://github.com/tarantool/test-run/pull/217
    [2]: https://github.com/tarantool/test-run/issues/251