Sergey, thanks a lot for the review.
1. sync() can be asynchonous operation so we should wait it or somehow make
sure it has finished.
7. sync() can be asynchonous operation, we should wait it to make sure
it has finished.
17. the same comment as above regarding sync()
The «sync» behaviour can be asynchronous, however, modern Linux has been waiting for the recording to finish.
Any way I will add checking exit status from sync.
2. What does '500' mean? Let's move magic numbers to constants.
Sure. Will fix it.
3. As far as I understand cbench_runner.lua wrote two files
cbench_output_memtx_write.txt and cbench_output_vinyl_write.txt, you
pass filenames as arguments. On next step we should extract interested
numbers from these files. But cbench_output_memtx_write.txt is unused
here. And how other files were created: cbench_output_memtx.txt and
cbench_output_vinyl_fsync.txt?
There are 3 runs for cbench with different parameters which produce 3 output files:
cbench_output_memtx.txt
cbench_output_vinyl_fsync.txt
cbench_output_vinyl_write.txt
which are parsed to 4 results files.
4. Lines with grep/sed looks absolutely unreadable. Please split lines, like below:
   grep "^?tab" cbench_output_memtx.txt |
   sed "s/.*name=//"| sed "s/&param=/:/"| sed "s/cb\./cb\.memtx\./" |
   tee -a cbench_result.txt
   grep "^?tab" cbench_output_vinyl_fsync.txt |
   sed "s/.*name=//"| sed "s/&param=/:/"| sed "s/cb\./cb\.vinyl\.fsync\./"|
   tee -a cbench_result.txt
6. set -u
9. It is not clear why filemask includes zero at the beginning.
10. please remove commented lines or uncomment them
12. you can implement the same body of function much simpler:
   with open(filename) as raw_data:
   return raw_data.readlines()
13. you can implement the same body of function much simpler:
   with open(filename) as raw_data:
   return raw_data.readlines()[-1][0]
14. Deviations
16. set -u
17(b). filemask for *.xlog and *.snap
Refactoring. I’ve added changes you suggested.
5. Running of publish.py will broke if we will move repo to another
   directory. Let's use relative path to publish.py.
8. Running of publish.py will broke if we will move repo to another
   directory. Let's use relative path to publish.py.
11. Running of publish.py will broke if we will move repo to another
   directory. Let's use relative path to publish.py.
15. Running of publish.py will broke if we will move repo to another
   directory. Let's use relative path to publish.py.
18. Running of publish.py will broke if we will move repo to another
   directory. Let's use relative path to publish.py.
21. Running of publish.py will broke if we will move repo to another
directory. Let's use relative path to publish.py.
This absolute path from container root. As we agreed it will be kept.
19. cbench script uses physcpubind equal to 1, why we set exact these
numbers and why physcpubind has not the same set of numbers?
For performance testing we are using servers with 2 processors
and each processor have few (6) cores. For stable results we bind tasks
to single proc or even single core. The difference can be because
of bench requirement (as Tarantool can run multithread execution).
So this is still open question how to do correctly binding, but for now
I would suggest to keep this parameters unchanged. 
20. What does "l" variable mean?
This variables changes in range (for l in a b c d e f ; do) and runs different workloads:
 -rw-r--r-- 1 root root  3014 Apr 24 07:24 workloada
 -rw-r--r-- 1 root root  3050 Apr 24 07:24 workloadb
 -rw-r--r-- 1 root root  3036 Apr 24 07:24 workloadc
 -rw-r--r-- 1 root root  3399 Apr 24 07:24 workloadd
 -rw-r--r-- 1 root root  3558 Apr 24 07:24 workloade
 -rw-r--r-- 1 root root  3149 Apr 24 07:24 workloadf
22. Why do we need gperftools here?
Possible artefact from debugging phase a long time ago. Doesn't need for perf testing anymore. Removing.
 
--
Oleg Piskunov