[tarantool-patches] Re: [PATCH 00/12] vinyl: statistics improvements
Vladimir Davydov
vdavydov.dev at gmail.com
Thu Jan 17 15:06:42 MSK 2019
On Thu, Jan 17, 2019 at 02:32:36PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev at gmail.com> [19/01/15 17:20]:
> > This patch set adds a few metrics necessary for implementing compaction
> > randomization and transaction throttling, but it's useful on its own,
> > because it makes box.stat.vinyl() a little bit more useful when it comes
> > to performance analysis. Here's an example of box.stat.vinyl() output
> > with this patch set applied:
>
> Please write a documentation request which explains the meaning of
> these variables. AFAIK these stats are still not described in the
> manual. Please try to explain why these statistics are useful, and
> how they can be used.
There are a lot of changes and they are done by spearate patches so
I'm planning to file a documentation request manually after this patch
set is pushed.
>
> > ---
> > - tx:
> > conflict: 0
> > commit: 1979052
> > rollback: 0
> > statements: 2
> > transactions: 1
> > gap_locks: 0
> > read_views: 0
> > regulator:
>
> let's rename it to rate_limit or rate_limits? Regulator is not
> specific enough. What does it regulate?
Transaction rate. I guess 'rate_limit' name would be somewhat more
straightforward, but the component is called vy_regulator in the code
and I'd like to keep the name 'regulator', because it'd be consistent
with other box.stat.vinyl() sections:
scheduler - schedules dumps and compaction tasks
regulator - regulates transaction rate basing on scheduler progress
iterator - here we will account cumulative iterator statistics
(cache hits/misses, read amplification, etc); this one hasn't been
implemented yet.
Besides, I'm planning to add 'rate_limit' member to this table and
regulator.rate_limit looks better than rate_limit.rate_limit or
rate_limit.value IMO.
>
> > dump_bandwidth: 10485760
> Without comments even I forget the meaning of these.
So we have a documentation for it.
>
> > dump_watermark: 20023725
> > write_rate: 7085581
>
> > memory:
> > tuple_cache: 0
> > tx: 2388
> > level0: 19394239
> > page_index: 4422529
> > bloom_filter: 1517177
>
> Good.
>
> > disk:
> > data_compacted: 500330587
>
> What's this?
Size of disk space (without compression) the database would take if all
spaces were compacted: (data + index) / data_compacted can be used to
estimate space amplification. It is estimated as the size of the last
LSM tree level. Wouldn't know how to name it better: data_unique may be,
or data_stripped, or simply last_level. IMO data_compacted sounds
better.
>
> > data: 762493299
> > index: 41814873
> > scheduler:
> > dump_time: 186.61679973663
>
> It's total dump time, the name can be confused with
> last dump time.
Yep, but reporting the last dump time wouldn't make any sense. I'd like
to avoid total_ prefixes as the names are already long enough.
>
> > tasks_inprogress: 3
> > dump_output: 2115930554
> > compaction_queue: 213022513
> > compaction_output: 4130054964
> > compaction_time: 737.99443827965
> > dump_count: 136
> > tasks_failed: 0
> > tasks_completed: 1839
> > dump_input: 2061676471
> > compaction_input: 5646476938
More information about the Tarantool-patches
mailing list