[tarantool-patches] Re: [PATCH 3/3] vinyl: force major compaction if there are too many DELETEs

Konstantin Osipov kostja at tarantool.org
Tue Oct 23 12:03:46 MSK 2018

* Vladimir Davydov <vdavydov.dev at gmail.com> [18/10/20 23:19]:
> Even a perfectly shaped LSM tree can accumulate a huge number of DELETE
> statements over time in case indexed fields are frequently updated. This
> can significantly increase read and space amplification, especially for
> secondary indexes.
> One way to deal with it is to propagate read amplification back to the
> scheduler so that it can raise compaction priority accordingly. Although
> this would probably make sense, it wouldn't be enough, because it
> wouldn't deal with space amplification growth in case the workload is
> write-mostly.

I disagree with the reasoning. We need a weighted norm of all
parameters of the lsm tree when calculating compaction priority. 
It's pretty easy to do. Imagine it's a multi-dimensional space,
where dimensions are write amplification, read
amplification, space amplification. We need to scale each
dimension and calculated a distance to the center of the space,
which stands for a perfectly shaped lsm. 

In any case reduction of read amplification and space amplification
address independent concerns: we need to ensure that space
amplification is within boundaries to not run out of disk space. 

Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

More information about the Tarantool-patches mailing list