From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 25 Apr 2019 15:53:10 +0300 From: Alexander Turenko Subject: Re: [tarantool-patches] [PATCH v3 7/7] Add merger for tuple streams (Lua part) Message-ID: <20190425125310.ah2rdqdguvfvlzoz@tkn_work_nb> References: <9af4b8f1311537ef696d71a1b09bc1721bde8ef0.1554906327.git.alexander.turenko@tarantool.org> <20190425114659.GN29257@atlas> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190425114659.GN29257@atlas> To: Konstantin Osipov Cc: tarantool-patches@freelists.org, Vladimir Davydov List-ID: On Thu, Apr 25, 2019 at 02:46:59PM +0300, Konstantin Osipov wrote: > * Alexander Turenko [19/04/10 18:23]: > > > The api is generally LGTM, one comment below: > > > A merger is a special kind of a source, which is created from a key_def > > object and a set of sources. It performs a kind of the merge sort: > > chooses a source with a minimal / maximal tuple on each step, consumes > > a tuple from this source and repeats. The API to create a merger is the > > following: > > > > ```lua > > local ctx = merger.context.new(key_def.new(<...>)) > > local sources = {<...>} > > local merger_inst = merger.new(ctx, sources, { > > Why do you need a separate object used only to construct a > merger? Why not pass all parameters into merger.new? A user may want to cache key_def + format creation when a schema changes rarely. The original merger allows it: https://github.com/tarantool/shard/blob/180948e99148973e89f75f8e4784315e183e3fa2/shard/init.lua#L1215-L1224 Even if one doesn't cache a merger context, but runs several merges over one data stream (see the multiplexed cases in examples) and a schema is the same, it worth to reuse the context.