From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Ilya Konyukhov <runsfor@gmail.com>, tarantool-patches@dev.tarantool.org Cc: alexander.turenko@tarantool.org Subject: Re: [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features Date: Sun, 7 Jun 2020 18:45:42 +0200 [thread overview] Message-ID: <67c75c01-8503-2355-e1f7-9644def2179c@tarantool.org> (raw) In-Reply-To: <ef0528997c2d120730f40af9f6d70c7bd2db08c1.1591345474.git.runsfor@gmail.com> Thanks for the patch! Generally, I don't like having so much Lua code in the daemon, and system space full scans. Because it is slow and produces Lua garbage. Also anyway it can't collect some internal things such as whether SQL is used (it is not exposed in any system spaces), popen, swim, etc. These things don't register self in any global place. I was rather thinking about keeping track of all these modules and their statistics in C. So as collection of the statistics would be right when it changes, in a set of int counters. And statistics dump would cost O(1) by time, right into a JSON string, without Lua participation except that it would call this C dumper and put its result into an http request. In other words, I am not sure this commit is needed at all, until we understand how to collect all the other features too. See 6 comments below. On 05/06/2020 10:35, Ilya Konyukhov wrote: > This patch adds basic db features to feedback report. > It collects info about what engine and which types of > indexes are setup by the user. > > Here is how report may look like if all the features used: > > ```json > { > "arch": "x64", > "features": { > "has_bitset_index": true, > "has_jsonpath_index": true, > "vinyl": true, > "has_tree_index": true, > "has_primary_index": true, > "has_hash_index": true, > "memtx": true, > "has_temporary_spaces": true, > "has_local_spaces": true, > "has_rtree_index": true, > "has_secondary_index": true, > "has_functional_index": true > }, > "server_id": "7c8490f7-61c5-4e12-a7ff-d9fed05ad8ac", > "is_docker": false, > "os": "OSX", > "feedback_type": "version", > "cluster_id": "1eb7d98e-3344-4f15-a439-c287464f09e7", > "tarantool_version": "2.5.0-90-g27fbe6ecd", > "feedback_version": 1 > } > ``` > > Part of #4943 > --- > src/box/lua/feedback_daemon.lua | 65 +++++++++++++++++++++++++++ > test/box-tap/feedback_daemon.test.lua | 42 ++++++++++++++++- > 2 files changed, 106 insertions(+), 1 deletion(-) > > diff --git a/src/box/lua/feedback_daemon.lua b/src/box/lua/feedback_daemon.lua > index 2ce49fb22..0fcd8ed87 100644 > --- a/src/box/lua/feedback_daemon.lua > +++ b/src/box/lua/feedback_daemon.lua > @@ -41,6 +41,15 @@ local function detect_docker_environment() > return true > end > > +local function is_system_space(sp) > + local sp_id = sp.id > + if box.schema.SYSTEM_ID_MIN <= sp_id and sp_id <= box.schema.SYSTEM_ID_MAX then > + return true > + end 1. Please, keep code lines inside 80 symbols border. Also this function return can be simplified to return box.schema.SYSTEM_ID_MIN <= sp_id and sp_id <= box.schema.SYSTEM_ID_MAX > + > + return false > +end > + > local function fill_in_base_info(feedback) > if box.info.status ~= "running" then > return nil, "not running" > @@ -56,9 +65,65 @@ local function fill_in_platform_info(feedback) > feedback.is_docker = detect_docker_environment() > end > > +local function fill_in_space_indices(feedback, sp) > + if not sp.index[0] then return end > + > + feedback.features.has_primary_index = true 2. What is a purpose of this field? Zero-index spaces always exist, at least because indexes are created in a separate DDL statement. Besides, the function and spaces iteration may be really heavy, if space count is thousands. Or even hundreds, but with many indexes. And there is no a yield. In addition to yields I ask you to add caching of this function results using schema version counter. Schema changes very rarely, so caching would make this function practically free almost always. > + local idx_count = 0 > + for _, idx in pairs(sp.index) do > + for _, part in pairs(idx.parts) do > + if part.path ~= nil then > + feedback.features.has_jsonpath_index = true > + break > + end > + end > + if idx.func ~= nil then > + feedback.features.has_functional_index = true > + end > + if idx.type == 'TREE' then > + feedback.features.has_tree_index = true > + elseif idx.type == 'HASH' then > + feedback.features.has_hash_index = true > + elseif idx.type == 'RTREE' then > + feedback.features.has_rtree_index = true > + elseif idx.type == 'BITSET' then > + feedback.features.has_bitset_index = true > + end > + idx_count = idx_count + 1 > + end > + > + if idx_count > 1 then > + feedback.features.has_secondary_index = true 3. This does not look really useful. What is this flag going to tell us? Secondary indexes exist almost always. Besides, I agree with Dmitry's comment about counters instead of flags. > + end > +end > + > +local function fill_in_features(feedback) > + feedback.features = feedback.features or {} > + > + local is_memtx, is_vinyl, is_temporary, is_local > + for _, sp in pairs(box.space) do > + local is_system = is_system_space(sp) > + if not is_system then > + if sp.engine == 'vinyl' then is_vinyl = true end > + if sp.engine == 'memtx' then > + if sp.temporary ~= nil then is_temporary = true end > + is_memtx = true > + end > + if sp.is_local ~= nil then is_local = true end > + fill_in_space_indices(feedback, sp) > + end > + end > + > + feedback.features.has_temporary_spaces = is_temporary > + feedback.features.has_local_spaces = is_local > + feedback.features.memtx = is_memtx > + feedback.features.vinyl = is_vinyl 4. Why do some flags have prefix 'has_', some have 'is_', and some are just nouns like 'memtx', 'vinyl'? Lets be consistent and use one name template. For that type of flags in C we would use 'has_'. > +end > diff --git a/test/box-tap/feedback_daemon.test.lua b/test/box-tap/feedback_daemon.test.lua > index c36b2a694..e382af8e8 100755 > --- a/test/box-tap/feedback_daemon.test.lua > +++ b/test/box-tap/feedback_daemon.test.lua > @@ -113,6 +113,46 @@ check("feedback after start") > daemon.send_test() > check("feedback after feedback send_test") > > +local feedback_json = json.decode(feedback_save) 5. When write a test for an issue, please, mention the issue in a comment and describe it shortly. Like this: -- -- gh-####: description. -- > +test:is(type(feedback_json.features), 'table', 'features field is present') > +test:isnil(next(feedback_json.features), 'features are empty at the moment') > + > +box.schema.create_space('features_vinyl', {engine = 'vinyl'}) > +box.schema.create_space('features_memtx', {engine = 'memtx', is_local = true, temporary = true}) > +box.space.features_memtx:create_index('vinyl_pk', {type = 'tree'}) > +box.space.features_memtx:create_index('memtx_pk', {type = 'hash'}) > +box.space.features_memtx:create_index('memtx_bitset', {type = 'bitset'}) > +box.space.features_memtx:create_index('memtx_rtree', {type = 'rtree', parts = {3, 'array'}}) > +box.space.features_memtx:create_index('memtx_jpath', > + {parts = {{field=4, type='str', path='data.name'}}}) 6. Please, be consistent in the code style. Surround '=' with whitespaces, add a whitespace after ',' (see your code below). > +box.schema.func.create('features_func', { > + body = "function(tuple) return {string.sub(tuple[2],1,1)} end", > + is_deterministic = true, > + is_sandboxed = true}) > +box.space.features_memtx:create_index('j', > + {parts={{field = 1, type = 'number'}},func = 'features_func'}) > + > +check('old feedback received') > +feedback_reset() > +check('feedback with db features received') > + > +feedback_json = json.decode(feedback_save) > +test:test('features', function(t) > + t:plan(12) > + t:ok(feedback_json.features.memtx, 'memtx engine usage gathered') > + t:ok(feedback_json.features.vinyl, 'vinyl engine usage gathered') > + t:ok(feedback_json.features.has_temporary_spaces, 'temporary space usage gathered') > + t:ok(feedback_json.features.has_local_spaces, 'local space usage gathered') > + t:ok(feedback_json.features.has_primary_index, 'primary index gathered') > + t:ok(feedback_json.features.has_secondary_index, 'secondary index gathered') > + t:ok(feedback_json.features.has_tree_index, 'tree index gathered') > + t:ok(feedback_json.features.has_hash_index, 'hash index gathered') > + t:ok(feedback_json.features.has_rtree_index, 'rtree index gathered') > + t:ok(feedback_json.features.has_bitset_index, 'bitset index gathered') > + t:ok(feedback_json.features.has_jsonpath_index, 'jsonpath index gathered') > + t:ok(feedback_json.features.has_functional_index, 'functional index gathered') > +end) > + > daemon.stop() > > box.feedback.save("feedback.json") >
next prev parent reply other threads:[~2020-06-07 16:45 UTC|newest] Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-05 8:35 [Tarantool-patches] [PATCH 0/2] Extend feedback module report Ilya Konyukhov 2020-06-05 8:35 ` [Tarantool-patches] [PATCH 1/2] feedback: determine runtime platform info Ilya Konyukhov 2020-06-07 16:45 ` Vladislav Shpilevoy 2020-06-09 23:05 ` Илья Конюхов 2020-06-11 19:32 ` Vladislav Shpilevoy 2020-07-01 0:16 ` Alexander Turenko 2020-07-05 2:14 ` Alexander Turenko 2020-06-05 8:35 ` [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features Ilya Konyukhov 2020-06-07 16:45 ` Vladislav Shpilevoy [this message] 2020-06-09 23:06 ` Илья Конюхов 2020-06-11 19:32 ` Vladislav Shpilevoy 2020-06-17 8:59 ` Илья Конюхов 2020-06-17 22:53 ` Vladislav Shpilevoy 2020-06-18 15:42 ` Илья Конюхов 2020-06-18 23:02 ` Vladislav Shpilevoy 2020-06-19 14:01 ` Илья Конюхов 2020-06-19 23:49 ` Vladislav Shpilevoy 2020-06-22 8:55 ` Илья Конюхов 2020-07-01 0:15 ` Alexander Turenko 2020-07-03 12:05 ` Илья Конюхов 2020-07-05 2:10 ` Alexander Turenko 2020-06-23 21:23 ` [Tarantool-patches] [PATCH 0/2] Extend feedback module report Vladislav Shpilevoy 2020-07-13 13:47 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=67c75c01-8503-2355-e1f7-9644def2179c@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=alexander.turenko@tarantool.org \ --cc=runsfor@gmail.com \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox