From: "Илья Конюхов" <runsfor@gmail.com> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features Date: Wed, 17 Jun 2020 11:59:35 +0300 [thread overview] Message-ID: <6D02837E-BBEB-4CB9-806F-D5D8E66014B1@gmail.com> (raw) In-Reply-To: <6b8b0f63-0a3f-d3ae-7596-2611ea433d44@tarantool.org> Hi, I’ve worked on your comments and fixed lapses. See comments below. Besides that, are you ok with pushing this (Lua) patch forward? I understand your thoughts on implementing more robust solution in C, which would also cover some statistics gathering goals. I agree with it, but I think we need to design it first. Let me branch out an issue on this topic. > On 11 Jun 2020, at 22:32, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote: > > Thanks for the fixes! > > Generally fine. As a first set of simple stats. See a few > more comments below. > > First is about why I wanted to collect the stat from C, in a > reactive way, rather than in a proactive way. > > The main reason - C is cheaper. Also the collected stats could be > useful for users, if we could design it in a good way. And then it > could be collected more often than once in an hour. > > The other reason - the schema cache should help in most of the > cases, but I once saw a customer, who used spaces as temporary > objects. From what I saw, it was code of a game, and spaces were > lobbies. In such installation the space count may be huge, and > schema change rate as well. So the cache becomes useless. > > A reactive C implementation could just increment a set of int counters > in a global struct in alter.cc right when a space/index is > created/changed/dropped. It would cost nothing for everyone. And would > be a good base for wider set of statistics. Which we clearly need. > For example, Cyrill was struggling with lack of internal statistics > when tried to bench relay threads performance. > > See 4 comments below. > >> diff --git a/src/box/lua/feedback_daemon.lua b/src/box/lua/feedback_daemon.lua >> index 21e69d511..1f177a204 100644 >> --- a/src/box/lua/feedback_daemon.lua >> +++ b/src/box/lua/feedback_daemon.lua >> @@ -50,6 +50,25 @@ local function detect_docker_environment() >> return cached_detect_docker_env >> end >> >> +local function is_system_space(sp) >> + return box.schema.SYSTEM_ID_MIN <= sp.id and >> + sp.id <= box.schema.SYSTEM_ID_MAX > > 1. Nit: please, align the second line by 'box' in the previous > line. Like this: > > return box.schema.SYSTEM_ID_MIN <= sp.id and > sp.id <= box.schema.SYSTEM_ID_MAX done. > >> +end >> + >> +local function is_jsonpath_index(idx) >> + for _, part in pairs(idx.parts) do >> + if part.path ~= nil then >> + return true >> + end >> + end >> + >> + return false >> +end >> + >> +local function is_functional_index(idx) >> + return idx.func ~= nil >> +end >> + >> local function fill_in_base_info(feedback) >> if box.info.status ~= "running" then >> return nil, "not running" >> @@ -65,9 +84,98 @@ local function fill_in_platform_info(feedback) >> feedback.is_docker = detect_docker_environment() >> end >> >> +local function fill_in_indices_stats(space, stats) >> + if not space.index[0] then return end > > 2. Looks like this condition is not necessary. The cycle anyway > shouldn't do any iterations if the table is empty. agree. fixed. > >> + >> + for name, idx in pairs(space.index) do >> + if type(name) == 'number' then >> + local idx_type = idx.type >> + if idx_type == 'TREE' then >> + if is_functional_index(idx) then >> + stats.functional = stats.functional + 1 >> + elseif is_jsonpath_index(idx) then >> + stats.jsonpath = stats.jsonpath + 1 >> + end >> + stats.tree = stats.tree + 1 >> + elseif idx_type == 'HASH' then >> + stats.hash = stats.hash + 1 >> + elseif idx_type == 'RTREE' then >> + stats.rtree = stats.rtree + 1 >> + elseif idx_type == 'BITSET' then >> + stats.bitset = stats.bitset + 1 >> + end >> + end >> + end >> +end >> + >> +local function fill_in_space_stats(features) >> + local spaces = { >> + memtx = 0, >> + vinyl = 0, >> + temporary = 0, >> + ['local'] = 0, >> + } >> + >> + local indices = { >> + hash = 0, >> + tree = 0, >> + rtree = 0, >> + bitset = 0, >> + jsonpath = 0, >> + functional = 0, >> + } >> + >> + for name, space in pairs(box.space) do >> + local is_system = is_system_space(space) >> + if not is_system and type(name) == 'number' then >> + if space.engine == 'vinyl' then >> + spaces.vinyl = spaces.vinyl + 1 >> + elseif space.engine == 'memtx' then >> + if space.temporary ~= nil then >> + spaces.temporary = spaces.temporary + 1 >> + end >> + spaces.memtx = spaces.memtx + 1 >> + end >> + if space.is_local == false then > > 3. Are you sure? Looks like a typo. Perhaps you meant: > > if space.is_local then > > Without '== false'. Otherwise you count non-local spaces. > How does it pass the tests now? My bad. Thanks, fixed. Test happen to pass, because there were 2 spaces. 1 local and 1 non-local. Adjusted it. > >> +local function fill_in_features_impl(features) >> + fill_in_space_stats(features) >> +end >> + >> +local cached_schema_version = 0 >> +local cached_feedback_features = {} > > 4. I would better move the cache handling into fill_in_space_stats(). > Because when you will add more features, not related to the schema, > they won't relate to schema version. > > fill_in_features() should not use any caches. Does not depend on schema. > fill_in_space_stats() can use the cache. Because fetches info from the > schema. > > I mean, it works now, but we would need to rewrite that mostly, when > more features will be collected. Agree. Extracted out caching into fill_in_space_stats(), but kept it in fill_in_features(), since space_stats is semantically features. diff --git a/src/box/lua/feedback_daemon.lua b/src/box/lua/feedback_daemon.lua index 95130d696..111a70fca 100644 --- a/src/box/lua/feedback_daemon.lua +++ b/src/box/lua/feedback_daemon.lua @@ -26,13 +26,154 @@ local function get_fiber_id(f) return fid end -local function fill_in_feedback(feedback) +local function detect_docker_environment_impl() + local fh = fio.open('/proc/1/cgroup', {'O_RDONLY'}) + if not fh then + return false + end + + -- fh:read() doesn't read empty "proc" files + local big_enough_chunk = 4096 + local s = fh:read(big_enough_chunk) + fh:close() + + return s and s:find('docker') and true or false +end + +local cached_detect_docker_env + +local function detect_docker_environment() + if cached_detect_docker_env == nil then + cached_detect_docker_env = detect_docker_environment_impl() + end + + return cached_detect_docker_env +end + +local function is_system_space(sp) + return box.schema.SYSTEM_ID_MIN <= sp.id and + sp.id <= box.schema.SYSTEM_ID_MAX +end + +local function is_jsonpath_index(idx) + for _, part in pairs(idx.parts) do + if part.path ~= nil then + return true + end + end + + return false +end + +local function is_functional_index(idx) + return idx.func ~= nil +end + +local function fill_in_base_info(feedback) if box.info.status ~= "running" then return nil, "not running" end feedback.tarantool_version = box.info.version feedback.server_id = box.info.uuid feedback.cluster_id = box.info.cluster.uuid +end + +local function fill_in_platform_info(feedback) + feedback.os = jit.os + feedback.arch = jit.arch + feedback.is_docker = detect_docker_environment() +end + +local function fill_in_indices_stats(space, stats) + for name, idx in pairs(space.index) do + if type(name) == 'number' then + local idx_type = idx.type + if idx_type == 'TREE' then + if is_functional_index(idx) then + stats.functional = stats.functional + 1 + elseif is_jsonpath_index(idx) then + stats.jsonpath = stats.jsonpath + 1 + end + stats.tree = stats.tree + 1 + elseif idx_type == 'HASH' then + stats.hash = stats.hash + 1 + elseif idx_type == 'RTREE' then + stats.rtree = stats.rtree + 1 + elseif idx_type == 'BITSET' then + stats.bitset = stats.bitset + 1 + end + end + end +end + +local function fill_in_space_stats_impl(features) + local spaces = { + memtx = 0, + vinyl = 0, + temporary = 0, + ['local'] = 0, + } + + local indices = { + hash = 0, + tree = 0, + rtree = 0, + bitset = 0, + jsonpath = 0, + functional = 0, + } + + for name, space in pairs(box.space) do + local is_system = is_system_space(space) + if not is_system and type(name) == 'number' then + if space.engine == 'vinyl' then + spaces.vinyl = spaces.vinyl + 1 + elseif space.engine == 'memtx' then + if space.temporary then + spaces.temporary = spaces.temporary + 1 + end + spaces.memtx = spaces.memtx + 1 + end + if space.is_local then + spaces['local'] = spaces['local'] + 1 + end + fill_in_indices_stats(space, indices) + end + fiber.yield() + end + + for k, v in pairs(spaces) do + features[k..'_spaces'] = v + end + + for k, v in pairs(indices) do + features[k..'_indices'] = v + end +end + +local cached_schema_version = 0 +local cached_feedback_features = {} + +local function fill_in_space_stats(feedback) + local schema_version = box.internal.schema_version() + if cached_schema_version < schema_version then + local features = {} + fill_in_space_stats_impl(features) + cached_schema_version = schema_version + cached_feedback_features = features + end + feedback.features = cached_feedback_features +end + +local function fill_in_features(feedback) + fill_in_space_stats(feedback) +end + +local function fill_in_feedback(feedback) + fill_in_base_info(feedback) + fill_in_platform_info(feedback) + fill_in_features(feedback) + return feedback end
next prev parent reply other threads:[~2020-06-17 8:59 UTC|newest] Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-05 8:35 [Tarantool-patches] [PATCH 0/2] Extend feedback module report Ilya Konyukhov 2020-06-05 8:35 ` [Tarantool-patches] [PATCH 1/2] feedback: determine runtime platform info Ilya Konyukhov 2020-06-07 16:45 ` Vladislav Shpilevoy 2020-06-09 23:05 ` Илья Конюхов 2020-06-11 19:32 ` Vladislav Shpilevoy 2020-07-01 0:16 ` Alexander Turenko 2020-07-05 2:14 ` Alexander Turenko 2020-06-05 8:35 ` [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features Ilya Konyukhov 2020-06-07 16:45 ` Vladislav Shpilevoy 2020-06-09 23:06 ` Илья Конюхов 2020-06-11 19:32 ` Vladislav Shpilevoy 2020-06-17 8:59 ` Илья Конюхов [this message] 2020-06-17 22:53 ` Vladislav Shpilevoy 2020-06-18 15:42 ` Илья Конюхов 2020-06-18 23:02 ` Vladislav Shpilevoy 2020-06-19 14:01 ` Илья Конюхов 2020-06-19 23:49 ` Vladislav Shpilevoy 2020-06-22 8:55 ` Илья Конюхов 2020-07-01 0:15 ` Alexander Turenko 2020-07-03 12:05 ` Илья Конюхов 2020-07-05 2:10 ` Alexander Turenko 2020-06-23 21:23 ` [Tarantool-patches] [PATCH 0/2] Extend feedback module report Vladislav Shpilevoy 2020-07-13 13:47 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=6D02837E-BBEB-4CB9-806F-D5D8E66014B1@gmail.com \ --to=runsfor@gmail.com \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 2/2] feedback: collect db engines and index features' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox