From: Oleg Babin via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, tarantool-patches@dev.tarantool.org, yaroslav.dynnikov@tarantool.org Subject: Re: [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module Date: Fri, 5 Mar 2021 00:22:39 +0300 [thread overview] Message-ID: <0522c6f4-7c8a-b424-eeb1-0bb0b10084d7@tarantool.org> (raw) In-Reply-To: <f1bf851b5497b6d9a7216e9a2299a68905847f30.1614039039.git.v.shpilevoy@tarantool.org> Hi! I've looked again. See 3 comments/questions below. On 23.02.2021 03:15, Vladislav Shpilevoy wrote: > +local function ref_session_new(sid) > + -- Session object does store its internal hot attributes in a table. Because > + -- it would mean access to any session attribute would cost at least one > + -- table indexing operation. Instead, all internal fields are stored as > + -- upvalues referenced by the methods defined as closures. > + -- > + -- This means session creation may not very suitable for jitting, but it is > + -- very rare and attempts to optimize the most common case. > + -- > + -- Still the public functions take 'self' object to make it look normally. > + -- They even use it a bit. > + > + -- Ref map to get ref object by its ID. > + local ref_map = {} > + -- Ref heap sorted by their deadlines. > + local ref_heap = lheap.new(heap_min_deadline_cmp) > + -- Total number of refs of the session. Is used to drop the session without > + -- fullscan of the ref map. Heap size can't be used because not all refs are > + -- stored here. See more on that below. > + local count = 0 Maybe it's better to rename it to "global_count". Sometimes it's quite confusing to see `M.count +=` near `count += `. Also you have "global_map" and "global_heap" so no reasons to call it just "count". > + -- Cache global session storages as upvalues to save on M indexing. > + local global_heap = M.session_heap > + local global_map = M.session_map > + > + local function ref_session_discount(self, del_count) > + local new_count = M.count - del_count > + assert(new_count >= 0) > + M.count = new_count > + > + new_count = count - del_count > + assert(new_count >= 0) > + count = new_count > + end > + > + local function ref_session_update_deadline(self) > + local ref = ref_heap:top() > + if not ref then > + self.deadline = DEADLINE_INFINITY > + global_heap:update(self) > + else > + local deadline = ref.deadline > + if deadline ~= self.deadline then > + self.deadline = deadline > + global_heap:update(self) > + end > + end > + end > + > + -- > + -- Garbage collect at most 2 expired refs. The idea is that there is no a > + -- dedicated fiber for expired refs collection. It would be too expensive to > + -- wakeup a fiber on each added or removed or updated ref. > + -- > + -- Instead, ref GC is mostly incremental and works by the principle "remove > + -- more than add". On each new ref added, two old refs try to expire. This > + -- way refs don't stack infinitely, and the expired refs are eventually > + -- removed. Because removal is faster than addition: -2 for each +1. > + -- > + local function ref_session_gc_step(self, now) > + -- This is inlined 2 iterations of the more general GC procedure. The > + -- latter is not called in order to save on not having a loop, > + -- additional branches and variables. > + if self.deadline > now then > + return > + end > + local top = ref_heap:top() > + ref_heap:remove_top() > + ref_map[top.id] = nil > + top = ref_heap:top() > + if not top then > + self.deadline = DEADLINE_INFINITY > + global_heap:update(self) > + ref_session_discount(self, 1) > + return > + end > + local deadline = top.deadline > + if deadline >= now then > + self.deadline = deadline > + global_heap:update(self) > + ref_session_discount(self, 1) > + return > + end > + ref_heap:remove_top() > + ref_map[top.id] = nil > + top = ref_heap:top() > + if not top then > + self.deadline = DEADLINE_INFINITY > + else > + self.deadline = top.deadline > + end > + global_heap:update(self) > + ref_session_discount(self, 2) > + end > + > + -- > + -- GC expired refs until they end or the limit on the number of iterations > + -- is exhausted. The limit is supposed to prevent too long GC which would > + -- occupy TX thread unfairly. > + -- > + -- Returns false if nothing to GC, or number of iterations left from the > + -- limit. The caller is supposed to yield when 0 is returned, and retry GC > + -- until it returns false. > + -- The function itself does not yield, because it is used from a more > + -- generic function GCing all sessions. It would not ever yield if all > + -- sessions would have less than limit refs, even if total ref count would > + -- be much bigger. > + -- > + -- Besides, the session might be killed during general GC. There must not be > + -- any yields in session methods so as not to introduce a support of dead > + -- sessions. > + -- > + local function ref_session_gc(self, limit, now) > + if self.deadline >= now then > + return false > + end Here you mix "booleans" and "numbers" as return values. Maybe it's better to return "nil" here? > + local top = ref_heap:top() > + local del = 1 > + local rest = 0 > + local deadline > + repeat > + ref_heap:remove_top() > + ref_map[top.id] = nil > + top = ref_heap:top() > + if not top then > + self.deadline = DEADLINE_INFINITY > + rest = limit - del > + break > + end > + deadline = top.deadline > + if deadline >= now then > + self.deadline = deadline > + rest = limit - del > + break > + end > + del = del + 1 > + until del >= limit > + ref_session_discount(self, del) > + global_heap:update(self) > + return rest > + end > + > + local function ref_session_add(self, rid, deadline, now) > + if ref_map[rid] then > + return nil, lerror.vshard(lerror.code.STORAGE_REF_ADD, > + 'duplicate ref') > + end > + local ref = { > + deadline = deadline, > + id = rid, > + -- Used by the heap. > + index = -1, > + } > + ref_session_gc_step(self, now) > + ref_map[rid] = ref > + ref_heap:push(ref) > + if deadline < self.deadline then > + self.deadline = deadline > + global_heap:update(self) > + end > + count = count + 1 > + M.count = M.count + 1 > + return true > + end > + > + -- > + -- Ref use means it can't be expired until deleted explicitly. Should be > + -- done when the request affecting the whole storage starts. After use it is > + -- important to call del afterwards - GC won't delete it automatically now. > + -- Unless the entire session is killed. > + -- > + local function ref_session_use(self, rid) > + local ref = ref_map[rid] > + if not ref then > + return nil, lerror.vshard(lerror.code.STORAGE_REF_USE, 'no ref') > + end > + ref_heap:remove(ref) > + ref_session_update_deadline(self) > + return true > + end > + > + local function ref_session_del(self, rid) > + local ref = ref_map[rid] > + if not ref then > + return nil, lerror.vshard(lerror.code.STORAGE_REF_DEL, 'no ref') > + end > + ref_heap:remove_try(ref) > + ref_map[rid] = nil > + ref_session_update_deadline(self) > + ref_session_discount(self, 1) > + return true > + end > + > + local function ref_session_kill(self) > + global_map[sid] = nil > + global_heap:remove(self) > + ref_session_discount(self, count) > + end > + > + -- Don't use __index. It is useless since all sessions use closures as > + -- methods. Also it is probably slower because on each method call would > + -- need to get the metatable, get __index, find the method here. While now > + -- it is only an index operation on the session object. Side note: for heap you still use "__index" even heap uses closures as methods.
next prev parent reply other threads:[~2021-03-04 21:22 UTC|newest] Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-23 0:15 [Tarantool-patches] [PATCH vshard 00/11] VShard Map-Reduce, part 2: Ref, Sched, Map Vladislav Shpilevoy via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 01/11] error: introduce vshard.error.timeout() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-24 21:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-25 12:42 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 10/11] sched: introduce vshard.storage.sched module Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:28 ` Oleg Babin via Tarantool-patches 2021-02-24 21:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-04 21:02 ` Oleg Babin via Tarantool-patches 2021-03-05 22:06 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-09 8:03 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 11/11] router: introduce map_callrw() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:28 ` Oleg Babin via Tarantool-patches 2021-02-24 22:04 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-25 12:43 ` Oleg Babin via Tarantool-patches 2021-02-26 23:58 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-01 10:58 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 02/11] storage: add helper for local functions invocation Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 03/11] storage: cache bucket count Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-24 21:47 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-25 12:42 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 04/11] registry: module for circular deps resolution Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 05/11] util: introduce safe fiber_cond_wait() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-24 21:48 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-25 12:42 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 06/11] util: introduce fiber_is_self_canceled() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 07/11] storage: introduce bucket_generation_wait() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 08/11] storage: introduce bucket_are_all_rw() Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:27 ` Oleg Babin via Tarantool-patches 2021-02-24 21:48 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-23 0:15 ` [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module Vladislav Shpilevoy via Tarantool-patches 2021-02-24 10:28 ` Oleg Babin via Tarantool-patches 2021-02-24 21:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-02-25 12:42 ` Oleg Babin via Tarantool-patches 2021-03-04 21:22 ` Oleg Babin via Tarantool-patches [this message] 2021-03-05 22:06 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-09 8:03 ` Oleg Babin via Tarantool-patches 2021-03-21 18:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-12 23:13 ` [Tarantool-patches] [PATCH vshard 00/11] VShard Map-Reduce, part 2: Ref, Sched, Map Vladislav Shpilevoy via Tarantool-patches 2021-03-15 7:05 ` Oleg Babin via Tarantool-patches 2021-03-28 18:17 ` Vladislav Shpilevoy via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=0522c6f4-7c8a-b424-eeb1-0bb0b10084d7@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=olegrok@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --cc=yaroslav.dynnikov@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox