[Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module
Oleg Babin
olegrok at tarantool.org
Fri Mar 5 00:22:39 MSK 2021
Hi! I've looked again. See 3 comments/questions below.
On 23.02.2021 03:15, Vladislav Shpilevoy wrote:
> +local function ref_session_new(sid)
> + -- Session object does store its internal hot attributes in a table. Because
> + -- it would mean access to any session attribute would cost at least one
> + -- table indexing operation. Instead, all internal fields are stored as
> + -- upvalues referenced by the methods defined as closures.
> + --
> + -- This means session creation may not very suitable for jitting, but it is
> + -- very rare and attempts to optimize the most common case.
> + --
> + -- Still the public functions take 'self' object to make it look normally.
> + -- They even use it a bit.
> +
> + -- Ref map to get ref object by its ID.
> + local ref_map = {}
> + -- Ref heap sorted by their deadlines.
> + local ref_heap = lheap.new(heap_min_deadline_cmp)
> + -- Total number of refs of the session. Is used to drop the session without
> + -- fullscan of the ref map. Heap size can't be used because not all refs are
> + -- stored here. See more on that below.
> + local count = 0
Maybe it's better to rename it to "global_count". Sometimes it's quite
confusing to see `M.count +=` near `count += `.
Also you have "global_map" and "global_heap" so no reasons to call it
just "count".
> + -- Cache global session storages as upvalues to save on M indexing.
> + local global_heap = M.session_heap
> + local global_map = M.session_map
> +
> + local function ref_session_discount(self, del_count)
> + local new_count = M.count - del_count
> + assert(new_count >= 0)
> + M.count = new_count
> +
> + new_count = count - del_count
> + assert(new_count >= 0)
> + count = new_count
> + end
> +
> + local function ref_session_update_deadline(self)
> + local ref = ref_heap:top()
> + if not ref then
> + self.deadline = DEADLINE_INFINITY
> + global_heap:update(self)
> + else
> + local deadline = ref.deadline
> + if deadline ~= self.deadline then
> + self.deadline = deadline
> + global_heap:update(self)
> + end
> + end
> + end
> +
> + --
> + -- Garbage collect at most 2 expired refs. The idea is that there is no a
> + -- dedicated fiber for expired refs collection. It would be too expensive to
> + -- wakeup a fiber on each added or removed or updated ref.
> + --
> + -- Instead, ref GC is mostly incremental and works by the principle "remove
> + -- more than add". On each new ref added, two old refs try to expire. This
> + -- way refs don't stack infinitely, and the expired refs are eventually
> + -- removed. Because removal is faster than addition: -2 for each +1.
> + --
> + local function ref_session_gc_step(self, now)
> + -- This is inlined 2 iterations of the more general GC procedure. The
> + -- latter is not called in order to save on not having a loop,
> + -- additional branches and variables.
> + if self.deadline > now then
> + return
> + end
> + local top = ref_heap:top()
> + ref_heap:remove_top()
> + ref_map[top.id] = nil
> + top = ref_heap:top()
> + if not top then
> + self.deadline = DEADLINE_INFINITY
> + global_heap:update(self)
> + ref_session_discount(self, 1)
> + return
> + end
> + local deadline = top.deadline
> + if deadline >= now then
> + self.deadline = deadline
> + global_heap:update(self)
> + ref_session_discount(self, 1)
> + return
> + end
> + ref_heap:remove_top()
> + ref_map[top.id] = nil
> + top = ref_heap:top()
> + if not top then
> + self.deadline = DEADLINE_INFINITY
> + else
> + self.deadline = top.deadline
> + end
> + global_heap:update(self)
> + ref_session_discount(self, 2)
> + end
> +
> + --
> + -- GC expired refs until they end or the limit on the number of iterations
> + -- is exhausted. The limit is supposed to prevent too long GC which would
> + -- occupy TX thread unfairly.
> + --
> + -- Returns false if nothing to GC, or number of iterations left from the
> + -- limit. The caller is supposed to yield when 0 is returned, and retry GC
> + -- until it returns false.
> + -- The function itself does not yield, because it is used from a more
> + -- generic function GCing all sessions. It would not ever yield if all
> + -- sessions would have less than limit refs, even if total ref count would
> + -- be much bigger.
> + --
> + -- Besides, the session might be killed during general GC. There must not be
> + -- any yields in session methods so as not to introduce a support of dead
> + -- sessions.
> + --
> + local function ref_session_gc(self, limit, now)
> + if self.deadline >= now then
> + return false
> + end
Here you mix "booleans" and "numbers" as return values. Maybe it's
better to return "nil" here?
> + local top = ref_heap:top()
> + local del = 1
> + local rest = 0
> + local deadline
> + repeat
> + ref_heap:remove_top()
> + ref_map[top.id] = nil
> + top = ref_heap:top()
> + if not top then
> + self.deadline = DEADLINE_INFINITY
> + rest = limit - del
> + break
> + end
> + deadline = top.deadline
> + if deadline >= now then
> + self.deadline = deadline
> + rest = limit - del
> + break
> + end
> + del = del + 1
> + until del >= limit
> + ref_session_discount(self, del)
> + global_heap:update(self)
> + return rest
> + end
> +
> + local function ref_session_add(self, rid, deadline, now)
> + if ref_map[rid] then
> + return nil, lerror.vshard(lerror.code.STORAGE_REF_ADD,
> + 'duplicate ref')
> + end
> + local ref = {
> + deadline = deadline,
> + id = rid,
> + -- Used by the heap.
> + index = -1,
> + }
> + ref_session_gc_step(self, now)
> + ref_map[rid] = ref
> + ref_heap:push(ref)
> + if deadline < self.deadline then
> + self.deadline = deadline
> + global_heap:update(self)
> + end
> + count = count + 1
> + M.count = M.count + 1
> + return true
> + end
> +
> + --
> + -- Ref use means it can't be expired until deleted explicitly. Should be
> + -- done when the request affecting the whole storage starts. After use it is
> + -- important to call del afterwards - GC won't delete it automatically now.
> + -- Unless the entire session is killed.
> + --
> + local function ref_session_use(self, rid)
> + local ref = ref_map[rid]
> + if not ref then
> + return nil, lerror.vshard(lerror.code.STORAGE_REF_USE, 'no ref')
> + end
> + ref_heap:remove(ref)
> + ref_session_update_deadline(self)
> + return true
> + end
> +
> + local function ref_session_del(self, rid)
> + local ref = ref_map[rid]
> + if not ref then
> + return nil, lerror.vshard(lerror.code.STORAGE_REF_DEL, 'no ref')
> + end
> + ref_heap:remove_try(ref)
> + ref_map[rid] = nil
> + ref_session_update_deadline(self)
> + ref_session_discount(self, 1)
> + return true
> + end
> +
> + local function ref_session_kill(self)
> + global_map[sid] = nil
> + global_heap:remove(self)
> + ref_session_discount(self, count)
> + end
> +
> + -- Don't use __index. It is useless since all sessions use closures as
> + -- methods. Also it is probably slower because on each method call would
> + -- need to get the metatable, get __index, find the method here. While now
> + -- it is only an index operation on the session object.
Side note: for heap you still use "__index" even heap uses closures as
methods.
More information about the Tarantool-patches
mailing list