Tarantool development patches archive
 help / color / mirror / Atom feed
From: Oleg Babin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org,
	yaroslav.dynnikov@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module
Date: Fri, 5 Mar 2021 00:22:39 +0300	[thread overview]
Message-ID: <0522c6f4-7c8a-b424-eeb1-0bb0b10084d7@tarantool.org> (raw)
In-Reply-To: <f1bf851b5497b6d9a7216e9a2299a68905847f30.1614039039.git.v.shpilevoy@tarantool.org>

Hi! I've looked again. See 3 comments/questions below.

On 23.02.2021 03:15, Vladislav Shpilevoy wrote:
> +local function ref_session_new(sid)
> +    -- Session object does store its internal hot attributes in a table. Because
> +    -- it would mean access to any session attribute would cost at least one
> +    -- table indexing operation. Instead, all internal fields are stored as
> +    -- upvalues referenced by the methods defined as closures.
> +    --
> +    -- This means session creation may not very suitable for jitting, but it is
> +    -- very rare and attempts to optimize the most common case.
> +    --
> +    -- Still the public functions take 'self' object to make it look normally.
> +    -- They even use it a bit.
> +
> +    -- Ref map to get ref object by its ID.
> +    local ref_map = {}
> +    -- Ref heap sorted by their deadlines.
> +    local ref_heap = lheap.new(heap_min_deadline_cmp)
> +    -- Total number of refs of the session. Is used to drop the session without
> +    -- fullscan of the ref map. Heap size can't be used because not all refs are
> +    -- stored here. See more on that below.
> +    local count = 0

Maybe it's better to rename it to "global_count". Sometimes it's quite 
confusing to see `M.count +=` near `count += `.

Also you have "global_map" and "global_heap" so no reasons to call it 
just "count".

> +    -- Cache global session storages as upvalues to save on M indexing.
> +    local global_heap = M.session_heap
> +    local global_map = M.session_map
> +
> +    local function ref_session_discount(self, del_count)
> +        local new_count = M.count - del_count
> +        assert(new_count >= 0)
> +        M.count = new_count
> +
> +        new_count = count - del_count
> +        assert(new_count >= 0)
> +        count = new_count
> +    end
> +
> +    local function ref_session_update_deadline(self)
> +        local ref = ref_heap:top()
> +        if not ref then
> +            self.deadline = DEADLINE_INFINITY
> +            global_heap:update(self)
> +        else
> +            local deadline = ref.deadline
> +            if deadline ~= self.deadline then
> +                self.deadline = deadline
> +                global_heap:update(self)
> +            end
> +        end
> +    end
> +
> +    --
> +    -- Garbage collect at most 2 expired refs. The idea is that there is no a
> +    -- dedicated fiber for expired refs collection. It would be too expensive to
> +    -- wakeup a fiber on each added or removed or updated ref.
> +    --
> +    -- Instead, ref GC is mostly incremental and works by the principle "remove
> +    -- more than add". On each new ref added, two old refs try to expire. This
> +    -- way refs don't stack infinitely, and the expired refs are eventually
> +    -- removed. Because removal is faster than addition: -2 for each +1.
> +    --
> +    local function ref_session_gc_step(self, now)
> +        -- This is inlined 2 iterations of the more general GC procedure. The
> +        -- latter is not called in order to save on not having a loop,
> +        -- additional branches and variables.
> +        if self.deadline > now then
> +            return
> +        end
> +        local top = ref_heap:top()
> +        ref_heap:remove_top()
> +        ref_map[top.id] = nil
> +        top = ref_heap:top()
> +        if not top then
> +            self.deadline = DEADLINE_INFINITY
> +            global_heap:update(self)
> +            ref_session_discount(self, 1)
> +            return
> +        end
> +        local deadline = top.deadline
> +        if deadline >= now then
> +            self.deadline = deadline
> +            global_heap:update(self)
> +            ref_session_discount(self, 1)
> +            return
> +        end
> +        ref_heap:remove_top()
> +        ref_map[top.id] = nil
> +        top = ref_heap:top()
> +        if not top then
> +            self.deadline = DEADLINE_INFINITY
> +        else
> +            self.deadline = top.deadline
> +        end
> +        global_heap:update(self)
> +        ref_session_discount(self, 2)
> +    end
> +
> +    --
> +    -- GC expired refs until they end or the limit on the number of iterations
> +    -- is exhausted. The limit is supposed to prevent too long GC which would
> +    -- occupy TX thread unfairly.
> +    --
> +    -- Returns false if nothing to GC, or number of iterations left from the
> +    -- limit. The caller is supposed to yield when 0 is returned, and retry GC
> +    -- until it returns false.
> +    -- The function itself does not yield, because it is used from a more
> +    -- generic function GCing all sessions. It would not ever yield if all
> +    -- sessions would have less than limit refs, even if total ref count would
> +    -- be much bigger.
> +    --
> +    -- Besides, the session might be killed during general GC. There must not be
> +    -- any yields in session methods so as not to introduce a support of dead
> +    -- sessions.
> +    --
> +    local function ref_session_gc(self, limit, now)
> +        if self.deadline >= now then
> +            return false
> +        end

Here you mix "booleans" and "numbers" as return values. Maybe it's 
better to return "nil" here?


> +        local top = ref_heap:top()
> +        local del = 1
> +        local rest = 0
> +        local deadline
> +        repeat
> +            ref_heap:remove_top()
> +            ref_map[top.id] = nil
> +            top = ref_heap:top()
> +            if not top then
> +                self.deadline = DEADLINE_INFINITY
> +                rest = limit - del
> +                break
> +            end
> +            deadline = top.deadline
> +            if deadline >= now then
> +                self.deadline = deadline
> +                rest = limit - del
> +                break
> +            end
> +            del = del + 1
> +        until del >= limit
> +        ref_session_discount(self, del)
> +        global_heap:update(self)
> +        return rest
> +    end
> +
> +    local function ref_session_add(self, rid, deadline, now)
> +        if ref_map[rid] then
> +            return nil, lerror.vshard(lerror.code.STORAGE_REF_ADD,
> +                                      'duplicate ref')
> +        end
> +        local ref = {
> +            deadline = deadline,
> +            id = rid,
> +            -- Used by the heap.
> +            index = -1,
> +        }
> +        ref_session_gc_step(self, now)
> +        ref_map[rid] = ref
> +        ref_heap:push(ref)
> +        if deadline < self.deadline then
> +            self.deadline = deadline
> +            global_heap:update(self)
> +        end
> +        count = count + 1
> +        M.count = M.count + 1
> +        return true
> +    end
> +
> +    --
> +    -- Ref use means it can't be expired until deleted explicitly. Should be
> +    -- done when the request affecting the whole storage starts. After use it is
> +    -- important to call del afterwards - GC won't delete it automatically now.
> +    -- Unless the entire session is killed.
> +    --
> +    local function ref_session_use(self, rid)
> +        local ref = ref_map[rid]
> +        if not ref then
> +            return nil, lerror.vshard(lerror.code.STORAGE_REF_USE, 'no ref')
> +        end
> +        ref_heap:remove(ref)
> +        ref_session_update_deadline(self)
> +        return true
> +    end
> +
> +    local function ref_session_del(self, rid)
> +        local ref = ref_map[rid]
> +        if not ref then
> +            return nil, lerror.vshard(lerror.code.STORAGE_REF_DEL, 'no ref')
> +        end
> +        ref_heap:remove_try(ref)
> +        ref_map[rid] = nil
> +        ref_session_update_deadline(self)
> +        ref_session_discount(self, 1)
> +        return true
> +    end
> +
> +    local function ref_session_kill(self)
> +        global_map[sid] = nil
> +        global_heap:remove(self)
> +        ref_session_discount(self, count)
> +    end
> +
> +    -- Don't use __index. It is useless since all sessions use closures as
> +    -- methods. Also it is probably slower because on each method call would
> +    -- need to get the metatable, get __index, find the method here. While now
> +    -- it is only an index operation on the session object.

Side note: for heap you still use "__index" even heap uses closures as 
methods.


  parent reply	other threads:[~2021-03-04 21:22 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23  0:15 [Tarantool-patches] [PATCH vshard 00/11] VShard Map-Reduce, part 2: Ref, Sched, Map Vladislav Shpilevoy via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 01/11] error: introduce vshard.error.timeout() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-24 21:46     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 12:42       ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 10/11] sched: introduce vshard.storage.sched module Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:28   ` Oleg Babin via Tarantool-patches
2021-02-24 21:50     ` Vladislav Shpilevoy via Tarantool-patches
2021-03-04 21:02   ` Oleg Babin via Tarantool-patches
2021-03-05 22:06     ` Vladislav Shpilevoy via Tarantool-patches
2021-03-09  8:03       ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 11/11] router: introduce map_callrw() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:28   ` Oleg Babin via Tarantool-patches
2021-02-24 22:04     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 12:43       ` Oleg Babin via Tarantool-patches
2021-02-26 23:58         ` Vladislav Shpilevoy via Tarantool-patches
2021-03-01 10:58           ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 02/11] storage: add helper for local functions invocation Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 03/11] storage: cache bucket count Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-24 21:47     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 12:42       ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 04/11] registry: module for circular deps resolution Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 05/11] util: introduce safe fiber_cond_wait() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-24 21:48     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 12:42       ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 06/11] util: introduce fiber_is_self_canceled() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 07/11] storage: introduce bucket_generation_wait() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 08/11] storage: introduce bucket_are_all_rw() Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:27   ` Oleg Babin via Tarantool-patches
2021-02-24 21:48     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-23  0:15 ` [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module Vladislav Shpilevoy via Tarantool-patches
2021-02-24 10:28   ` Oleg Babin via Tarantool-patches
2021-02-24 21:49     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 12:42       ` Oleg Babin via Tarantool-patches
2021-03-04 21:22   ` Oleg Babin via Tarantool-patches [this message]
2021-03-05 22:06     ` Vladislav Shpilevoy via Tarantool-patches
2021-03-09  8:03       ` Oleg Babin via Tarantool-patches
2021-03-21 18:49   ` Vladislav Shpilevoy via Tarantool-patches
2021-03-12 23:13 ` [Tarantool-patches] [PATCH vshard 00/11] VShard Map-Reduce, part 2: Ref, Sched, Map Vladislav Shpilevoy via Tarantool-patches
2021-03-15  7:05   ` Oleg Babin via Tarantool-patches
2021-03-28 18:17 ` Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0522c6f4-7c8a-b424-eeb1-0bb0b10084d7@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=olegrok@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --cc=yaroslav.dynnikov@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH vshard 09/11] ref: introduce vshard.storage.ref module' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox