From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id BD34B701BF; Sat, 27 Feb 2021 17:57:10 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BD34B701BF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1614437830; bh=9Laog2S6R4WrJnMhNLEsMf2IEF7fdHLWWUDy/fgsbeo=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=xuOMaz5ZCPc0PSGR2GzMRhsMszgExOTCBNpFM/8s90AZKE/A3y5Bwx9CeWUd5Km48 wxApsPyhf9Ilb+NGPaRGWfYQviofi9VdngY7e1X7Ivh5G580DLGDCFSW/09pDf9Hqm Yl+B8TeV6IuhdF3qBRRMzWvsusYOtApGJi4AW+YQ= Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 669136EC71 for ; Sat, 27 Feb 2021 17:57:08 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 669136EC71 Received: by smtp33.i.mail.ru with esmtpa (envelope-from ) id 1lG11n-0001rH-NH for tarantool-discussions@dev.tarantool.org; Sat, 27 Feb 2021 17:57:08 +0300 Content-Type: multipart/alternative; boundary="Apple-Mail=_47DA2149-02F0-4E6A-B9C1-0266EE9D635B" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Message-Id: Date: Sat, 27 Feb 2021 17:57:06 +0300 To: tarantool-discussions@dev.tarantool.org X-Mailer: Apple Mail (2.3654.60.0.2.21) X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9795828B892398B72AC588525E45B8D9948334200C4E13096182A05F538085040E5607863EA5BDC4DF4CB69D3DFD386E6FF380F1DB04C9E8C46717039FDCA1E03 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7BAE5222749FC9020C2099A533E45F2D0395957E7521B51C2CFCAF695D4D8E9FCEA1F7E6F0F101C6778DA827A17800CE78887611F2F2455C9EA1F7E6F0F101C674E70A05D1297E1BBC6CDE5D1141D2B1CA4F80C3965A056B895EED3DD3072A449992551043BEFBFBF9FA2833FD35BB23D9E625A9149C048EE1E561CDFBCA1751FF04B652EEC242312D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAE9A1BBD95851C5BA471835C12D1D977C4224003CC836476EC64975D915A344093EC92FD9297F6718AA50765F7900637AEEE038640289DEEA7F4EDE966BC389F395957E7521B51C24C7702A67D5C33162DBA43225CD8A89FD2A95C73FD1EFF45262FEC7FBD7D1F5BB5C8C57E37DE458B4C7702A67D5C3316FA3894348FB808DBCF17F1EDFBC1FB573B503F486389A921A5CC5B56E945C8DA X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8186998911F362727C414F749A5E30D975C25752893F242F32C3D497325094AF6DE9134D2FE0EB324889C2B6934AE262D3EE7EAB7254005DCED361EAB80902F451D1E0A4E2319210D9B64D260DF9561598F01A9E91200F654B01098AAFFB0A1231D8E8E86DC7131B365E7726E8460B7C23C X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D347130F804358653A65476EEA4321C28E733FE3A328524340781CF68A77242BAE6920498D7D12C89601D7E09C32AA3244C45D9F96FAB0753BE405BA3C90976B6BF3A92A9747B6CC8863EB3F6AD6EA9203E X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojzT24cXffn6wJRDucUOtoQA== X-Mailru-Sender: 3B9A0136629DC912F4AABCEFC589C81EF51B11C1A6EEEC94FD05E59D34CA0CBF025763E6C3CB2020AD07DD1419AC565FA614486B47F28B67C5E079CCF3B0523AED31B7EB2E253A9E112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-discussions] [RFC] describe an inter-fiber debugger X-BeenThere: tarantool-discussions@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development process List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Ostanevich via Tarantool-discussions Reply-To: Sergey Ostanevich Errors-To: tarantool-discussions-bounces@dev.tarantool.org Sender: "Tarantool-discussions" --Apple-Mail=_47DA2149-02F0-4E6A-B9C1-0266EE9D635B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Subject:=20 An RFC on bringing debugger facility into Tarantool. Part of #5857 --- doc/rfc/inter-fiber-debugger.md | 204 ++++++++++++++++++++++++++++++++ 1 file changed, 204 insertions(+) create mode 100644 doc/rfc/inter-fiber-debugger.md diff --git a/doc/rfc/inter-fiber-debugger.md = b/doc/rfc/inter-fiber-debugger.md new file mode 100644 index 000000000..e4b64490c --- /dev/null +++ b/doc/rfc/inter-fiber-debugger.md @@ -0,0 +1,204 @@ +# Inter-fiber Debugger for Tarantool +* **Status**: In progress +* **Start date**: 20-01-2021 +* **Authors**: Sergey Ostanevich @sergos sergos@tarantool.org = , + Igor Munkin @imun imun@tarantool.org = +* **Discussion**: = https://github.com/tarantool/tarantool/discussions/5857 = + +[TOC] + +### Rationale + +To make Tarantool platform developer-friendly we should provide a set = of basic +developer tools. One of such tool is debugger. There are number of = debuggers +available for the Lua environments, although all of them are missing = the +critical feature needed for the Tarantool platform: they should not = cause a +full-stop of the debugged program during the debug session. + +In this RFC I propose to overcome the problem with a solution that will = stop +only the fiber to be debugged. It will allow developers to debug their +application, while Tarantool can keep processing requests, perform = replication +and so on. + +### Approach + +To do not reinvent the debugger techniques we may borrow the already = existent +Lua debugger, put the rules about fiber use, data manipulation tweaks = and so +on. + +Every fiber can be considered as a 'debuggee' or a regular fiber, = switching +from one state to the other. To control the status we can either patch = fiber +machinery - which seems excessive as fibers can serve pure C tasks - or = tweak +the breakpoint hook to employ the fiber yield. The fiber will appear in = a state +it waits for commands from the debugger and set the LuaJIT machinery = hooks to +be prepared for the next fiber to be scheduled. + +### Debug techniques + +Regular debuggers provide interruption for all threads at once hence = they don't +distinguish breakpoints appearance across the threads - they just stop +execution. For our case we have to introduce some specifics so that = debugger +will align with the fiber nature of the server behavior. Let's consider = some +techniques we can propose to the user. + +#### 1) Break first fiber met + +User puts a breakpoint that triggers once, stopping the first fiber the = break +happens in. After breakpoint is met the fiber reports its status to the +debugger server, put itself in a wait state, clears the breakpoint and = yields. +As soon as server issue a command, the debuggee will reset the = breakpoint, +handle the command and proceed with execution or yield again. + +#### 2) Regular breakpoint + +This mode will start the same way as previous mode, but keep the = breakpoint +before yield, so that the breakpoint still can trigger in another = fiber. As the +server may deliver huge number of fibers during its performance, we = have to set +up a user-configurable limit for the number of debuggee fibers can be = set at +once. As soon as limit is reached the debuggee fiber starts behave = exactly as +in previous mode, clearing the breakpoint before the yield from the = debuggee. + +#### 3) Run a function under debug session + +This is the most straightforward way to debug a function: perform a = call +through the debug interface. A new fiber will be created and break will = appear +at the function entrance. The limit of debuggee fibers should be = increased and +the fiber will behave similar to the modes above. + +#### 4) Attach debugger to a fiber by ID + +Every fiber has its numerical ID, so debugger can provide interface to = start +debugging for a particular fiber. The fiber will be put in a wait state = as soon +as it start execution after the debugger is attached. + +### Basic mechanism + +The Tarantool side of the debugger will consist of a dedicated fiber = named +DebugSRV that will handle requests from the developer and make = bookkeeping of +debuggee fibers and their breakpoints and a Lua function DebugHook set = as a +hook in Lua debug [https://www.lua.org/pil/23.html = ] library. Users should not +use this hook for the period of debugging to avoid interference. The = external +interface can be organized over arbitrary protocol, be it a socket = connection, +console or even IPROTO (using IPROTO_CALL). + +Debuggee fiber will be controlled by a debug hook function named = DebugHook. It +is responsibility of the DebugHook to set the debuggee fiber status, = check the +breakpoints appearance, its condition including the ignore count and = update +hit_count. As soon as breakpoint is met, the DebugHook has to put its = state to +pending and wait for command from the DebugSRV. + +Communication between DebugSRV and the debuggee fiber can be done via +fiber.channel mechanism. It will simplify the wait-for semantics. + +#### Data structure + +Every debuggee fiber is present in the corresponding table in the = DebugSRV +fiber. The table has the following format: + +``` +debuggees =3D { + max_debuggee =3D number, + preserved_hook =3D { + [1] =3D function, + [2] =3D type, + [3] =3D number + } + fibers =3D { + [] =3D { + state =3D ['pending'|'operation'], + current_breakpoint =3D , + channel =3D fiber.channel, + breakpoints =3D { + [] =3D { + type =3D ['l'|'c'|'r'|'i'], + value =3D [number|string] + condition =3D function, + hit_count =3D number, + ignore_count =3D number + } + } + } + } + global_breakpoints =3D { + [] =3D { + type =3D ['l'|'c'|'r'|'i'], + value =3D [number|string] + condition =3D function, + hit_count =3D number, + ignore_count =3D number + } +} +``` +As DebugSRV receives commands it updates the structure of the debuggees = and +forces the fiber wakeup to reset its hook state. The state of the = debuggee is +one of the following: + +- 'operation': the fiber is already in the debuggees list, but it = issued yield + without any breakpoint met +- 'pending': DebugHook waits for a new command from the channel in the + debuggees.fibers of its own ID + + +#### DebugHook behavior + +For the techniques 3) and 4) fiber appears in the list of = debuggees.fibers +first, with its status set as 'operation' with a list of breakpoints = set. + +For the techniques 1) and 2) there is a list of global_breakpoints that = should +be checked by every fiber. + +In case a fiber receives control from the debug machinery it should = check if it +is present in ```debuggees.fibers[ID]```. If it is - it should check if = its +current position meets any breakpoint from the +```debuggees.fibers[ID].breakpoints``` or = ```debuggees.global_breakponts```. If +breakpoint is met, the fiber sets its state into 'pending' and waits = for a +command from the ```debuggees.fibers[ID].channel```. + +In case a fiber is not present in the ```debuggees.fibers[ID]``` it = should +check that the number of fibers entries in the debuggees structure is = less than +max_debuggee. In such a case it checks if it met any of the +```global_breakpoint``` it and put itself into the fibers list, = updating the +array size [https://www.lua.org/pil/19.1.html = ]. Also it should open a channel +to the DebugSVR and put itself into the 'pending' state. + +#### DebugSRV behavior + +DebugSRV handles the input from the user and supports the following = list of +commands (as mentioned, it can be used from any interface, so commands = are +function calls for general case): + +- ```break_info([fiber ID])``` - list all breakpoints with counts and + conditions, limits output for the fiber with ID +- ```break_cond(, )``` - set a condition for = the + breakpoint, condition should be Lua code evaluating into a boolean = value +- ```break_ignore(, )``` - ignore the number of + breakpoint executions +- ```break_delete()``` - removes a breakpoint +- ```step()``` - continue execution, stepping into the call +- ```step_over()``` - continue execution until the next = source line, + skip calls +- ```step_out()``` - continue execution until return from the = current + function + +The functions above are common for many debuggers, just some tweaks to = adopt +fibers. Functions below are more specific, so let's get into some = details: + +- ```set_max_debuggee(number)``` - set the number of fibers can be = debugged + simultaneously. It modifies the ```debuggees.max_debuggee``` so that = new fibers + will respect the amount of debuggees. For example, if at some point = of + debugging there were 5 debuggee fibers user can set this value to 3 - = it will + not cause any problem, just a new fiber will not become a debuggee if = it meet + some global breakpoint. +- ```debug_eval(, )``` - allows to evaluate the code in = the + context of the debuggee fiber if it is in 'pending' mode. User can = issue a + ```debug_eval(113, function() return fiber.id () = end)``` to receive 113 as a + result +- ```break(, [fiber ID])``` - add a new = breakpoint in + the fiber's breakpoint list on in the global list if no fiber ID = provided +- ```debug_start()``` - starts debug session: creates debuggees = structure, + preserve current debug hook in ```debuggees.preserved_hook``` and = sets + DebugHook as the current hook +- ```debug_stop()``` - quits debug session: resets the debug hook, = clears + debuggees structure + + -- 2.24.3 (Apple Git-128)= --Apple-Mail=_47DA2149-02F0-4E6A-B9C1-0266EE9D635B Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Subject: An RFC on bringing = debugger facility into Tarantool.

Part = of #5857
---
doc/rfc/inter-fiber-debugger.md | 204 = ++++++++++++++++++++++++++++++++
1 file changed, 204 insertions(+)
create mode 100644 = doc/rfc/inter-fiber-debugger.md

diff = --git a/doc/rfc/inter-fiber-debugger.md = b/doc/rfc/inter-fiber-debugger.md
new file mode 100644
index 000000000..e4b64490c
--- /dev/null
+++ = b/doc/rfc/inter-fiber-debugger.md
@@ -0,0 +1,204 @@
+# Inter-fiber Debugger for Tarantool
+* **Status**: In = progress
+* **Start date**: = 20-01-2021
+* = **Authors**: Sergey Ostanevich @sergos sergos@tarantool.org,
+ =             &n= bsp; Igor Munkin @imun imun@tarantool.org
+* **Discussion**: https://github.com/tarantool/tarantool/discussions/5857
+
+[TOC]
+
+### = Rationale
+
+To make Tarantool = platform developer-friendly we should provide a set of basic
+developer tools. One = of such tool is debugger. There are number of debuggers
+available for the Lua = environments, although all of them are missing the
+critical feature = needed for the Tarantool platform: they should not cause a
+full-stop of the = debugged program during the debug session.
+
+In = this RFC I propose to overcome the problem with a solution that will = stop
+only the fiber to be = debugged. It will allow developers to debug their
+application, while = Tarantool can keep processing requests, perform replication
+and so on.
+
+### Approach
+
+To do not reinvent the = debugger techniques we may borrow the already existent
+Lua debugger, put the = rules about fiber use, data manipulation tweaks and so
+on.
+
+Every fiber can be = considered as a 'debuggee' or a regular fiber, switching
+from one state to the = other. To control the status we can either patch fiber
+machinery - which = seems excessive as fibers can serve pure C tasks - or tweak
+the breakpoint hook to = employ the fiber yield. The fiber will appear in a state
+it waits for commands = from the debugger and set the LuaJIT machinery hooks to
+be prepared for the = next fiber to be scheduled.
+
+### = Debug techniques
+
+Regular= debuggers provide interruption for all threads at once hence they = don't
+distinguish = breakpoints appearance across the threads - they just stop
+execution. For our = case we have to introduce some specifics so that debugger
+will align with the = fiber nature of the server behavior. Let's consider some
+techniques we can = propose to the user.
+
+#### = 1) Break first fiber met
+
+User = puts a breakpoint that triggers once, stopping the first fiber the = break
+happens in. After = breakpoint is met the fiber reports its status to the
+debugger server, put = itself in a wait state, clears the breakpoint and yields.
+As soon as server = issue a command, the debuggee will reset the breakpoint,
+handle the command and = proceed with execution or yield again.
+
+#### = 2) Regular breakpoint
+
+This = mode will start the same way as previous mode, but keep the = breakpoint
+before = yield, so that the breakpoint still can trigger in another fiber. As = the
+server may deliver = huge number of fibers during its performance, we have to set
+up a user-configurable = limit for the number of debuggee fibers can be set at
+once. As soon as limit = is reached the debuggee fiber starts behave exactly as
+in previous mode, = clearing the breakpoint before the yield from the debuggee.
+
+#### 3) Run a function = under debug session
+
+This = is the most straightforward way to debug a function: perform a = call
+through the debug = interface. A new fiber will be created and break will appear
+at the function = entrance. The limit of debuggee fibers should be increased and
+the fiber will behave = similar to the modes above.
+
+#### = 4) Attach debugger to a fiber by ID
+
+Every = fiber has its numerical ID, so debugger can provide interface to = start
+debugging for a = particular fiber. The fiber will be put in a wait state as = soon
+as it start execution = after the debugger is attached.
+
+### = Basic mechanism
+
+The = Tarantool side of the debugger will consist of a dedicated fiber = named
+DebugSRV that will = handle requests from the developer and make bookkeeping of
+debuggee fibers and = their breakpoints and a Lua function DebugHook set as a
+hook in Lua debug = [https://www.lua.org/pil/23.html] library. Users should = not
+use this hook for the = period of debugging to avoid interference. The external
+interface can be = organized over arbitrary protocol, be it a socket connection,
+console or even IPROTO = (using IPROTO_CALL).
+
+Debuggee fiber will be controlled by a debug hook function = named DebugHook. It
+is = responsibility of the DebugHook to set the debuggee fiber status, check = the
+breakpoints = appearance, its condition including the ignore count and = update
+hit_count. As soon as = breakpoint is met, the DebugHook has to put its state to
+pending and wait for = command from the DebugSRV.
+
+Communication between DebugSRV and the debuggee fiber can be = done via
+fiber.channel = mechanism. It will simplify the wait-for semantics.
+
+#### Data = structure
+
+Every debuggee fiber = is present in the corresponding table in the DebugSRV
+fiber. The table has = the following format:
+
+```
+debuggees =3D {
+    max_debuggee =3D number,
+ =    preserved_hook =3D {
+        [1] =3D = function,
+ =        [2] =3D type,
+ =        [3] =3D number
+ =    }
+ =    fibers =3D {
+ =        [<fiber_id>] =3D = {
+ =            state = =3D ['pending'|'operation'],
+ =            current_= breakpoint =3D <breakpoint_id>,
+ =            channel = =3D fiber.channel,
+ =            breakpoi= nts =3D {
+ =             &n= bsp;  [<breakpoint_id>] =3D {
+ =             &n= bsp;      type =3D = ['l'|'c'|'r'|'i'],
+ =             &n= bsp;      value =3D = [number|string]
+ =             &n= bsp;      condition =3D = function,
+ =             &n= bsp;      hit_count =3D number,
+ =             &n= bsp;      ignore_count =3D = number
+ =             &n= bsp;  }
+ =            }=
+ =        }
+ =    }
+ =    global_breakpoints =3D {
+ =        [<breakpoint_id>] =3D = {
+ =            type =3D= ['l'|'c'|'r'|'i'],
+ =            value = =3D [number|string]
+ =            conditio= n =3D function,
+ =            hit_coun= t =3D number,
+ =            ignore_c= ount =3D number
+ =    }
+}
+```
+As = DebugSRV receives commands it updates the structure of the debuggees = and
+forces the fiber = wakeup to reset its hook state. The state of the debuggee is
+one of the = following:
+
+- = 'operation': the fiber is already in the debuggees list, but it issued = yield
+  without any = breakpoint met
+- = 'pending': DebugHook waits for a new command from the channel in = the
+ =  debuggees.fibers of its own ID
+
+
+#### = DebugHook behavior
+
+For = the techniques 3) and 4) fiber appears in the list of = debuggees.fibers
+first, = with its status set as 'operation' with a list of breakpoints = set.
+
+For the techniques 1) = and 2) there is a list of global_breakpoints that should
+be checked by every = fiber.
+
+In case a fiber = receives control from the debug machinery it should check if = it
+is present in = ```debuggees.fibers[ID]```. If it is - it should check if its
+current position meets = any breakpoint from the
+```debuggees.fibers[ID].breakpoints``` or = ```debuggees.global_breakponts```. If
+breakpoint is met, the fiber sets its state into 'pending' = and waits for a
+command= from the ```debuggees.fibers[ID].channel```.
+
+In case a fiber is not = present in the ```debuggees.fibers[ID]``` it should
+check that the number = of fibers entries in the debuggees structure is less than
+max_debuggee. In such = a case it checks if it met any of the
+```global_breakpoint``` it  and put itself into the = fibers list, updating the
+array size [https://www.lua.org/pil/19.1.html]. Also it should open = a channel
+to the DebugSVR and = put itself into the 'pending' state.
+
+#### = DebugSRV behavior
+
+DebugSRV handles the input from the user and supports the = following list of
+commands (as mentioned, it can be used from any interface, = so commands are
+function calls for general case):
+
+- ```break_info([fiber = ID])``` - list all breakpoints with counts and
+  conditions, = limits output for the fiber with ID
+- ```break_cond(<breakpoint id>, <condition>)``` = - set a condition for the
+  breakpoint, condition should be Lua code evaluating = into a boolean value
+- = ```break_ignore(<breakpoint id>, <count>)``` - ignore the = number of
+  breakpoint = executions
+- = ```break_delete(<breakpoint id>)``` - removes a = breakpoint
+- = ```step(<fiber ID>)``` - continue execution, stepping into the = call
+- = ```step_over(<fiber ID>)``` - continue execution until the next = source line,
+ =  skip calls
+- = ```step_out(<fiber ID>)``` - continue execution until return from = the current
+ =  function
+
+The = functions above are common for many debuggers, just some tweaks to = adopt
+fibers. Functions = below are more specific, so let's get into some details:
+
+- = ```set_max_debuggee(number)``` - set the number of fibers can be = debugged
+  simultaneously. = It modifies the ```debuggees.max_debuggee``` so that new = fibers
+  will respect = the amount of debuggees. For example, if at some point of
+  debugging there = were 5 debuggee fibers user can set this value to 3 - it will
+  not cause any = problem, just a new fiber will not become a debuggee if it = meet
+  some global = breakpoint.
+- = ```debug_eval(<fiber ID>, <code>)``` - allows to evaluate = the code in the
+ =  context of the debuggee fiber if it is in 'pending' mode. User can = issue a
+ =  ```debug_eval(113, function() return fiber.id() end)``` to receive 113 as a
+  result
+- ```break(<breakpoint description>, [fiber ID])``` - = add a new breakpoint in
+ =  the fiber's breakpoint list on in the global list if no fiber ID = provided
+- ```debug_start()``` = - starts debug session: creates debuggees structure,
+  preserve = current debug hook in ```debuggees.preserved_hook``` and sets
+  DebugHook as = the current hook
+- = ```debug_stop()``` - quits debug session: resets the debug hook, = clears
+  debuggees = structure
+
+
--
2.24.3 (Apple = Git-128)= --Apple-Mail=_47DA2149-02F0-4E6A-B9C1-0266EE9D635B--