[Tarantool-discussions] [RFC] describe an inter-fiber debugger

Sergey Ostanevich sergos at tarantool.org
Sat Feb 27 17:57:06 MSK 2021


Subject: 
An RFC on bringing debugger facility into Tarantool.

Part of #5857
---
doc/rfc/inter-fiber-debugger.md | 204 ++++++++++++++++++++++++++++++++
1 file changed, 204 insertions(+)
create mode 100644 doc/rfc/inter-fiber-debugger.md

diff --git a/doc/rfc/inter-fiber-debugger.md b/doc/rfc/inter-fiber-debugger.md
new file mode 100644
index 000000000..e4b64490c
--- /dev/null
+++ b/doc/rfc/inter-fiber-debugger.md
@@ -0,0 +1,204 @@
+# Inter-fiber Debugger for Tarantool
+* **Status**: In progress
+* **Start date**: 20-01-2021
+* **Authors**: Sergey Ostanevich @sergos sergos at tarantool.org <mailto:sergos at tarantool.org>,
+               Igor Munkin @imun imun at tarantool.org <mailto:imun at tarantool.org>
+* **Discussion**: https://github.com/tarantool/tarantool/discussions/5857 <https://github.com/tarantool/tarantool/discussions/5857>
+
+[TOC]
+
+### Rationale
+
+To make Tarantool platform developer-friendly we should provide a set of basic
+developer tools. One of such tool is debugger. There are number of debuggers
+available for the Lua environments, although all of them are missing the
+critical feature needed for the Tarantool platform: they should not cause a
+full-stop of the debugged program during the debug session.
+
+In this RFC I propose to overcome the problem with a solution that will stop
+only the fiber to be debugged. It will allow developers to debug their
+application, while Tarantool can keep processing requests, perform replication
+and so on.
+
+### Approach
+
+To do not reinvent the debugger techniques we may borrow the already existent
+Lua debugger, put the rules about fiber use, data manipulation tweaks and so
+on.
+
+Every fiber can be considered as a 'debuggee' or a regular fiber, switching
+from one state to the other. To control the status we can either patch fiber
+machinery - which seems excessive as fibers can serve pure C tasks - or tweak
+the breakpoint hook to employ the fiber yield. The fiber will appear in a state
+it waits for commands from the debugger and set the LuaJIT machinery hooks to
+be prepared for the next fiber to be scheduled.
+
+### Debug techniques
+
+Regular debuggers provide interruption for all threads at once hence they don't
+distinguish breakpoints appearance across the threads - they just stop
+execution. For our case we have to introduce some specifics so that debugger
+will align with the fiber nature of the server behavior. Let's consider some
+techniques we can propose to the user.
+
+#### 1) Break first fiber met
+
+User puts a breakpoint that triggers once, stopping the first fiber the break
+happens in. After breakpoint is met the fiber reports its status to the
+debugger server, put itself in a wait state, clears the breakpoint and yields.
+As soon as server issue a command, the debuggee will reset the breakpoint,
+handle the command and proceed with execution or yield again.
+
+#### 2) Regular breakpoint
+
+This mode will start the same way as previous mode, but keep the breakpoint
+before yield, so that the breakpoint still can trigger in another fiber. As the
+server may deliver huge number of fibers during its performance, we have to set
+up a user-configurable limit for the number of debuggee fibers can be set at
+once. As soon as limit is reached the debuggee fiber starts behave exactly as
+in previous mode, clearing the breakpoint before the yield from the debuggee.
+
+#### 3) Run a function under debug session
+
+This is the most straightforward way to debug a function: perform a call
+through the debug interface. A new fiber will be created and break will appear
+at the function entrance. The limit of debuggee fibers should be increased and
+the fiber will behave similar to the modes above.
+
+#### 4) Attach debugger to a fiber by ID
+
+Every fiber has its numerical ID, so debugger can provide interface to start
+debugging for a particular fiber. The fiber will be put in a wait state as soon
+as it start execution after the debugger is attached.
+
+### Basic mechanism
+
+The Tarantool side of the debugger will consist of a dedicated fiber named
+DebugSRV that will handle requests from the developer and make bookkeeping of
+debuggee fibers and their breakpoints and a Lua function DebugHook set as a
+hook in Lua debug [https://www.lua.org/pil/23.html <https://www.lua.org/pil/23.html>] library. Users should not
+use this hook for the period of debugging to avoid interference. The external
+interface can be organized over arbitrary protocol, be it a socket connection,
+console or even IPROTO (using IPROTO_CALL).
+
+Debuggee fiber will be controlled by a debug hook function named DebugHook. It
+is responsibility of the DebugHook to set the debuggee fiber status, check the
+breakpoints appearance, its condition including the ignore count and update
+hit_count. As soon as breakpoint is met, the DebugHook has to put its state to
+pending and wait for command from the DebugSRV.
+
+Communication between DebugSRV and the debuggee fiber can be done via
+fiber.channel mechanism. It will simplify the wait-for semantics.
+
+#### Data structure
+
+Every debuggee fiber is present in the corresponding table in the DebugSRV
+fiber. The table has the following format:
+
+```
+debuggees = {
+    max_debuggee = number,
+    preserved_hook = {
+        [1] = function,
+        [2] = type,
+        [3] = number
+    }
+    fibers = {
+        [<fiber_id>] = {
+            state = ['pending'|'operation'],
+            current_breakpoint = <breakpoint_id>,
+            channel = fiber.channel,
+            breakpoints = {
+                [<breakpoint_id>] = {
+                    type = ['l'|'c'|'r'|'i'],
+                    value = [number|string]
+                    condition = function,
+                    hit_count = number,
+                    ignore_count = number
+                }
+            }
+        }
+    }
+    global_breakpoints = {
+        [<breakpoint_id>] = {
+            type = ['l'|'c'|'r'|'i'],
+            value = [number|string]
+            condition = function,
+            hit_count = number,
+            ignore_count = number
+    }
+}
+```
+As DebugSRV receives commands it updates the structure of the debuggees and
+forces the fiber wakeup to reset its hook state. The state of the debuggee is
+one of the following:
+
+- 'operation': the fiber is already in the debuggees list, but it issued yield
+  without any breakpoint met
+- 'pending': DebugHook waits for a new command from the channel in the
+  debuggees.fibers of its own ID
+
+
+#### DebugHook behavior
+
+For the techniques 3) and 4) fiber appears in the list of debuggees.fibers
+first, with its status set as 'operation' with a list of breakpoints set.
+
+For the techniques 1) and 2) there is a list of global_breakpoints that should
+be checked by every fiber.
+
+In case a fiber receives control from the debug machinery it should check if it
+is present in ```debuggees.fibers[ID]```. If it is - it should check if its
+current position meets any breakpoint from the
+```debuggees.fibers[ID].breakpoints``` or ```debuggees.global_breakponts```. If
+breakpoint is met, the fiber sets its state into 'pending' and waits for a
+command from the ```debuggees.fibers[ID].channel```.
+
+In case a fiber is not present in the ```debuggees.fibers[ID]``` it should
+check that the number of fibers entries in the debuggees structure is less than
+max_debuggee. In such a case it checks if it met any of the
+```global_breakpoint``` it  and put itself into the fibers list, updating the
+array size [https://www.lua.org/pil/19.1.html <https://www.lua.org/pil/19.1.html>]. Also it should open a channel
+to the DebugSVR and put itself into the 'pending' state.
+
+#### DebugSRV behavior
+
+DebugSRV handles the input from the user and supports the following list of
+commands (as mentioned, it can be used from any interface, so commands are
+function calls for general case):
+
+- ```break_info([fiber ID])``` - list all breakpoints with counts and
+  conditions, limits output for the fiber with ID
+- ```break_cond(<breakpoint id>, <condition>)``` - set a condition for the
+  breakpoint, condition should be Lua code evaluating into a boolean value
+- ```break_ignore(<breakpoint id>, <count>)``` - ignore the number of
+  breakpoint executions
+- ```break_delete(<breakpoint id>)``` - removes a breakpoint
+- ```step(<fiber ID>)``` - continue execution, stepping into the call
+- ```step_over(<fiber ID>)``` - continue execution until the next source line,
+  skip calls
+- ```step_out(<fiber ID>)``` - continue execution until return from the current
+  function
+
+The functions above are common for many debuggers, just some tweaks to adopt
+fibers. Functions below are more specific, so let's get into some details:
+
+- ```set_max_debuggee(number)``` - set the number of fibers can be debugged
+  simultaneously. It modifies the ```debuggees.max_debuggee``` so that new fibers
+  will respect the amount of debuggees. For example, if at some point of
+  debugging there were 5 debuggee fibers user can set this value to 3 - it will
+  not cause any problem, just a new fiber will not become a debuggee if it meet
+  some global breakpoint.
+- ```debug_eval(<fiber ID>, <code>)``` - allows to evaluate the code in the
+  context of the debuggee fiber if it is in 'pending' mode. User can issue a
+  ```debug_eval(113, function() return fiber.id <http://fiber.id/>() end)``` to receive 113 as a
+  result
+- ```break(<breakpoint description>, [fiber ID])``` - add a new breakpoint in
+  the fiber's breakpoint list on in the global list if no fiber ID provided
+- ```debug_start()``` - starts debug session: creates debuggees structure,
+  preserve current debug hook in ```debuggees.preserved_hook``` and sets
+  DebugHook as the current hook
+- ```debug_stop()``` - quits debug session: resets the debug hook, clears
+  debuggees structure
+
+
--
2.24.3 (Apple Git-128)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-discussions/attachments/20210227/c7a6bfd8/attachment.htm>


More information about the Tarantool-discussions mailing list