<div dir="ltr">I forgot to provide github link to branch:<div><a href="https://github.com/tarantool/tarantool/tree/fckxorg/rfc-platform-profiler">https://github.com/tarantool/tarantool/tree/fckxorg/rfc-platform-profiler</a><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">пн, 12 июл. 2021 г. в 15:25, Maxim Kokryashkin <<a href="mailto:max.kokryashkin@gmail.com">max.kokryashkin@gmail.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">From: Maxim Kokryashkin <<a href="mailto:m.kokryashkin@tarantool.org" target="_blank">m.kokryashkin@tarantool.org</a>><br>
<br>
---<br>
doc/rfc/781-luajit-platform-profiler.md | 13 ++++++++++++-<br>
1 file changed, 12 insertions(+), 1 deletion(-)<br>
<br>
diff --git a/doc/rfc/781-luajit-platform-profiler.md b/doc/rfc/781-luajit-platform-profiler.md<br>
index fda3d535b..74132c2d4 100644<br>
--- a/doc/rfc/781-luajit-platform-profiler.md<br>
+++ b/doc/rfc/781-luajit-platform-profiler.md<br>
@@ -14,6 +14,17 @@ Currently, available options for profiling LuaJIT are not fine enough to get an<br>
<br>
To get a detailed perspective of platform performance, a more advanced profiler is needed. The desired profiler must be able to capture both guest and host stacks simultaneously, along with virtual machine states.<br>
<br>
+To get the difference, you can take a look at flamegraphs generated by pref, jit.p, and PoC for the proposed profiler below.<br>
+### jit.p<br>
+![jit.p](<a href="https://i.imgur.com/sDZZDZx.png" rel="noreferrer" target="_blank">https://i.imgur.com/sDZZDZx.png</a>)<br>
+<br>
+### perf<br>
+![perf](<a href="https://i.imgur.com/DlKbFpo.png" rel="noreferrer" target="_blank">https://i.imgur.com/DlKbFpo.png</a>)<br>
+<br>
+### sysprof<br>
+![sysprof](<a href="https://i.imgur.com/Yf80MDE.png" rel="noreferrer" target="_blank">https://i.imgur.com/Yf80MDE.png</a>)<br>
+<br>
+<br>
## Detailed design<br>
<br>
The proposed approach is to extend existing profiler embedded into LuaJIT, so it will be able to capture host stack too. <br>
@@ -69,4 +80,4 @@ Another way to implement such a thing is to make perf to see guest stack. To do<br>
Stack unwinding from outside of the LuaJIT is the problem we didn’t manage to solve for today. There are different approaches to do this:<br>
- *Save rsp register value to rbp and preserve rbp.* However, LuaJIT uses rbp as a general-purpose register, and it is hard not to break everything trying to use it only for stack frames.<br>
- *Coordinated work of `jit.p` and perf.* This approach requires modifying perf the way it will send LuaJIT suspension signal, and after getting info about the host stack, it will receive information about the guest stack and join them. This solution is quite possible, but modified perf doesn't seem like a production-ready solution.<br>
-- *Dwarf unwinding*<br>
\ No newline at end of file<br>
+- *Dwarf unwinding*<br>
-- <br>
2.32.0<br>
<br>
</blockquote></div>