From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id F12806EC41; Mon, 5 Jul 2021 13:29:45 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org F12806EC41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1625480986; bh=l7TOmC52jJNF92Ow9xWC1ilwn/fUgguKZr/DXFWtvfI=; h=To:Cc:Date:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=ZeCUipuLUfqzVTc3itn74pGnX5hcXFfT+NRGaozbtBcLBOcG7t7896yxEa0+tBblq gbm4ehq7DPzcErg6S0qH/wcUF00InsAWVPXXHfOacB2mbN7IP0TwQVd6tqV+s9fSQu 5Ui7WPjspbbmX5zQR2BXFileetnXRbUvgIFNc4yc= Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id DA0E76EC40 for ; Mon, 5 Jul 2021 12:02:17 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DA0E76EC40 Received: by mail-lj1-f180.google.com with SMTP id p24so23762034ljj.1 for ; Mon, 05 Jul 2021 02:02:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=+i7rFQiFxaKT/K5Zg0aMCEaXobr5ae280CcZ+8ByBIQ=; b=g1ZlJeFne5hL4LNAi9axL7p0XnQ5EUjFJr3f/kEbKx8fJrIkbFbr2BfWGDraqg8IPG 5Kh7LSwopnG5umOKivmv6ITfhe/oYL5wJu5t5ULHvEewYZijLGZMG5C4b7uGJUykuvak pS7s3mxFj8YZ4MGz4NN8BlpRyA61OeKB2X+tjw+1klsNEeeFxNUzVyJ8YiyYew9pr0sf WKoS0qlWW0sAUteAlgJ7PnY0RCQX2eHMPo/94thThlCH6sfJiHzcF+Dw+EQFsWOy2wzm DSv+cwTOwmFIHBI+c6EgOFbqu9dx3g+lzUtQ2OmUQW2whls9rym348hTiK4cD99tTWEH EHpQ== X-Gm-Message-State: AOAM533yQEnqKpzPKGteKBtJ+rpIJb1bdNOSaQwVrXY+OD94bSYkPBYv sjwyYA+Vrd+6LME7XTcQpwS+3TuPASKJ6dcQiFM= X-Google-Smtp-Source: ABdhPJyXQ4SYDXgO7IqRQCKwwotGCxzDkzGpQKn9Y70yKHPorzqKJy2NoZSTIyg7lh6pgWviyIZ3gQ== X-Received: by 2002:a2e:9c18:: with SMTP id s24mr10340949lji.249.1625475737039; Mon, 05 Jul 2021 02:02:17 -0700 (PDT) Received: from localhost.localdomain ([2a00:1370:8131:39a8:2c0a:28e8:e4ad:bc59]) by smtp.gmail.com with ESMTPSA id y6sm685062lfk.51.2021.07.05.02.02.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jul 2021 02:02:16 -0700 (PDT) To: tarantool-discussions@dev.tarantool.org, imun@tarantool.org, skaplun@tarantool.org, sergos@tarantool.org Cc: Maxim Kokryashkin Date: Mon, 5 Jul 2021 12:01:47 +0300 Message-Id: <20210705090147.467734-1-max.kokryashkin@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Mon, 05 Jul 2021 13:29:44 +0300 Subject: [Tarantool-discussions] [PATCH] Add RFC for platform profiler X-BeenThere: tarantool-discussions@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development process List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-discussions Reply-To: Maxim Kokryashkin Errors-To: tarantool-discussions-bounces@dev.tarantool.org Sender: "Tarantool-discussions" From: Maxim Kokryashkin It has been proposed to implement a platform performance profiler several times by now, so this commit adds the document, which describes one of the possible implementations. Github branch: https://github.com/tarantool/tarantool/tree/fckxorg/rfc-platform-profiler Needed for: #781 See also: #4001 --- doc/rfc/781-luajit-platform-profiler.md | 72 +++++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 doc/rfc/781-luajit-platform-profiler.md diff --git a/doc/rfc/781-luajit-platform-profiler.md b/doc/rfc/781-luajit-platform-profiler.md new file mode 100644 index 000000000..fda3d535b --- /dev/null +++ b/doc/rfc/781-luajit-platform-profiler.md @@ -0,0 +1,72 @@ +# lua: system-wide profiler + +* **Status**: In progress +* **Start date**: 02-07-2021 +* **Authors**: Mikhail Shishatskiy @Shishqa m.shishatskiy@tarantool.org, Maxim Kokryashkin @fckxorg m.kokryashkin@tarantool.org +* **Issues**: [#781](https://github.com/tarantool/tarantool/issues/781) + +## Summary +The document describes the platform profiler for LuaJIT. It is needed to obtain a complete view of platform performance. Existing LuaJIT profiler only able to give you information about virtual machine states and guest stack. Hence, the document proposes to extend the existing LuaJIT profiler, so it will be able to gather stack traces from both C and Lua. + +## Background and motivation + +Currently, available options for profiling LuaJIT are not fine enough to get an understanding of performance. For example, perf only able to show host stack, so all the Lua calls are seen as single pcall. Oppositely, jit.p module provided with LuaJIT is not able to give any information about the host stack. + +To get a detailed perspective of platform performance, a more advanced profiler is needed. The desired profiler must be able to capture both guest and host stacks simultaneously, along with virtual machine states. + +## Detailed design + +The proposed approach is to extend existing profiler embedded into LuaJIT, so it will be able to capture host stack too. + +### Host stack + +The default sampling profiler implementation in LuaJIT, which can be seen [here](https://github.com/tarantool/luajit/blob/tarantool/src/lj_profile.c), follows this flow: +``` +luaJIT_profile_start --> profile_timer_start + +... + |lock VM state +[signal emmited] --> profile_signal_trigger: __|prepare args for a callback + |schedule callback execution + |unlock VM state +... + +luaJIT_profile_stop --> profile_timer_stop +``` + +Callback, which is scheduled by `profile_signal_trigger` can be used to dump needed information, including VM stack. However, even though the guest stack is still the same by the time when callback executed, the host stack is already have been changed, so the final stack dump can not be considered valid. + +Hence, to get a valid final snapshot of both stacks, a dump should be done right at the signal, like [there](https://github.com/Shishqa/luajit/blob/c0da971640512696f5c166e8f2dc1ed982a8f451/src/profile/sysprof.c#L63). + +The host stack can be dumped with`backtrace(void**, int)`. + +### VM stack +We are using an implementation similar to the one, which is used in [lj_debug_dumpstack](https://github.com/tarantool/luajit/blob/af889e4608e6eca495dd85e6161d8bcd7d3628e6/src/lj_debug.c#L580) to dump guest stack. But there is a problem with that because sometimes the VM stack can be invalid, thanks to this [bug](https://github.com/tarantool/luajit/blob/af889e4608e6eca495dd85e6161d8bcd7d3628e6/src/vm_x64.dasc#L4594). As you can see down the link, VM state changes to LFUNC, and after that stack reallocation takes place. So if our signal takes place in between, we will get a segmentation fault. Anyway, that issue is easy to fix, so this approach is suitable. + +### Symbol table + +It is a heavy task to dump names of functions every time, so instead, we will dump a symbol table in the beginning. Later on, it will be sufficient to dump only a function's address. However, some functions can be loaded and unloaded several times, and their addresses will be different each time. Hence, we will update the symbol table accordingly. To carry out the symtab update, we will drop in new symtab record into the file, where the profiler stores data. + +A symbol table looks like this (the same format as symtab in memprof): +``` + 1 byte 8 bytes 8 bytes + _______________________________________________________________ +| type | address of function | function name | first line number| + --------------------------------------------------------------- +``` + + + +### Traces + +Traces are the real problem here because there is no mechanism in LuaJIT to unwind them. Consequently, we need to introduce our own. The basic idea is to place some markers into the bytecode of a trace to indicate the start and the end of each function call and use them to unwind the whole call stack of a trace. + +A more specific description is needed. + +## Rationale and alternatives + +Another way to implement such a thing is to make perf to see guest stack. To do so, we need to map virtual machine symbols (and that functionality is present in LuaJIT ([link](https://github.com/tarantool/luajit/blob/d4e12d7ac28e3bc857d30971dd77deec66a67297/src/lj_trace.c#L96))) and do something so perf could unwind the virtual machine stack. +Stack unwinding from outside of the LuaJIT is the problem we didn’t manage to solve for today. There are different approaches to do this: +- *Save rsp register value to rbp and preserve rbp.* However, LuaJIT uses rbp as a general-purpose register, and it is hard not to break everything trying to use it only for stack frames. +- *Coordinated work of `jit.p` and perf.* This approach requires modifying perf the way it will send LuaJIT suspension signal, and after getting info about the host stack, it will receive information about the guest stack and join them. This solution is quite possible, but modified perf doesn't seem like a production-ready solution. +- *Dwarf unwinding* \ No newline at end of file -- 2.31.1