From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id EBAEE6E454; Fri, 4 Mar 2022 22:23:57 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EBAEE6E454 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1646421838; bh=PHcKFqFPguUbu7b88sL/CZxy+c3JbvlZi8bLXVMZYms=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=DyED0Kn62eCvpWhvDcGzQ2nVr3/5xewPvProWYcnCbU5vrd//zlKgtEwWLRV+6N/I MHD/GvLEMlC48NchrXm3kzM2/5HOmlgogZaMPYLC574nMBTVGb0adyMRp91D7Hmt28 fmrVrEqTJntratMtes9uzM4rSQa7onyGszaePx54= Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id CD2B96E225 for ; Fri, 4 Mar 2022 22:22:27 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org CD2B96E225 Received: by mail-lf1-f52.google.com with SMTP id b9so15823443lfv.7 for ; Fri, 04 Mar 2022 11:22:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SzTP2SF8j29rZqJl7QYAvH5lws6jifeOoA419WvCSTU=; b=qB1f0etwruHlJn8slrwD4clXCmlpSjqEVzkWlx8RfnacmhohBvvcahWve8FJ/Scow1 ZAngAnwmuUC7R3UP+YFPBwdBG5ncTmtCJOjbPcGMSqVqg9Y5t4PgB6zPZJsSUcEzEOZv y6gXb2LJHxoXY7tjH4ey6bXKonDhJ1IlQZegLVAjWNSwVMAfJEL20P9gNa6CDmt9DKAD p5Ay/L1xVb4Qq6Bmzyy8tC6cqz0QRBh6yb1GJjWTGNiX0x3pw3Vjg1MvVCU5uqXa4jlP mUgerudnyjIIae9qz+iroq9oLr/uREDj+jvBXR7JoKLQPfwQHgbhspZqghZ6mhd/2Ey5 tenA== X-Gm-Message-State: AOAM530FHpIiBY6aDKblTeFf8pMe8SwjDfqPt3rh4jtIXaSqqXYkaHoO lGaut+otGqdGDLlcLvICsENhk5mi9H16Vm4f3WY= X-Google-Smtp-Source: ABdhPJxDZUtLxHFCarNSPEwX+SBnNHh9GRIIWBXFaNJnxVU00WJfnxUPEthioXy/tWAkMs/XO2Y1rg== X-Received: by 2002:a05:6512:2256:b0:443:3d02:7cf7 with SMTP id i22-20020a056512225600b004433d027cf7mr187314lfu.194.1646421746681; Fri, 04 Mar 2022 11:22:26 -0800 (PST) Received: from localhost.localdomain (109-252-140-49.dynamic.spd-mgts.ru. [109.252.140.49]) by smtp.gmail.com with ESMTPSA id u9-20020a056512128900b0044315b93df5sm1204687lfs.233.2022.03.04.11.22.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Mar 2022 11:22:26 -0800 (PST) X-Google-Original-From: Maxim Kokryashkin To: tarantool-patches@dev.tarantool.org, imun@tarantool.org, skaplun@tarantool.org Date: Fri, 4 Mar 2022 22:22:18 +0300 Message-Id: <20220304192219.1266071-2-m.kokryashkin@tarantool.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220304192219.1266071-1-m.kokryashkin@tarantool.org> References: <20220304192219.1266071-1-m.kokryashkin@tarantool.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH luajit v5 1/2] memprof: extend symtab with C-symbols X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This commit enriches memprof's symbol table with information about C-symbols. The parser is updated correspondingly. If there is .symtab section or at least .dynsym segment in a shared library, then the following data is stored in symtab for each symbol: | SYMTAB_CFUNC | symbol address | symbol name | 1 byte 8 bytes magic number If none of those are present, then instead of a symbol name and its address there will be name and address of a shared library containing that symbol. Part of tarantool/tarantool#5813 --- >> The following data is stored in event for each newly loaded symbol: >> | (AEVENT_SYMTAB | ASOURCE_CFUNC) | symbol address | symbol name | >> 1 byte 8 bytes >> magic >> number > > What do you think of dumping so name too for convenience? > For example, dump an empty so name stands for Tarantool|LuaJIT sources, > but the other one is related to some other .so library that is loaded by > a user. > CC-ed Igor here. I don't think it's helpful, since one of the key ideas is to be independent from .so libs on client-side. However, I agree that some kind of a flag marking .so files, which are not related to Tarantool sources, may provide some help during debugging. Still, let's wait for Igor's opinion. >> static const unsigned char ljs_header[] = {'l', 'j', 's', LJS_CURRENT_VERSION, >> 0x0, 0x0, 0x0}; > > Should we update LJS header due to the new format of the symtab stream? I see no reason for that -- LJS version bump is sufficient. Makefile.original | 2 +- src/lj_memprof.c | 330 ++++++++++++++++++ src/lj_memprof.h | 7 +- test/tarantool-tests/tools-utils-avl.test.lua | 59 ++++ tools/CMakeLists.txt | 2 + tools/utils/avl.lua | 118 +++++++ tools/utils/symtab.lua | 24 +- 7 files changed, 537 insertions(+), 5 deletions(-) create mode 100644 test/tarantool-tests/tools-utils-avl.test.lua create mode 100644 tools/utils/avl.lua diff --git a/Makefile.original b/Makefile.original index 33dc2ed5..0c92df9e 100644 --- a/Makefile.original +++ b/Makefile.original @@ -100,7 +100,7 @@ FILES_JITLIB= bc.lua bcsave.lua dump.lua p.lua v.lua zone.lua \ dis_x86.lua dis_x64.lua dis_arm.lua dis_arm64.lua \ dis_arm64be.lua dis_ppc.lua dis_mips.lua dis_mipsel.lua \ dis_mips64.lua dis_mips64el.lua vmdef.lua -FILES_UTILSLIB= bufread.lua symtab.lua +FILES_UTILSLIB= avl.lua bufread.lua symtab.lua FILES_MEMPROFLIB= parse.lua humanize.lua FILES_TOOLSLIB= memprof.lua FILE_TMEMPROF= luajit-parse-memprof diff --git a/src/lj_memprof.c b/src/lj_memprof.c index 2d779983..71c1da7f 100644 --- a/src/lj_memprof.c +++ b/src/lj_memprof.c @@ -8,11 +8,22 @@ #define lj_memprof_c #define LUA_CORE +#define _GNU_SOURCE + +#include #include +#include +#include +#include #include "lj_arch.h" #include "lj_memprof.h" +#if LUAJIT_OS != LUAJIT_OS_OSX +#include +#include +#include +#endif #if LJ_HASMEMPROF #include "lj_obj.h" @@ -66,12 +77,327 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace) #endif +#if LUAJIT_OS != LUAJIT_OS_OSX + +struct ghashtab_header { + uint32_t nbuckets; + uint32_t symoffset; + uint32_t bloom_size; + uint32_t bloom_shift; +}; + +uint32_t ghashtab_size(ElfW(Addr) ghashtab) +{ + /* + ** There is no easy way to get count of symbols in GNU hashtable, so the + ** only way to do this is to take highest possible non-empty bucket and + ** iterate through its symbols until the last chain is over. + */ + uint32_t last_entry = 0; + uint32_t *cur_bucket = NULL; + + const uint32_t *chain = NULL; + struct ghashtab_header *header = (struct ghashtab_header*)ghashtab; + /* + ** sizeof(size_t) returns 8, if compiled with 64-bit compiler, and 4 if + ** compiled with 32-bit compiler. It is the best option to determine which + ** kind of CPU we are running on. + */ + const char *buckets = (char*)ghashtab + sizeof(struct ghashtab_header) + + sizeof(size_t) * header->bloom_size; + + cur_bucket = (uint32_t*)buckets; + for (uint32_t i = 0; i < header->nbuckets; ++i) { + if (last_entry < *cur_bucket) + last_entry = *cur_bucket; + cur_bucket++; + } + + if (last_entry < header->symoffset) + return header->symoffset; + + chain = (uint32_t*)(buckets + sizeof(uint32_t) * header->nbuckets); + /* The chain ends with the lowest bit set to 1. */ + while (!(chain[last_entry - header->symoffset] & 1)) { + last_entry++; + } + + return ++last_entry; +} + +struct symbol_resolver_conf { + struct lj_wbuf *buf; + const uint8_t header; +}; + +void write_c_symtab(ElfW(Sym*) sym, char *strtab, ElfW(Addr) so_addr, + size_t sym_cnt, const uint8_t header, struct lj_wbuf *buf) { + char *sym_name = NULL; + + /* + ** Index 0 in ELF symtab is used to + ** represent undefined symbols. Hence, we can just + ** start with index 1. + ** + ** For more information, see: + ** https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-79797.html + */ + + for (ElfW(Word) sym_index = 1; sym_index < sym_cnt; sym_index++) { + /* + ** ELF32_ST_TYPE and ELF64_ST_TYPE are the same, so we can use + ** ELF32_ST_TYPE for both 64-bit and 32-bit ELFs. + ** + ** For more, see https://github.com/torvalds/linux/blob/9137eda53752ef73148e42b0d7640a00f1bc96b1/include/uapi/linux/elf.h#L135 + */ + if (ELF32_ST_TYPE(sym[sym_index].st_info) == STT_FUNC) { + if (sym[sym_index].st_name == 0) + /* Symbol has no name. */ + continue; + sym_name = &strtab[sym[sym_index].st_name]; + lj_wbuf_addbyte(buf, header); + lj_wbuf_addu64(buf, sym[sym_index].st_value + so_addr); + lj_wbuf_addstring(buf, sym_name); + } + } +} + +int dump_sht_symtab(const char *elf_name, struct lj_wbuf *buf, + const uint8_t header, const ElfW(Addr) so_addr) { + int status = 0; + + char *strtab = NULL; + ElfW(Shdr*) section_headers = NULL; + ElfW(Sym*) sym = NULL; + ElfW(Ehdr) elf_header = {}; + + ElfW(Off) sym_off = 0; + ElfW(Off) strtab_off = 0; + + size_t sym_cnt = 0; + size_t symtab_size = 0; + size_t strtab_size = 0; + size_t strtab_index = 0; + + size_t shoff = 0; /* Section headers offset. */ + size_t shnum = 0; /* Section headers number. */ + size_t shentsize = 0; /* Section header entry size. */ + + FILE *elf_file = fopen(elf_name, "rb"); + + if (elf_file == NULL) + return -1; + + fread(&elf_header, sizeof(elf_header), 1, elf_file); + if (ferror(elf_file) != 0) + goto error; + if (memcmp(elf_header.e_ident, ELFMAG, SELFMAG) != 0) + /* Not a valid ELF file. */ + goto error; + + shoff = elf_header.e_shoff; + shnum = elf_header.e_shnum; + shentsize = elf_header.e_shentsize; + + if (shoff == 0 || shnum == 0 || shentsize == 0) + /* No sections in ELF. */ + goto error; + + /* + ** Memory occupied by section headers is unlikely to be more than 160B, but + ** 32-bit and 64-bit ELF files may have sections of different sizes and some + ** of the sections may duiplicate, so we need to take that into account. + */ + section_headers = calloc(shnum, shentsize); + if (section_headers == NULL) + goto error; + + if (fseek(elf_file, shoff, SEEK_SET) != 0) + goto error; + + fread(section_headers, shentsize, shnum, elf_file); + if (ferror(elf_file) != 0) + goto error; + + for (size_t header_index = 0; header_index < shnum; ++header_index) { + if(section_headers[header_index].sh_type == SHT_SYMTAB) { + sym_off = section_headers[header_index].sh_offset; + symtab_size = section_headers[header_index].sh_size; + sym_cnt = symtab_size / section_headers[header_index].sh_entsize; + + strtab_index = section_headers[header_index].sh_link; + + strtab_off = section_headers[strtab_index].sh_offset; + strtab_size = section_headers[strtab_index].sh_size; + break; + } + } + + if (sym_off == 0 || strtab_off == 0 || sym_cnt == 0) + goto error; + + /* Load strtab and symtab into memory. */ + sym = calloc(sym_cnt, sizeof(ElfW(Sym))); + if (sym == NULL) + goto error; + + strtab = calloc(strtab_size, sizeof(char)); + if (strtab == NULL) + goto error; + + if (fseek(elf_file, sym_off, SEEK_SET) != 0) + goto error; + + fread(sym, sizeof(ElfW(Sym)), sym_cnt, elf_file); + if (ferror(elf_file) != 0) + goto error; + + if (fseek(elf_file, strtab_off, SEEK_SET) != 0) + goto error; + + fread(strtab, sizeof(char), strtab_size, elf_file); + if (ferror(elf_file) != 0) + goto error; + + write_c_symtab(sym, strtab, so_addr, sym_cnt, header, buf); + + goto end; + + error: + status = -1; + + end: + free(sym); + free(strtab); + free(section_headers); + fclose(elf_file); + + return status; +} + +int dump_dyn_symtab(struct dl_phdr_info *info, const uint8_t header, + struct lj_wbuf *buf) { + for (size_t header_index = 0; header_index < info->dlpi_phnum; ++header_index) { + if (info->dlpi_phdr[header_index].p_type == PT_DYNAMIC) { + ElfW(Dyn*) dyn = NULL; + ElfW(Sym*) sym = NULL; + ElfW(Word*) hashtab = NULL; + ElfW(Addr) ghashtab = 0; + ElfW(Word) sym_cnt = 0; + + char *strtab = 0; + + dyn = (ElfW(Dyn)*)(info->dlpi_addr + info->dlpi_phdr[header_index].p_vaddr); + + for(; dyn->d_tag != DT_NULL; dyn++) { + switch(dyn->d_tag) { + case DT_HASH: + hashtab = (ElfW(Word*))dyn->d_un.d_ptr; + break; + case DT_GNU_HASH: + ghashtab = dyn->d_un.d_ptr; + break; + case DT_STRTAB: + strtab = (char*)dyn->d_un.d_ptr; + break; + case DT_SYMTAB: + sym = (ElfW(Sym*))dyn->d_un.d_ptr; + break; + default: + break; + } + } + + if ((hashtab == NULL && ghashtab == 0) + || strtab == NULL || sym == NULL) + /* Not enough data to resolve symbols. */ + return 1; + + /* + ** A hash table consists of Elf32_Word or Elf64_Word objects that provide for + ** symbol table access. Hash table has the following organization: + ** +-------------------+ + ** | nbucket | + ** +-------------------+ + ** | nchain | + ** +-------------------+ + ** | bucket[0] | + ** | ... | + ** | bucket[nbucket-1] | + ** +-------------------+ + ** | chain[0] | + ** | ... | + ** | chain[nchain-1] | + ** +-------------------+ + ** Chain table entries parallel the symbol table. The number of symbol + ** table entries should equal nchain, so symbol table indexes also select + ** chain table entries. Since the chain array values are indexes for not only + ** the chain array itself, but also for the symbol table, the chain array must + ** be the same size as the symbol table. This makes nchain equal to the length + ** of the symbol table. + ** + ** For more, see https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-48031.html + */ + sym_cnt = ghashtab == 0 ? hashtab[1] : ghashtab_size(ghashtab); + write_c_symtab(sym, strtab, info->dlpi_addr, sym_cnt, header, buf); + return 0; + } + } + + return 1; +} + +int resolve_symbolnames(struct dl_phdr_info *info, size_t info_size, void *data) +{ + struct symbol_resolver_conf *conf = data; + const uint8_t header = conf->header; + struct lj_wbuf *buf = conf->buf; + + UNUSED(info_size); + + /* Skip vDSO library. */ + if (info->dlpi_addr == getauxval(AT_SYSINFO_EHDR)) + return 0; + + /* + ** Main way: try to open ELF and read SHT_SYMTAB, SHT_STRTAB and SHT_HASH + ** sections from it. + */ + if (dump_sht_symtab(info->dlpi_name, buf, header, info->dlpi_addr) == 0) { + return 0; + } + + /* First fallback: dump functions only from PT_DYNAMIC segment. */ + if(dump_dyn_symtab(info, header, buf) == 0) { + return 0; + } + + /* + ** Last resort: dump ELF size and address to show .so name for its functions + ** in memprof output. + */ + lj_wbuf_addbyte(buf, header); + lj_wbuf_addu64(buf, info->dlpi_addr); + lj_wbuf_addstring(buf, info->dlpi_name); + + return 0; +} + +#endif + static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) { const GCRef *iter = &g->gc.root; const GCobj *o; const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]); +#if LUAJIT_OS != LUAJIT_OS_OSX + struct symbol_resolver_conf conf = { + out, + SYMTAB_CFUNC, + }; +#endif + /* Write prologue. */ lj_wbuf_addn(out, ljs_header, ljs_header_len); @@ -95,6 +421,10 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g) iter = &o->gch.nextgc; } +#if LUAJIT_OS != LUAJIT_OS_OSX + /* Write symbols. */ + dl_iterate_phdr(resolve_symbolnames, &conf); +#endif lj_wbuf_addbyte(out, SYMTAB_FINAL); } diff --git a/src/lj_memprof.h b/src/lj_memprof.h index 395fb429..0327a205 100644 --- a/src/lj_memprof.h +++ b/src/lj_memprof.h @@ -25,13 +25,15 @@ ** prologue := 'l' 'j' 's' version reserved ** version := ** reserved := -** sym := sym-lua | sym-trace | sym-final +** sym := sym-lua | sym-cfunc | sym-trace | sym-final ** sym-lua := sym-header sym-addr sym-chunk sym-line ** sym-trace := sym-header trace-no trace-addr sym-addr sym-line ** sym-header := ** sym-addr := ** sym-chunk := string ** sym-line := +** sym-cfunc := sym-header sym-addr sym-name +** sym-name := string ** sym-final := sym-header ** trace-no := ** trace-addr := @@ -54,7 +56,8 @@ */ #define SYMTAB_LFUNC ((uint8_t)0) -#define SYMTAB_TRACE ((uint8_t)1) +#define SYMTAB_CFUNC ((uint8_t)1) +#define SYMTAB_TRACE ((uint8_t)2) #define SYMTAB_FINAL ((uint8_t)0x80) #define LJM_CURRENT_FORMAT_VERSION 0x02 diff --git a/test/tarantool-tests/tools-utils-avl.test.lua b/test/tarantool-tests/tools-utils-avl.test.lua new file mode 100644 index 00000000..17cc7a85 --- /dev/null +++ b/test/tarantool-tests/tools-utils-avl.test.lua @@ -0,0 +1,59 @@ +local avl = require "utils.avl" +local tap = require("tap") + +local test = tap.test("tools-utils-avl") +test:plan(7) + +local function traverse(node, result) + if node ~= nil then + table.insert(result, node.key) + traverse(node.left, result) + traverse(node.right, result) + end + return result +end + +local function batch_insert(root, values) + for i = 1, #values do + root = avl.insert(root, values[i]) + end + + return root +end + +local function compare(arr1, arr2) + for i, v in pairs(arr1) do + if v ~= arr2[i] then + return false + end + end + return true +end + +-- 1L rotation test. +local root = batch_insert(nil, {1, 2, 3}) +test:ok(compare(traverse(root, {}), {2, 1, 3})) + +-- 1R rotation test. +root = batch_insert(nil, {3, 2, 1}) +test:ok(compare(traverse(root, {}), {2, 1, 3})) + +-- 2L rotation test. +root = batch_insert(nil, {1, 3, 2}) +test:ok(compare(traverse(root, {}), {2, 1, 3})) + +-- 2R rotation test. +root = batch_insert(nil, {3, 1, 2}) +test:ok(compare(traverse(root, {}), {2, 1, 3})) + +-- Exact upper bound. +test:ok(avl.upper_bound(root, 1) == 1) + +-- No upper bound. +test:ok(avl.upper_bound(root, -10) == nil) + +-- Not exact upper bound. +test:ok(avl.upper_bound(root, 2.75) == 2) + + + diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt index 61830e44..c6803d00 100644 --- a/tools/CMakeLists.txt +++ b/tools/CMakeLists.txt @@ -30,6 +30,7 @@ else() memprof/humanize.lua memprof/parse.lua memprof.lua + utils/avl.lua utils/bufread.lua utils/symtab.lua ) @@ -46,6 +47,7 @@ else() COMPONENT tools-parse-memprof ) install(FILES + ${CMAKE_CURRENT_SOURCE_DIR}/utils/avl.lua ${CMAKE_CURRENT_SOURCE_DIR}/utils/bufread.lua ${CMAKE_CURRENT_SOURCE_DIR}/utils/symtab.lua DESTINATION ${LUAJIT_DATAROOTDIR}/utils diff --git a/tools/utils/avl.lua b/tools/utils/avl.lua new file mode 100644 index 00000000..98c15bd7 --- /dev/null +++ b/tools/utils/avl.lua @@ -0,0 +1,118 @@ +local math = require'math' + +local M = {} +local max = math.max + +local function create_node(key, value) + return { + key = key, + value = value, + left = nil, + right = nil, + height = 1, + } +end + +local function height(node) + if node == nil then + return 0 + end + return node.height +end + +local function update_height(node) + node.height = 1 + max(height(node.left), height(node.right)) +end + +local function get_balance(node) + if node == nil then + return 0 + end + return height(node.left) - height(node.right) +end + +local function rotate_left(node) + local r_subtree = node.right; + local rl_subtree = r_subtree.left; + + r_subtree.left = node; + node.right = rl_subtree; + + update_height(node) + update_height(r_subtree) + + return r_subtree; +end + +local function rotate_right(node) + local l_subtree = node.left + local lr_subtree = l_subtree.right; + + l_subtree.right = node; + node.left = lr_subtree; + + update_height(node) + update_height(l_subtree) + + return l_subtree; +end + +local function rebalance(node, key) + local balance = get_balance(node) + + if -1 <= balance and balance <=1 then + return node + end + + if balance > 1 and key < node.left.key then + return rotate_right(node) + elseif balance < -1 and key > node.right.key then + return rotate_left(node) + elseif balance > 1 and key > node.left.key then + node.left = rotate_left(node.left) + return rotate_right(node) + elseif balance < -1 and key < node.right.key then + node.right = rotate_right(node.right) + return rotate_left(node) + end +end + +function M.insert(node, key, value) + if node == nil then + return create_node(key, value) + end + + if key < node.key then + node.left = M.insert(node.left, key, value) + elseif key > node.key then + node.right = M.insert(node.right, key, value) + else + node.key = key + node.value = value + end + + update_height(node) + return rebalance(node, key) +end + +function M.upper_bound(node, key) + if node == nil then + return nil, nil + end + -- Explicit match. + if key == node.key then + return node.key, node.value + elseif key < node.key then + return M.upper_bound(node.left, key) + elseif key > node.key then + local right_key, value = M.upper_bound(node.right, key) + right_key = right_key or node.key + value = value or node.value + + return right_key, value + end +end + + +return M + diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua index c7fcf77c..aa66269c 100644 --- a/tools/utils/symtab.lua +++ b/tools/utils/symtab.lua @@ -6,6 +6,8 @@ local bit = require "bit" +local avl = require "utils.avl" + local band = bit.band local string_format = string.format @@ -15,7 +17,8 @@ local LJS_EPILOGUE_HEADER = 0x80 local LJS_SYMTYPE_MASK = 0x03 local SYMTAB_LFUNC = 0 -local SYMTAB_TRACE = 1 +local SYMTAB_CFUNC = 1 +local SYMTAB_TRACE = 2 local M = {} @@ -50,15 +53,27 @@ local function parse_sym_trace(reader, symtab) } end +-- Parse a single entry in a symtab: .so library +local function parse_sym_cfunc(reader, symtab) + local addr = reader:read_uleb128() + local name = reader:read_string() + + symtab.cfunc = avl.insert(symtab.cfunc, addr, { + name = name + }) +end + local parsers = { [SYMTAB_LFUNC] = parse_sym_lfunc, [SYMTAB_TRACE] = parse_sym_trace, + [SYMTAB_CFUNC] = parse_sym_cfunc } function M.parse(reader) local symtab = { lfunc = {}, trace = {}, + cfunc = nil, } local magic = reader:read_octets(3) local version = reader:read_octets(1) @@ -93,7 +108,6 @@ function M.parse(reader) parsers[sym_type](reader, symtab) end end - return symtab end @@ -135,6 +149,12 @@ function M.demangle(symtab, loc) return string_format("%s:%d", symtab.lfunc[addr].source, loc.line) end + local key, value = avl.upper_bound(symtab.cfunc, addr) + + if key then + return string_format("%s:%#x", value.name, key) + end + return string_format("CFUNC %#x", addr) end -- 2.35.1