From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 1945254CE65; Tue, 25 Jul 2023 19:36:31 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1945254CE65 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1690302991; bh=lx5caIeA83IlBqM8/GxWTVQ5j48FSGKJ+7y/IU6cCMY=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=PMeRNI6Jora7qi+j3Rt2CuzYl17ccVA92KOBlQqochqrdo13SuGoAkYWDqjZAOFAg hPU/2MJXHE/fuVg2b/XArkKIeYJDV0UQjSnbrnG0JlVUQdv0ViOLJ0thQh9ZjCeTDE gENS4/8H92L8UXjUDKennOorlAVBCmVd8Q+b91XM= Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B510254C63D for ; Tue, 25 Jul 2023 19:36:30 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B510254C63D Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2b6f97c7115so82069731fa.2 for ; Tue, 25 Jul 2023 09:36:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690302990; x=1690907790; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P7NCQojgcLec5oCajM4nDlfrjqN4eRKRgoGY8UqwtV8=; b=PZT6Cn+F7XVhyy/KlcWrYNgomNS9x6ioUM2fjfWu96muE89jLQvZM5SWgpiEbwSRGa moMVh0dGWItbsh9FmHhs/4vu+QluPpajlpiQCQdIDUJqto5D+Yl/yrftY2SnXSlIStwe ZVUITkIC7Z9AGo7XgzlWTq5dmYR2J5gHDVXwOixRHf/lh3Cpjw6jNOL4egLGTTNxlzOt NbDuai0NxCEM+4HeIvHZ+HK4QdnBtnFmzUJMXKx03Y0fp/xvYvzhAnzGca8V7PHc5D4E gQ8y/eRwc/P6zW4MgANKJfCqccxGYGjnq1nI3Wg20USm4nccwpA4bV8qNWyf9EeqrVjH 8tMw== X-Gm-Message-State: ABy/qLaRUC6J4WEEM1jf/jKCvY4MEk8K1aYhD4uh0Ygs+qbf/NC/FQv2 e0t2POFiIM0Y2iajFcjS1QkSZyiyQg8= X-Google-Smtp-Source: APBJJlGPRJzTGDBoBLYdZbtj0unnaBCl9czfRxna2F/CZ+AYQx/yd/WRRCJUPMbemGpVnKAwESe4ag== X-Received: by 2002:a2e:9043:0:b0:2b6:daed:494f with SMTP id n3-20020a2e9043000000b002b6daed494fmr8103362ljg.35.1690302989490; Tue, 25 Jul 2023 09:36:29 -0700 (PDT) Received: from pony.mail.msk ([2a00:1148:b0ba:16:37e9:bf80:d77e:ff11]) by smtp.gmail.com with ESMTPSA id 12-20020a05651c008c00b002b475f087desm3575908ljq.56.2023.07.25.09.36.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jul 2023 09:36:29 -0700 (PDT) To: tarantool-patches@dev.tarantool.org, Sergey Kaplun , max.kokryashkin@gmail.com Date: Tue, 25 Jul 2023 19:36:24 +0300 Message-Id: <6426e58a9a72691ccffc84001c21e363c8da6312.1690300762.git.sergeyb@tarantool.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH 1/2] Fix embedded bytecode loader. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: sergeyb@tarantool.org (cherry-picked from commit 820339960123dc78a7ce03edf53fcf4fdae0e55d) Original problem is specific for x32 and is as follows: when a chunk with bytecode library is loaded into memory, and the address is higher than 0x80000100, the LexState->pe, that contains an address of the end of bytecode chunk in the memory, will wrap around and become smaller than address in LexState->p, that contains an address of the beginning of bytecode chunk in the memory. In bcread_fill() called by bcread_want(), memcpy() is called with a very large size and causes bus error on x86 and segmentation fault on ARM android. The problem cannot be reproduced on platforms supported by Tarantool (ARM64, x86_64), so test doesn't reproduce a problem without a patch and tests patch partially. Sergey Bronnikov: * added the description --- src/lj_bcread.c | 10 +++++----- src/lj_lex.c | 6 ++++++ src/lj_lex.h | 1 + 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/src/lj_bcread.c b/src/lj_bcread.c index f6c7ad25..315ad4d8 100644 --- a/src/lj_bcread.c +++ b/src/lj_bcread.c @@ -79,6 +79,7 @@ static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need) ls->c = -1; /* Only bad if we get called again. */ break; } + if (sz >= LJ_MAX_BUF - n) lj_err_mem(ls->L); if (n) { /* Append to buffer. */ n += (MSize)sz; p = lj_buf_need(&ls->sb, n < len ? len : n); @@ -90,20 +91,20 @@ static LJ_NOINLINE void bcread_fill(LexState *ls, MSize len, int need) ls->p = buf; ls->pe = buf + sz; } - } while (ls->p + len > ls->pe); + } while ((MSize)(ls->pe - ls->p) < len); } /* Need a certain number of bytes. */ static LJ_AINLINE void bcread_need(LexState *ls, MSize len) { - if (LJ_UNLIKELY(ls->p + len > ls->pe)) + if (LJ_UNLIKELY((MSize)(ls->pe - ls->p) < len)) bcread_fill(ls, len, 1); } /* Want to read up to a certain number of bytes, but may need less. */ static LJ_AINLINE void bcread_want(LexState *ls, MSize len) { - if (LJ_UNLIKELY(ls->p + len > ls->pe)) + if (LJ_UNLIKELY((MSize)(ls->pe - ls->p) < len)) bcread_fill(ls, len, 0); } @@ -463,8 +464,7 @@ GCproto *lj_bcread(LexState *ls) setprotoV(L, L->top, pt); incr_top(L); } - if ((int32_t)(2*(uint32_t)(ls->pe - ls->p)) > 0 || - L->top-1 != bcread_oldtop(L, ls)) + if ((ls->pe != ls->p && !ls->endmark) || L->top-1 != bcread_oldtop(L, ls)) bcread_error(ls, LJ_ERR_BCBAD); /* Pop off last prototype. */ L->top--; diff --git a/src/lj_lex.c b/src/lj_lex.c index 52856912..82e4ba6f 100644 --- a/src/lj_lex.c +++ b/src/lj_lex.c @@ -48,6 +48,11 @@ static LJ_NOINLINE LexChar lex_more(LexState *ls) size_t sz; const char *p = ls->rfunc(ls->L, ls->rdata, &sz); if (p == NULL || sz == 0) return LEX_EOF; + if (sz >= LJ_MAX_BUF) { + if (sz != ~(size_t)0) lj_err_mem(ls->L); + sz = ~(uintptr_t)0 - (uintptr_t)p; + ls->endmark = 1; + } ls->pe = p + sz; ls->p = p + 1; return (LexChar)(uint8_t)p[0]; @@ -406,6 +411,7 @@ int lj_lex_setup(lua_State *L, LexState *ls) ls->lookahead = TK_eof; /* No look-ahead token. */ ls->linenumber = 1; ls->lastline = 1; + ls->endmark = 0; lex_next(ls); /* Read-ahead first char. */ if (ls->c == 0xef && ls->p + 2 <= ls->pe && (uint8_t)ls->p[0] == 0xbb && (uint8_t)ls->p[1] == 0xbf) { /* Skip UTF-8 BOM (if buffered). */ diff --git a/src/lj_lex.h b/src/lj_lex.h index 33fa8657..38d28533 100644 --- a/src/lj_lex.h +++ b/src/lj_lex.h @@ -73,6 +73,7 @@ typedef struct LexState { BCInsLine *bcstack; /* Stack for bytecode instructions/line numbers. */ MSize sizebcstack; /* Size of bytecode stack. */ uint32_t level; /* Syntactical nesting level. */ + int endmark; /* Trust bytecode end marker, even if not at EOF. */ } LexState; LJ_FUNC int lj_lex_setup(lua_State *L, LexState *ls); -- 2.34.1