From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [217.69.128.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 3173B45C304 for ; Thu, 17 Dec 2020 12:42:51 +0300 (MSK) References: <20201215142527.560937-1-void@tarantool.org> From: Leonid Vasiliev Message-ID: Date: Thu, 17 Dec 2020 12:41:48 +0300 MIME-Version: 1.0 In-Reply-To: <20201215142527.560937-1-void@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Nikiforov , tarantool-patches@dev.tarantool.org Cc: Vladislav Shpilevoy Hi! Thank you for the patch. Generally LGTM. See some comments below: According to https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message "Description is what the patch does, started from lowercase letter, without a dot in the end, in the imperative mood." ("properly..."). I could be wrong, but it seems like is not written in an imperative mood. On 15.12.2020 17:25, Sergey Nikiforov wrote: > Not all invalid characters were ignored by base64 decoder > causing data corruption and reads beyond decode table > (faults under ASAN). > > Added corresponding check into base64 unit test. > > Fixes: #5627 > --- > > Branch: https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing > Issue: https://github.com/tarantool/tarantool/issues/5627 > > test/unit/base64.c | 23 ++++++++++++++++++++++- > test/unit/base64.result | 5 ++++- > third_party/base64.c | 3 ++- > 3 files changed, 28 insertions(+), 3 deletions(-) > > diff --git a/test/unit/base64.c b/test/unit/base64.c > index ada497adf..c0f53a5e1 100644 > --- a/test/unit/base64.c > +++ b/test/unit/base64.c > @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str) > base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols)); > } > > +static void > +base64_invalid_chars_test(void) > +{ > + /* Upper bit must be cleared */ > + const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' }; > + char outbuf[8]; > + > + plan(1); Usually `plan ()` is called as the first call in a function. It's just easier to see how many checks there will be. I don't know any rule about this. So, it's up to you. > + > + /* Invalid chars should be ignored, not decoded into garbage */ > + is(base64_decode(invalid_data, sizeof(invalid_data), > + outbuf, sizeof(outbuf)), > + 0, "ignoring invalid chars"); > + > + check_plan(); > +} > + > int main(int argc, char *argv[]) > { > - plan(28); > + plan(28 > + + 1 /* invalid chars test */ > + ); I agree with Vlad. Why `+ 1` and not just 29? > header(); > > const char *option_tests[] = { > @@ -78,6 +97,8 @@ int main(int argc, char *argv[]) > base64_nowrap_test(option_tests[i]); > } > > + base64_invalid_chars_test(); > + > footer(); > return check_plan(); > } > diff --git a/test/unit/base64.result b/test/unit/base64.result > index cd1f2b3f6..3bc2c2275 100644 > --- a/test/unit/base64.result > +++ b/test/unit/base64.result > @@ -1,4 +1,4 @@ > -1..28 > +1..29 > *** main *** > 1..3 > ok 1 - length > @@ -175,4 +175,7 @@ ok 27 - subtests > ok 3 - decode length ok > ok 4 - encode/decode > ok 28 - subtests > + 1..1 > + ok 1 - ignoring invalid chars > +ok 29 - subtests > *** main: done *** > diff --git a/third_party/base64.c b/third_party/base64.c > index 8ecab23eb..7c69315ea 100644 > --- a/third_party/base64.c > +++ b/third_party/base64.c > @@ -222,7 +222,8 @@ base64_decode_value(int value) > 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, > 44, 45, 46, 47, 48, 49, 50, 51 > }; > - static const int decoding_size = sizeof(decoding); > + static const int decoding_size = > + sizeof(decoding) / sizeof(decoding[0]); > int codepos = value; > codepos -= 43; > if (codepos < 0 || codepos >= decoding_size) >