From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 67FB045C305 for ; Fri, 4 Dec 2020 00:35:46 +0300 (MSK) References: <20201201163048.95601-1-void@tarantool.org> From: Vladislav Shpilevoy Message-ID: <3b5d3306-2756-f831-af15-7227f411e196@tarantool.org> Date: Thu, 3 Dec 2020 22:35:44 +0100 MIME-Version: 1.0 In-Reply-To: <20201201163048.95601-1-void@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads) List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Nikiforov , tarantool-patches@dev.tarantool.org Hi! Thanks for the patch! I recommend you to read this document: https://github.com/tarantool/tarantool/wiki/Code-review-procedure See 5 comments below. 1. Please, use subsystem prefix in the commit title. In your case it should be 'base64: ...'. On 01.12.2020 17:30, Sergey Nikiforov wrote: > It also caused data corruption. 2. What do you mean 'also'? What did it cause besides data corruption? > Also: > Fixed read access beyond decode table (noticed along the way). > Minimized number of condition checks in internal loops (performance). 3. Please, never mix unrelated changed into one commit. That complicates the review; makes it harder to cherry-pick things to the older branches; ruins git history; and you can introduce a new bug while doing 'refactoring'. Btw, can you prove your optimizations actually do any notable impact on the performance? Do you have numbers showing that it is worth optimizing? > Fixes: #3069 > --- 4. See https://github.com/tarantool/tarantool/wiki/Code-review-procedure#sending-the-patch. Please, put here issue and branch links. I don't know the branch name, and I don't see it in `git branch -a | grep 3069`. So I can't check if the patch works. > third_party/base64.c | 57 ++++++++++++++++++++++++++++++-------------- > 1 file changed, 39 insertions(+), 18 deletions(-) > > diff --git a/third_party/base64.c b/third_party/base64.c > index 8ecab23eb..ab644df22 100644 > --- a/third_party/base64.c > +++ b/third_party/base64.c > @@ -222,7 +222,8 @@ base64_decode_value(int value) > 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, > 44, 45, 46, 47, 48, 49, 50, 51 > }; > - static const int decoding_size = sizeof(decoding); > + static const int decoding_size = > + sizeof(decoding) / sizeof(decoding[0]); 5. Since this is a bugfix, it should be covered with a test when possible. Here it looks like it is possible. If 'value' was big enough, it would access some memory out of the array, and would return garbage instead of an error. I believe it shouldn't be hard to create a proper unit test on that. Everything below is just refactoring, hopefully without new bugs. But I recommend to remove it, since it is not related to the bug anyhow, and hardly makes performance any notable better. Unless you have numbers. If it really makes difference, please, extract these optimizations into a new commit on a different branch so as we could handle it out of the 3069 bug context.