From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 1514045C304 for ; Tue, 15 Dec 2020 17:20:03 +0300 (MSK) References: <20201201163048.95601-1-void@tarantool.org> <3b5d3306-2756-f831-af15-7227f411e196@tarantool.org> From: Sergey Nikiforov Message-ID: <179f6987-6cfc-8a16-e420-08e56149480e@tarantool.org> Date: Tue, 15 Dec 2020 17:20:01 +0300 MIME-Version: 1.0 In-Reply-To: <3b5d3306-2756-f831-af15-7227f411e196@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads) List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org Hi! I would submit new patches in separate e-mail threads. Please see comments below. On 04.12.2020 0:35, Vladislav Shpilevoy wrote: > Hi! Thanks for the patch! > > I recommend you to read this document: > https://github.com/tarantool/tarantool/wiki/Code-review-procedure > > See 5 comments below. > > 1. Please, use subsystem prefix in the commit title. In your case it > should be 'base64: ...'. ok. I have used http://www.tarantool.io/en/doc/latest/dev_guide/developer_guidelines/ which does not say that "subsystem" can be anything (there is a list). > On 01.12.2020 17:30, Sergey Nikiforov wrote: >> It also caused data corruption. > > 2. What do you mean 'also'? What did it cause besides data corruption? ASAN faults (see #3069). I have made commit description more clear. >> Also: >> Fixed read access beyond decode table (noticed along the way). >> Minimized number of condition checks in internal loops (performance). > > 3. Please, never mix unrelated changed into one commit. That > complicates the review; makes it harder to cherry-pick things to > the older branches; ruins git history; and you can introduce a new > bug while doing 'refactoring'. While I agree with you in general, in this particular case I had to create "intermediate" version containing only this specific fix w/o optimization (there was no such thing before - I was fixing and cleaning up logic in single pass) and test it. > Btw, can you prove your optimizations actually do any notable > impact on the performance? Do you have numbers showing that it is > worth optimizing? release, old: It took 6369332219 ns to decode 7087960000 bytes, speed is 1112826236 bps release, optimized base64 decoder: It took 5550868992 ns to decode 7087960000 bytes, speed is 1276909977 bps ~1.15 times faster (Intel Core I7-9700K, single thread) Where can I commit performance testing code? >> Fixes: #3069 >> --- > > 4. See > https://github.com/tarantool/tarantool/wiki/Code-review-procedure#sending-the-patch. > > Please, put here issue and branch links. I don't know the branch name, > and I don't see it in `git branch -a | grep 3069`. So I can't check if > the patch works. ok >> third_party/base64.c | 57 ++++++++++++++++++++++++++++++-------------- >> 1 file changed, 39 insertions(+), 18 deletions(-) >> >> diff --git a/third_party/base64.c b/third_party/base64.c >> index 8ecab23eb..ab644df22 100644 >> --- a/third_party/base64.c >> +++ b/third_party/base64.c >> @@ -222,7 +222,8 @@ base64_decode_value(int value) >> 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, >> 44, 45, 46, 47, 48, 49, 50, 51 >> }; >> - static const int decoding_size = sizeof(decoding); >> + static const int decoding_size = >> + sizeof(decoding) / sizeof(decoding[0]); > > 5. Since this is a bugfix, it should be covered with a test when > possible. Here it looks like it is possible. If 'value' was > big enough, it would access some memory out of the array, and > would return garbage instead of an error. I believe it shouldn't > be hard to create a proper unit test on that. Done. I have created #5627 for this bug and fix + test would go into separate e-mail thread. > Everything below is just refactoring, hopefully without new bugs. > But I recommend to remove it, since it is not related to the bug > anyhow, and hardly makes performance any notable better. Unless > you have numbers. If it really makes difference, please, extract > these optimizations into a new commit on a different branch so as > we could handle it out of the 3069 bug context. It would be second patch in #3069 series because of the dependency on the fix. Or we could merge it later after fix for #3069 is merged.