Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Nikiforov <void@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads)
Date: Tue, 15 Dec 2020 17:20:01 +0300	[thread overview]
Message-ID: <179f6987-6cfc-8a16-e420-08e56149480e@tarantool.org> (raw)
In-Reply-To: <3b5d3306-2756-f831-af15-7227f411e196@tarantool.org>

Hi!

I would submit new patches in separate e-mail threads. Please see 
comments below.

On 04.12.2020 0:35, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> I recommend you to read this document:
> https://github.com/tarantool/tarantool/wiki/Code-review-procedure
> 
> See 5 comments below.
> 
> 1. Please, use subsystem prefix in the commit title. In your case it
> should be 'base64: ...'.

ok. I have used 
http://www.tarantool.io/en/doc/latest/dev_guide/developer_guidelines/ 
which does not say that "subsystem" can be anything (there is a list).

> On 01.12.2020 17:30, Sergey Nikiforov wrote:
>> It also caused data corruption.
> 
> 2. What do you mean 'also'? What did it cause besides data corruption?

ASAN faults (see #3069). I have made commit description more clear.

>> Also:
>> Fixed read access beyond decode table (noticed along the way).
>> Minimized number of condition checks in internal loops (performance).
> 
> 3. Please, never mix unrelated changed into one commit. That
> complicates the review; makes it harder to cherry-pick things to
> the older branches; ruins git history; and you can introduce a new
> bug while doing 'refactoring'.

While I agree with you in general, in this particular case I had to 
create "intermediate" version containing only this specific fix w/o 
optimization (there was no such thing before - I was fixing and cleaning 
up logic in single pass) and test it.

> Btw, can you prove your optimizations actually do any notable
> impact on the performance? Do you have numbers showing that it is
> worth optimizing?

release, old:
It took 6369332219 ns to decode 7087960000 bytes, speed is 1112826236 bps

release, optimized base64 decoder:
It took 5550868992 ns to decode 7087960000 bytes, speed is 1276909977 bps

~1.15 times faster (Intel Core I7-9700K, single thread)

Where can I commit performance testing code?

>> Fixes: #3069
>> ---
> 
> 4. See
> https://github.com/tarantool/tarantool/wiki/Code-review-procedure#sending-the-patch.
> 
> Please, put here issue and branch links. I don't know the branch name,
> and I don't see it in `git branch -a | grep 3069`. So I can't check if
> the patch works.

ok

>>   third_party/base64.c | 57 ++++++++++++++++++++++++++++++--------------
>>   1 file changed, 39 insertions(+), 18 deletions(-)
>>
>> diff --git a/third_party/base64.c b/third_party/base64.c
>> index 8ecab23eb..ab644df22 100644
>> --- a/third_party/base64.c
>> +++ b/third_party/base64.c
>> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>>   		32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>>   		44, 45, 46, 47, 48, 49, 50, 51
>>   	};
>> -	static const int decoding_size = sizeof(decoding);
>> +	static const int decoding_size =
>> +		sizeof(decoding) / sizeof(decoding[0]);
> 
> 5. Since this is a bugfix, it should be covered with a test when
> possible. Here it looks like it is possible. If 'value' was
> big enough, it would access some memory out of the array, and
> would return garbage instead of an error. I believe it shouldn't
> be hard to create a proper unit test on that.

Done. I have created #5627 for this bug and fix + test would go into 
separate e-mail thread.

> Everything below is just refactoring, hopefully without new bugs.
> But I recommend to remove it, since it is not related to the bug
> anyhow, and hardly makes performance any notable better. Unless
> you have numbers. If it really makes difference, please, extract
> these optimizations into a new commit on a different branch so as
> we could handle it out of the 3069 bug context.

It would be second patch in #3069 series because of the dependency on 
the fix. Or we could merge it later after fix for #3069 is merged.

  reply	other threads:[~2020-12-15 14:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01 16:30 Sergey Nikiforov
2020-12-03 21:35 ` Vladislav Shpilevoy
2020-12-15 14:20   ` Sergey Nikiforov [this message]
2020-12-16 23:22     ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=179f6987-6cfc-8a16-e420-08e56149480e@tarantool.org \
    --to=void@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox