[Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads)

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Dec 17 02:22:13 MSK 2020


> On 04.12.2020 0:35, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the patch!
>>
>> I recommend you to read this document:
>> https://github.com/tarantool/tarantool/wiki/Code-review-procedure
>>
>> See 5 comments below.
>>
>> 1. Please, use subsystem prefix in the commit title. In your case it
>> should be 'base64: ...'.
> 
> ok. I have used http://www.tarantool.io/en/doc/latest/dev_guide/developer_guidelines/ which does not say that "subsystem" can be anything (there is a list).

It also does not say, that the subsystem shall be one of the
items in the list. Usually we try to use one of already used
in the commit titles. Sometimes invent a new one.

Although I would like to have such a strict list. Would make
things a bit simpler.

>> On 01.12.2020 17:30, Sergey Nikiforov wrote:
>>> It also caused data corruption.
>>
>> 2. What do you mean 'also'? What did it cause besides data corruption?
> 
> ASAN faults (see #3069). I have made commit description more clear.

Now with the separate commits I understood everything, thanks.

>>> Also:
>>> Fixed read access beyond decode table (noticed along the way).
>>> Minimized number of condition checks in internal loops (performance).
>>
>> 3. Please, never mix unrelated changed into one commit. That
>> complicates the review; makes it harder to cherry-pick things to
>> the older branches; ruins git history; and you can introduce a new
>> bug while doing 'refactoring'.
> 
> While I agree with you in general, in this particular case I had to create "intermediate" version containing only this specific fix w/o optimization (there was no such thing before - I was fixing and cleaning up logic in single pass) and test it.

And this is a good thing. I understood the problem much easier. Other
people also will. And I found a bug in the optimization patch, which
I probably wouldn't notice if it would be one big commit.

>> Btw, can you prove your optimizations actually do any notable
>> impact on the performance? Do you have numbers showing that it is
>> worth optimizing?
> 
> release, old:
> It took 6369332219 ns to decode 7087960000 bytes, speed is 1112826236 bps
> 
> release, optimized base64 decoder:
> It took 5550868992 ns to decode 7087960000 bytes, speed is 1276909977 bps
> 
> ~1.15 times faster (Intel Core I7-9700K, single thread)

Looks nice, worth doing.

> Where can I commit performance testing code?

About this I don't know anything. You can ask Alexander L. if you want
your bench committed. I know that we have our microbenches somewhere,
and Alexander took a lead here recently.

>> Everything below is just refactoring, hopefully without new bugs.
>> But I recommend to remove it, since it is not related to the bug
>> anyhow, and hardly makes performance any notable better. Unless
>> you have numbers. If it really makes difference, please, extract
>> these optimizations into a new commit on a different branch so as
>> we could handle it out of the 3069 bug context.
> 
> It would be second patch in #3069 series because of the dependency on the fix. Or we could merge it later after fix for #3069 is merged.

Yes, exactly. It is a separate patch and this is great. It is ok
if it is done on top of another patch.


More information about the Tarantool-patches mailing list