[Tarantool-patches] [PATCH] Fix base64 decoder output buffer overrun (reads)

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Dec 4 00:35:44 MSK 2020


Hi! Thanks for the patch!

I recommend you to read this document:
https://github.com/tarantool/tarantool/wiki/Code-review-procedure

See 5 comments below.

1. Please, use subsystem prefix in the commit title. In your case it
should be 'base64: ...'.

On 01.12.2020 17:30, Sergey Nikiforov wrote:
> It also caused data corruption.

2. What do you mean 'also'? What did it cause besides data corruption?

> Also:
> Fixed read access beyond decode table (noticed along the way).
> Minimized number of condition checks in internal loops (performance).

3. Please, never mix unrelated changed into one commit. That
complicates the review; makes it harder to cherry-pick things to
the older branches; ruins git history; and you can introduce a new
bug while doing 'refactoring'.

Btw, can you prove your optimizations actually do any notable
impact on the performance? Do you have numbers showing that it is
worth optimizing?

> Fixes: #3069
> ---

4. See
https://github.com/tarantool/tarantool/wiki/Code-review-procedure#sending-the-patch.

Please, put here issue and branch links. I don't know the branch name,
and I don't see it in `git branch -a | grep 3069`. So I can't check if
the patch works.

>  third_party/base64.c | 57 ++++++++++++++++++++++++++++++--------------
>  1 file changed, 39 insertions(+), 18 deletions(-)
> 
> diff --git a/third_party/base64.c b/third_party/base64.c
> index 8ecab23eb..ab644df22 100644
> --- a/third_party/base64.c
> +++ b/third_party/base64.c
> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>  		32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>  		44, 45, 46, 47, 48, 49, 50, 51
>  	};
> -	static const int decoding_size = sizeof(decoding);
> +	static const int decoding_size =
> +		sizeof(decoding) / sizeof(decoding[0]);

5. Since this is a bugfix, it should be covered with a test when
possible. Here it looks like it is possible. If 'value' was
big enough, it would access some memory out of the array, and
would return garbage instead of an error. I believe it shouldn't
be hard to create a proper unit test on that.

Everything below is just refactoring, hopefully without new bugs.
But I recommend to remove it, since it is not related to the bug
anyhow, and hardly makes performance any notable better. Unless
you have numbers. If it really makes difference, please, extract
these optimizations into a new commit on a different branch so as
we could handle it out of the 3069 bug context.


More information about the Tarantool-patches mailing list