[Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
@ 2020-12-15 14:25 Sergey Nikiforov
  2020-12-16 23:28 ` Vladislav Shpilevoy
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Sergey Nikiforov @ 2020-12-15 14:25 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Vladislav Shpilevoy

Not all invalid characters were ignored by base64 decoder
causing data corruption and reads beyond decode table
(faults under ASAN).

Added corresponding check into base64 unit test.

Fixes: #5627
---

Branch: https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing
Issue: https://github.com/tarantool/tarantool/issues/5627

 test/unit/base64.c      | 23 ++++++++++++++++++++++-
 test/unit/base64.result |  5 ++++-
 third_party/base64.c    |  3 ++-
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/test/unit/base64.c b/test/unit/base64.c
index ada497adf..c0f53a5e1 100644
--- a/test/unit/base64.c
+++ b/test/unit/base64.c
@@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
 	base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
 }
 
+static void
+base64_invalid_chars_test(void)
+{
+	/* Upper bit must be cleared */
+	const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
+	char outbuf[8];
+
+	plan(1);
+
+	/* Invalid chars should be ignored, not decoded into garbage */
+	is(base64_decode(invalid_data, sizeof(invalid_data),
+	                 outbuf, sizeof(outbuf)),
+	   0, "ignoring invalid chars");
+
+	check_plan();
+}
+
 int main(int argc, char *argv[])
 {
-	plan(28);
+	plan(28
+	     + 1 /* invalid chars test */
+	     );
 	header();
 
 	const char *option_tests[] = {
@@ -78,6 +97,8 @@ int main(int argc, char *argv[])
 		base64_nowrap_test(option_tests[i]);
 	}
 
+	base64_invalid_chars_test();
+
 	footer();
 	return check_plan();
 }
diff --git a/test/unit/base64.result b/test/unit/base64.result
index cd1f2b3f6..3bc2c2275 100644
--- a/test/unit/base64.result
+++ b/test/unit/base64.result
@@ -1,4 +1,4 @@
-1..28
+1..29
 	*** main ***
     1..3
     ok 1 - length
@@ -175,4 +175,7 @@ ok 27 - subtests
     ok 3 - decode length ok
     ok 4 - encode/decode
 ok 28 - subtests
+    1..1
+    ok 1 - ignoring invalid chars
+ok 29 - subtests
 	*** main: done ***
diff --git a/third_party/base64.c b/third_party/base64.c
index 8ecab23eb..7c69315ea 100644
--- a/third_party/base64.c
+++ b/third_party/base64.c
@@ -222,7 +222,8 @@ base64_decode_value(int value)
 		32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
 		44, 45, 46, 47, 48, 49, 50, 51
 	};
-	static const int decoding_size = sizeof(decoding);
+	static const int decoding_size =
+		sizeof(decoding) / sizeof(decoding[0]);
 	int codepos = value;
 	codepos -= 43;
 	if (codepos < 0 || codepos >= decoding_size)
-- 
2.25.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-15 14:25 [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters Sergey Nikiforov
@ 2020-12-16 23:28 ` Vladislav Shpilevoy
  2020-12-17  9:41 ` Leonid Vasiliev
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-16 23:28 UTC (permalink / raw)
  To: Sergey Nikiforov, tarantool-patches

Thanks for the patch!

> diff --git a/test/unit/base64.c b/test/unit/base64.c
> index ada497adf..c0f53a5e1 100644
> --- a/test/unit/base64.c
> +++ b/test/unit/base64.c
> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>  	base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>  }
>  
> +static void
> +base64_invalid_chars_test(void)
> +{
> +	/* Upper bit must be cleared */
> +	const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
> +	char outbuf[8];
> +
> +	plan(1);
> +
> +	/* Invalid chars should be ignored, not decoded into garbage */
> +	is(base64_decode(invalid_data, sizeof(invalid_data),
> +	                 outbuf, sizeof(outbuf)),
> +	   0, "ignoring invalid chars");
> +
> +	check_plan();
> +}
> +
>  int main(int argc, char *argv[])
>  {
> -	plan(28);
> +	plan(28
> +	     + 1 /* invalid chars test */
> +	     );

Why not just 29? It looks kind of weird now.

The patch looks good. Get a second review from somebody, and I
will commit this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-15 14:25 [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters Sergey Nikiforov
  2020-12-16 23:28 ` Vladislav Shpilevoy
@ 2020-12-17  9:41 ` Leonid Vasiliev
  2020-12-17 12:41   ` Alexander Turenko
  2020-12-17 13:04   ` Sergey Nikiforov
  2020-12-23 15:17 ` Vladislav Shpilevoy
  2020-12-30 11:28 ` Alexander V. Tikhonov
  3 siblings, 2 replies; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-17  9:41 UTC (permalink / raw)
  To: Sergey Nikiforov, tarantool-patches; +Cc: Vladislav Shpilevoy

Hi! Thank you for the patch.
Generally LGTM.
See some comments below:

According to 
https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message
"Description is what the patch does, started from lowercase letter,
without a dot in the end, in the imperative mood."
("properly...").
I could be wrong, but it seems like <description> is not written in an
imperative mood.

On 15.12.2020 17:25, Sergey Nikiforov wrote:
> Not all invalid characters were ignored by base64 decoder
> causing data corruption and reads beyond decode table
> (faults under ASAN).
> 
> Added corresponding check into base64 unit test.
> 
> Fixes: #5627
> ---
> 
> Branch: https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing
> Issue: https://github.com/tarantool/tarantool/issues/5627
> 
>   test/unit/base64.c      | 23 ++++++++++++++++++++++-
>   test/unit/base64.result |  5 ++++-
>   third_party/base64.c    |  3 ++-
>   3 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/test/unit/base64.c b/test/unit/base64.c
> index ada497adf..c0f53a5e1 100644
> --- a/test/unit/base64.c
> +++ b/test/unit/base64.c
> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>   	base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>   }
>   
> +static void
> +base64_invalid_chars_test(void)
> +{
> +	/* Upper bit must be cleared */
> +	const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
> +	char outbuf[8];
> +
> +	plan(1);

Usually `plan ()` is called as the first call in a function. It's just
easier to see how many checks there will be. I don't know any rule about
this. So, it's up to you.

> +
> +	/* Invalid chars should be ignored, not decoded into garbage */
> +	is(base64_decode(invalid_data, sizeof(invalid_data),
> +	                 outbuf, sizeof(outbuf)),
> +	   0, "ignoring invalid chars");
> +
> +	check_plan();
> +}
> +
>   int main(int argc, char *argv[])
>   {
> -	plan(28);
> +	plan(28
> +	     + 1 /* invalid chars test */
> +	     );

I agree with Vlad. Why `+ 1` and not just 29?

>   	header();
>   
>   	const char *option_tests[] = {
> @@ -78,6 +97,8 @@ int main(int argc, char *argv[])
>   		base64_nowrap_test(option_tests[i]);
>   	}
>   
> +	base64_invalid_chars_test();
> +
>   	footer();
>   	return check_plan();
>   }
> diff --git a/test/unit/base64.result b/test/unit/base64.result
> index cd1f2b3f6..3bc2c2275 100644
> --- a/test/unit/base64.result
> +++ b/test/unit/base64.result
> @@ -1,4 +1,4 @@
> -1..28
> +1..29
>   	*** main ***
>       1..3
>       ok 1 - length
> @@ -175,4 +175,7 @@ ok 27 - subtests
>       ok 3 - decode length ok
>       ok 4 - encode/decode
>   ok 28 - subtests
> +    1..1
> +    ok 1 - ignoring invalid chars
> +ok 29 - subtests
>   	*** main: done ***
> diff --git a/third_party/base64.c b/third_party/base64.c
> index 8ecab23eb..7c69315ea 100644
> --- a/third_party/base64.c
> +++ b/third_party/base64.c
> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>   		32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>   		44, 45, 46, 47, 48, 49, 50, 51
>   	};
> -	static const int decoding_size = sizeof(decoding);
> +	static const int decoding_size =
> +		sizeof(decoding) / sizeof(decoding[0]);
>   	int codepos = value;
>   	codepos -= 43;
>   	if (codepos < 0 || codepos >= decoding_size)
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-17  9:41 ` Leonid Vasiliev
@ 2020-12-17 12:41   ` Alexander Turenko
  2020-12-17 13:04   ` Sergey Nikiforov
  1 sibling, 0 replies; 10+ messages in thread
From: Alexander Turenko @ 2020-12-17 12:41 UTC (permalink / raw)
  To: Leonid Vasiliev; +Cc: tarantool-patches, Vladislav Shpilevoy

On Thu, Dec 17, 2020 at 12:41:48PM +0300, Leonid Vasiliev via Tarantool-patches wrote:
> Hi! Thank you for the patch.
> Generally LGTM.
> See some comments below:
> 
> According to https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message
> "Description is what the patch does, started from lowercase letter,
> without a dot in the end, in the imperative mood."
> ("properly...").
> I could be wrong, but it seems like <description> is not written in an
> imperative mood.

A commit subject should be in the imperative mood. But there is more
freedom for a commit message body.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-17  9:41 ` Leonid Vasiliev
  2020-12-17 12:41   ` Alexander Turenko
@ 2020-12-17 13:04   ` Sergey Nikiforov
  2020-12-17 14:52     ` Leonid Vasiliev
  1 sibling, 1 reply; 10+ messages in thread
From: Sergey Nikiforov @ 2020-12-17 13:04 UTC (permalink / raw)
  To: Leonid Vasiliev, tarantool-patches; +Cc: Vladislav Shpilevoy

Hi!

On 17.12.2020 12:41, Leonid Vasiliev wrote:
> Hi! Thank you for the patch.
> Generally LGTM.
> See some comments below:
> 
> According to 
> https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message 
> 
> "Description is what the patch does, started from lowercase letter,
> without a dot in the end, in the imperative mood."
> ("properly...").
> I could be wrong, but it seems like <description> is not written in an
> imperative mood.

Mmm. I am not that great in English, but how "Properly ignore..." is not 
"imperative mood"? What would you suggest?
I should, however, use lowercase (alas, force of habit).

> On 15.12.2020 17:25, Sergey Nikiforov wrote:
>> Not all invalid characters were ignored by base64 decoder
>> causing data corruption and reads beyond decode table
>> (faults under ASAN).
>>
>> Added corresponding check into base64 unit test.
>>
>> Fixes: #5627
>> ---
>>
>> Branch: 
>> https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing 
>>
>> Issue: https://github.com/tarantool/tarantool/issues/5627
>>
>>   test/unit/base64.c      | 23 ++++++++++++++++++++++-
>>   test/unit/base64.result |  5 ++++-
>>   third_party/base64.c    |  3 ++-
>>   3 files changed, 28 insertions(+), 3 deletions(-)
>>
>> diff --git a/test/unit/base64.c b/test/unit/base64.c
>> index ada497adf..c0f53a5e1 100644
>> --- a/test/unit/base64.c
>> +++ b/test/unit/base64.c
>> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>>       base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>>   }
>> +static void
>> +base64_invalid_chars_test(void)
>> +{
>> +    /* Upper bit must be cleared */
>> +    const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
>> +    char outbuf[8];
>> +
>> +    plan(1);
> 
> Usually `plan ()` is called as the first call in a function. It's just
> easier to see how many checks there will be. I don't know any rule about
> this. So, it's up to you.

I would move it if you like (if there would be another patch revision). 
I have just tried to be C89-friendly. Force of habit (useful one).

>> +
>> +    /* Invalid chars should be ignored, not decoded into garbage */
>> +    is(base64_decode(invalid_data, sizeof(invalid_data),
>> +                     outbuf, sizeof(outbuf)),
>> +       0, "ignoring invalid chars");
>> +
>> +    check_plan();
>> +}
>> +
>>   int main(int argc, char *argv[])
>>   {
>> -    plan(28);
>> +    plan(28
>> +         + 1 /* invalid chars test */
>> +         );
> 
> I agree with Vlad. Why `+ 1` and not just 29?

Using "magic" values without explanation is a bad idea for readability. 
Should I "decode" how 28 was calculated as well (using constants where 
appropriate) or no one bothers so much with tests?

>>       header();
>>       const char *option_tests[] = {
>> @@ -78,6 +97,8 @@ int main(int argc, char *argv[])
>>           base64_nowrap_test(option_tests[i]);
>>       }
>> +    base64_invalid_chars_test();
>> +
>>       footer();
>>       return check_plan();
>>   }
>> diff --git a/test/unit/base64.result b/test/unit/base64.result
>> index cd1f2b3f6..3bc2c2275 100644
>> --- a/test/unit/base64.result
>> +++ b/test/unit/base64.result
>> @@ -1,4 +1,4 @@
>> -1..28
>> +1..29
>>       *** main ***
>>       1..3
>>       ok 1 - length
>> @@ -175,4 +175,7 @@ ok 27 - subtests
>>       ok 3 - decode length ok
>>       ok 4 - encode/decode
>>   ok 28 - subtests
>> +    1..1
>> +    ok 1 - ignoring invalid chars
>> +ok 29 - subtests
>>       *** main: done ***
>> diff --git a/third_party/base64.c b/third_party/base64.c
>> index 8ecab23eb..7c69315ea 100644
>> --- a/third_party/base64.c
>> +++ b/third_party/base64.c
>> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>>           32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>>           44, 45, 46, 47, 48, 49, 50, 51
>>       };
>> -    static const int decoding_size = sizeof(decoding);
>> +    static const int decoding_size =
>> +        sizeof(decoding) / sizeof(decoding[0]);
>>       int codepos = value;
>>       codepos -= 43;
>>       if (codepos < 0 || codepos >= decoding_size)
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-17 13:04   ` Sergey Nikiforov
@ 2020-12-17 14:52     ` Leonid Vasiliev
  2020-12-23 12:17       ` Leonid Vasiliev
  0 siblings, 1 reply; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-17 14:52 UTC (permalink / raw)
  To: Sergey Nikiforov, tarantool-patches; +Cc: Vladislav Shpilevoy

Hi!

On 17.12.2020 16:04, Sergey Nikiforov wrote:
> Hi!
> 
> On 17.12.2020 12:41, Leonid Vasiliev wrote:
>> Hi! Thank you for the patch.
>> Generally LGTM.
>> See some comments below:
>>
>> According to 
>> https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message 
>>
>> "Description is what the patch does, started from lowercase letter,
>> without a dot in the end, in the imperative mood."
>> ("properly...").
>> I could be wrong, but it seems like <description> is not written in an
>> imperative mood.
> 
> Mmm. I am not that great in English, but how "Properly ignore..." is not 
> "imperative mood"? What would you suggest?
> I should, however, use lowercase (alas, force of habit).

I will refer to my phrase:"I could be wrong") Leave it as is.

> 
>> On 15.12.2020 17:25, Sergey Nikiforov wrote:
>>> Not all invalid characters were ignored by base64 decoder
>>> causing data corruption and reads beyond decode table
>>> (faults under ASAN).
>>>
>>> Added corresponding check into base64 unit test.
>>>
>>> Fixes: #5627
>>> ---
>>>
>>> Branch: 
>>> https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing 
>>>
>>> Issue: https://github.com/tarantool/tarantool/issues/5627
>>>
>>>   test/unit/base64.c      | 23 ++++++++++++++++++++++-
>>>   test/unit/base64.result |  5 ++++-
>>>   third_party/base64.c    |  3 ++-
>>>   3 files changed, 28 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/test/unit/base64.c b/test/unit/base64.c
>>> index ada497adf..c0f53a5e1 100644
>>> --- a/test/unit/base64.c
>>> +++ b/test/unit/base64.c
>>> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>>>       base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>>>   }
>>> +static void
>>> +base64_invalid_chars_test(void)
>>> +{
>>> +    /* Upper bit must be cleared */
>>> +    const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
>>> +    char outbuf[8];
>>> +
>>> +    plan(1);
>>
>> Usually `plan ()` is called as the first call in a function. It's just
>> easier to see how many checks there will be. I don't know any rule about
>> this. So, it's up to you.
> 
> I would move it if you like (if there would be another patch revision). 
> I have just tried to be C89-friendly. Force of habit (useful one).
> 

Ok.

>>> +
>>> +    /* Invalid chars should be ignored, not decoded into garbage */
>>> +    is(base64_decode(invalid_data, sizeof(invalid_data),
>>> +                     outbuf, sizeof(outbuf)),
>>> +       0, "ignoring invalid chars");
>>> +
>>> +    check_plan();
>>> +}
>>> +
>>>   int main(int argc, char *argv[])
>>>   {
>>> -    plan(28);
>>> +    plan(28
>>> +         + 1 /* invalid chars test */
>>> +         );
>>
>> I agree with Vlad. Why `+ 1` and not just 29?
> 
> Using "magic" values without explanation is a bad idea for readability. 
> Should I "decode" how 28 was calculated as well (using constants where 
> appropriate) or no one bothers so much with tests?

AFAIK in `plan()` we always just write the "total" number of checks.

> 
>>>       header();
>>>       const char *option_tests[] = {
>>> @@ -78,6 +97,8 @@ int main(int argc, char *argv[])
>>>           base64_nowrap_test(option_tests[i]);
>>>       }
>>> +    base64_invalid_chars_test();
>>> +
>>>       footer();
>>>       return check_plan();
>>>   }
>>> diff --git a/test/unit/base64.result b/test/unit/base64.result
>>> index cd1f2b3f6..3bc2c2275 100644
>>> --- a/test/unit/base64.result
>>> +++ b/test/unit/base64.result
>>> @@ -1,4 +1,4 @@
>>> -1..28
>>> +1..29
>>>       *** main ***
>>>       1..3
>>>       ok 1 - length
>>> @@ -175,4 +175,7 @@ ok 27 - subtests
>>>       ok 3 - decode length ok
>>>       ok 4 - encode/decode
>>>   ok 28 - subtests
>>> +    1..1
>>> +    ok 1 - ignoring invalid chars
>>> +ok 29 - subtests
>>>       *** main: done ***
>>> diff --git a/third_party/base64.c b/third_party/base64.c
>>> index 8ecab23eb..7c69315ea 100644
>>> --- a/third_party/base64.c
>>> +++ b/third_party/base64.c
>>> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>>>           32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>>>           44, 45, 46, 47, 48, 49, 50, 51
>>>       };
>>> -    static const int decoding_size = sizeof(decoding);
>>> +    static const int decoding_size =
>>> +        sizeof(decoding) / sizeof(decoding[0]);
>>>       int codepos = value;
>>>       codepos -= 43;
>>>       if (codepos < 0 || codepos >= decoding_size)
>>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-17 14:52     ` Leonid Vasiliev
@ 2020-12-23 12:17       ` Leonid Vasiliev
  0 siblings, 0 replies; 10+ messages in thread
From: Leonid Vasiliev @ 2020-12-23 12:17 UTC (permalink / raw)
  To: Sergey Nikiforov, tarantool-patches; +Cc: Vladislav Shpilevoy

Hi!
Just to be clear. Perhaps before I have not clearly expressed my
opinion.
If you ok with 29 instead
```
+    plan(28
+         + 1 /* invalid chars test */
+         );
```
then LGTM.

On 17.12.2020 17:52, Leonid Vasiliev via Tarantool-patches wrote:
> Hi!
> 
> On 17.12.2020 16:04, Sergey Nikiforov wrote:
>> Hi!
>>
>> On 17.12.2020 12:41, Leonid Vasiliev wrote:
>>> Hi! Thank you for the patch.
>>> Generally LGTM.
>>> See some comments below:
>>>
>>> According to 
>>> https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message 
>>>
>>> "Description is what the patch does, started from lowercase letter,
>>> without a dot in the end, in the imperative mood."
>>> ("properly...").
>>> I could be wrong, but it seems like <description> is not written in an
>>> imperative mood.
>>
>> Mmm. I am not that great in English, but how "Properly ignore..." is 
>> not "imperative mood"? What would you suggest?
>> I should, however, use lowercase (alas, force of habit).
> 
> I will refer to my phrase:"I could be wrong") Leave it as is.
> 
>>
>>> On 15.12.2020 17:25, Sergey Nikiforov wrote:
>>>> Not all invalid characters were ignored by base64 decoder
>>>> causing data corruption and reads beyond decode table
>>>> (faults under ASAN).
>>>>
>>>> Added corresponding check into base64 unit test.
>>>>
>>>> Fixes: #5627
>>>> ---
>>>>
>>>> Branch: 
>>>> https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing 
>>>>
>>>> Issue: https://github.com/tarantool/tarantool/issues/5627
>>>>
>>>>   test/unit/base64.c      | 23 ++++++++++++++++++++++-
>>>>   test/unit/base64.result |  5 ++++-
>>>>   third_party/base64.c    |  3 ++-
>>>>   3 files changed, 28 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/test/unit/base64.c b/test/unit/base64.c
>>>> index ada497adf..c0f53a5e1 100644
>>>> --- a/test/unit/base64.c
>>>> +++ b/test/unit/base64.c
>>>> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>>>>       base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>>>>   }
>>>> +static void
>>>> +base64_invalid_chars_test(void)
>>>> +{
>>>> +    /* Upper bit must be cleared */
>>>> +    const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
>>>> +    char outbuf[8];
>>>> +
>>>> +    plan(1);
>>>
>>> Usually `plan ()` is called as the first call in a function. It's just
>>> easier to see how many checks there will be. I don't know any rule about
>>> this. So, it's up to you.
>>
>> I would move it if you like (if there would be another patch 
>> revision). I have just tried to be C89-friendly. Force of habit 
>> (useful one).
>>
> 
> Ok.
> 
>>>> +
>>>> +    /* Invalid chars should be ignored, not decoded into garbage */
>>>> +    is(base64_decode(invalid_data, sizeof(invalid_data),
>>>> +                     outbuf, sizeof(outbuf)),
>>>> +       0, "ignoring invalid chars");
>>>> +
>>>> +    check_plan();
>>>> +}
>>>> +
>>>>   int main(int argc, char *argv[])
>>>>   {
>>>> -    plan(28);
>>>> +    plan(28
>>>> +         + 1 /* invalid chars test */
>>>> +         );
>>>
>>> I agree with Vlad. Why `+ 1` and not just 29?
>>
>> Using "magic" values without explanation is a bad idea for 
>> readability. Should I "decode" how 28 was calculated as well (using 
>> constants where appropriate) or no one bothers so much with tests?
> 
> AFAIK in `plan()` we always just write the "total" number of checks.
> 
>>
>>>>       header();
>>>>       const char *option_tests[] = {
>>>> @@ -78,6 +97,8 @@ int main(int argc, char *argv[])
>>>>           base64_nowrap_test(option_tests[i]);
>>>>       }
>>>> +    base64_invalid_chars_test();
>>>> +
>>>>       footer();
>>>>       return check_plan();
>>>>   }
>>>> diff --git a/test/unit/base64.result b/test/unit/base64.result
>>>> index cd1f2b3f6..3bc2c2275 100644
>>>> --- a/test/unit/base64.result
>>>> +++ b/test/unit/base64.result
>>>> @@ -1,4 +1,4 @@
>>>> -1..28
>>>> +1..29
>>>>       *** main ***
>>>>       1..3
>>>>       ok 1 - length
>>>> @@ -175,4 +175,7 @@ ok 27 - subtests
>>>>       ok 3 - decode length ok
>>>>       ok 4 - encode/decode
>>>>   ok 28 - subtests
>>>> +    1..1
>>>> +    ok 1 - ignoring invalid chars
>>>> +ok 29 - subtests
>>>>       *** main: done ***
>>>> diff --git a/third_party/base64.c b/third_party/base64.c
>>>> index 8ecab23eb..7c69315ea 100644
>>>> --- a/third_party/base64.c
>>>> +++ b/third_party/base64.c
>>>> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>>>>           32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>>>>           44, 45, 46, 47, 48, 49, 50, 51
>>>>       };
>>>> -    static const int decoding_size = sizeof(decoding);
>>>> +    static const int decoding_size =
>>>> +        sizeof(decoding) / sizeof(decoding[0]);
>>>>       int codepos = value;
>>>>       codepos -= 43;
>>>>       if (codepos < 0 || codepos >= decoding_size)
>>>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-15 14:25 [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters Sergey Nikiforov
  2020-12-16 23:28 ` Vladislav Shpilevoy
  2020-12-17  9:41 ` Leonid Vasiliev
@ 2020-12-23 15:17 ` Vladislav Shpilevoy
  2020-12-30 11:59   ` Alexander Turenko
  2020-12-30 11:28 ` Alexander V. Tikhonov
  3 siblings, 1 reply; 10+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-23 15:17 UTC (permalink / raw)
  To: Sergey Nikiforov, tarantool-patches, Alexander V. Tikhonov

Alexander Tikh., please, tell if we can merge branch
void234/gh-5627-fix-base64-invalid-chars-processing-v3.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-23 15:17 ` Vladislav Shpilevoy
@ 2020-12-30 11:59   ` Alexander Turenko
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Turenko @ 2020-12-30 11:59 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Wed, Dec 23, 2020 at 04:17:29PM +0100, Vladislav Shpilevoy via Tarantool-patches wrote:
> Alexander Tikh., please, tell if we can merge branch
> void234/gh-5627-fix-base64-invalid-chars-processing-v3.

Pushed to master, 2.6, 2.5 and 1.10.

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters
  2020-12-15 14:25 [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters Sergey Nikiforov
                   ` (2 preceding siblings ...)
  2020-12-23 15:17 ` Vladislav Shpilevoy
@ 2020-12-30 11:28 ` Alexander V. Tikhonov
  3 siblings, 0 replies; 10+ messages in thread
From: Alexander V. Tikhonov @ 2020-12-30 11:28 UTC (permalink / raw)
  To: Sergey Nikiforov; +Cc: tarantool-patches

Hi Sergey, thanks for the patch, as I see no new degradation found in
gitlab-ci testing commit criteria pipeline [1], patch LGTM.

[1] - https://gitlab.com/tarantool/tarantool/-/pipelines/230156990

On Tue, Dec 15, 2020 at 05:25:27PM +0300, Sergey Nikiforov via Tarantool-patches wrote:
> Not all invalid characters were ignored by base64 decoder
> causing data corruption and reads beyond decode table
> (faults under ASAN).
> 
> Added corresponding check into base64 unit test.
> 
> Fixes: #5627
> ---
> 
> Branch: https://github.com/tarantool/tarantool/tree/void234/gh-5627-fix-base64-invalid-chars-processing
> Issue: https://github.com/tarantool/tarantool/issues/5627
> 
>  test/unit/base64.c      | 23 ++++++++++++++++++++++-
>  test/unit/base64.result |  5 ++++-
>  third_party/base64.c    |  3 ++-
>  3 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/test/unit/base64.c b/test/unit/base64.c
> index ada497adf..c0f53a5e1 100644
> --- a/test/unit/base64.c
> +++ b/test/unit/base64.c
> @@ -58,9 +58,28 @@ base64_nowrap_test(const char *str)
>  	base64_test(str, BASE64_NOWRAP, symbols, lengthof(symbols));
>  }
>  
> +static void
> +base64_invalid_chars_test(void)
> +{
> +	/* Upper bit must be cleared */
> +	const char invalid_data[] = { '\x7b', '\x7c', '\x7d', '\x7e' };
> +	char outbuf[8];
> +
> +	plan(1);
> +
> +	/* Invalid chars should be ignored, not decoded into garbage */
> +	is(base64_decode(invalid_data, sizeof(invalid_data),
> +	                 outbuf, sizeof(outbuf)),
> +	   0, "ignoring invalid chars");
> +
> +	check_plan();
> +}
> +
>  int main(int argc, char *argv[])
>  {
> -	plan(28);
> +	plan(28
> +	     + 1 /* invalid chars test */
> +	     );
>  	header();
>  
>  	const char *option_tests[] = {
> @@ -78,6 +97,8 @@ int main(int argc, char *argv[])
>  		base64_nowrap_test(option_tests[i]);
>  	}
>  
> +	base64_invalid_chars_test();
> +
>  	footer();
>  	return check_plan();
>  }
> diff --git a/test/unit/base64.result b/test/unit/base64.result
> index cd1f2b3f6..3bc2c2275 100644
> --- a/test/unit/base64.result
> +++ b/test/unit/base64.result
> @@ -1,4 +1,4 @@
> -1..28
> +1..29
>  	*** main ***
>      1..3
>      ok 1 - length
> @@ -175,4 +175,7 @@ ok 27 - subtests
>      ok 3 - decode length ok
>      ok 4 - encode/decode
>  ok 28 - subtests
> +    1..1
> +    ok 1 - ignoring invalid chars
> +ok 29 - subtests
>  	*** main: done ***
> diff --git a/third_party/base64.c b/third_party/base64.c
> index 8ecab23eb..7c69315ea 100644
> --- a/third_party/base64.c
> +++ b/third_party/base64.c
> @@ -222,7 +222,8 @@ base64_decode_value(int value)
>  		32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
>  		44, 45, 46, 47, 48, 49, 50, 51
>  	};
> -	static const int decoding_size = sizeof(decoding);
> +	static const int decoding_size =
> +		sizeof(decoding) / sizeof(decoding[0]);
>  	int codepos = value;
>  	codepos -= 43;
>  	if (codepos < 0 || codepos >= decoding_size)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-30 11:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 14:25 [Tarantool-patches] [PATCH v2] base64: Properly ignore invalid characters Sergey Nikiforov
2020-12-16 23:28 ` Vladislav Shpilevoy
2020-12-17  9:41 ` Leonid Vasiliev
2020-12-17 12:41   ` Alexander Turenko
2020-12-17 13:04   ` Sergey Nikiforov
2020-12-17 14:52     ` Leonid Vasiliev
2020-12-23 12:17       ` Leonid Vasiliev
2020-12-23 15:17 ` Vladislav Shpilevoy
2020-12-30 11:59   ` Alexander Turenko
2020-12-30 11:28 ` Alexander V. Tikhonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox