[Tarantool-patches] [PATCH 1/4] test: add infrastructure for fuzzing testing and fuzzers

Sergey Bronnikov sergeyb at tarantool.org
Thu Dec 24 13:18:28 MSK 2020


Hi,

On 20.12.2020 16:31, Igor Munkin wrote:
> Sergey,
>
> Thanks for the changes!
>
> On 13.12.20, Sergey Bronnikov wrote:
>> Igor, many thanks for review!
>>
>> I've fixed patches and pushed to the branch (but left them as a separate
>> commits with prefix [TO SQUASH]).
> Everything is fixed in scope of the other patches in the series, but I
> have more questions for your answers and updates for this one, please
> consider them below.
>
>> On 07.12.2020 20:24, Igor Munkin wrote:
>>> Sergey,
>>>
>>> Thanks for the patch! Please consider the remaining comments below.
>>>
>>> On 30.11.20, sergeyb at tarantool.org wrote:
> <snipped>
>
>>>> and add fuzzers based on LibFuzzer [1] to csv, http_parser and uri modules.
> Typo: Still exceeds 72 chars...
Fixed. Also splitted cmake command-line in example to fit in 72 chars too.
>
>>>> NOTE: LibFuzzer requires Clang compiler.
> <snipped>
>
>> Message with assert is definitely not ok. LibFuzzer documentation says
>> that all fuzzers must return 0 only [1].
>>
> Neat, now everything works fine. However, considering your comment, I
> have a newbie question (since I'm not an expert in fuzzing testing): how
> do we need to check whether parsing finishes right or not?

libfuzzer has a number of settings and one of them is flag that controls 
time of single unit execution.

./test/fuzz/http_parser_fuzzer -help=1

<snipped>

timeout                        1200    Timeout in seconds (if positive). 
If one unit runs more than this number of seconds the process will abort.
<snipped>

>
> Anyway, you can simply add asserts to check rc is 0, can't you?
> Otherwise these tests look kinda smoke ones to me.
>
>> --- a/test/fuzz/csv_fuzzer.c
>> +++ b/test/fuzz/csv_fuzzer.c
>> @@ -9,15 +9,14 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t
>> size) {
>>            csv_create(&csv);
>>            char *buf = calloc(size, sizeof(char*));
> Typo: s/sizeof(char*)/sizeof(char)/.
Fixed.
> This is why I failed to prove the out of boundary access below for a
> while. Surprisingly, everything is fine in HTTP parser test.
>
>>            if (buf == NULL)
>> -                return -1;
>> +                return 0;
> I believe the testing is not OK if <calloc> yields NULL, but the code
> returns 0. This is odd, IMHO. What about adding either assert or abort
> to handle this branch? To make asserts work all time simply undefine
> NDEBUG at the beginning of the test. Same for other cases.

Igor, I think you get everything wrong ;) Let me explain.

We don't write a highly reliable and safety code here. Everything we 
need is just to properly pass a junk to a function under test.

The goal of fuzzing testing is to find errors like buffer-overflows, 
use-after-free and so on.

Lack of memory during testing is rare case and I think we don't need to 
catch such cases here.

Because triggered assert due to lack of memory is useless information 
from test,

I don't know how we can improve Tarantool with such information. 
Gracefully exit is more than enough.

Moreover I have took a look on source code of tests for other opensource 
projects that were already used in OSS-Fuzz.

They don't care about return codes from calloc(), malloc() functions at 
all. See for example [1].


>
>>            memcpy(buf, data, size);
>>            buf[size] = '\0';
> This write is out of boundaries. In fact it's not, since you
> overallocate a chunk above (consider typo: sizeof(char *) instead of
> sizeof(char)). You want to allocate the <size> bytes via <calloc> and
> accessing <size> byte, but the "last slot" is addressed by (<size> - 1).
>
> Moreover, these manipulations are excessive: if data passed as the first
> argument is NUL-terminated, then this assignment is not necessary
> (everything is done by <memcpy>). Otherwise, you strips the last byte of
> the first argument and you need to allocate (<size> + 1) bytes.
>
> Same for URI parser.

Thanks for catching this.

--- a/test/fuzz/csv_fuzzer.c
+++ b/test/fuzz/csv_fuzzer.c
@@ -8,7 +8,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
  {
         struct csv csv;
         csv_create(&csv);
-       char *buf = calloc(size, sizeof(char));
+       char *buf = calloc(size + 1, sizeof(char));
         if (buf == NULL)
                 return 0;
         memcpy(buf, data, size);
diff --git a/test/fuzz/uri_fuzzer.c b/test/fuzz/uri_fuzzer.c
index 0060bee9b..b4661aea1 100644
--- a/test/fuzz/uri_fuzzer.c
+++ b/test/fuzz/uri_fuzzer.c
@@ -6,10 +6,10 @@

  int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
  {
-       char *buf = calloc(size, sizeof(char *));
+       char *buf = calloc(size + 1, sizeof(char));
         if (!buf)
                 return 0;
-       strncpy(buf, (char *)data, size);
+       memcpy(buf, data, size);
         buf[size] = '\0';
         struct uri uri;
         uri_parse(&uri, buf);


>
>>            char *end = buf + size;
>>            csv_parse_chunk(&csv, buf, end);
>>            csv_finish_parsing(&csv);
>> -        int rc = csv_get_error_status(&csv) == CSV_ER_INVALID ? 1 : 0;
>>            csv_destroy(&csv);
>>            free(buf);
>>
>> -        return rc;
>> +        return 0;
>>    }
>> diff --git a/test/fuzz/http_parser_fuzzer.c b/test/fuzz/http_parser_fuzzer.c
>> index a0aaf6786..f2dd7d09a 100644
>> --- a/test/fuzz/http_parser_fuzzer.c
>> +++ b/test/fuzz/http_parser_fuzzer.c
>> @@ -9,10 +9,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
>>           http_parser_create(&parser);
>>           parser.hdr_name = (char *)calloc((int)size, sizeof(char));
> Minor: why do you explicitly cast <size> argument here, but omit such
> cast in other tests?

calloc accepts size with size_t type, so removed cast here.


>>           if (parser.hdr_name == NULL)
>> -                return -1;
>> +                return 0;
>>           char *end_buf = buf + size;
>> -        int rc = http_parse_header_line(&parser, &buf, end_buf, size);
>> +        http_parse_header_line(&parser, &buf, end_buf, size);
>>           free(parser.hdr_name);
>>   
>> -        return rc;
>> +        return 0;
>>   }
> <snipped>
>
>> For the rest I believe the reason is
>>
>> | NOTE: libFuzzer has rudimentary signal handlers.
>> |       Combine libFuzzer with AddressSanitizer or similar for better crash reports.
> As I mentioned in the previous reply, I tried exactly your recipe, so
> ASAN was enabled.
Finally :)
>
>>
>>>> Part of #1809
>>>> ---
>>>>    CMakeLists.txt                 |  2 +-
>>>>    cmake/profile.cmake            | 13 ++++++++++
>>>>    test/CMakeLists.txt            |  3 +++
>>>>    test/fuzz/CMakeLists.txt       | 45 ++++++++++++++++++++++++++++++++++
>>>>    test/fuzz/csv_fuzzer.c         | 23 +++++++++++++++++
>>>>    test/fuzz/http_parser_fuzzer.c | 18 ++++++++++++++
>>>>    test/fuzz/uri_fuzzer.c         | 19 ++++++++++++++
>>>>    7 files changed, 122 insertions(+), 1 deletion(-)
>>>>    create mode 100644 test/fuzz/CMakeLists.txt
>>>>    create mode 100644 test/fuzz/csv_fuzzer.c
>>>>    create mode 100644 test/fuzz/http_parser_fuzzer.c
>>>>    create mode 100644 test/fuzz/uri_fuzzer.c
> <snipped>
>
>>>> diff --git a/test/fuzz/CMakeLists.txt b/test/fuzz/CMakeLists.txt
>>>> new file mode 100644
>>>> index 000000000..142d38f67
>>>> --- /dev/null
>>>> +++ b/test/fuzz/CMakeLists.txt
>>>> @@ -0,0 +1,45 @@
> <snipped>
>
>>> 1. I'm totally not an expert, but quite confused with the fact the
>>> libcsv is build w/o <fuzzer> flag, but csv_fuzzer is build with it.
>> You are right. Project source code should be instrumented too and I
>> enable it:
>>
>> diff --git a/cmake/profile.cmake b/cmake/profile.cmake
>> index 45e3d112c..308d1b0fb 100644
>> --- a/cmake/profile.cmake
>> +++ b/cmake/profile.cmake
>> @@ -53,6 +53,9 @@ if(ENABLE_FUZZER)
>>                " $ CC=clang CXX=clang++ cmake . <...> -DENABLE_FUZZER=ON
>> && make -j\n"
>>                "\n")
>>        endif()
>> +    if (NOT OSS_FUZZ)
>> +        add_compile_flags("C;CXX" -fsanitize=fuzzer-no-link)
>> +    endif()
> Why these compile flags are added under this particular condition?

Because when OSS Fuzz is enabled compiler and link flags passed

from outside. See description how to integrate project to OSS Fuzz in [2].

>
>>    endif()
>>
>>    option(ENABLE_ASAN "Enable AddressSanitizer, a fast memory error
>> detector based on compiler instrumentation" OFF)
> <snipped>
>
>> You may not understand why options with "-fsanitize=fuzzer" two times
>> (in cmake/profile.cmake and test/fuzz/CMakeLists.txt). I'll clarify it
>> in advance:
>>
>> - cmake/profile.cmake is for project source files,
>> -fsanitize=fuzzer-no-link option allows to instrument project source
>> files for fuzzing, but LibFuzzer will not replace main() in these files.
>>
>> - test/fuzz/CMakeLists.txt uses -fsanitize=fuzzer and not
>> -fsanitize=fuzzer-no-link because we want to add automatically generated
>> main() for each fuzzer.
> This is a nice wording to comment the corresponding changes.
Added.
>
>>> 2. Do you need to specify <address> flag once more, when ASAN is
>>> enabled? If not the hunk above looks excess, doesn't it?
>> Agree, it was a bad idea to manage UBSan and ASAN flags in yet another
>> place.
> Side note: You can oblige one to enable ASAN/UBSAN the same way, you
> restrict building via clang. Of course if it makes sense :)
Looks like it is no sense. GCC also has support of sanitizers, [3].
> <snipped>
>
>> Code style also recommend to use goto(), but I believe that LibFuzzer
>> someday will start to accept different exit codes
>>
>> and probably it is better to keep code as is without using goto().
> Agree here.
>
> By the way, there is one nit left: please remove the space between * and
> the parameter name in the function signatures.

Fixed.

diff --git a/test/fuzz/csv_fuzzer.c b/test/fuzz/csv_fuzzer.c
index 5e470c492..ffa917ad6 100644
--- a/test/fuzz/csv_fuzzer.c
+++ b/test/fuzz/csv_fuzzer.c
@@ -4,11 +4,11 @@
  #include <string.h>
  #include "csv/csv.h"

-int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size)
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
  {
         struct csv csv;
         csv_create(&csv);
-       char *buf = calloc(size, sizeof(char *));
+       char *buf = calloc(size, sizeof(char));
         if (buf == NULL)
                 return 0;
         memcpy(buf, data, size);
diff --git a/test/fuzz/http_parser_fuzzer.c b/test/fuzz/http_parser_fuzzer.c
index 1d78450f0..737c89617 100644
--- a/test/fuzz/http_parser_fuzzer.c
+++ b/test/fuzz/http_parser_fuzzer.c
@@ -3,7 +3,7 @@
  #include <stddef.h>
  #include "http_parser/http_parser.h"

-int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size)
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
  {
         struct http_parser parser;
         char *buf = (char *)data;
diff --git a/test/fuzz/uri_fuzzer.c b/test/fuzz/uri_fuzzer.c
index 6e047bde5..0060bee9b 100644
--- a/test/fuzz/uri_fuzzer.c
+++ b/test/fuzz/uri_fuzzer.c
@@ -4,7 +4,7 @@
  #include <string.h>
  #include "uri/uri.h"

-int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size)
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
  {
         char *buf = calloc(size, sizeof(char *));
         if (!buf)

>
> <snipped>
>
>> Also added a warning that triggered when someone use ENABLE_FUZZER and
>> OSS_FUZZ without
>>
>> environment variable LIB_FUZZING_ENGINE:
> Nice.
>
>> --- a/cmake/profile.cmake
>> +++ b/cmake/profile.cmake
>> @@ -53,6 +53,13 @@ if(ENABLE_FUZZER)
>>                " $ CC=clang CXX=clang++ cmake . <...> -DENABLE_FUZZER=ON
>> && make -j\n"
>>                "\n")
>>        endif()
>> +    if(OSS_FUZZ AND NOT DEFINED ENV{LIB_FUZZING_ENGINE})
>> +        message(SEND_ERROR
> Minor: Why do you use SEND_ERROR here? I guess one can't proceed with
> the desired testing in this case, so FATAL_ERROR prevents one from the
> further misuse.
I don't mind, let's replace it to FATAL_ERROR.
>
>> +            "OSS-Fuzz builds require the environment variable "
>> +            "LIB_FUZZING_ENGINE to be set. If you are seeing this "
>> +            "warning, it points to a deeper problem in the ossfuzz "
>> +            "build setup.")
>> +    endif()
> Side note: the mess with whitespace is only in this patch. Everything is
> OK on the branch in the corresponding commit.
>
>>        if (NOT OSS_FUZZ)
>>            add_compile_flags("C;CXX" -fsanitize=fuzzer-no-link)
>>        endif()
>>
>>
>> 1. http://llvm.org/docs/LibFuzzer.html#id22
>>
>> 2.
>> https://github.com/tarantool/tarantool/blob/master/cmake/compiler.cmake#L290-L320
>>

1. 
https://github.com/google/oss-fuzz/blob/master/projects/unbound/fuzz_2.c#L16

2. 
https://google.github.io/oss-fuzz/getting-started/new-project-guide/#buildsh

3. https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html



More information about the Tarantool-patches mailing list