[Tarantool-patches] [PATCH] raft: more precise verification of incoming request state

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Jun 27 17:12:01 MSK 2021


On 26.06.2021 15:34, Cyrill Gorcunov wrote:
> On Fri, Jun 25, 2021 at 11:49:10PM +0200, Vladislav Shpilevoy wrote:
>>> Closes #6067
>>
>> Unfortunately, the patch does not fix much. Firstly,
>> the state is decoded from MessagePack as mp_decode_uint().
>> It can't be negative originally by design.
> 
> No, Vlad, this is not how signed\unsigned numbers are processed.

Yeah, I know how the numbers are processed. But right here it
does not matter. The state initially appears from mp_decode_uint()
which returns uint64_t. Hence it is positive initially. Always.
You can't make mp_decode_uint() < 0 true without any casts.
It becomes truncated later.

>> Secondly, and most importantly, the state is decoded as
>> uint64_t raft_request.state, but it is saved without checks
>> into enum struct raft_msg.state in box_raft_request_to_msg().
> 
> I suppose this is done because box_raft_request_to_msg a kind of
> general wrapper which doesn't care much about contents validation
> but simply represent the data in a new structure?

Yes.

> When stage
> of processing starts (ie raft_process_msg helper) we validate
> each member, so I think this is a good architecture to investigate
> contents in raft_process_msg,

You can't validate it during decoding properly, because the decoder
does not and should not depend on the raft library. It does not
know which states are valid. It only decodes the fields into numbers.

> but if you prefer to use
> box_raft_request_to_msg instead I would make it so.

It would be good to keep box_raft_request_to_msg() simple. But it is
not a requirement.

>> So the current patch does not change almost anything I
>> suppose? Because if I send ((UINT32_MAX << 31) | 0x01), it
>> will work. Because the upper 32 bits would be truncated
>> (assuming the enum is 32 bits), and your checks are done too
>> late to notice it.
>>
>> Correct? If so, then I would propose to fix the issue entirely,
>> if you want to work on that.
> 
> I think I didnt put more technical details in commit message,
> let me try to explain in a more verbose way: the enum is stored
> as integer type but standart doesn't require it to be signed
> or unsigned, so it is implementation defined. Currently on my
> gcc-11 series the if statement consider @state member as
> _unsigned_ integer thus basically my patch does absolutely
> nothing, the code remains the same on asm level. Still a compiler
> has all rights to use a signed "state" and in this case our
> `if` won't hit leading to undefined behaviour in case if the
> "state" is in form (0x80000000 | x) and what is worse we won't
> even able to figure out what has happened.
> 
> We can emulate this behaviour by using
> 
> raft_process_msg
> 	...
> 	if ((int)req->state < 0) {
> 		panic("RAFT: req->state > 0 -> %d", (int)req->state >= (int)raft_state_MAX);
> 		assert(req->state > 0);
> 	}
> 
> in result if we pass -1u as state in the test we will have
> [001] +2021-06-26 15:52:58.610 [57492] main/101/main F> RAFT: req->state > 0 -> 0
> 
> thus if compiler use signed int type as internal representation then
> (int)req->state >= (int)raft_state_MAX won't take the branch because
> it is a negative value.

The very fact you needed to put so much effort into the explanation
of a simple negative number check means it is a huge overkill. Please,
compare with < 0 explicitly. If the compiler will store the enum as uint,
it will drop your condition as unreachable anyway.

> Surely the patch won't catch integer overflows since we trim the
> data as you noted above because, gimme some time to think about,
> I really like how our box_raft_request_to_msg helpers are done
> and if it possible would love to not change them much: they are
> simple and short now.

You can try to fix it by changing state to uint64_t in raft_request (now
it is uint32_t), then you need to check for the valid values in
box_raft_request_to_msg(). Or you can patch raft_msg to store the state as
uint64_t. So the value wouldn't be truncated anywhere and raft_process_msg()
could check its correctness properly.

>>> diff --git a/src/lib/raft/raft.c b/src/lib/raft/raft.c
>>> index eacdddb7e..409e983f0 100644
>>> --- a/src/lib/raft/raft.c
>>> +++ b/src/lib/raft/raft.c
>>> @@ -309,7 +309,9 @@ raft_process_msg(struct raft *raft, const struct raft_msg *req, uint32_t source)
>>>  	say_info("RAFT: message %s from %u", raft_msg_to_string(req), source);
>>>  	assert(source > 0);
>>>  	assert(source != raft->self);
>>> -	if (req->term == 0 || req->state == 0 || req->state >= raft_state_MAX) {
>>> +
>>> +	if (req->term == 0 || req->state == 0 ||
>>> +	    (unsigned)req->state >= raft_state_MAX) {
>>
>> Please, perform a normal < 0 comparison. No need for unsigned cast
>> cast.
> 
> We need an explicit cast here because as I said the internal
> implementation is compiler defined.

It has nothing to do with the compiler-specific implementation.
It is simply over-engineering. You cast value to unsigned to
check if it is negative. WTF? Lets do a normal comparison.

> On the other hands I played with a number of gcc and llvm compilers
> and for this code they all does correct comparision (as unsigned
> integers) so maybe we could just leave it as is, simply close
> the bug as won't implement and spend time on more important things?
> I don't mind actually, so up to you just say me what you think.

I don't mind fixing the issue. But now your patch does not fix it,
unfortunately. If you want to fix it - go on. But it must be truly
fixed then.


More information about the Tarantool-patches mailing list