[Tarantool-patches] [PATCH 3/4] box: add MsgPack encoding/decoding for UUID

Serge Petrenko sergepetrenko at tarantool.org
Fri Apr 10 02:46:43 MSK 2020


> 6 апр. 2020 г., в 00:26, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> написал(а):
> 
> Thanks for the patch!
> 

Hi! Thanks for the review!

> See 10 comments below.
> 
>> diff --git a/src/lib/core/mp_uuid.c b/src/lib/core/mp_uuid.c
>> new file mode 100644
>> index 000000000..5b0cc3f1a
>> --- /dev/null
>> +++ b/src/lib/core/mp_uuid.c
>> @@ -0,0 +1,75 @@
>> +
>> +#include "mp_uuid.h"
>> +#include "msgpuck.h"
>> +#include "mp_extension_types.h"
>> +#include "lib/uuid/tt_uuid.h"
> 
> 1. uuid is a separate library depending on core. Now core
> depends on uuid too. This is a cyclic dependency. It does not
> explode probably only because all needed functions are in the
> tt_uuid.h header. So mp_uuid.c/.h should be moved to lib/uuid.
> Is it possible?
> 
> I see we add MP_UUID to src/lib/core/mp_extension_types.h.
> That is also a part of core lib. Maybe we should just merge core
> and uuid libs. I remember we split them only because Kostja
> insisted, but personally I don't see any reason in keeping then
> separated.
> 
> However I am afraid this won't help forever. In future we may
> want to add new mp_ext for something in box/ such as MP_TUPLE
> or something. We already want something like that for custom
> errors. Not related to this patch anyway.

Thanks for pointing this out. I merged mp_uuid.h/.c into uuid lib,
and also extracted mpstream, which depends on mp_uuid now, into a
separate library. So now the dependencies are
                  
mpstream —> uuid -> core
    \________->_______/

> 
>> +
>> +inline uint32_t
>> +mp_sizeof_uuid()
>> +{
>> +	return mp_sizeof_ext(sizeof(struct tt_uuid));
>> +}
>> +
>> +struct tt_uuid *
>> +uuid_unpack(const char **data, uint32_t len, struct tt_uuid *uuid)
>> +{
>> +	if (len != sizeof(*uuid))
>> +		return NULL;
>> +	memcpy(uuid, *data, sizeof(*uuid));
>> +	if (tt_uuid_validate(uuid) != 0)
>> +		return NULL;
>> +	*data += sizeof(*uuid);
>> +	return uuid;
>> +}
>> +
>> +struct tt_uuid *
>> +mp_decode_uuid(const char **data, struct tt_uuid *uuid)
>> +{
>> +	if (mp_typeof(**data) != MP_EXT)
>> +		return NULL;
>> +	int8_t type;
>> +	const char *const svp = *data;
>> +
>> +	uint32_t len = mp_decode_extl(data, &type);
>> +	if (type != MP_UUID || uuid_unpack(data, len, uuid) == NULL) {
>> +		*data = svp;
>> +		return NULL;
>> +	}
>> +	return uuid;
>> +}
>> +
>> +char *
>> +mp_encode_uuid(char *data, const struct tt_uuid *uuid)
>> +{
>> +	return mp_encode_ext(data, MP_UUID, (char *)uuid, sizeof(*uuid));
> 
> 2. Can we just copy it as is? Shouldn't it be big endian or
> little endian or whatever the standard says? If the standard
> says nothing, we should use big-endian (because all other msgpack
> types use it), and we need to either convert every field of
> uuid to big endian before encoding, or ensure it is already
> big endian always, and copy as is.
> 
> In case tt_uuid fields are little endian now, I think no big
> problem in storing them as big endian. Just keep tostring/fromstring
> and other public functions doing the same as before. If we store
> uuid in big-endian, we can use memcmp() to compare two 16byte binaries.
> Without conversions.
> 

Good point, thanks! It isn’t specified whether uuid fields should be big or
little endian. The standard does specify though that the fields must be
encoded as big endian. So instead of memcpy I do mp_store_u32, mp_store_u16, …
and so on. (mp_load_u32(), … for unpack).
mp_load/store_8/16/32 convert the integers to big endian for us, if needed.

>> +}
>> diff --git a/src/lib/core/mp_uuid.h b/src/lib/core/mp_uuid.h
>> new file mode 100644
>> index 000000000..fda0f3aed
>> --- /dev/null
>> +++ b/src/lib/core/mp_uuid.h
>> @@ -0,0 +1,90 @@
>> +#ifndef TARANTOOL_LIB_CORE_MP_UUID_INCLUDED
>> +#define TARANTOOL_LIB_CORE_MP_UUID_INCLUDED
> 
> 3. We use #pragma once now, no?
> 

Ok, fixed.

>> +#include <stdint.h>
>> +
>> +#if defined(__cplusplus)
>> +extern "C" {
>> +#endif /* defined(__cplusplus) */
>> +
>> +struct tt_uuid;
>> +
>> +/**
>> + * \brief Return the number of bytes an encoded uuid value takes.
>> + */
>> +uint32_t
>> +mp_sizeof_uuid();
> 
> 4. Please, use 'void' when a function does not take any arguments.
> Cyrill was saing that it is bad to leave empty (), because then
> the compiler considers it a function having any number of arguments,
> and generates some code for that.

Done.

> 
>> +
>> +/**
>> + * Copy a uuid value from a buffer. Can be used in a combination
>> + * with mp_decode_extl() instead of mp_decode_uuid() when multiple
>> + * extension types are possible.
>> + *
>> + * \param data A buffer.
>> + * \param len Length returned by mp_decode_extl, has to be equal
>> + *            to sizeof(struct tt_uuid), otherwise an error is
>> + *            returned.
>> + * \param[out] uuid Uuid to be decoded.
>> + * \return A pointer to the decoded uuid.
>> + *         NULL in case of an error.
>> + * \post *data = *data + sizeof(struct tt_uuid).
>> + */
>> +struct tt_uuid *
>> +uuid_unpack(const char **data, uint32_t len, struct tt_uuid *uuid);
> 
> 5. Why is it in mp_uuid.h/.c? It does not depend on msgpack at all.
> The similar function decimal_unpack(), for example, is in decimal.c.
> Not in mp_decimal.c.

After considering your comments re uuid_unpack
I pack and unpack uuid fields in big endian style,
using mp_store/load_u32/u16/u8. So let it be in mp_uuid.h/.c now.

> 
>> +
>> +/**
>> + * \brief Decode a uuid from MsgPack \a data.
>> + * \param data A buffer.
>> + * \param[out] uuid Uuid to be decoded.
>> + * \return A pointer to the decoded uuid.
>> + *         NULL in case of an error.
>> + * \post *data = *data + mp_sizeof_uuid().
>> + */
>> +struct tt_uuid *
>> +mp_decode_uuid(const char **data, struct tt_uuid *uuid);
>> +
>> diff --git a/src/lib/msgpuck b/src/lib/msgpuck
>> index 8ae606a16..0d273f95f 160000
>> --- a/src/lib/msgpuck
>> +++ b/src/lib/msgpuck
>> @@ -1 +1 @@
>> -Subproject commit 8ae606a1636dd89b2d61b154e5a1db03dce91657
>> +Subproject commit 0d273f95f8de64aeed698c354ca3bc7cb5ea461a
> 
> 6. I didn't see a patch for msgpuck repo. Did you send it?

Yeah, you even LGTM’ed it ([PATCH] Add const qualifier to mp_encode_ext argument)
Alexander Turenko pushed it and I updated the reference in v2 of this patchset.

> 
>> diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
>> index edbc15b72..fb49b1547 100644
>> --- a/src/lua/msgpack.c
>> +++ b/src/lua/msgpack.c
>> @@ -325,6 +336,10 @@ luamp_decode(struct lua_State *L, struct luaL_serializer *cfg,
>> 		break;
>> 	}
>> 	}
>> +return;
> 
> 7. Bad indentation for 'return;'.
> 

Yep, fixed, ty.

>> +ext_decode_err:
>> +	lua_pop(L, -1);
>> +	luaL_error(L, "msgpack.decode: invalid MsgPack");
>> }
>>> diff --git a/test/app/uuid.result b/test/app/uuid.result
>> index 0713614c6..013c51282 100644
>> --- a/test/app/uuid.result
>> +++ b/test/app/uuid.result
>> @@ -106,7 +106,7 @@ uu.node[5]
>> -- invalid values
>> uuid.fromstr(nil)
>> ---
>> -- error: 'builtin/uuid.lua:47: fromstr(str)'
>> +- error: 'builtin/uuid.lua:38: fromstr(str)'
> 
> 8. Better add a test_run filter to avoid that change in
> future when anything will be updated in uuid.lua again.
> 

Ok, done.

>> ...
>> uuid.fromstr('')
>> ---
>> diff --git a/test/unit/uuid.c b/test/unit/uuid.c
>> index c43d93b4f..4f6b963ba 100644
>> --- a/test/unit/uuid.c
>> +++ b/test/unit/uuid.c
>> @@ -27,10 +29,28 @@ uuid_test(struct tt_uuid a, struct tt_uuid b, int expected_result)
>>            "%s %s %s", a_str, sign, b_str);
>> }
>> 
>> +static void
>> +mp_uuid_test()
>> +{
>> +        plan(4);
>> +        char buf[18];
>> +        char *data = buf;
>> +        const char *data1 = buf;
>> +        struct tt_uuid uu, ret;
>> +        random_init();
>> +        tt_uuid_create(&uu);
>> +        char *end = mp_encode_uuid(data, &uu);
>> +        is(end - data, mp_sizeof_uuid(), "mp_sizeof_uuid() == encoded length");
>> +        struct tt_uuid *rc = mp_decode_uuid(&data1, &ret);
>> +        is(rc, &ret, "mp_decode_uuid() return code");
>> +        is(data1, end, "mp_sizeof_uuid() == decoded length");
>> +        is(tt_uuid_compare(&uu, &ret), 0, "mp_decode_uuid(mp_encode_uuid(uu)) == uu");
> 
> 9. When you create a subset of tests, it should have at least
> plan() + check_plan(). Also we usually add header() footer(), but
> seems like it is too late for this test file.

Fixed, thanks.

> 
>> +}
>> +
>> int
>> main(void)
>> {
>> -        plan(2);
>> +        plan(3);
>> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
>> index af4f2f5d5..411c56f71 100644
>> --- a/third_party/lua-yaml/lyaml.cc
>> +++ b/third_party/lua-yaml/lyaml.cc
>> @@ -697,10 +700,18 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>>       switch (field.ext_type) {
>>       case MP_DECIMAL:
>>          str = decimal_to_string(field.decval);
>> -	 len = strlen(str);
>> -	 break;
>> +         len = strlen(str);
>> +         break;
>> +      case MP_UUID:
>> +      {
>> +         char *buf = tt_static_buf();
>> +         tt_uuid_to_string(field.uuidval, buf);
>> +         str = buf;
>> +         len = UUID_STR_LEN;
>> +         break;
>> +      }
>>       default:
>> -	 assert(0); /* checked by luaL_checkfield() */
>> +         assert(0); /* checked by luaL_checkfield() */
> 
> 10. Better avoid the unnecessary diff.
> 
>>       }
>>       break;
>>     }
>> 


--
Serge Petrenko
sergepetrenko at tarantool.org


More information about the Tarantool-patches mailing list