[tarantool-patches] Re: [PATCH v1 1/1] implement mp_stack class

Alexander Turenko alexander.turenko at tarantool.org
Mon Jan 21 14:25:42 MSK 2019


Minor comments are below, they are on code comments and the test.

If you doubt on some wording comments ask me and engage Vladimir into
discussion.

WBR, Alexander Turenko.

On Fri, Jan 18, 2019 at 01:32:28PM +0300, Kirill Shcherbatov wrote:
> Introduced a new mp_stack class to make complex msgpack parsing.
> The structure is needed in order to facilitate the decoding
> nested MP_ARRAY and MP_MAP msgpack containers without recursion.
> This allows you to determine that the parsing of the current
> container is complete.
> 
> Needed for #1012
> ---
>  msgpuck.h      | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
>  test/msgpuck.c |  79 +++++++++++++++++++++++++++-
>  2 files changed, 214 insertions(+), 1 deletion(-)
> 
> diff --git a/msgpuck.h b/msgpuck.h
> index b0d4967..b9bbe78 100644
> --- a/msgpuck.h
> +++ b/msgpuck.h
> @@ -359,6 +359,45 @@ enum mp_type {
>  	MP_EXT
>  };
>  
> +/** Message pack MP_MAP or MP_ARRAY container descriptor. */
> +struct mp_frame {
> +	/** MP frame type calculated with mp_typeof(). */
> +	enum mp_type type;
> +	/**
> +	 * Total items in MP_MAP or MP_ARRAY container
> +	 * calculated with mp_decode_map() or mp_decode_array().
> +	 */
> +	uint32_t size;
> +	/**
> +	 * Count of items already processed. Must be less or
> +	 * equal to mp_frame::size member.
> +	 */
> +	uint32_t curr;
> +};
> +
> +/**
> + * Stack of service contexts mp_frame to do complex msgpack parse.

Here we introduce the term 'service context', but don't describe it
anywhere. I think it would be better to say 'Stack of map/array
descriptors mp_frame' -- reuse the previously introduced term.

> + * The structure is needed in order to facilitate the decoding
> + * nested MP_ARRAY and MP_MAP msgpack containers without
> + * recursion. This allows you to determine that the parsing of
> + * the current container is complete.
> +*/
> +struct mp_stack {
> +	/** The maximum stack depth. */
> +	uint32_t size;
> +	/**
> +	 * Count of used stack frames. Corresponds to the index
> +	 * in the array to perform the push operation. Must be
> +	 * less or equal to mp_stack::size member.
> +	 */
> +	uint32_t used;
> +	/**
> +	 * Array of mp_frames descriptors having at least
> +	 * mp_stack::size items.

I would simplify the clause like so:

> Array of size mp_stack::size of mp_frames.

> +	 */
> +	struct mp_frame *frames;
> +};
> +
>  /**
>   * \brief Determine MsgPack type by a first byte \a c of encoded data.
>   *
> @@ -1184,6 +1223,103 @@ mp_check(const char **data, const char *end);
>   * {{{ Implementation
>   */
>  
> +/**
> + * \brief Initialize mp_stack \a stack of specified size \a size
> + * and user-allocated array \a frames.
> + * The \a frames allocation must have at least \a size mp_frame
> + * items.
> + * \param stack - the pointer to a mp_stack to initialize.
> + * \param size - stack size, count of stack::frames to use.
> + * \param frames - mp_frame service context allocation to use,
> + *                 must have at least \a size items.

I would use the 'descriptor' term you introduce above or some other, but
not two different terms for one object. I'm about using 'service
context' here.

Also I thought the term 'allocation' is a process of obtaining memory,
not a result. I would formulate it as: 'preallocated array of size \a
size of struct mp_frame items'.

> + */
> +MP_PROTO void
> +mp_stack_init(struct mp_stack *stack, uint32_t size, struct mp_frame *frames)
> +{
> +	stack->frames = frames;
> +	stack->size = size;
> +	stack->used = 0;
> +}
> +
> +/**
> + * \brief Test if mp_stack \a stack is empty.
> + * \param stack - the pointer to a mp_stack to a stack to test.
> + */
> +MP_PROTO bool
> +mp_stack_is_empty(struct mp_stack *stack)
> +{
> +	return stack->used == 0;
> +}
> +
> +/**
> + * \brief Test if mp_stack \a stack is full.
> + * \param stack - the pointer to a mp_stack to a stack to test.
> + */
> +MP_PROTO bool
> +mp_stack_is_full(struct mp_stack *stack)
> +{
> +	return stack->used >= stack->size;
> +}
> +
> +/**
> + * \brief Pop the top mp_stack \a stack frame.
> + * The \a stack must not be empty.
> + * \param stack - the pointer to a mp_stack to operate with.
> + */
> +MP_PROTO void
> +mp_stack_pop(struct mp_stack *stack)
> +{
> +	assert(!mp_stack_is_empty(stack));
> +	--stack->used;
> +}
> +
> +/**
> + * \brief Get a pointer to a top mp_stack \a stack frame.
> + * \param stack - the pointer to a mp_stack to operate with.
> + * The \a stack must not be empty.
> + * \retval a pointer to a top stack frame.
> + */
> +MP_PROTO struct mp_frame *
> +mp_stack_top(struct mp_stack *stack)
> +{
> +	assert(!mp_stack_is_empty(stack));
> +	return &stack->frames[stack->used - 1];
> +}
> +
> +/**
> + * \brief Construct a new mp_frame and push it on to mp_stack
> + * \a stack. The \a stack must not be full.
> + * \param stack - the pointer to a stack to operate with.
> + * \param type - the type of mp_frame to create.
> + * \param size - size size of mp_frame to create.
> + * \retval a pointer to a top stack frame.
> + */
> +MP_PROTO void
> +mp_stack_push(struct mp_stack *stack, enum mp_type type, uint32_t size)
> +{

Retval is void.

> +	char tuple[1024];
> +	char *wptr = tuple;
> +	const char *descr = "{\"first\", {\"key1\": \"one\", \"key2\": {2}}, {3}}";

1. The line is too long.
2. Now it is changed: { -> [ and } -> ], but it still is not correct.
   JSON encodes maps with {} and arrays with [].

> +	wptr = mp_encode_array(wptr, 3);
> +	wptr = mp_encode_str(wptr, "first", 5);
> +	wptr = mp_encode_map(wptr, 2);
> +	wptr = mp_encode_str(wptr, "key1", 4);
> +	wptr = mp_encode_str(wptr, "one", 3);
> +	wptr = mp_encode_str(wptr, "key2", 4);
> +	wptr = mp_encode_array(wptr, 1);
> +	wptr = mp_encode_uint(wptr, 2);
> +	wptr = mp_encode_array(wptr, 1);
> +	wptr = mp_encode_uint(wptr, 3);
> +	const char *rptr = tuple;
> +	char buff[1024];
> +	wptr = buff;
> +
> +	struct mp_frame mp_frames[3];
> +	struct mp_stack mp_stack;
> +	mp_stack_init(&mp_stack, 3, mp_frames);

I would move the converter into a separate function and use it here.

> +		enum mp_type type = mp_typeof(*rptr);
> +		if (type == MP_ARRAY || type == MP_MAP) {
> +			uint32_t size = type == MP_ARRAY ?
> +					mp_decode_array(&rptr) :
> +					mp_decode_map(&rptr);
> +			mp_stack_push(&mp_stack, type, size);
> +			wptr += sprintf(wptr, "{");
> +		} else {
> +			if (type == MP_STR) {
> +				const char *str = mp_decode_str(&rptr, &len);
> +				wptr += sprintf(wptr, "\"%.*s\"", len, str);
> +			} else if (type == MP_UINT) {
> +				uint64_t num = mp_decode_uint(&rptr);
> +				wptr += sprintf(wptr, "%u", (unsigned)num);
> +			} else {
> +				mp_unreachable();
> +			}
> +			if (mp_stack_top_is_internal_item(&mp_stack))
> +				wptr += sprintf(wptr, ", ");
> +		}

I think this block would look better if we'll rewrite it as switch.



More information about the Tarantool-patches mailing list