From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Mon, 29 Apr 2019 18:41:14 +0300 From: Vladimir Davydov Subject: Re: [PATCH v4 1/3] box: introduce tuple_parse_iterator class Message-ID: <20190429154114.y3wbgg3ieliasigm@esperanza> References: <57c7d98b69f64abdb1bec67aa837e6c2245f02c2.1555682707.git.kshcherbatov@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57c7d98b69f64abdb1bec67aa837e6c2245f02c2.1555682707.git.kshcherbatov@tarantool.org> To: Kirill Shcherbatov Cc: tarantool-patches@freelists.org, kostja@tarantool.org List-ID: The subject is stale: the iterator is called tuple_format_iterator now. Anyway, I'm not quite satisfied with the API: - I don't like that you pass some parameters in argument list while others (multikey_frame, stack) are accessed directly. This looks confusing. - Also, sometimes you use frame->type while sometimes field->token.type to determine if a field is array/map or a leaf. Again, inconsistent and hence confusing. - Returning MULTIKEY_POP to let the caller know that the iterator's done parsing a multikey entry looks kinda unnatural to me :-/ - There's no need to pass key_part_only to each next() invocation - you could as well pass it once on iterator creation. Here's what I'd try to do: - Fold validation in the iterator. After all, we can't reliably iterate over a malformed tuple, right? I'd allow to pass flags to the iterator constructor: VALIDATE // ensure the tuple conforms to the format KEY_PARTS_ONLY // skip fields that aren't indexed If VALIDATE was set, the iterator would allocate field bitmaps and check that all required fields were present in the tuple as it walked over it. It would check field types as well in this case. This would allow us to get rid of the awkward MULTIKEY_POP flag. - Rather than returning field + pos/end_pos and accessing other fields directly, I'd fill an auxiliary struct instead with a bit of commentary to make the iterator protocol more transparent: struct tuple_format_iterator_entry { // Pointer to the field data. NULL if EOF. void *data; // For leaf fields only: pointer to the data end. void *data_end; // For intermediate fields only (array/map): // number of child entries. int count; // Field metadata. May be NULL if the field isn't // present in the format. // (We return all child entries of an array present // in the format, no matter formatted or not.) struct tuple_field *field; // Parent field metadata. Cannot be NULL. struct tuple_field *parent; // Number of multikey items. int multikey_count; // Index in the multikey array. int multikey_idx; }; int tuple_format_iterator_next(struct tuple_format_iterator *it, struct tuple_format_iterator_entry *entry); Looking at this struct, a reader would quickly figure out what fields are used and what not. Currently, it's impossible without diving deep into tuple_format_iterator_next implementation. - Please avoid branching at top levels, because it will lead to code duplication between vy_stmt_new_surrogate_delete and tuple_field_map_create. For instance, there should be the only method for setting a field map offset in a builder: field_map_builder_set_slot(builder, offset_slot, offset, multikey_idx, multikey_count); Simply pass multikey_idx = -1 to it for non-multikey indexes and let it decide what to do. You'll need to patch tuple_format_iterator_next so that it conforms to this protocol. I guess we need to make vy_stmt_new_surrogate_delete() use field map builder as well in the scope of this patch - it looks like without fixing it we won't be able to come up with a good tuple_format_iterator API. BTW tuple_format_iterator_next is soo huuge. Why not move it from the header to the C file? Also, please look through your comments with a spell checker enabled: there are a few typos that need to be fixed.