[Tarantool-patches] [PATCH 4/5] error: provide MP_ERROR extension serializer

Tue May 12 23:38:25 MSK 2020

Thanks for the review!

On 12/05/2020 19:52, Cyrill Gorcunov wrote:
> On Tue, May 12, 2020 at 01:45:51AM +0200, Vladislav Shpilevoy wrote:
>> Msgpuck functions mp_snprint() and mp_fprint() now support
>> customizable MP_EXT serializer. This patch adds such for MP_ERROR.
>> All extension serializers will be activated in a separate commit.
>>
>> Part of #4719
>> ---
>>  src/box/mp_error.cc       | 161 ++++++++++++++++++++++-
>>  src/box/mp_error.h        |  29 ++++
>>  test/unit/mp_error.cc     | 270 +++++++++++++++++++++++++++++++++++++-
>>  test/unit/mp_error.result |  72 +++++++++-
>>  4 files changed, 529 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/box/mp_error.cc b/src/box/mp_error.cc
>> index 0491a7a50..fed2ce288 100644
>> --- a/src/box/mp_error.cc
>> +++ b/src/box/mp_error.cc
>> @@ -552,3 +555,159 @@ error_unpack_unsafe(const char **data)
>>  	}
>>  	return err;
>>  }
>> +
>> +#define MP_ERROR_PRINT_DEFINITION
>> +#define MP_PRINT_FUNC snprintf
>> +#define MP_PRINT_SUFFIX snprint
>> +#define MP_PRINT_2(total, func, ...)						\
>> +	SNPRINT(total, func, buf, size, __VA_ARGS__)
>> +#define MP_PRINT_ARGS_DECL char *buf, int size
>> +#include __FILE__
> 
> Don't get it -- include current file?! This a good sign of
> some kind of problem in the code structure... If we need
> a template then it should be some .cc/.c file explicily
> included instead of __FILE__, no?

I don't want to introduce a new file just for these 2 functions
(mp_fprint_error() and mp_snprint_error()). This would be even
worse than it is now. So I used the same file. I wrote __FILE__
instead of mp_error.cc, so as to minimize diff, when mp_error.cc
will be converted to C and moved to lib/core (that is planned).
Also #include __FILE__ is more clear, when you want to say
"include self", IMO.

Alternative is to make mp_print_error_one(), mp_print_error_stack(),
mp_print_error() as huge macros, but I don't like writing big macros.
It is hard to edit, and much much harder to debug, since this all
becomes just one infinite line, and 'n' in gdb/lldb skips the whole
function.

If you don't like the whole template idea, there is one alternative
I described in the cover letter for the msgpuck patchset - virtual
printer. But it will be really slow. We could probably try to make it
buffered, but then it will be double-copying, also slow. I tried really
hard to find other solutions, but failed. Even looked at fmemopen(),
so as we would need to implement only mp_fprint(), but msgpuck library
is supposed to work on Windows too. Also fmemopen() is basically a
virtual printer as well (i.e. is slower, than direct snprint()).

So if you have anything in mind except templates, or a virtual printer,
or you know something about fmemopen() what I don't know, please, tell.
Because I don't like these solutions, but just couldn't find anything
better, with bearable complexity worth the feature. Which I thought would
be just one-day patch, but I was very very wrong here.

I could switch to the virtual printer in case you think its slowness is
worth the simplification.

As unbearable level of complexity I mean the non-binary tree solution
I proposed in the msgpuck patchset's cover letter.

>> +#define MP_ERROR_PRINT_DEFINITION
>> +#define MP_PRINT_FUNC fprintf
>> +#define MP_PRINT_SUFFIX fprint
>> +#define MP_PRINT_2(total, func, ...) do {							\
>> +	int bytes = func(file, __VA_ARGS__);					\
>> +	if (bytes < 0)								\
>> +		return -1;							\
>> +	total += bytes;								\
>> +} while (0)
>> +#define MP_PRINT_ARGS_DECL FILE *file
>> +#include __FILE__
>> +
>> +/* !defined(MP_ERROR_PRINT_DEFINITION) */
>> +#else
>> +/* defined(MP_ERROR_PRINT_DEFINITION) */
>> +
>> +/**
>> + * MP_ERROR extension string serializer.
>> + * There are two applications for string serialization - into a
>> + * buffer, and into a file. Structure of both is exactly the same
>> + * except for the copying/writing itself. To avoid code
>> + * duplication the code is templated and expects some macros to do
>> + * the actual output.
>> + */
>> +
>> +#define MP_CONCAT4_R(a, b, c, d) a##b##c##d
>> +#define MP_CONCAT4(a, b, c, d) MP_CONCAT4_R(a, b, c, d)
>> +#define MP_PRINT(total, ...) MP_PRINT_2(total, MP_PRINT_FUNC, __VA_ARGS__)
>> +
>> +#define mp_func_name(name) MP_CONCAT4(mp_, MP_PRINT_SUFFIX, _, name)
>> +#define mp_print_error_one mp_func_name(error_one)
>> +#define mp_print_error_stack mp_func_name(error_stack)
>> +#define mp_print_error mp_func_name(error)
>> +#define mp_print_common mp_func_name(recursion)
> 
> Maybe we should align the assignments to be able to read it, like

I am pretty much able to read it as is, and moreover our code style
does not contain any kinds of such alignments. I explained that all
already in public chats, in private, in emails. Not going to describe
pros and cons again.

I applied the alignment, and faced the problems right away - with equal
alignment for MP_CONCAT4_R, MP_CONCAT4, and MP_PRINT one of the lines
becomes too big, out of 80 symbols. On the other hand with not equal
alignments it would not look better than now.

The mp_func_name() definitions with +2 tabs fall out of 80 as well, but
with +1 the space is too short - any a bit longer name will violate
the alignment in future causing either big unnecessary diff or alignment
violation.

> #define MP_CONCAT4_R(a, b, c, d)	a##b##c##d
> #define MP_CONCAT4(a, b, c, d)		MP_CONCAT4_R(a, b, c, d)
> #define MP_PRINT(total, ...)		MP_PRINT_2(total, MP_PRINT_FUNC, __VA_ARGS__)
> 
> #define mp_func_name(name)	MP_CONCAT4(mp_, MP_PRINT_SUFFIX, _, name)
> #define mp_print_error_one	mp_func_name(error_one)
> #define mp_print_error_stack	mp_func_name(error_stack)
> #define mp_print_error		mp_func_name(error)
> #define mp_print_common		mp_func_name(recursion)
> 
>> +
>> +static int
>> +mp_print_error_one(MP_PRINT_ARGS_DECL, const char **data, int depth)
>> +{
>> +	int total = 0;
>> +	MP_PRINT(total, "{");
>> +	if (depth <= 0) {
>> +		MP_PRINT(total, "...}");
>> +		return total;
>> +	}
>> +	const char *field_to_key[MP_ERROR_MAX] = {
>> +		/* MP_ERROR_TYPE = */ "\"type\": ",
>> +		/* MP_ERROR_FILE = */ "\"file\": ",
>> +		/* MP_ERROR_LINE = */ "\"line\": ",
>> +		/* MP_ERROR_MESSAGE = */ "\"message\": ",
>> +		/* MP_ERROR_ERRNO = */ "\"errno\": ",
>> +		/* MP_ERROR_CODE = */ "\"code\": ",
>> +		/* MP_ERROR_FIELDS = */ "\"fields\": ",
>> +	};
> 
> Please use designited initializers in a readable way
> instead of this bloody mess
> 
> 	const char *field_to_key[] = {
> 		[MP_ERROR_TYPE]		= "\"type\": ",
> 		...
> 		[MP_ERROR_FILE]		= ...,
> 		...
> 		[MP_ERROR_FIELDS]	= "\"fields\": ",
> 	}

Once again - this is not a mess, this is our codestyle. If you
don't like it, you should talk to somebody responsible for our
SOP to change it. We can't write our code in individual code
styles, and silently hope to push the old style away eventually.
If you want changes, they need to go through our documentation
and the responsible person. This is Kirill Y., as I remember.

Talking of [<number>] = <value> initializer - I didn't know you
can do that in C. Cool, thanks. Now it should be safer.

I applied the alignments, but I won't push it into the master until,
of course, the branch gets all ACKs, *and* until we agree to either
align all new code, or don't align such places at all. In the latter
case I will return the old version.

====================

diff --git a/src/box/mp_error.cc b/src/box/mp_error.cc
index fed2ce288..0b5e6bc96 100644
--- a/src/box/mp_error.cc
+++ b/src/box/mp_error.cc
@@ -589,15 +589,16 @@ error_unpack_unsafe(const char **data)
  * the actual output.
  */
 
-#define MP_CONCAT4_R(a, b, c, d) a##b##c##d
-#define MP_CONCAT4(a, b, c, d) MP_CONCAT4_R(a, b, c, d)
-#define MP_PRINT(total, ...) MP_PRINT_2(total, MP_PRINT_FUNC, __VA_ARGS__)
+#define MP_CONCAT4_R(a, b, c, d)	a##b##c##d
+#define MP_CONCAT4(a, b, c, d)		MP_CONCAT4_R(a, b, c, d)
+#define MP_PRINT(total, ...)		MP_PRINT_2(total, MP_PRINT_FUNC,	\
+						   __VA_ARGS__)
 
-#define mp_func_name(name) MP_CONCAT4(mp_, MP_PRINT_SUFFIX, _, name)
-#define mp_print_error_one mp_func_name(error_one)
-#define mp_print_error_stack mp_func_name(error_stack)
-#define mp_print_error mp_func_name(error)
-#define mp_print_common mp_func_name(recursion)
+#define mp_func_name(name)	MP_CONCAT4(mp_, MP_PRINT_SUFFIX, _, name)
+#define mp_print_error_one	mp_func_name(error_one)
+#define mp_print_error_stack	mp_func_name(error_stack)
+#define mp_print_error		mp_func_name(error)
+#define mp_print_common		mp_func_name(recursion)
====================

I don't know why the line above is screwed. In source it looks ok.
Probably due to '+' in the beginning, added by git. One another
reason against alignments - they will look shitty in git diff, show,
etc.

====================
 
 static int
 mp_print_error_one(MP_PRINT_ARGS_DECL, const char **data, int depth)
@@ -609,13 +610,13 @@ mp_print_error_one(MP_PRINT_ARGS_DECL, const char **data, int depth)
 		return total;
 	}
 	const char *field_to_key[MP_ERROR_MAX] = {
-		/* MP_ERROR_TYPE = */ "\"type\": ",
-		/* MP_ERROR_FILE = */ "\"file\": ",
-		/* MP_ERROR_LINE = */ "\"line\": ",
-		/* MP_ERROR_MESSAGE = */ "\"message\": ",
-		/* MP_ERROR_ERRNO = */ "\"errno\": ",
-		/* MP_ERROR_CODE = */ "\"code\": ",
-		/* MP_ERROR_FIELDS = */ "\"fields\": ",
+		[MP_ERROR_TYPE] =	"\"type\": ",
+		[MP_ERROR_FILE] =	"\"file\": ",
+		[MP_ERROR_LINE] =	"\"line\": ",
+		[MP_ERROR_MESSAGE] =	"\"message\": ",
+		[MP_ERROR_ERRNO] =	"\"errno\": ",
+		[MP_ERROR_CODE] =	"\"code\": ",
+		[MP_ERROR_FIELDS] =	"\"fields\": ",
 	};
 	--depth;
 	if (mp_typeof(**data) != MP_MAP)