[tarantool-patches] Re: [PATCH v2] add optional 'chars' param to string.strip functions

Alexander Turenko alexander.turenko at tarantool.org
Mon Jan 21 21:14:50 MSK 2019


Hi!

Add the 'lua: ' prefix to the commit message header and try to fit it in
50 symbols, maybe like so:

> lua: add 'chars' param to string.strip functions

The Lua part looks okay, the test looks okay. See comments re C part
below.

WBR, Alexander Turenko.

On Thu, Jan 17, 2019 at 09:35:20PM +0000, Michał Durak wrote:
> Add optional 'chars' parameter to string.strip, string.lstrip
> and string.rstrip for specifying the unwanted characters.
> Behavior modeled after the equivalent Python built-ins.
> 
> Needed for: #2977

Please, change to 'Fixes #2977' or 'Closes #2977' to auto-close the
commit.

Also it would be great if you would write docbot comment to allow the
tarantool documentation team to catch the new task when your commit will
land into 2.1 branch. Docs and example:

https://github.com/tarantool/docbot/blob/master/README.md
https://github.com/tarantool/tarantool/commit/dfa7a61d53fd8ad1d5aabb8138028eb121a40f61

> ---
> 
> FreeLists seems to redact the email addresses of senders using ProtonMail.
> Please direct any replies to gdrbyko1[at]protonmail[dot]com
> branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977
> issue:  https://github.com/tarantool/tarantool/issues/2977

> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 04de5ad04..5ff4094ee 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -109,6 +109,7 @@ set (core_sources
>       coll_def.c
>       mpstream.c
>       port.c
> +     tt_string.c

Move it to src/lua/string.[ch].

> diff --git a/src/lua/string.lua b/src/lua/string.lua
> index cbce26b35..35bd95343 100644
> --- a/src/lua/string.lua
> +++ b/src/lua/string.lua
> @@ -6,15 +6,23 @@ ffi.cdef[[
>             const char *needle,   size_t needle_len);
>      int memcmp(const char *mem1, const char *mem2, size_t num);
>      int isspace(int c);
> +    void
> +    string_strip_helper(const char *inp, unsigned long inp_len,
> +                        const char *chars, unsigned long chars_len,
> +                        bool lstrip, bool rstrip,
> +                        unsigned long *newstart, unsigned long *newlen);
>  ]]

Don't forget to change definition here after size_t changes (see below).

> diff --git a/src/tt_string.c b/src/tt_string.c
> new file mode 100644
> index 000000000..a3cd4ff94
> --- /dev/null
> +++ b/src/tt_string.c

Please, use tabs for indent. Consider out C style guide:
https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/

> +unsigned long
> +string_lstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr)
> +{

1. It seems that it is usual to use size_t for a string length. At least
   lua_tolstring() and strlen() use them. It seems that we should use it
   too.

2. We can make this and the following function static (and also remove
   them from a header).

3. I think a comment would be helpful, at least what is a return value.

4. I think we can name it like lstrip_helper() to highlight the fact
   that it a) is file-level (static); b) has different API from
   string_strip_helper().

> +        unsigned long i;
> +        for (i = 0; i < inp_len; ++i) {
> +                unsigned c = inp[i];

Maybe this should be unsigned char?

> +                if (arr[c] == 0)
> +                        break;
> +        }
> +        return i;
> +}
> +
> +unsigned long
> +string_rstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr)
> +{
> +        unsigned long i;
> +        for (i = inp_len - 1; i != (unsigned long)(-1); --i) {
> +                unsigned c = inp[i];
> +                if (arr[c] == 0)
> +                        break;
> +        }
> +        return inp_len - i - 1;
> +}

The same questions as for string_lstrip_helper are applicable here.

> +void
> +string_strip_helper(const char *inp, unsigned long inp_len, const char *chars,
> +		    unsigned long chars_len, bool lstrip, bool rstrip,
> +                    unsigned long *newstart, unsigned long *newlen)
> +{
> +        unsigned long skipped_from_left = 0;
> +        unsigned long skipped_from_right = 0;
> +        uint8_t arr[256] = {0};
> +        for (unsigned long i = 0; i < chars_len; ++i) {
> +                unsigned c = chars[i];
> +                arr[c] = 1;
> +        }
> +
> +        if (lstrip)
> +                skipped_from_left = string_lstrip_helper(inp, inp_len, arr);
> +        if (rstrip && skipped_from_left < inp_len - 1)

This condition is not quite obvious. Can you please comment it?

> +                skipped_from_right = string_rstrip_helper(inp, inp_len, arr);
> +
> +        *newstart = skipped_from_left;
> +        *newlen = inp_len - skipped_from_left - skipped_from_right;
> +}

> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif /* defined(__cplusplus) */
> +
> +unsigned long
> +string_lstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr);
> +
> +unsigned long
> +string_rstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr);
> +

These function can be removed.

> +/** \cond public */
> +
> +void
> +string_strip_helper(const char *inp, unsigned long inp_len, const char *chars,
> +		    unsigned long chars_len, bool lstrip, bool rstrip,
> +                    unsigned long *newstart, unsigned long *newlen);

Tabs correctly used at line 51, but spaces are at line 52. Please,
follow our C style guide (it is very similar to Linux Kernel style).

Please add doxygen-style comments (what a function do, what are
parameters and return values). Note: comments should fit into 66
symbols.




More information about the Tarantool-patches mailing list