From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 1B2AF247CA for ; Mon, 21 Jan 2019 13:14:52 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QEAF5QmJ9taM for ; Mon, 21 Jan 2019 13:14:51 -0500 (EST) Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 6C8B7247BF for ; Mon, 21 Jan 2019 13:14:51 -0500 (EST) Date: Mon, 21 Jan 2019 21:14:50 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH v2] add optional 'chars' param to string.strip functions Message-ID: <20190121181448.hh4t7kvv45tc7ldu@tkn_work_nb> References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: =?utf-8?Q?Micha=C5=82?= Durak Cc: "tarantool-patches@freelists.org" Hi! Add the 'lua: ' prefix to the commit message header and try to fit it in 50 symbols, maybe like so: > lua: add 'chars' param to string.strip functions The Lua part looks okay, the test looks okay. See comments re C part below. WBR, Alexander Turenko. On Thu, Jan 17, 2019 at 09:35:20PM +0000, MichaƂ Durak wrote: > Add optional 'chars' parameter to string.strip, string.lstrip > and string.rstrip for specifying the unwanted characters. > Behavior modeled after the equivalent Python built-ins. > > Needed for: #2977 Please, change to 'Fixes #2977' or 'Closes #2977' to auto-close the commit. Also it would be great if you would write docbot comment to allow the tarantool documentation team to catch the new task when your commit will land into 2.1 branch. Docs and example: https://github.com/tarantool/docbot/blob/master/README.md https://github.com/tarantool/tarantool/commit/dfa7a61d53fd8ad1d5aabb8138028eb121a40f61 > --- > > FreeLists seems to redact the email addresses of senders using ProtonMail. > Please direct any replies to gdrbyko1[at]protonmail[dot]com > branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977 > issue: https://github.com/tarantool/tarantool/issues/2977 > diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt > index 04de5ad04..5ff4094ee 100644 > --- a/src/CMakeLists.txt > +++ b/src/CMakeLists.txt > @@ -109,6 +109,7 @@ set (core_sources > coll_def.c > mpstream.c > port.c > + tt_string.c Move it to src/lua/string.[ch]. > diff --git a/src/lua/string.lua b/src/lua/string.lua > index cbce26b35..35bd95343 100644 > --- a/src/lua/string.lua > +++ b/src/lua/string.lua > @@ -6,15 +6,23 @@ ffi.cdef[[ > const char *needle, size_t needle_len); > int memcmp(const char *mem1, const char *mem2, size_t num); > int isspace(int c); > + void > + string_strip_helper(const char *inp, unsigned long inp_len, > + const char *chars, unsigned long chars_len, > + bool lstrip, bool rstrip, > + unsigned long *newstart, unsigned long *newlen); > ]] Don't forget to change definition here after size_t changes (see below). > diff --git a/src/tt_string.c b/src/tt_string.c > new file mode 100644 > index 000000000..a3cd4ff94 > --- /dev/null > +++ b/src/tt_string.c Please, use tabs for indent. Consider out C style guide: https://www.tarantool.io/en/doc/1.10/dev_guide/c_style_guide/ > +unsigned long > +string_lstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr) > +{ 1. It seems that it is usual to use size_t for a string length. At least lua_tolstring() and strlen() use them. It seems that we should use it too. 2. We can make this and the following function static (and also remove them from a header). 3. I think a comment would be helpful, at least what is a return value. 4. I think we can name it like lstrip_helper() to highlight the fact that it a) is file-level (static); b) has different API from string_strip_helper(). > + unsigned long i; > + for (i = 0; i < inp_len; ++i) { > + unsigned c = inp[i]; Maybe this should be unsigned char? > + if (arr[c] == 0) > + break; > + } > + return i; > +} > + > +unsigned long > +string_rstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr) > +{ > + unsigned long i; > + for (i = inp_len - 1; i != (unsigned long)(-1); --i) { > + unsigned c = inp[i]; > + if (arr[c] == 0) > + break; > + } > + return inp_len - i - 1; > +} The same questions as for string_lstrip_helper are applicable here. > +void > +string_strip_helper(const char *inp, unsigned long inp_len, const char *chars, > + unsigned long chars_len, bool lstrip, bool rstrip, > + unsigned long *newstart, unsigned long *newlen) > +{ > + unsigned long skipped_from_left = 0; > + unsigned long skipped_from_right = 0; > + uint8_t arr[256] = {0}; > + for (unsigned long i = 0; i < chars_len; ++i) { > + unsigned c = chars[i]; > + arr[c] = 1; > + } > + > + if (lstrip) > + skipped_from_left = string_lstrip_helper(inp, inp_len, arr); > + if (rstrip && skipped_from_left < inp_len - 1) This condition is not quite obvious. Can you please comment it? > + skipped_from_right = string_rstrip_helper(inp, inp_len, arr); > + > + *newstart = skipped_from_left; > + *newlen = inp_len - skipped_from_left - skipped_from_right; > +} > +#include > +#include > + > +#if defined(__cplusplus) > +extern "C" { > +#endif /* defined(__cplusplus) */ > + > +unsigned long > +string_lstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr); > + > +unsigned long > +string_rstrip_helper(const char *inp, unsigned long inp_len, uint8_t *arr); > + These function can be removed. > +/** \cond public */ > + > +void > +string_strip_helper(const char *inp, unsigned long inp_len, const char *chars, > + unsigned long chars_len, bool lstrip, bool rstrip, > + unsigned long *newstart, unsigned long *newlen); Tabs correctly used at line 51, but spaces are at line 52. Please, follow our C style guide (it is very similar to Linux Kernel style). Please add doxygen-style comments (what a function do, what are parameters and return values). Note: comments should fit into 66 symbols.