* [tarantool-patches] [PATCH v3] lua: add 'chars' param to string.strip functions
@ 2019-01-22 18:51 Michał Durak
  2019-02-04  4:47 ` [tarantool-patches] " Alexander Turenko
  0 siblings, 1 reply; 5+ messages in thread
From: Michał Durak @ 2019-01-22 18:51 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches
Subject: [PATCH] lua: add 'chars' param to string.strip functions
Add optional 'chars' parameter to string.strip, string.lstrip
and string.rstrip for specifying the unwanted characters.
Behavior modeled after the equivalent Python built-ins.
Closes: #2977
@TarantoolBot document
Title: string.strip(inp, chars)
Update the documentation for string.strip,
string.lstrip and string.rstrip to reflect
the addition of the optional  param.
---
Thank you for the review. I have fixed all of the issues you pointed out.
I'm sorry about the indentation issues, not sure how I managed to mess up
this badly. I did read the C style guide earlier and I actually did change
it all to tabs but maybe I forgot to save the changes or something.
When you asked about the rstrip condition, I noticed that was wrong in the end,
and it would make string.rstrip("#", "#") incorrectly return "#" rather than
an empty string. I fixed that issue and added relevant test case.
Below follows the whole patch once again, vs the commit fixing my previous
mistakes, but that commit can be found in my github branch if needed.
branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977
issue:  https://github.com/tarantool/tarantool/issues/2977
                 |   1 +
 src/CMakeLists.txt           |   2 +
 src/lua/string.c             | 119 +++++++++++++++++++++++++++++++++++++++++++
 src/lua/string.h             |  55 ++++++++++++++++++++
 src/lua/string.lua           |  63 ++++++++++++++++++++---
 test/app-tap/string.test.lua |  76 +++++++++++++++++++++++----
 6 files changed, 300 insertions(+), 16 deletions(-)
 create mode 100644 src/lua/string.c
 create mode 100644 src/lua/string.h
 --git a/extra/exports b/extra/exports
index 5f69e0730..0b304e4b1 100644
--- a/extra/exports
+++ b/extra/exports
@@ -209,6 +209,7 @@ clock_realtime64
 clock_monotonic64
 clock_process64
 clock_thread64
+string_strip_helper
 # Lua / LuaJIT
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 04de5ad04..905ebcc4e 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -183,6 +183,7 @@ set (server_sources
      lua/httpc.c
      lua/utf8.c
      lua/info.c
+     lua/string.c
      ${lua_sources}
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
@@ -200,6 +201,7 @@ set(api_headers
     ${CMAKE_SOURCE_DIR}/src/coio_task.h
     ${CMAKE_SOURCE_DIR}/src/lua/utils.h
     ${CMAKE_SOURCE_DIR}/src/lua/error.h
+    ${CMAKE_SOURCE_DIR}/src/lua/string.h
     ${CMAKE_SOURCE_DIR}/src/box/txn.h
     ${CMAKE_SOURCE_DIR}/src/box/key_def.h
     ${CMAKE_SOURCE_DIR}/src/box/field_def.h
diff --git a/src/lua/string.c b/src/lua/string.c
new file mode 100644
index 000000000..03c36c262
--- /dev/null
+++ b/src/lua/string.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "string.h"
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <stddef.h>
+
+/*
+ * Implements the lstrip functionality of string_strip_helper.
+ *
+ * @param inp     The input string to perform lstrip on.
+ * @param inp_len The length of the input string.
+ * @param arr     An array of 256 values defining
+ *                the set of characters to strip.
+ * @return the number of characters to strip from the left.
+ */
+static size_t
+lstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
+{
+	size_t i;
+	for (i = 0; i < inp_len; ++i) {
+		unsigned char c = inp[i];
+		if (arr[c] == 0)
+			break;
+	}
+	return i;
+}
+
+/*
+ * Same as lstrip_helper, but for rstrip.
+ *
+ * @param inp     The input string to perform rstrip on.
+ * @param inp_len The length of the input string.
+ * @param arr     An array of 256 values defining
+ *                the set of characters to strip.
+ * @return the number of characters to strip from the right
+ */
+static size_t
+rstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
+{
+	size_t i;
+	for (i = inp_len - 1; i != (size_t)(-1); --i) {
+		unsigned char c = inp[i];
+		if (arr[c] == 0)
+			break;
+	}
+	return inp_len - i - 1;
+}
+
+/*
+ * Perform a combination of lstrip and rstrip on the input string,
+ * and return the position and length of the resulting substring.
+ *
+ * @param[in]  inp       The input string to perform stripping on.
+ * @param[in]  inp_len   The length of the input string.
+ * @param[in]  chars     A string representing the unwanted chars.
+ * @param[in]  chars_len The length of chars
+ * @param[in]  lstrip    Whether to perform lstrip or not.
+ * @param[in]  rstrip    Whether to perform rstrip or not.
+ * @param[out] newstart  The position of the resulting substring
+ *                       in the input string.
+ * @param[out] newlen    The length of the resulting substring.
+ */
+void
+string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+		    size_t chars_len, bool lstrip, bool rstrip,
+		    size_t *newstart, size_t *newlen)
+{
+	size_t skipped_from_left = 0;
+	size_t skipped_from_right = 0;
+	uint8_t arr[256] = {0};
+	for (size_t i = 0; i < chars_len; ++i) {
+		unsigned char c = chars[i];
+		arr[c] = 1;
+	}
+
+	if (lstrip)
+		skipped_from_left = lstrip_helper(inp, inp_len, arr);
+	/*
+	 * perform rstrip if the caller so wishes, but only if there are
+	 * still some characters not yet stripped by lstrip
+	 * for example, with chars "#" and input "###", lstrip and rstrip
+	 * helpers would both return 3, breaking the newlen calculation
+	 */
+	if (rstrip && skipped_from_left < inp_len)
+		skipped_from_right = rstrip_helper(inp, inp_len, arr);
+
+	*newstart = skipped_from_left;
+	*newlen = inp_len - skipped_from_left - skipped_from_right;
+}
diff --git a/src/lua/string.h b/src/lua/string.h
new file mode 100644
index 000000000..464c48031
--- /dev/null
+++ b/src/lua/string.h
@@ -0,0 +1,55 @@
+#ifndef TARANTOOL_STRING_H_INCLUDED
+#define TARANTOOL_STRING_H_INCLUDED
+/*
+ * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <stddef.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+/** \cond public */
+
+void
+string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+		    size_t chars_len, bool lstrip, bool rstrip,
+		    size_t *newstart, size_t *newlen);
+
+/** \endcond public */
+
+#if defined(__cplusplus)
+} /* extern "C" */
+#endif /* defined(__cplusplus) */
+
+#endif /* TARANTOOL_STRING_H_INCLUDED */
diff --git a/src/lua/string.lua b/src/lua/string.lua
index cbce26b35..8216ace6a 100644
--- a/src/lua/string.lua
+++ b/src/lua/string.lua
@@ -6,15 +6,22 @@ ffi.cdef[[
            const char *needle,   size_t needle_len);
     int memcmp(const char *mem1, const char *mem2, size_t num);
     int isspace(int c);
+    void
+    string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+                        size_t chars_len, bool lstrip, bool rstrip,
+                        size_t *newstart, size_t *newlen);
 ]]
-local c_char_ptr = ffi.typeof('const char *')
+local c_char_ptr     = ffi.typeof('const char *')
+local strip_newstart = ffi.new("unsigned long[1]")
+local strip_newlen   = ffi.new("unsigned long[1]")
 local memcmp  = ffi.C.memcmp
 local memmem  = ffi.C.memmem
 local isspace = ffi.C.isspace
 local err_string_arg = "bad argument #%d to '%s' (%s expected, got %s)"
+local space_chars    = ' \t\n\v\f\r'
 local function string_split_empty(inp, maxsplit)
     local p = c_char_ptr(inp)
@@ -339,25 +346,67 @@ local function string_fromhex(inp)
     return ffi.string(res, len)
 end
-local function string_strip(inp)
+local function string_strip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.strip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "^%s*(.-)%s*$", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.strip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, true,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
-local function string_lstrip(inp)
+local function string_lstrip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.lstrip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "^%s*(.-)", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.lstrip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, false,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
-local function string_rstrip(inp)
+local function string_rstrip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.rstrip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "(.-)%s*$", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.rstrip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, false, true,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
index 7203fcd36..02a1a84d7 100755
--- a/test/app-tap/string.test.lua
+++ b/test/app-tap/string.test.lua
@@ -134,18 +134,76 @@ test:test("fromhex", function(test)
 end)
 test:test("strip", function(test)
-    test:plan(6)
-    local str = "  hello hello "
-    test:is(string.len(string.strip(str)), 11, "strip")
-    test:is(string.len(string.lstrip(str)), 12, "lstrip")
-    test:is(string.len(string.rstrip(str)), 13, "rstrip")
+    test:plan(45)
+    local str = "  Hello world! "
+    test:is(string.strip(str), "Hello world!", "strip (without chars)")
+    test:is(string.lstrip(str), "Hello world! ", "lstrip (without chars)")
+    test:is(string.rstrip(str), "  Hello world!", "rstrip (without chars)")
+    str = ""
+    test:is(string.strip(str), str, "strip (0-len inp without chars)")
+    test:is(string.lstrip(str), str, "lstrip (0-len inp without chars)")
+    test:is(string.rstrip(str), str, "rstrip (0-len inp without chars)")
+    str = " "
+    test:is(string.strip(str), "", "strip (1-len inp without chars)")
+    test:is(string.lstrip(str), "", "lstrip (1-len inp without chars)")
+    test:is(string.rstrip(str), "", "rstrip (1-len inp without chars)")
+    str = "\t\v"
+    test:is(string.strip(str), "", "strip (strip everything without chars)")
+    test:is(string.lstrip(str), "", "lstrip (strip everything without chars)")
+    test:is(string.rstrip(str), "", "rstrip (strip everything without chars)")
+    str = "hello"
+    test:is(string.strip(str), str, "strip (strip nothing without chars)")
+    test:is(string.lstrip(str), str, "lstrip (strip nothing without chars)")
+    test:is(string.rstrip(str), str, "rstrip (strip nothing without chars)")
+    str = " \t\n\v\f\rTEST \t\n\v\f\r"
+    test:is(string.strip(str), "TEST", "strip (all space characters without chars)")
+    test:is(string.lstrip(str), "TEST \t\n\v\f\r", "lstrip (all space characters without chars)")
+    test:is(string.rstrip(str), " \t\n\v\f\rTEST", "rstrip (all space characters without chars)")
+
+    local chars = "#\0"
+    str = "##Hello world!#"
+    test:is(string.strip(str, chars), "Hello world!", "strip (with chars)")
+    test:is(string.lstrip(str, chars), "Hello world!#", "lstrip (with chars)")
+    test:is(string.rstrip(str, chars), "##Hello world!", "rstrip (with chars)")
+    str = ""
+    test:is(string.strip(str, chars), str, "strip (0-len inp with chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (0-len inp with chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (0-len inp with chars)")
+    str = "#"
+    test:is(string.strip(str, chars), "", "strip (1-len inp with chars)")
+    test:is(string.lstrip(str, chars), "", "lstrip (1-len inp with chars)")
+    test:is(string.rstrip(str, chars), "", "rstrip (1-len inp with chars)")
+    str = "##"
+    test:is(string.strip(str, chars), "", "strip (strip everything with chars)")
+    test:is(string.lstrip(str, chars), "", "lstrip (strip everything with chars)")
+    test:is(string.rstrip(str, chars), "", "rstrip (strip everything with chars)")
+    str = "hello"
+    test:is(string.strip(str, chars), str, "strip (strip nothing with chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (strip nothing with chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (strip nothing with chars)")
+    str = "\0\0\0TEST\0"
+    test:is(string.strip(str, chars), "TEST", "strip (embedded 0s with chars)")
+    test:is(string.lstrip(str, chars), "TEST\0", "lstrip (embedded 0s with chars)")
+    test:is(string.rstrip(str, chars), "\0\0\0TEST", "rstrip (embedded 0s with chars)")
+    chars = ""
+    test:is(string.strip(str, chars), str, "strip (0-len chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (0-len chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (0-len chars)")
+
     local _, err = pcall(string.strip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
+    test:ok(err and err:match("#1 to '.-%.strip' %(string expected, got number%)"), "strip err 1")
     _, err = pcall(string.lstrip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
+    test:ok(err and err:match("#1 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 1")
     _, err = pcall(string.rstrip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
-end )
+    test:ok(err and err:match("#1 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 1")
+
+    _, err = pcall(string.strip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.strip' %(string expected, got number%)"), "strip err 2")
+    _, err = pcall(string.lstrip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 2")
+    _, err = pcall(string.rstrip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 2")
+end)
 test:test("unicode", function(test)
     test:plan(104)
--
2.11.0
^ permalink raw reply	[flat|nested] 5+ messages in thread
* [tarantool-patches] Re: [PATCH v3] lua: add 'chars' param to string.strip functions
  2019-01-22 18:51 [tarantool-patches] [PATCH v3] lua: add 'chars' param to string.strip functions Michał Durak
@ 2019-02-04  4:47 ` Alexander Turenko
  2019-02-11 19:20   ` Michał Durak
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Turenko @ 2019-02-04  4:47 UTC (permalink / raw)
  To: Michał Durak; +Cc: tarantool-patches
Hi!
Sorry for the late answer.
Some really minor comments are below.
Also I attached a patch at end of the email. I changed the string helper
a bit and I think it looks simpler in that way. Are there any reason
(performance?) why we should not apply this change?
WBR, Alexander Turenko.
> branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977
> issue:  https://github.com/tarantool/tarantool/issues/2977
Please, squash your two commits.
> +/*
> + * Implements the lstrip functionality of string_strip_helper.
> + *
> + * @param inp     The input string to perform lstrip on.
> + * @param inp_len The length of the input string.
> + * @param arr     An array of 256 values defining
> + *                the set of characters to strip.
> + * @return the number of characters to strip from the left.
> + */
Nit: The return value description starts from a lower-case letter. The
same for rstrip_helper().
> +/*
> + * Same as lstrip_helper, but for rstrip.
> + *
> + * @param inp     The input string to perform rstrip on.
> + * @param inp_len The length of the input string.
> + * @param arr     An array of 256 values defining
> + *                the set of characters to strip.
> + * @return the number of characters to strip from the right
Nit: no period at the end.
> +void
> +string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> +		    size_t chars_len, bool lstrip, bool rstrip,
> +		    size_t *newstart, size_t *newlen)
> +{
> +	size_t skipped_from_left = 0;
> +	size_t skipped_from_right = 0;
> +	uint8_t arr[256] = {0};
> +	for (size_t i = 0; i < chars_len; ++i) {
> +		unsigned char c = chars[i];
> +		arr[c] = 1;
> +	}
> +
> +	if (lstrip)
> +		skipped_from_left = lstrip_helper(inp, inp_len, arr);
> +	/*
> +	 * perform rstrip if the caller so wishes, but only if there are
Nit: start from a capital letter.
> +	 * still some characters not yet stripped by lstrip
> +	 * for example, with chars "#" and input "###", lstrip and rstrip
> +	 * helpers would both return 3, breaking the newlen calculation
> +	 */
> +	if (rstrip && skipped_from_left < inp_len)
> +		skipped_from_right = rstrip_helper(inp, inp_len, arr);
> +
> +	*newstart = skipped_from_left;
> +	*newlen = inp_len - skipped_from_left - skipped_from_right;
> +}
> diff --git a/src/lua/string.h b/src/lua/string.h
> new file mode 100644
> index 000000000..464c48031
> --- /dev/null
> +++ b/src/lua/string.h
> @@ -0,0 +1,55 @@
> +#ifndef TARANTOOL_STRING_H_INCLUDED
> +#define TARANTOOL_STRING_H_INCLUDED
Nit: We usually use full path to a file to form a sentry macro name:
TARANTOOL_LUA_STRING_H_INCLUDED.
----
diff --git a/src/lua/string.c b/src/lua/string.c
index 03c36c262..f117f2936 100644
--- a/src/lua/string.c
+++ b/src/lua/string.c
@@ -86,34 +86,31 @@ rstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
  * @param[in]  chars_len The length of chars
  * @param[in]  lstrip    Whether to perform lstrip or not.
  * @param[in]  rstrip    Whether to perform rstrip or not.
- * @param[out] newstart  The position of the resulting substring
- *                       in the input string.
- * @param[out] newlen    The length of the resulting substring.
+ * @param[out] newlen    The length of the resulting string.
+ *
+ * @return The resulting string.
  */
-void
+const char *
 string_strip_helper(const char *inp, size_t inp_len, const char *chars,
-		    size_t chars_len, bool lstrip, bool rstrip,
-		    size_t *newstart, size_t *newlen)
+		    size_t chars_len, bool lstrip, bool rstrip, size_t *newlen)
 {
-	size_t skipped_from_left = 0;
-	size_t skipped_from_right = 0;
 	uint8_t arr[256] = {0};
 	for (size_t i = 0; i < chars_len; ++i) {
 		unsigned char c = chars[i];
 		arr[c] = 1;
 	}
 
-	if (lstrip)
-		skipped_from_left = lstrip_helper(inp, inp_len, arr);
-	/*
-	 * perform rstrip if the caller so wishes, but only if there are
-	 * still some characters not yet stripped by lstrip
-	 * for example, with chars "#" and input "###", lstrip and rstrip
-	 * helpers would both return 3, breaking the newlen calculation
-	 */
-	if (rstrip && skipped_from_left < inp_len)
-		skipped_from_right = rstrip_helper(inp, inp_len, arr);
+	if (lstrip) {
+		size_t skipped_from_left = lstrip_helper(inp, inp_len, arr);
+		inp = inp + skipped_from_left;
+		inp_len = inp_len - skipped_from_left;
+	}
+
+	if (rstrip) {
+		size_t skipped_from_right = rstrip_helper(inp, inp_len, arr);
+		inp_len = inp_len - skipped_from_right;
+	}
 
-	*newstart = skipped_from_left;
-	*newlen = inp_len - skipped_from_left - skipped_from_right;
+	*newlen = inp_len;
+	return inp;
 }
diff --git a/src/lua/string.h b/src/lua/string.h
index 464c48031..448f551f6 100644
--- a/src/lua/string.h
+++ b/src/lua/string.h
@@ -41,10 +41,9 @@ extern "C" {
 
 /** \cond public */
 
-void
+const char *
 string_strip_helper(const char *inp, size_t inp_len, const char *chars,
-		    size_t chars_len, bool lstrip, bool rstrip,
-		    size_t *newstart, size_t *newlen);
+		    size_t chars_len, bool lstrip, bool rstrip,	size_t *newlen);
 
 /** \endcond public */
 
diff --git a/src/lua/string.lua b/src/lua/string.lua
index 8216ace6a..0c47030f1 100644
--- a/src/lua/string.lua
+++ b/src/lua/string.lua
@@ -6,14 +6,13 @@ ffi.cdef[[
            const char *needle,   size_t needle_len);
     int memcmp(const char *mem1, const char *mem2, size_t num);
     int isspace(int c);
-    void
+    const char *
     string_strip_helper(const char *inp, size_t inp_len, const char *chars,
                         size_t chars_len, bool lstrip, bool rstrip,
-                        size_t *newstart, size_t *newlen);
+                        size_t *newlen);
 ]]
 
 local c_char_ptr     = ffi.typeof('const char *')
-local strip_newstart = ffi.new("unsigned long[1]")
 local strip_newlen   = ffi.new("unsigned long[1]")
 
 local memcmp  = ffi.C.memcmp
@@ -362,9 +361,9 @@ local function string_strip(inp, chars)
     end
 
     local casted_inp = c_char_ptr(inp)
-    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, true,
-                              strip_newstart, strip_newlen)
-    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
+    local newstr = ffi.C.string_strip_helper(inp, #inp, chars, #chars, true,
+                                             true, strip_newlen)
+    return ffi.string(newstr, strip_newlen[0])
 end
 
 local function string_lstrip(inp, chars)
@@ -383,9 +382,9 @@ local function string_lstrip(inp, chars)
     end
 
     local casted_inp = c_char_ptr(inp)
-    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, false,
-                              strip_newstart, strip_newlen)
-    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
+    local newstr = ffi.C.string_strip_helper(inp, #inp, chars, #chars, true,
+                                             false, strip_newlen)
+    return ffi.string(newstr, strip_newlen[0])
 end
 
 local function string_rstrip(inp, chars)
@@ -404,9 +403,9 @@ local function string_rstrip(inp, chars)
     end
 
     local casted_inp = c_char_ptr(inp)
-    ffi.C.string_strip_helper(inp, #inp, chars, #chars, false, true,
-                              strip_newstart, strip_newlen)
-    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
+    local newstr = ffi.C.string_strip_helper(inp, #inp, chars, #chars, false,
+                                             true, strip_newlen)
+    return ffi.string(newstr, strip_newlen[0])
 end
^ permalink raw reply	[flat|nested] 5+ messages in thread
* [tarantool-patches] Re: [PATCH v3] lua: add 'chars' param to string.strip functions
  2019-02-04  4:47 ` [tarantool-patches] " Alexander Turenko
@ 2019-02-11 19:20   ` Michał Durak
  2019-02-18 20:42     ` Alexander Turenko
  0 siblings, 1 reply; 5+ messages in thread
From: Michał Durak @ 2019-02-11 19:20 UTC (permalink / raw)
  To: Alexander Turenko, tarantool-patches
Add optional 'chars' parameter to string.strip, string.lstrip
and string.rstrip for specifying the unwanted characters.
Behavior modeled after the equivalent Python built-ins.
Closes: tarantool#2977
@TarantoolBot document
Title: string.strip(inp, chars)
Update the documentation for string.strip,
string.lstrip and string.rstrip to reflect
the addition of the optional  param.
---
Hi. I considered your approach, and indeed, it does look simpler.
I also benchmarked it, and it's a bit slower than what I proposed earlier.
I don't think it's because of any particular flaw in your implementation,
rather it's how LuaJIT behaves with functions that return something.
There seems to be some kind of overhead, compared to calling void functions
making use of output parameters.
Keeping this in mind, I came up with another implementation, which also looks
simpler than my previous one, but at the same time offers equal performance.
In the benchmark results (link below), "string_lstrip_gdrbyko1" refers to the
implementation found in my previous patch, "string_lstrip_totktonada" is the
one you suggested, and "string_lstrip_a" is the one contained in the most recent
commit of my branch and also the patch attached below to this e-mail.
Also, I fixed all the nits you pointed out, thanks.
benchmark results: https://gist.github.com/gdrbyKo1/8d070a189efc3b065f937edbf35a1912
branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977
issue: https://github.com/tarantool/tarantool/issues/2977
Regards, Michał
                 |   1 +
 src/CMakeLists.txt           |   2 +
 src/lua/string.c             | 114 +++++++++++++++++++++++++++++++++++++++++++
 src/lua/string.h             |  55 +++++++++++++++++++++
 src/lua/string.lua           |  63 +++++++++++++++++++++---
 test/app-tap/string.test.lua |  76 +++++++++++++++++++++++++----
 6 files changed, 295 insertions(+), 16 deletions(-)
 create mode 100644 src/lua/string.c
 create mode 100644 src/lua/string.h
 --git a/extra/exports b/extra/exports
index 5f69e0730..0b304e4b1 100644
--- a/extra/exports
+++ b/extra/exports
@@ -209,6 +209,7 @@ clock_realtime64
 clock_monotonic64
 clock_process64
 clock_thread64
+string_strip_helper
 # Lua / LuaJIT
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 04de5ad04..905ebcc4e 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -183,6 +183,7 @@ set (server_sources
      lua/httpc.c
      lua/utf8.c
      lua/info.c
+     lua/string.c
      ${lua_sources}
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
@@ -200,6 +201,7 @@ set(api_headers
     ${CMAKE_SOURCE_DIR}/src/coio_task.h
     ${CMAKE_SOURCE_DIR}/src/lua/utils.h
     ${CMAKE_SOURCE_DIR}/src/lua/error.h
+    ${CMAKE_SOURCE_DIR}/src/lua/string.h
     ${CMAKE_SOURCE_DIR}/src/box/txn.h
     ${CMAKE_SOURCE_DIR}/src/box/key_def.h
     ${CMAKE_SOURCE_DIR}/src/box/field_def.h
diff --git a/src/lua/string.c b/src/lua/string.c
new file mode 100644
index 000000000..8c8869473
--- /dev/null
+++ b/src/lua/string.c
@@ -0,0 +1,114 @@
+/*
+ * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "string.h"
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <stddef.h>
+
+/*
+ * Implements the lstrip functionality of string_strip_helper.
+ *
+ * @param inp     The input string to perform lstrip on.
+ * @param inp_len The length of the input string.
+ * @param arr     An array of 256 values defining
+ *                the set of characters to strip.
+ *
+ * @return The number of characters to strip from the left.
+ */
+static size_t
+lstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
+{
+	size_t i;
+	for (i = 0; i < inp_len; ++i) {
+		unsigned char c = inp[i];
+		if (arr[c] == 0)
+			break;
+	}
+	return i;
+}
+
+/*
+ * Same as lstrip_helper, but for rstrip.
+ *
+ * @param inp     The input string to perform rstrip on.
+ * @param inp_len The length of the input string.
+ * @param arr     An array of 256 values defining
+ *                the set of characters to strip.
+ *
+ * @return The number of characters to strip from the right.
+ */
+static size_t
+rstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
+{
+	size_t i;
+	for (i = inp_len - 1; i != (size_t)(-1); --i) {
+		unsigned char c = inp[i];
+		if (arr[c] == 0)
+			break;
+	}
+	return inp_len - i - 1;
+}
+
+/*
+ * Perform a combination of lstrip and rstrip on the input string,
+ * and return the position and length of the resulting substring.
+ *
+ * @param[in]  inp       The input string to perform stripping on.
+ * @param[in]  inp_len   The length of the input string.
+ * @param[in]  chars     A string representing the unwanted chars.
+ * @param[in]  chars_len The length of chars
+ * @param[in]  lstrip    Whether to perform lstrip or not.
+ * @param[in]  rstrip    Whether to perform rstrip or not.
+ * @param[out] newstart  The position of the resulting substring
+ *                       in the input string.
+ * @param[out] newlen    The length of the resulting substring.
+ */
+void
+string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+		    size_t chars_len, bool lstrip, bool rstrip,
+		    size_t *newstart, size_t *newlen)
+{
+	size_t skipped;
+	uint8_t arr[256] = {0};
+
+	for (size_t i = 0; i < chars_len; ++i) {
+		unsigned char c = chars[i];
+		arr[c] = 1;
+	}
+
+	*newstart = skipped = lstrip ? lstrip_helper(inp, inp_len, arr) : 0;
+
+	if (rstrip)
+		skipped += rstrip_helper(inp + skipped, inp_len - skipped, arr);
+
+	*newlen = inp_len - skipped;
+}
diff --git a/src/lua/string.h b/src/lua/string.h
new file mode 100644
index 000000000..340077619
--- /dev/null
+++ b/src/lua/string.h
@@ -0,0 +1,55 @@
+#ifndef TARANTOOL_LUA_STRING_H_INCLUDED
+#define TARANTOOL_LUA_STRING_H_INCLUDED
+/*
+ * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <stddef.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+/** \cond public */
+
+void
+string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+		    size_t chars_len, bool lstrip, bool rstrip,
+		    size_t *newstart, size_t *newlen);
+
+/** \endcond public */
+
+#if defined(__cplusplus)
+} /* extern "C" */
+#endif /* defined(__cplusplus) */
+
+#endif /* TARANTOOL_LUA_STRING_H_INCLUDED */
diff --git a/src/lua/string.lua b/src/lua/string.lua
index cbce26b35..8216ace6a 100644
--- a/src/lua/string.lua
+++ b/src/lua/string.lua
@@ -6,15 +6,22 @@ ffi.cdef[[
            const char *needle,   size_t needle_len);
     int memcmp(const char *mem1, const char *mem2, size_t num);
     int isspace(int c);
+    void
+    string_strip_helper(const char *inp, size_t inp_len, const char *chars,
+                        size_t chars_len, bool lstrip, bool rstrip,
+                        size_t *newstart, size_t *newlen);
 ]]
-local c_char_ptr = ffi.typeof('const char *')
+local c_char_ptr     = ffi.typeof('const char *')
+local strip_newstart = ffi.new("unsigned long[1]")
+local strip_newlen   = ffi.new("unsigned long[1]")
 local memcmp  = ffi.C.memcmp
 local memmem  = ffi.C.memmem
 local isspace = ffi.C.isspace
 local err_string_arg = "bad argument #%d to '%s' (%s expected, got %s)"
+local space_chars    = ' \t\n\v\f\r'
 local function string_split_empty(inp, maxsplit)
     local p = c_char_ptr(inp)
@@ -339,25 +346,67 @@ local function string_fromhex(inp)
     return ffi.string(res, len)
 end
-local function string_strip(inp)
+local function string_strip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.strip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "^%s*(.-)%s*$", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.strip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, true,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
-local function string_lstrip(inp)
+local function string_lstrip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.lstrip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "^%s*(.-)", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.lstrip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, false,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
-local function string_rstrip(inp)
+local function string_rstrip(inp, chars)
     if type(inp) ~= 'string' then
         error(err_string_arg:format(1, "string.rstrip", 'string', type(inp)), 2)
     end
-    return (string.gsub(inp, "(.-)%s*$", "%1"))
+    if inp == '' then
+        return inp
+    end
+    if chars == nil then
+        chars = space_chars
+    elseif type(chars) ~= 'string' then
+        error(err_string_arg:format(2, "string.rstrip", 'string', type(chars)), 2)
+    elseif chars == '' then
+        return inp
+    end
+
+    local casted_inp = c_char_ptr(inp)
+    ffi.C.string_strip_helper(inp, #inp, chars, #chars, false, true,
+                              strip_newstart, strip_newlen)
+    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
 end
diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
index 7203fcd36..02a1a84d7 100755
--- a/test/app-tap/string.test.lua
+++ b/test/app-tap/string.test.lua
@@ -134,18 +134,76 @@ test:test("fromhex", function(test)
 end)
 test:test("strip", function(test)
-    test:plan(6)
-    local str = "  hello hello "
-    test:is(string.len(string.strip(str)), 11, "strip")
-    test:is(string.len(string.lstrip(str)), 12, "lstrip")
-    test:is(string.len(string.rstrip(str)), 13, "rstrip")
+    test:plan(45)
+    local str = "  Hello world! "
+    test:is(string.strip(str), "Hello world!", "strip (without chars)")
+    test:is(string.lstrip(str), "Hello world! ", "lstrip (without chars)")
+    test:is(string.rstrip(str), "  Hello world!", "rstrip (without chars)")
+    str = ""
+    test:is(string.strip(str), str, "strip (0-len inp without chars)")
+    test:is(string.lstrip(str), str, "lstrip (0-len inp without chars)")
+    test:is(string.rstrip(str), str, "rstrip (0-len inp without chars)")
+    str = " "
+    test:is(string.strip(str), "", "strip (1-len inp without chars)")
+    test:is(string.lstrip(str), "", "lstrip (1-len inp without chars)")
+    test:is(string.rstrip(str), "", "rstrip (1-len inp without chars)")
+    str = "\t\v"
+    test:is(string.strip(str), "", "strip (strip everything without chars)")
+    test:is(string.lstrip(str), "", "lstrip (strip everything without chars)")
+    test:is(string.rstrip(str), "", "rstrip (strip everything without chars)")
+    str = "hello"
+    test:is(string.strip(str), str, "strip (strip nothing without chars)")
+    test:is(string.lstrip(str), str, "lstrip (strip nothing without chars)")
+    test:is(string.rstrip(str), str, "rstrip (strip nothing without chars)")
+    str = " \t\n\v\f\rTEST \t\n\v\f\r"
+    test:is(string.strip(str), "TEST", "strip (all space characters without chars)")
+    test:is(string.lstrip(str), "TEST \t\n\v\f\r", "lstrip (all space characters without chars)")
+    test:is(string.rstrip(str), " \t\n\v\f\rTEST", "rstrip (all space characters without chars)")
+
+    local chars = "#\0"
+    str = "##Hello world!#"
+    test:is(string.strip(str, chars), "Hello world!", "strip (with chars)")
+    test:is(string.lstrip(str, chars), "Hello world!#", "lstrip (with chars)")
+    test:is(string.rstrip(str, chars), "##Hello world!", "rstrip (with chars)")
+    str = ""
+    test:is(string.strip(str, chars), str, "strip (0-len inp with chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (0-len inp with chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (0-len inp with chars)")
+    str = "#"
+    test:is(string.strip(str, chars), "", "strip (1-len inp with chars)")
+    test:is(string.lstrip(str, chars), "", "lstrip (1-len inp with chars)")
+    test:is(string.rstrip(str, chars), "", "rstrip (1-len inp with chars)")
+    str = "##"
+    test:is(string.strip(str, chars), "", "strip (strip everything with chars)")
+    test:is(string.lstrip(str, chars), "", "lstrip (strip everything with chars)")
+    test:is(string.rstrip(str, chars), "", "rstrip (strip everything with chars)")
+    str = "hello"
+    test:is(string.strip(str, chars), str, "strip (strip nothing with chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (strip nothing with chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (strip nothing with chars)")
+    str = "\0\0\0TEST\0"
+    test:is(string.strip(str, chars), "TEST", "strip (embedded 0s with chars)")
+    test:is(string.lstrip(str, chars), "TEST\0", "lstrip (embedded 0s with chars)")
+    test:is(string.rstrip(str, chars), "\0\0\0TEST", "rstrip (embedded 0s with chars)")
+    chars = ""
+    test:is(string.strip(str, chars), str, "strip (0-len chars)")
+    test:is(string.lstrip(str, chars), str, "lstrip (0-len chars)")
+    test:is(string.rstrip(str, chars), str, "rstrip (0-len chars)")
+
     local _, err = pcall(string.strip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
+    test:ok(err and err:match("#1 to '.-%.strip' %(string expected, got number%)"), "strip err 1")
     _, err = pcall(string.lstrip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
+    test:ok(err and err:match("#1 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 1")
     _, err = pcall(string.rstrip, 12)
-    test:ok(err and err:match("%(string expected, got number%)"))
-end )
+    test:ok(err and err:match("#1 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 1")
+
+    _, err = pcall(string.strip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.strip' %(string expected, got number%)"), "strip err 2")
+    _, err = pcall(string.lstrip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 2")
+    _, err = pcall(string.rstrip, "foo", 12)
+    test:ok(err and err:match("#2 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 2")
+end)
 test:test("unicode", function(test)
     test:plan(104)
--
2.11.0
^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: [PATCH v3] lua: add 'chars' param to string.strip functions
  2019-02-11 19:20   ` Michał Durak
@ 2019-02-18 20:42     ` Alexander Turenko
  2019-02-21 13:01       ` Vladimir Davydov
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Turenko @ 2019-02-18 20:42 UTC (permalink / raw)
  To: Michał Durak, Vladimir Davydov; +Cc: tarantool-patches
Thank you.
I don't have objections in general. I posted the proposal to make the
code of a helpers looks more symmetric and that is all. I don't think it
worth to do one more review iteration between us or do perf. testing
again. This is formal LGTM.
Vladimir, can you, please, look at the code and also at the proposed
fragment below?
WBR, Alexander Turenko.
On Mon, Feb 11, 2019 at 07:20:38PM +0000, Michał Durak wrote:
> Add optional 'chars' parameter to string.strip, string.lstrip
> and string.rstrip for specifying the unwanted characters.
> Behavior modeled after the equivalent Python built-ins.
> 
> Closes: tarantool#2977
Does it work in that way? I know that `Closes: #2977` do, but don't sure
about your form. The documentation (the link is below) doesn't mention
this form as far as I see.
https://help.github.com/articles/closing-issues-using-keywords/
> 
> @TarantoolBot document
> Title: string.strip(inp, chars)
> Update the documentation for string.strip,
> string.lstrip and string.rstrip to reflect
> the addition of the optional  param.
> ---
> Hi. I considered your approach, and indeed, it does look simpler.
> I also benchmarked it, and it's a bit slower than what I proposed earlier.
> I don't think it's because of any particular flaw in your implementation,
> rather it's how LuaJIT behaves with functions that return something.
> There seems to be some kind of overhead, compared to calling void functions
> making use of output parameters.
> Keeping this in mind, I came up with another implementation, which also looks
> simpler than my previous one, but at the same time offers equal performance.
> In the benchmark results (link below), "string_lstrip_gdrbyko1" refers to the
> implementation found in my previous patch, "string_lstrip_totktonada" is the
> one you suggested, and "string_lstrip_a" is the one contained in the most recent
> commit of my branch and also the patch attached below to this e-mail.
> Also, I fixed all the nits you pointed out, thanks.
> benchmark results: https://gist.github.com/gdrbyKo1/8d070a189efc3b065f937edbf35a1912
> branch: https://github.com/gdrbyKo1/tarantool/tree/gdrbyko1/gh-2977
> issue: https://github.com/tarantool/tarantool/issues/2977
> Regards, Michał
> 
>  extra/exports                |   1 +
>  src/CMakeLists.txt           |   2 +
>  src/lua/string.c             | 114 +++++++++++++++++++++++++++++++++++++++++++
>  src/lua/string.h             |  55 +++++++++++++++++++++
>  src/lua/string.lua           |  63 +++++++++++++++++++++---
>  test/app-tap/string.test.lua |  76 +++++++++++++++++++++++++----
>  6 files changed, 295 insertions(+), 16 deletions(-)
>  create mode 100644 src/lua/string.c
>  create mode 100644 src/lua/string.h
> 
> diff --git a/extra/exports b/extra/exports
> index 5f69e0730..0b304e4b1 100644
> --- a/extra/exports
> +++ b/extra/exports
> @@ -209,6 +209,7 @@ clock_realtime64
>  clock_monotonic64
>  clock_process64
>  clock_thread64
> +string_strip_helper
> 
>  # Lua / LuaJIT
> 
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 04de5ad04..905ebcc4e 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -183,6 +183,7 @@ set (server_sources
>       lua/httpc.c
>       lua/utf8.c
>       lua/info.c
> +     lua/string.c
>       ${lua_sources}
>       ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
>       ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
> @@ -200,6 +201,7 @@ set(api_headers
>      ${CMAKE_SOURCE_DIR}/src/coio_task.h
>      ${CMAKE_SOURCE_DIR}/src/lua/utils.h
>      ${CMAKE_SOURCE_DIR}/src/lua/error.h
> +    ${CMAKE_SOURCE_DIR}/src/lua/string.h
>      ${CMAKE_SOURCE_DIR}/src/box/txn.h
>      ${CMAKE_SOURCE_DIR}/src/box/key_def.h
>      ${CMAKE_SOURCE_DIR}/src/box/field_def.h
> diff --git a/src/lua/string.c b/src/lua/string.c
> new file mode 100644
> index 000000000..8c8869473
> --- /dev/null
> +++ b/src/lua/string.c
> @@ -0,0 +1,114 @@
> +/*
> + * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include "string.h"
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <stddef.h>
> +
> +/*
> + * Implements the lstrip functionality of string_strip_helper.
> + *
> + * @param inp     The input string to perform lstrip on.
> + * @param inp_len The length of the input string.
> + * @param arr     An array of 256 values defining
> + *                the set of characters to strip.
> + *
> + * @return The number of characters to strip from the left.
> + */
> +static size_t
> +lstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
> +{
> +	size_t i;
> +	for (i = 0; i < inp_len; ++i) {
> +		unsigned char c = inp[i];
> +		if (arr[c] == 0)
> +			break;
> +	}
> +	return i;
> +}
> +
> +/*
> + * Same as lstrip_helper, but for rstrip.
> + *
> + * @param inp     The input string to perform rstrip on.
> + * @param inp_len The length of the input string.
> + * @param arr     An array of 256 values defining
> + *                the set of characters to strip.
> + *
> + * @return The number of characters to strip from the right.
> + */
> +static size_t
> +rstrip_helper(const char *inp, size_t inp_len, uint8_t *arr)
> +{
> +	size_t i;
> +	for (i = inp_len - 1; i != (size_t)(-1); --i) {
> +		unsigned char c = inp[i];
> +		if (arr[c] == 0)
> +			break;
> +	}
> +	return inp_len - i - 1;
> +}
> +
> +/*
> + * Perform a combination of lstrip and rstrip on the input string,
> + * and return the position and length of the resulting substring.
> + *
> + * @param[in]  inp       The input string to perform stripping on.
> + * @param[in]  inp_len   The length of the input string.
> + * @param[in]  chars     A string representing the unwanted chars.
> + * @param[in]  chars_len The length of chars
> + * @param[in]  lstrip    Whether to perform lstrip or not.
> + * @param[in]  rstrip    Whether to perform rstrip or not.
> + * @param[out] newstart  The position of the resulting substring
> + *                       in the input string.
> + * @param[out] newlen    The length of the resulting substring.
> + */
> +void
> +string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> +		    size_t chars_len, bool lstrip, bool rstrip,
> +		    size_t *newstart, size_t *newlen)
> +{
> +	size_t skipped;
> +	uint8_t arr[256] = {0};
> +
> +	for (size_t i = 0; i < chars_len; ++i) {
> +		unsigned char c = chars[i];
> +		arr[c] = 1;
> +	}
> +
> +	*newstart = skipped = lstrip ? lstrip_helper(inp, inp_len, arr) : 0;
> +
> +	if (rstrip)
> +		skipped += rstrip_helper(inp + skipped, inp_len - skipped, arr);
> +
> +	*newlen = inp_len - skipped;
> +}
I think that the following way to write it looks simpler (marked changed
lines with !!):
```
void
string_strip_helper(const char *inp, size_t inp_len, const char *chars,
		    size_t chars_len, bool lstrip, bool rstrip,
		    size_t *newstart, size_t *newlen)
{
!!	size_t skipped = 0;
	uint8_t arr[256] = {0};
	for (size_t i = 0; i < chars_len; ++i) {
		unsigned char c = chars[i];
		arr[c] = 1;
	}
!!	if (lstrip)
!!		skipped += lstrip_helper(inp, inp_len, arr);
!!	*newstart = skipped;
	if (rstrip)
		skipped += rstrip_helper(inp + skipped, inp_len - skipped, arr);
	*newlen = inp_len - skipped;
}
```
Hope the performance will be quite same, because we don't change the
function input / result types at all and the code is more or less same.
I don't think it is needed to re-measure the performance again.
> diff --git a/src/lua/string.h b/src/lua/string.h
> new file mode 100644
> index 000000000..340077619
> --- /dev/null
> +++ b/src/lua/string.h
> @@ -0,0 +1,55 @@
> +#ifndef TARANTOOL_LUA_STRING_H_INCLUDED
> +#define TARANTOOL_LUA_STRING_H_INCLUDED
> +/*
> + * Copyright 2010-2019 Tarantool AUTHORS: please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <stddef.h>
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif /* defined(__cplusplus) */
> +
> +/** \cond public */
> +
> +void
> +string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> +		    size_t chars_len, bool lstrip, bool rstrip,
> +		    size_t *newstart, size_t *newlen);
> +
> +/** \endcond public */
> +
> +#if defined(__cplusplus)
> +} /* extern "C" */
> +#endif /* defined(__cplusplus) */
> +
> +#endif /* TARANTOOL_LUA_STRING_H_INCLUDED */
> diff --git a/src/lua/string.lua b/src/lua/string.lua
> index cbce26b35..8216ace6a 100644
> --- a/src/lua/string.lua
> +++ b/src/lua/string.lua
> @@ -6,15 +6,22 @@ ffi.cdef[[
>             const char *needle,   size_t needle_len);
>      int memcmp(const char *mem1, const char *mem2, size_t num);
>      int isspace(int c);
> +    void
> +    string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> +                        size_t chars_len, bool lstrip, bool rstrip,
> +                        size_t *newstart, size_t *newlen);
>  ]]
> 
> -local c_char_ptr = ffi.typeof('const char *')
> +local c_char_ptr     = ffi.typeof('const char *')
> +local strip_newstart = ffi.new("unsigned long[1]")
> +local strip_newlen   = ffi.new("unsigned long[1]")
> 
>  local memcmp  = ffi.C.memcmp
>  local memmem  = ffi.C.memmem
>  local isspace = ffi.C.isspace
> 
>  local err_string_arg = "bad argument #%d to '%s' (%s expected, got %s)"
> +local space_chars    = ' \t\n\v\f\r'
> 
>  local function string_split_empty(inp, maxsplit)
>      local p = c_char_ptr(inp)
> @@ -339,25 +346,67 @@ local function string_fromhex(inp)
>      return ffi.string(res, len)
>  end
> 
> -local function string_strip(inp)
> +local function string_strip(inp, chars)
>      if type(inp) ~= 'string' then
>          error(err_string_arg:format(1, "string.strip", 'string', type(inp)), 2)
>      end
> -    return (string.gsub(inp, "^%s*(.-)%s*$", "%1"))
> +    if inp == '' then
> +        return inp
> +    end
> +    if chars == nil then
> +        chars = space_chars
> +    elseif type(chars) ~= 'string' then
> +        error(err_string_arg:format(2, "string.strip", 'string', type(chars)), 2)
> +    elseif chars == '' then
> +        return inp
> +    end
> +
> +    local casted_inp = c_char_ptr(inp)
> +    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, true,
> +                              strip_newstart, strip_newlen)
> +    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
>  end
> 
> -local function string_lstrip(inp)
> +local function string_lstrip(inp, chars)
>      if type(inp) ~= 'string' then
>          error(err_string_arg:format(1, "string.lstrip", 'string', type(inp)), 2)
>      end
> -    return (string.gsub(inp, "^%s*(.-)", "%1"))
> +    if inp == '' then
> +        return inp
> +    end
> +    if chars == nil then
> +        chars = space_chars
> +    elseif type(chars) ~= 'string' then
> +        error(err_string_arg:format(2, "string.lstrip", 'string', type(chars)), 2)
> +    elseif chars == '' then
> +        return inp
> +    end
> +
> +    local casted_inp = c_char_ptr(inp)
> +    ffi.C.string_strip_helper(inp, #inp, chars, #chars, true, false,
> +                              strip_newstart, strip_newlen)
> +    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
>  end
> 
> -local function string_rstrip(inp)
> +local function string_rstrip(inp, chars)
>      if type(inp) ~= 'string' then
>          error(err_string_arg:format(1, "string.rstrip", 'string', type(inp)), 2)
>      end
> -    return (string.gsub(inp, "(.-)%s*$", "%1"))
> +    if inp == '' then
> +        return inp
> +    end
> +    if chars == nil then
> +        chars = space_chars
> +    elseif type(chars) ~= 'string' then
> +        error(err_string_arg:format(2, "string.rstrip", 'string', type(chars)), 2)
> +    elseif chars == '' then
> +        return inp
> +    end
> +
> +    local casted_inp = c_char_ptr(inp)
> +    ffi.C.string_strip_helper(inp, #inp, chars, #chars, false, true,
> +                              strip_newstart, strip_newlen)
> +    return ffi.string(casted_inp + strip_newstart[0], strip_newlen[0])
>  end
> 
> 
> diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
> index 7203fcd36..02a1a84d7 100755
> --- a/test/app-tap/string.test.lua
> +++ b/test/app-tap/string.test.lua
> @@ -134,18 +134,76 @@ test:test("fromhex", function(test)
>  end)
> 
>  test:test("strip", function(test)
> -    test:plan(6)
> -    local str = "  hello hello "
> -    test:is(string.len(string.strip(str)), 11, "strip")
> -    test:is(string.len(string.lstrip(str)), 12, "lstrip")
> -    test:is(string.len(string.rstrip(str)), 13, "rstrip")
> +    test:plan(45)
> +    local str = "  Hello world! "
> +    test:is(string.strip(str), "Hello world!", "strip (without chars)")
> +    test:is(string.lstrip(str), "Hello world! ", "lstrip (without chars)")
> +    test:is(string.rstrip(str), "  Hello world!", "rstrip (without chars)")
> +    str = ""
> +    test:is(string.strip(str), str, "strip (0-len inp without chars)")
> +    test:is(string.lstrip(str), str, "lstrip (0-len inp without chars)")
> +    test:is(string.rstrip(str), str, "rstrip (0-len inp without chars)")
> +    str = " "
> +    test:is(string.strip(str), "", "strip (1-len inp without chars)")
> +    test:is(string.lstrip(str), "", "lstrip (1-len inp without chars)")
> +    test:is(string.rstrip(str), "", "rstrip (1-len inp without chars)")
> +    str = "\t\v"
> +    test:is(string.strip(str), "", "strip (strip everything without chars)")
> +    test:is(string.lstrip(str), "", "lstrip (strip everything without chars)")
> +    test:is(string.rstrip(str), "", "rstrip (strip everything without chars)")
> +    str = "hello"
> +    test:is(string.strip(str), str, "strip (strip nothing without chars)")
> +    test:is(string.lstrip(str), str, "lstrip (strip nothing without chars)")
> +    test:is(string.rstrip(str), str, "rstrip (strip nothing without chars)")
> +    str = " \t\n\v\f\rTEST \t\n\v\f\r"
> +    test:is(string.strip(str), "TEST", "strip (all space characters without chars)")
> +    test:is(string.lstrip(str), "TEST \t\n\v\f\r", "lstrip (all space characters without chars)")
> +    test:is(string.rstrip(str), " \t\n\v\f\rTEST", "rstrip (all space characters without chars)")
> +
> +    local chars = "#\0"
> +    str = "##Hello world!#"
> +    test:is(string.strip(str, chars), "Hello world!", "strip (with chars)")
> +    test:is(string.lstrip(str, chars), "Hello world!#", "lstrip (with chars)")
> +    test:is(string.rstrip(str, chars), "##Hello world!", "rstrip (with chars)")
> +    str = ""
> +    test:is(string.strip(str, chars), str, "strip (0-len inp with chars)")
> +    test:is(string.lstrip(str, chars), str, "lstrip (0-len inp with chars)")
> +    test:is(string.rstrip(str, chars), str, "rstrip (0-len inp with chars)")
> +    str = "#"
> +    test:is(string.strip(str, chars), "", "strip (1-len inp with chars)")
> +    test:is(string.lstrip(str, chars), "", "lstrip (1-len inp with chars)")
> +    test:is(string.rstrip(str, chars), "", "rstrip (1-len inp with chars)")
> +    str = "##"
> +    test:is(string.strip(str, chars), "", "strip (strip everything with chars)")
> +    test:is(string.lstrip(str, chars), "", "lstrip (strip everything with chars)")
> +    test:is(string.rstrip(str, chars), "", "rstrip (strip everything with chars)")
> +    str = "hello"
> +    test:is(string.strip(str, chars), str, "strip (strip nothing with chars)")
> +    test:is(string.lstrip(str, chars), str, "lstrip (strip nothing with chars)")
> +    test:is(string.rstrip(str, chars), str, "rstrip (strip nothing with chars)")
> +    str = "\0\0\0TEST\0"
> +    test:is(string.strip(str, chars), "TEST", "strip (embedded 0s with chars)")
> +    test:is(string.lstrip(str, chars), "TEST\0", "lstrip (embedded 0s with chars)")
> +    test:is(string.rstrip(str, chars), "\0\0\0TEST", "rstrip (embedded 0s with chars)")
> +    chars = ""
> +    test:is(string.strip(str, chars), str, "strip (0-len chars)")
> +    test:is(string.lstrip(str, chars), str, "lstrip (0-len chars)")
> +    test:is(string.rstrip(str, chars), str, "rstrip (0-len chars)")
> +
>      local _, err = pcall(string.strip, 12)
> -    test:ok(err and err:match("%(string expected, got number%)"))
> +    test:ok(err and err:match("#1 to '.-%.strip' %(string expected, got number%)"), "strip err 1")
>      _, err = pcall(string.lstrip, 12)
> -    test:ok(err and err:match("%(string expected, got number%)"))
> +    test:ok(err and err:match("#1 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 1")
>      _, err = pcall(string.rstrip, 12)
> -    test:ok(err and err:match("%(string expected, got number%)"))
> -end )
> +    test:ok(err and err:match("#1 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 1")
> +
> +    _, err = pcall(string.strip, "foo", 12)
> +    test:ok(err and err:match("#2 to '.-%.strip' %(string expected, got number%)"), "strip err 2")
> +    _, err = pcall(string.lstrip, "foo", 12)
> +    test:ok(err and err:match("#2 to '.-%.lstrip' %(string expected, got number%)"), "lstrip err 2")
> +    _, err = pcall(string.rstrip, "foo", 12)
> +    test:ok(err and err:match("#2 to '.-%.rstrip' %(string expected, got number%)"), "rstrip err 2")
> +end)
> 
>  test:test("unicode", function(test)
>      test:plan(104)
> --
> 2.11.0
> 
> 
^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: [PATCH v3] lua: add 'chars' param to string.strip functions
  2019-02-18 20:42     ` Alexander Turenko
@ 2019-02-21 13:01       ` Vladimir Davydov
  0 siblings, 0 replies; 5+ messages in thread
From: Vladimir Davydov @ 2019-02-21 13:01 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: Michał Durak, tarantool-patches
On Mon, Feb 18, 2019 at 11:42:32PM +0300, Alexander Turenko wrote:
> On Mon, Feb 11, 2019 at 07:20:38PM +0000, Michał Durak wrote:
> > Add optional 'chars' parameter to string.strip, string.lstrip
> > and string.rstrip for specifying the unwanted characters.
> > Behavior modeled after the equivalent Python built-ins.
> > 
> > Closes: tarantool#2977
> 
> Does it work in that way? I know that `Closes: #2977` do, but don't sure
> about your form. The documentation (the link is below) doesn't mention
> this form as far as I see.
Fixed it.
> > +void
> > +string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> > +		    size_t chars_len, bool lstrip, bool rstrip,
> > +		    size_t *newstart, size_t *newlen)
> > +{
> > +	size_t skipped;
> > +	uint8_t arr[256] = {0};
> > +
> > +	for (size_t i = 0; i < chars_len; ++i) {
> > +		unsigned char c = chars[i];
> > +		arr[c] = 1;
> > +	}
> > +
> > +	*newstart = skipped = lstrip ? lstrip_helper(inp, inp_len, arr) : 0;
> > +
> > +	if (rstrip)
> > +		skipped += rstrip_helper(inp + skipped, inp_len - skipped, arr);
> > +
> > +	*newlen = inp_len - skipped;
> > +}
> 
> I think that the following way to write it looks simpler (marked changed
> lines with !!):
> 
> ```
> void
> string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> 		    size_t chars_len, bool lstrip, bool rstrip,
> 		    size_t *newstart, size_t *newlen)
> {
> !!	size_t skipped = 0;
> 	uint8_t arr[256] = {0};
> 
> 	for (size_t i = 0; i < chars_len; ++i) {
> 		unsigned char c = chars[i];
> 		arr[c] = 1;
> 	}
> 
> !!	if (lstrip)
> !!		skipped += lstrip_helper(inp, inp_len, arr);
> 
> !!	*newstart = skipped;
> 
> 	if (rstrip)
> 		skipped += rstrip_helper(inp + skipped, inp_len - skipped, arr);
> 
> 	*newlen = inp_len - skipped;
> }
> ```
Agree. Updated the patch.
> > +#if defined(__cplusplus)
> > +extern "C" {
> > +#endif /* defined(__cplusplus) */
> > +
> > +/** \cond public */
> > +
> > +void
> > +string_strip_helper(const char *inp, size_t inp_len, const char *chars,
> > +		    size_t chars_len, bool lstrip, bool rstrip,
> > +		    size_t *newstart, size_t *newlen);
> > +
> > +/** \endcond public */
cond public isn't needed here. Removed.
Pushed to 2.1.
^ permalink raw reply	[flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-02-21 13:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-22 18:51 [tarantool-patches] [PATCH v3] lua: add 'chars' param to string.strip functions Michał Durak
2019-02-04  4:47 ` [tarantool-patches] " Alexander Turenko
2019-02-11 19:20   ` Michał Durak
2019-02-18 20:42     ` Alexander Turenko
2019-02-21 13:01       ` Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox