[tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
Alexander Turenko
alexander.turenko at tarantool.org
Sat May 5 03:18:15 MSK 2018
Vlad,
Thanks for the fixes. You are rock!
I want to clarify two things, please see below.
WBR, Alexander Turenko.
On Sat, May 05, 2018 at 02:32:27AM +0300, Vladislav Shpilevoy wrote:
> Hello. Thanks for review.
>
> On 05/05/2018 01:33, Alexander Turenko wrote:
> > Vlad,
> >
> > Are you try to run tests from utf8.lua from [1]?
> >
> > [1]: https://www.lua.org/tests/lua-5.3.4-tests.tar.gz
> >
Are you think such testing would be redundant? I don't insist, just want
to know explicit position.
> > > +
> > > +/**
> > > + * Calculate length of a UTF8 string. Length here is symbol count.
> > > + * Works like utf8.len in Lua 5.3.
> > > + * @param String to get length.
> > > + * @param Start byte offset. Must point to the start of symbol. On
> > > + * invalid symbol an error is returned. Can be negative.
> >
> > Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
> > offset equilibristics is not very intuitive (at least for me).
>
> No, start can be any, as well as end.
>
It does not look like so:
tarantool> print(utf8.len('abc', 0))
nil position is out of string
tarantool> print(utf8.len('abc', 5))
nil position is out of string
tarantool> print(utf8.len('abc', 1, 4))
nil position is out of string
That matches lua 5.3 behaviour, but contradicts with your words above.
So the question is about proper doxygen-style comment.
> >
> > > + * @param End byte offset, can be negative. Can point to the
> > > + * middle of symbol.
> >
> > We need to clarify that a symbol under the end offset is subject to
> > include into the resulting count (inclusive range).
> >
> > I would also explicitly stated that -1 is the end byte.
> >
> > Worth to document allowed range (0 <= |end| <= #str, right?)?
>
More information about the Tarantool-patches
mailing list