[tarantool-patches] Re: [PATCH v2] core: Non-blocking io.popen

Wed Jun 5 01:39:49 MSK 2019

Hi all!

I looked at the questions that were passed directly to me. See my
thoughts below. Hope it'll help.

WBR, Alexander Turenko.

On Tue, Jun 04, 2019 at 02:14:01PM +0300, Vladimir Davydov wrote:
> On Mon, Jun 03, 2019 at 06:52:35PM +0300, Stanislav Zudin wrote:
> > > > rc,src,err = handle:read(buffer,size)
> > > > rc,src,err = handle:read2(buffer,size,seconds)
> > > > 
> > > > read stdout & stderr of the process started by fio.popen
> > > > read() -> str, source
> > > > read(buf) -> len, source
> > > > read(size) -> str, source
> > > > read(buf, size) -> len, source
> > > > read2(seconds) -> str, source
> > > > read2(buf,seconds) -> len, source
> > > > read2(size,seconds) -> str, source
> > > > read2(buf, size,seconds) -> len, source
> > > 
> > > Please use the same function name for both variants - read() - as
> > > you can figure out what to do by looking at function arguments, no?
> > > 
> > 
> > Well, actually there is an obstacle.
> > The size and the seconds are both 'number' so we can't distinguish them.
> > The possible way is to pass the fixed number of arguments, e.g.
> > read(nil,size) instead of read(size),
> > read(buf, nil, seconds) instead of read2(buf,seconds)
> > and so on.
> > If this approach is acceptable then i'll make the changes.
> 
> Dunno. Need to think. May be, pass seconds in a table? AKA options?
> I'll talk to Alexander T. - may be he knows a better way.

I'm sure we should not name the function that consider a timeout as
read2(). The possible solution is to name it read_timeout(). This
naming approach is used in coio for example (but it is C).

Another way is to add 'opts' argument for the function: it will be
distinguishable from 'size', because it is a Lua table. There are
questions however:

* Whether this argument should be always on third
  position? So a user will need to call it like :read(nil, nil, {timeout
  = <...>}. Looks complex. For example I never remember how much nils I
  should add for box.schema.user.grant() before pass {if_not_exists =
  true}.
* If not, the position can be 1st, 2nd or 3rd, but always last.
  Personally I found such convention complex to understand too.
* The reason why all this position-related problems appear is that we
  don't wrap optional arguments into 'opts'. Usually we (at least me) do
  and there are no such problems. However here we want to provide an API
  that looks similar to fio-handle:read().

There is another option:

* read([size])
* read(buf[, size])
* read(opts), where opts are:
  * buf
  * size
  * timeout

So we support all optional arguments as options, but keep fio-like API.

I also suggest to look into the old discussion about net.box, where
similar questions were arisen:
https://github.com/tarantool/tarantool/issues/2195

> > > > diff --git a/test/box-tap/fio_popen.test.lua b/test/box-tap/fio_popen.test.lua
> > > 
> > > Should be app-tap? Anyway, why tap test? Normal tests are easier to
> > > write and understand IMO.
> > > 
> > 
> > Ok, let it be app-tap.
> > 
> > What a 'Normal' test? The ones in e.g. test/app directory?
> > It's hard to find the error when result files are quite big.
> > Have to use external tools, e.g. 'diff' etc.
> 
> Yes, but I find it more convenient than analyzing a tap test result
> file. Please ask Alexander T. about our test policy.

Personally I like 'core = app' tests and using 'tap' module in them. My
points here the following:

* I'm able to check necessary parts of a result as I wish.
* Results are under my eyes when I read a test. No need to move around
  two files.
* I'm able to write a test in declarative manner (when there are many
  similar checks for different inputs).
* I'm able to run a test w/o test-run. It is just Lua file, nothing
  special.

Vladimir likes 'core = tarantool' tests. Points I had listen here from
different persons are the following:

* It is easy to write.
* It can be copy-pasted to console to check.

So there is a thing of preference and I hope one should not push another
developer to use specific approach.

I took glance at your test and suggest to consider `luacheck -r
test/box-tap/fio_popen.test.lua` output (-r is to allow redefinition of
a variable, it is often convenient for 'ok', 'rc' and such variables).
This will make the test look better for ones who write many code on Lua.

Luacheck often reports unknown tarantool-specific globals. You can
ignore those warnings or write .luacheckrc file like this one:
https://github.com/tarantool/graphql/blob/c26aa1e8f65129a938f5af12cc644b8001f517d7/.luacheckrc

> > >> diff --git a/src/lua/fio.c b/src/lua/fio.c
> > >> +/**
> > >> + * A wrapper applicable for using with lua's GC.
> > >> + * */
> > >> +struct fio_popen_wrap {
> > >
> > > Why do you need a wrapper at all?
> > 
> > Right now it contains only a "handle" (a pointer to
> > struct popen_data *).
> > Later we may need to keep something else in the managed memory.
> > It's more readable than just a pointer to pointer.
> 
> AFAIK we don't usually add such trivial wrappers to export objects to
> Lua. Please ask Alexander T. as he's an expert in Lua-C interaction.

What is the problem with GC? I don't sure about userdata, but we use
cdata of <struct foo *> for structures that should be garbage collected
with LuaJIT GC. See lbox_merge_source_new() for example: it set a
callback for GC using luaL_setcdatagc(). buffer.lua however use cdata of
<struct ibuf> (non-pointer type) and it seems it allows to avoid extra
memory allocation for the structure. Using a pointer type however allows
to hide fields from Lua (don't define a structure in ffi.cdef).

I don't have an opinion about using userdata or cdata. They looks quite
similar except that latter provides automatic getters / setters for a
structure fields, comparison operators and so on.

As I see we use userdata across the code, but use cdata more often. I
suggest to discuss this with Nick Z. or Roman T.: I guess they have a
way more context about LuaJIT usage and internals. Ask me for contacts
privately.

I see you already have <struct popen_data>, but cast a pointer to it to
<void *>. Why not to use a pointer to the structure as a handle (you can
just declare it in a header file, but don't define) and expose it to Lua
as cdata<struct popen_data *>? This way looks more clear for me than the
current approach.

Naming is the question too. I would name the handle structure
<struct popen> or <struct popen_handle>.

> > >> +	void* handle;
> > >> +};
> > >> +static const char* fio_popen_typename = "fio.popen.data";
> > >> +
> > >> +static struct fio_popen_wrap *
> > >> +fio_popen_get_handle_wrap(lua_State *L, int index)
> > >> +{
> > >> +	luaL_checktype(L, index, LUA_TUSERDATA);
> > >> +	struct fio_popen_wrap *wrap = (struct fio_popen_wrap *)
> > >> +		luaL_checkudata(L, index, fio_popen_typename);
> > >
> > > Hmm, why do you use udata? Why not plain cdata? I keep forgetting
> > > the difference, to tell the truth.
> > 
> > The Lua finalizer deals only with "managed" memory.
> > It's not aware of cdata and can't release it.
> 
> Hmm, but we use finalizers with cdata - take a look at luaT_pushtuble
> and how it uses luaL_setcdatagc. Again, it's worth talking to Alex T.
> re this.

It seems I already cover this question above. Yep, it is possible to set
a finalizer with cdata. I think userdata allows this too (at least by
setting a metatable with __gc metamethod).