[Tarantool-patches] [PATCH luajit v1 03/11] profile: introduce profiler writing module

Thu Dec 24 09:46:56 MSK 2020

Hi, Igor!

Thanks for the review!

On 21.12.20, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider the comments below. I expected
> that we agreed we need an MVP for now, so the module API is fine except
> the <write_buffer_sys> part: it is overcomplicated for the current uses.

We can try this API for now. This may lead to thoughts about improving C
API.

> 
> On 16.12.20, Sergey Kaplun wrote:
> > This patch introduces module for writing profile data.
> > Its usage will be added at the next patches.
> > 
> > It can be used for memory profiler or for signal-based
> > cpu profiler.
> 
> I see nothing strongly related to the profiler, so it's simply an
> internal write buffer. Then let's settle for the naming at first. I
> guess <lj_wbuf*> is a good prefix for this subsystem.

OK, it looks better.

> 
> So, I propose the following wording for the commit message:
> | core: introduce write buffer module
> |
> | This patch introduces the standalone module for writing data to the
> | file via the special buffer. The module provides the API for buffer
> | initial setup and its convenient usage.
> 
> Feel free to adjust the description on you own and even describe the
> introduced API in a more verbose way.

Thanks, sounds well!

> 
> > 
> > Part of tarantool/tarantool#5442
> > ---
> > 
> > Custom memcpy function (see below) makes sense if this module will be
> > used for cpu/sample profiler based on a signal-based timer. Else it can
> > be easily redefined.
> 
> I suggest to not overcomplicate all this machinery added for an a single
> profiler. Let's drop all memcpy-related hacks. By the way we can simply
> incorporate it when it is necessary.

Sure.

> 
> > 
> >  src/Makefile            |   5 +-
> >  src/Makefile.dep        |   2 +
> >  src/profile/ljp_write.c | 195 ++++++++++++++++++++++++++++++++++++++++
> >  src/profile/ljp_write.h |  84 +++++++++++++++++
> >  4 files changed, 284 insertions(+), 2 deletions(-)
> >  create mode 100644 src/profile/ljp_write.c
> >  create mode 100644 src/profile/ljp_write.h
> > 
> > diff --git a/src/Makefile b/src/Makefile
> > index be7ed95..4b1d937 100644
> > --- a/src/Makefile
> > +++ b/src/Makefile
> 
> Please, adjust these changes considering the comments to the first
> patch. You can find the proposed naming above.

Done.

> 
> > @@ -469,6 +469,7 @@ DASM_FLAGS= $(DASM_XFLAGS) $(DASM_AFLAGS)
> 
> <snipped>
> 
> > diff --git a/src/profile/ljp_write.c b/src/profile/ljp_write.c
> > new file mode 100644
> > index 0000000..de7202d
> > --- /dev/null
> > +++ b/src/profile/ljp_write.c
> > @@ -0,0 +1,195 @@
> > +/*
> > +** Low-level writer for LuaJIT Profiler.
> > +**
> > +** Major portions taken verbatim or adapted from the LuaVela.
> > +** Copyright (C) 2015-2019 IPONWEB Ltd.
> > +*/
> > +
> > +#include <unistd.h>
> > +#include <errno.h>
> > +
> > +#include "profile/ljp_write.h"
> > +#include "utils/leb128.h"
> > +#include "lj_def.h"
> 
> <snipped>
> 
> > +/* Wraps a write syscall ensuring all data have been written. */
> 
> I see no syscall wrapped here. Anyway, IIRC we discussed that we don't
> need such complex interfaces in scope of MVP. So you can just use
> <write> or <fwrite> right here for now and redesign the wbuf API later
> if necessary (it's internal, so I see no problem).

It *may* be wrapped here (encapsulated in writer function). As I say
above, it can lead us to some thoughts about C API. It is internal
interface and we can change it whenever we want.

Thoughts?

> 
> > +static void write_buffer_sys(struct ljp_buffer *buffer, const void **data,
> > +			     size_t len)
> 
> Why do you need a separate function for this? I guess this should be
> moved right to the <ljp_write_flush_buffer> (hell these names).

I suppose it not necessary. I'll merge it with flush.

> 
> > +{
> > +  void *ctx = buffer->ctx;
> > +  size_t written;
> > +
> > +  lua_assert(!ljp_write_test_flag(buffer, STREAM_STOP));
> > +
> > +  written = buffer->writer(data, len, ctx);
> 
> Well, I believe you can use buffer->ctx instead of the additional
> variable here. Trust me, you can!

Yep :)

> 
> > +
> > +  if (LJ_UNLIKELY(written < len)) {
> > +    write_set_flag(buffer, STREAM_ERR_IO);
> > +    write_save_errno(buffer);
> > +  }
> > +  if (LJ_UNLIKELY(*data == NULL)) {
> > +    write_set_flag(buffer, STREAM_STOP);
> > +    write_save_errno(buffer);
> > +  }
> > +}
> > +
> > +static LJ_AINLINE size_t write_bytes_buffered(const struct ljp_buffer *buf)
> 
> I propose s/write_bytes_buffered/lj_wbuf_len/ (consider sbuflen macro).

Fixed.

> 
> <snipped>
> 
> > +static LJ_AINLINE int write_buffer_has(const struct ljp_buffer *buf, size_t n)
> 
> I propose s/write_buffer_has/lj_wbuf_left/ (consider sbufleft macro).

Fixed.

> 

<snipped>

> > +/* Writes n bytes from an arbitrary buffer src to the output. */
> > +static void write_buffer(struct ljp_buffer *buf, const void *src, size_t n)
> > +{
> > +  if (LJ_UNLIKELY(ljp_write_test_flag(buf, STREAM_STOP)))
> > +    return;
> > +  /*
> > +  ** Very unlikely: We are told to write a large buffer at once.
> > +  ** Buffer not belong to us so we must to pump data
> > +  ** through buffer.
> > +  */
> > +  while (LJ_UNLIKELY(n > buf->size)) {
> > +    ljp_write_flush_buffer(buf);
> 
> Why do you need to flush the buffer on start? I guess you can fill the
> buffer till it becomes full and only then flush.

Thanks! Good point!

> 
> > +    write_memcpy(buf->pos, src, buf->size);
> > +    buf->pos += (ptrdiff_t)buf->size;
> > +    n -= buf->size;
> > +  }
> > +
> > +  write_reserve(buf, n);
> > +  write_memcpy(buf->pos, src, n);
> > +  buf->pos += (ptrdiff_t)n;
> > +}
> > +
> > +/* Writes a \0-terminated C string to the output buffer. */
> > +void ljp_write_string(struct ljp_buffer *buf, const char *s)
> > +{
> > +  const size_t l = strlen(s);
> > +
> > +  ljp_write_u64(buf, (uint64_t)l);
> 
> This is unclear that the check that profiling is still active is made in
> scope of the callee.

I'll add the comment here.

> 
> > +  write_buffer(buf, s, l);
> > +}
> > +
> 
> <snipped>
> 
> > diff --git a/src/profile/ljp_write.h b/src/profile/ljp_write.h
> > new file mode 100644
> > index 0000000..29c1669
> > --- /dev/null
> > +++ b/src/profile/ljp_write.h
> > @@ -0,0 +1,84 @@
> > +/*
> > +** Low-level event streaming for LuaJIT Profiler.
> > +** NB! Please note that all events may be streamed inside a signal handler.
> > +** This means effectively that only async-signal-safe library functions and
> > +** syscalls MUST be used for streaming. Check with `man 7 signal` when in
> > +** doubt.
> > +** Major portions taken verbatim or adapted from the LuaVela.
> > +** Copyright (C) 2015-2019 IPONWEB Ltd.
> > +*/
> > +
> > +#ifndef _LJP_WRITE_H
> > +#define _LJP_WRITE_H
> > +
> > +#include <stdint.h>
> > +
> > +/*
> > +** Data format for strings:
> > +**
> > +** string         := string-len string-payload
> > +** string-len     := <ULEB128>
> > +** string-payload := <BYTE> {string-len}
> > +**
> > +** Note.
> > +** For strings shorter than 128 bytes (most likely scenario in our case)
> > +** we write the same amount of data (1-byte ULEB128 + actual payload) as we
> > +** would have written with straightforward serialization (actual payload + \0),
> > +** but make parsing easier.
> > +*/
> > +
> > +/* Stream errors. */
> > +#define STREAM_ERR_IO 0x1
> > +#define STREAM_STOP   0x2
> > +
> > +typedef size_t (*ljp_writer)(const void **data, size_t len, void *opt);
> > +
> > +/* Write buffer for profilers. */
> > +struct ljp_buffer {
> > +  /*
> > +  ** Buffer writer which will called at buffer write.
> > +  ** Should return amount of written bytes on success or zero in case of error.
> > +  ** *data should contain new buffer of size greater or equal to len.
> > +  ** If *data == NULL stream stops.
> > +  */
> > +  ljp_writer writer;
> > +  /* Context to writer function. */
> 
> Typo: s/Context to/Context for/.

Thanks.

> 
> > +  void *ctx;
> > +  /* Buffer size. */
> > +  size_t size;
> > +  /* Saved errno in case of error. */
> > +  int saved_errno;
> > +  /* Start of buffer. */
> > +  uint8_t *buf;
> > +  /* Current position in buffer. */
> > +  uint8_t *pos;
> > +  /* Internal flags. */
> > +  volatile uint8_t flags;
> > +};
> 
> Well, I don't get why the functions are called <ljp_write_*>, but the
> first parameter (i.e. "self") is ljp_buffer. As a result such names as
> <ljp_write_errno> looks confusing, since errno is actually written
> nowhere. I suggest to name everything with <lj_wbuf_*> prefix, so the
> names <lj_wbuf_errno> and <lj_wbuf_test_flag> fits the resulting value.
> Furthermore, the routines appending the data to the buffer can be
> renamed the following way: ljp_write_<type> -> lj_wbuf_add<type>
> (consider Lua standart buffer API). Thoughts?

Yes, it's more consistent with LuaJIT internals too.

> 
> > +
> > +/* Write string. */
> > +void ljp_write_string(struct ljp_buffer *buf, const char *s);
> > +
> > +/* Write single byte. */
> > +void ljp_write_byte(struct ljp_buffer *buf, uint8_t b);
> > +
> > +/* Write uint64_t in uleb128 format. */
> > +void ljp_write_u64(struct ljp_buffer *buf, uint64_t n);
> > +
> > +/* Immediatly flush buffer. */
> > +void ljp_write_flush_buffer(struct ljp_buffer *buf);
> > +
> > +/* Init buffer. */
> > +void ljp_write_init(struct ljp_buffer *buf, ljp_writer writer, void *ctx,
> > +		    uint8_t *mem, size_t size);
> > +
> > +/* Check flags. */
> > +int ljp_write_test_flag(const struct ljp_buffer *buf, uint8_t flag);
> > +
> > +/* Return saved errno. */
> > +int ljp_write_errno(const struct ljp_buffer *buf);
> > +
> > +/* Set pointers to NULL and reset flags. */
> > +void ljp_write_terminate(struct ljp_buffer *buf);
> > +
> > +#endif
> > -- 
> > 2.28.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun