[Tarantool-discussions] Multithreading in Tarantool

Tarantool discussions archive
 help / color / mirror / Atom feed

From: Alexander Turenko <alexander.turenko@tarantool.org>
To: tarantool-discussions@dev.tarantool.org
Cc: Nick Karlov <karlovn@tarantool.org>,
	Konstantin Nazarov <racktear@tarantool.org>
Subject: [Tarantool-discussions] Multithreading in Tarantool
Date: Sun, 24 Nov 2019 07:02:04 +0300	[thread overview]
Message-ID: <20191124040203.ai3al2j4sywx3ljz@tkn_work_nb> (raw)

We asked for multithreaded requests processing from time to time. I want to
share cases we heard from users and key points of discussions around this
feature as basis for future work.

Feel free to extend the discussion with any cases, thoughts or
suggestions.

Case 1
------

Analytical SQL queries on a read view that do not block tx thread. Proposed to
add an option to box.execute() that will create a read view, spawn a thread
(or use a prespawned one), perform a query in this thread using the read view
handle and then push results back to tx. The read view is destroyed in tx
before returning results.

Case 2
------

I don't sure about this case, to be honest. It is kinda vague. However it is
quite similar to the described above except that a read view is acquired /
released from a spawned thread and results don't have to be pushed to tx.

A long lived thread acts as a TCP server that implements a specific protocol.
It needs box's data to process requests. Small latency of responses is
mission-critical.

There are several ways to provide an implementation of the protocol:

* Implement the protocol right in tarantool using built-in `socket` module.
* Run a TCP server in a process and acquire data from tarantool using the
  binary protocol via Unix socket.
* Run a TCP server in a long lived coio thread, which can acquire a read view,
  call box's function with it and release the read view.

In some circumstances 3rd way may be profitable in the sense of development
time (no need to separate CPU intensive and database dependent code) and
latency (no need to copy data between processes). However, as I said, I'm
tentative here.

Let's consider this case as a second-tier wish.

Discussion
----------

The cases can be divided into separate problems:

1. We should implement ability to create and destroy read views and ability to
   use them to perform box's requests.
2. We should allow coio thread to:
   - Obtain a read view handle (for case 2).
   - Call box's functions with a read view handle.
   - Push data to tx.
     - NB: Chunked transfer to tx may be important.
3. Enhance SQL executor:
   - Support SQL queries on a read view.
   - Forbid DML / DDL inside queries on a read view.
     - NB: SQL engine can write temporary tables during work on a read query.

Out of scope: several Lua states that working in different threads, data
transfer / RPC between them. This looks as really significant amount of work.

There is a problem. If we'll create a read view and then perform a DDL
request, the read view will become broken. It seems we should assert around
those cases more. How to shield a user from this kind of mistake? Forbid any
DDL when there is at least one read view?

Our requirement for future implementation: a performance and a memory
footprint should be the same as now or negligible worse if a user does not use
read views.

WBR, Alexander Turenko.

                 reply	other threads:[~2019-11-24  4:02 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191124040203.ai3al2j4sywx3ljz@tkn_work_nb \
    --to=alexander.turenko@tarantool.org \
    --cc=karlovn@tarantool.org \
    --cc=racktear@tarantool.org \
    --cc=tarantool-discussions@dev.tarantool.org \
    --subject='Re: [Tarantool-discussions] Multithreading in Tarantool' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox