From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp55.i.mail.ru (smtp55.i.mail.ru [217.69.128.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id CB27246970F for ; Sun, 24 Nov 2019 07:02:08 +0300 (MSK) Date: Sun, 24 Nov 2019 07:02:04 +0300 From: Alexander Turenko Message-ID: <20191124040203.ai3al2j4sywx3ljz@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Subject: [Tarantool-discussions] Multithreading in Tarantool List-Id: Tarantool development process List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-discussions@dev.tarantool.org Cc: Nick Karlov , Konstantin Nazarov We asked for multithreaded requests processing from time to time. I want to share cases we heard from users and key points of discussions around this feature as basis for future work. Feel free to extend the discussion with any cases, thoughts or suggestions. Case 1 ------ Analytical SQL queries on a read view that do not block tx thread. Proposed to add an option to box.execute() that will create a read view, spawn a thread (or use a prespawned one), perform a query in this thread using the read view handle and then push results back to tx. The read view is destroyed in tx before returning results. Case 2 ------ I don't sure about this case, to be honest. It is kinda vague. However it is quite similar to the described above except that a read view is acquired / released from a spawned thread and results don't have to be pushed to tx. A long lived thread acts as a TCP server that implements a specific protocol. It needs box's data to process requests. Small latency of responses is mission-critical. There are several ways to provide an implementation of the protocol: * Implement the protocol right in tarantool using built-in `socket` module. * Run a TCP server in a process and acquire data from tarantool using the binary protocol via Unix socket. * Run a TCP server in a long lived coio thread, which can acquire a read view, call box's function with it and release the read view. In some circumstances 3rd way may be profitable in the sense of development time (no need to separate CPU intensive and database dependent code) and latency (no need to copy data between processes). However, as I said, I'm tentative here. Let's consider this case as a second-tier wish. Discussion ---------- The cases can be divided into separate problems: 1. We should implement ability to create and destroy read views and ability to use them to perform box's requests. 2. We should allow coio thread to: - Obtain a read view handle (for case 2). - Call box's functions with a read view handle. - Push data to tx. - NB: Chunked transfer to tx may be important. 3. Enhance SQL executor: - Support SQL queries on a read view. - Forbid DML / DDL inside queries on a read view. - NB: SQL engine can write temporary tables during work on a read query. Out of scope: several Lua states that working in different threads, data transfer / RPC between them. This looks as really significant amount of work. There is a problem. If we'll create a read view and then perform a DDL request, the read view will become broken. It seems we should assert around those cases more. How to shield a user from this kind of mistake? Forbid any DDL when there is at least one read view? Our requirement for future implementation: a performance and a memory footprint should be the same as now or negligible worse if a user does not use read views. WBR, Alexander Turenko.