* [PATCH 0/8] Follow-up on async memtx index cleanup
@ 2018-05-22 11:46 Vladimir Davydov
2018-05-22 11:46 ` [PATCH 1/8] memtx: init index extent allocator in engine constructor Vladimir Davydov
` (7 more replies)
0 siblings, 8 replies; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Turned out that after commit 2a1482f33fa0 ("memtx: free tuples
asynchronously when primary index is dropped"), we can't merge
1.10 to 2.0 without breaking tests, because one of the tests
(namely sql-tap/gh-3083-ephemeral-unref-tuples) executes a lot
of SQL SELECT statements without yielding. Those statements
in turn create a lot of ephemeral spaces that consume all
available memory since memtx garbage collection doesn't have
a chance to clean up. To fix that, this patch set improves
the garbage collection procedure: now it is invoked not only
by a background fiber, but also on demand, i.e. when any of
memtx functions fails to allocate memory.
https://github.com/tarantool/tarantool/issues/3408
https://github.com/tarantool/tarantool/commits/memtx-run-gc-on-demand
Vladimir Davydov (8):
memtx: init index extent allocator in engine constructor
memtx: fold memtx_tuple.cc into memtx_engine.c
memtx: pass engine to memory allocation functions
memtx: move all global variables to engine
memtx: destroy slab arena on engine shutdown
memtx: embed light hash into memtx_hash_index
memtx: rework background garbage collection procedure
memtx: run garbage collection on demand
src/box/CMakeLists.txt | 1 -
src/box/alter.cc | 1 -
src/box/index.cc | 7 +-
src/box/lua/slab.c | 35 ++++---
src/box/lua/tuple.c | 1 -
src/box/lua/xlog.c | 1 -
src/box/memtx_bitset.c | 4 +-
src/box/memtx_engine.c | 257 ++++++++++++++++++++++++++++++++++---------------
src/box/memtx_engine.h | 67 ++++++++++---
src/box/memtx_hash.c | 93 ++++++------------
src/box/memtx_hash.h | 35 ++++++-
src/box/memtx_rtree.c | 4 +-
src/box/memtx_space.c | 7 +-
src/box/memtx_tree.c | 33 +++----
src/box/memtx_tree.h | 1 +
src/box/memtx_tuple.cc | 193 -------------------------------------
src/box/memtx_tuple.h | 100 -------------------
src/box/tuple_format.c | 1 +
src/box/tuple_format.h | 2 +
19 files changed, 349 insertions(+), 494 deletions(-)
delete mode 100644 src/box/memtx_tuple.cc
delete mode 100644 src/box/memtx_tuple.h
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/8] memtx: init index extent allocator in engine constructor
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:43 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c Vladimir Davydov
` (6 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Postponing it until a memtx index is created for the first time saves us
no memory or cpu, it only makes the code more difficult to follow.
---
src/box/memtx_bitset.c | 2 --
src/box/memtx_engine.c | 34 +++++++---------------------------
src/box/memtx_engine.h | 9 ---------
src/box/memtx_hash.c | 2 --
src/box/memtx_rtree.c | 2 --
src/box/memtx_tree.c | 2 --
6 files changed, 7 insertions(+), 44 deletions(-)
diff --git a/src/box/memtx_bitset.c b/src/box/memtx_bitset.c
index 7d8ee11a..93b4a00a 100644
--- a/src/box/memtx_bitset.c
+++ b/src/box/memtx_bitset.c
@@ -490,8 +490,6 @@ memtx_bitset_index_new(struct memtx_engine *memtx, struct index_def *def)
assert(def->iid > 0);
assert(!def->opts.is_unique);
- memtx_index_arena_init();
-
if (!mempool_is_initialized(&memtx->bitset_iterator_pool)) {
mempool_create(&memtx->bitset_iterator_pool, cord_slab_cache(),
sizeof(struct bitset_index_iterator));
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index ee3afaec..fd93491e 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -55,7 +55,6 @@
* box.slab.info(), @sa lua/slab.cc
*/
extern struct quota memtx_quota;
-static bool memtx_index_arena_initialized = false;
struct slab_arena memtx_arena; /* used by memtx_tuple.cc */
static struct slab_cache memtx_index_slab_cache;
struct mempool memtx_index_extent_pool;
@@ -971,6 +970,13 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
if (memtx->gc_fiber == NULL)
goto fail;
+ /* Initialize index extent allocator. */
+ slab_cache_create(&memtx_index_slab_cache, &memtx_arena);
+ mempool_create(&memtx_index_extent_pool, &memtx_index_slab_cache,
+ MEMTX_EXTENT_SIZE);
+ memtx_index_num_reserved_extents = 0;
+ memtx_index_reserved_extents = NULL;
+
memtx->state = MEMTX_INITIALIZED;
memtx->force_recovery = force_recovery;
@@ -1007,32 +1013,6 @@ memtx_engine_set_max_tuple_size(struct memtx_engine *memtx, size_t max_size)
}
/**
- * Initialize arena for indexes.
- * The arena is used for memtx_index_extent_alloc
- * and memtx_index_extent_free.
- * Can be called several times, only first call do the work.
- */
-void
-memtx_index_arena_init(void)
-{
- if (memtx_index_arena_initialized) {
- /* already done.. */
- return;
- }
- /* Creating slab cache */
- slab_cache_create(&memtx_index_slab_cache, &memtx_arena);
- /* Creating mempool */
- mempool_create(&memtx_index_extent_pool,
- &memtx_index_slab_cache,
- MEMTX_EXTENT_SIZE);
- /* Empty reserved list */
- memtx_index_num_reserved_extents = 0;
- memtx_index_reserved_extents = 0;
- /* Done */
- memtx_index_arena_initialized = true;
-}
-
-/**
* Allocate a block of size MEMTX_EXTENT_SIZE for memtx index
*/
void *
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 2ce597b4..72f40528 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -154,15 +154,6 @@ enum {
};
/**
- * Initialize arena for indexes.
- * The arena is used for memtx_index_extent_alloc
- * and memtx_index_extent_free.
- * Can be called several times, only first call do the work.
- */
-void
-memtx_index_arena_init(void);
-
-/**
* Allocate a block of size MEMTX_EXTENT_SIZE for memtx index
*/
void *
diff --git a/src/box/memtx_hash.c b/src/box/memtx_hash.c
index 20d6ac5d..08ba84ad 100644
--- a/src/box/memtx_hash.c
+++ b/src/box/memtx_hash.c
@@ -461,8 +461,6 @@ static const struct index_vtab memtx_hash_index_vtab = {
struct memtx_hash_index *
memtx_hash_index_new(struct memtx_engine *memtx, struct index_def *def)
{
- memtx_index_arena_init();
-
if (!mempool_is_initialized(&memtx->hash_iterator_pool)) {
mempool_create(&memtx->hash_iterator_pool, cord_slab_cache(),
sizeof(struct hash_iterator));
diff --git a/src/box/memtx_rtree.c b/src/box/memtx_rtree.c
index 6afd5521..baeffa26 100644
--- a/src/box/memtx_rtree.c
+++ b/src/box/memtx_rtree.c
@@ -350,8 +350,6 @@ memtx_rtree_index_new(struct memtx_engine *memtx, struct index_def *def)
enum rtree_distance_type distance_type =
(enum rtree_distance_type)def->opts.distance;
- memtx_index_arena_init();
-
if (!mempool_is_initialized(&memtx->rtree_iterator_pool)) {
mempool_create(&memtx->rtree_iterator_pool, cord_slab_cache(),
sizeof(struct index_rtree_iterator));
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index 6b76e538..0911af52 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -669,8 +669,6 @@ static const struct index_vtab memtx_tree_index_vtab = {
struct memtx_tree_index *
memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
{
- memtx_index_arena_init();
-
if (!mempool_is_initialized(&memtx->tree_iterator_pool)) {
mempool_create(&memtx->tree_iterator_pool, cord_slab_cache(),
sizeof(struct tree_iterator));
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
2018-05-22 11:46 ` [PATCH 1/8] memtx: init index extent allocator in engine constructor Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:45 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 3/8] memtx: pass engine to memory allocation functions Vladimir Davydov
` (5 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
The two files are too closely related: memtx_arena is defined and
used in memtx_engine.c, but initialized in memtx_tuple.cc. Since
memtx_tuple.cc is small, let's fold it into memtx_engine.c.
---
src/box/CMakeLists.txt | 1 -
src/box/alter.cc | 1 -
src/box/lua/tuple.c | 1 -
src/box/lua/xlog.c | 1 -
src/box/memtx_engine.c | 152 ++++++++++++++++++++++++++++++++++----
src/box/memtx_engine.h | 13 ++++
src/box/memtx_space.c | 3 +-
src/box/memtx_tuple.cc | 193 -------------------------------------------------
src/box/memtx_tuple.h | 100 -------------------------
9 files changed, 152 insertions(+), 313 deletions(-)
delete mode 100644 src/box/memtx_tuple.cc
delete mode 100644 src/box/memtx_tuple.h
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 0bbc857a..6b1ae3e8 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -68,7 +68,6 @@ add_library(box STATIC
engine.c
memtx_engine.c
memtx_space.c
- memtx_tuple.cc
sysview_engine.c
sysview_index.c
vinyl.c
diff --git a/src/box/alter.cc b/src/box/alter.cc
index 518f515b..f0315ff7 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -50,7 +50,6 @@
#include "xrow.h"
#include "iproto_constants.h"
#include "identifier.h"
-#include "memtx_tuple.h"
#include "version.h"
#include "sequence.h"
diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index dc98e967..80573313 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -41,7 +41,6 @@
#include "box/tuple.h"
#include "box/tuple_convert.h"
#include "box/errcode.h"
-#include "box/memtx_tuple.h"
#include "json/path.h"
/** {{{ box.tuple Lua library
diff --git a/src/box/lua/xlog.c b/src/box/lua/xlog.c
index 030f5c2d..70384c1d 100644
--- a/src/box/lua/xlog.c
+++ b/src/box/lua/xlog.c
@@ -43,7 +43,6 @@
#include <box/lua/tuple.h>
#include <lua/msgpack.h>
#include <lua/utils.h>
-#include "box/memtx_tuple.h"
/* {{{ Helpers */
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index fd93491e..0c5136cf 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -30,12 +30,14 @@
*/
#include "memtx_engine.h"
#include "memtx_space.h"
-#include "memtx_tuple.h"
+#include <small/quota.h>
#include <small/small.h>
#include <small/mempool.h>
#include "fiber.h"
+#include "clock.h"
+#include "errinj.h"
#include "coio_file.h"
#include "tuple.h"
#include "txn.h"
@@ -48,16 +50,25 @@
#include "schema.h"
#include "gc.h"
-/** For all memory used by all indexes.
- * If you decide to use memtx_index_arena or
- * memtx_index_slab_cache for anything other than
- * memtx_index_extent_pool, make sure this is reflected in
- * box.slab.info(), @sa lua/slab.cc
+/*
+ * If you decide to use memtx_arena for anything other than
+ * memtx_alloc or memtx_index_extent_pool, make sure this
+ * is reflected in box.slab.info(), @sa lua/slab.c.
*/
-extern struct quota memtx_quota;
-struct slab_arena memtx_arena; /* used by memtx_tuple.cc */
+
+/** Common quota for memtx tuples and indexes. */
+static struct quota memtx_quota;
+/** Common slab arena for memtx tuples and indexes. */
+static struct slab_arena memtx_arena;
+/** Slab cache for allocating memtx tuples. */
+static struct slab_cache memtx_slab_cache;
+/** Memtx tuple allocator. */
+struct small_alloc memtx_alloc; /* used by box.slab.info() */
+/** Slab cache for allocating memtx index extents. */
static struct slab_cache memtx_index_slab_cache;
-struct mempool memtx_index_extent_pool;
+/** Memtx index extent allocator. */
+struct mempool memtx_index_extent_pool; /* used by box.slab.info() */
+
/**
* To ensure proper statement-level rollback in case
* of out of memory conditions, we maintain a number
@@ -68,6 +79,12 @@ struct mempool memtx_index_extent_pool;
static int memtx_index_num_reserved_extents;
static void *memtx_index_reserved_extents;
+/** Maximal allowed tuple size, box.cfg.memtx_max_tuple_size. */
+static size_t memtx_max_tuple_size = 1 * 1024 * 1024;
+
+/** Incremented with each next snapshot. */
+uint32_t snapshot_version;
+
static void
txn_on_yield_or_stop(struct trigger *trigger, void *event)
{
@@ -76,6 +93,23 @@ txn_on_yield_or_stop(struct trigger *trigger, void *event)
txn_rollback(); /* doesn't throw */
}
+struct memtx_tuple {
+ /*
+ * sic: the header of the tuple is used
+ * to store a free list pointer in smfree_delayed.
+ * Please don't change it without understanding
+ * how smfree_delayed and snapshotting COW works.
+ */
+ /** Snapshot generation version. */
+ uint32_t version;
+ struct tuple base;
+};
+
+enum {
+ OBJSIZE_MIN = 16,
+ SLAB_SIZE = 16 * 1024 * 1024,
+};
+
static int
memtx_end_build_primary_key(struct space *space, void *param)
{
@@ -140,7 +174,6 @@ memtx_engine_shutdown(struct engine *engine)
mempool_destroy(&memtx->bitset_iterator_pool);
xdir_destroy(&memtx->snap_dir);
free(memtx);
- memtx_tuple_free();
}
static int
@@ -668,7 +701,8 @@ memtx_engine_begin_checkpoint(struct engine *engine)
}
/* increment snapshot version; set tuple deletion to delayed mode */
- memtx_tuple_begin_snapshot();
+ snapshot_version++;
+ small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, true);
return 0;
}
@@ -714,7 +748,7 @@ memtx_engine_commit_checkpoint(struct engine *engine, struct vclock *vclock)
/* waitCheckpoint() must have been done. */
assert(!memtx->checkpoint->waiting_for_snap_thread);
- memtx_tuple_end_snapshot();
+ small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, false);
if (!memtx->checkpoint->touch) {
int64_t lsn = vclock_sum(memtx->checkpoint->vclock);
@@ -757,7 +791,7 @@ memtx_engine_abort_checkpoint(struct engine *engine)
memtx->checkpoint->waiting_for_snap_thread = false;
}
- memtx_tuple_end_snapshot();
+ small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, false);
/** Remove garbage .inprogress file. */
char *filename =
@@ -950,8 +984,6 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
uint64_t tuple_arena_max_size, uint32_t objsize_min,
float alloc_factor)
{
- memtx_tuple_init(tuple_arena_max_size, objsize_min, alloc_factor);
-
struct memtx_engine *memtx = calloc(1, sizeof(*memtx));
if (memtx == NULL) {
diag_set(OutOfMemory, sizeof(*memtx),
@@ -970,6 +1002,18 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
if (memtx->gc_fiber == NULL)
goto fail;
+ /* Apply lowest allowed objsize bound. */
+ if (objsize_min < OBJSIZE_MIN)
+ objsize_min = OBJSIZE_MIN;
+
+ /* Initialize tuple allocator. */
+ quota_init(&memtx_quota, tuple_arena_max_size);
+ tuple_arena_create(&memtx_arena, &memtx_quota, tuple_arena_max_size,
+ SLAB_SIZE, "memtx");
+ slab_cache_create(&memtx_slab_cache, &memtx_arena);
+ small_alloc_create(&memtx_alloc, &memtx_slab_cache,
+ objsize_min, alloc_factor);
+
/* Initialize index extent allocator. */
slab_cache_create(&memtx_index_slab_cache, &memtx_arena);
mempool_create(&memtx_index_extent_pool, &memtx_index_slab_cache,
@@ -1012,6 +1056,84 @@ memtx_engine_set_max_tuple_size(struct memtx_engine *memtx, size_t max_size)
memtx_max_tuple_size = max_size;
}
+struct tuple *
+memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
+{
+ assert(mp_typeof(*data) == MP_ARRAY);
+ size_t tuple_len = end - data;
+ size_t meta_size = tuple_format_meta_size(format);
+ size_t total = sizeof(struct memtx_tuple) + meta_size + tuple_len;
+
+ ERROR_INJECT(ERRINJ_TUPLE_ALLOC, {
+ diag_set(OutOfMemory, total, "slab allocator", "memtx_tuple");
+ return NULL;
+ });
+ if (unlikely(total > memtx_max_tuple_size)) {
+ diag_set(ClientError, ER_MEMTX_MAX_TUPLE_SIZE, total);
+ error_log(diag_last_error(diag_get()));
+ return NULL;
+ }
+
+ struct memtx_tuple *memtx_tuple = smalloc(&memtx_alloc, total);
+ if (memtx_tuple == NULL) {
+ diag_set(OutOfMemory, total, "slab allocator", "memtx_tuple");
+ return NULL;
+ }
+ struct tuple *tuple = &memtx_tuple->base;
+ tuple->refs = 0;
+ memtx_tuple->version = snapshot_version;
+ assert(tuple_len <= UINT32_MAX); /* bsize is UINT32_MAX */
+ tuple->bsize = tuple_len;
+ tuple->format_id = tuple_format_id(format);
+ tuple_format_ref(format);
+ /*
+ * Data offset is calculated from the begin of the struct
+ * tuple base, not from memtx_tuple, because the struct
+ * tuple is not the first field of the memtx_tuple.
+ */
+ tuple->data_offset = sizeof(struct tuple) + meta_size;
+ char *raw = (char *) tuple + tuple->data_offset;
+ uint32_t *field_map = (uint32_t *) raw;
+ memcpy(raw, data, tuple_len);
+ if (tuple_init_field_map(format, field_map, raw)) {
+ memtx_tuple_delete(format, tuple);
+ return NULL;
+ }
+ say_debug("%s(%zu) = %p", __func__, tuple_len, memtx_tuple);
+ return tuple;
+}
+
+void
+memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple)
+{
+ say_debug("%s(%p)", __func__, tuple);
+ assert(tuple->refs == 0);
+#ifndef NDEBUG
+ struct errinj *delay = errinj(ERRINJ_MEMTX_TUPLE_DELETE_DELAY,
+ ERRINJ_DOUBLE);
+ if (delay != NULL && delay->dparam > 0) {
+ double start = clock_monotonic();
+ while (clock_monotonic() < start + delay->dparam)
+ /* busy loop */ ;
+ }
+#endif
+ size_t total = sizeof(struct memtx_tuple) +
+ tuple_format_meta_size(format) + tuple->bsize;
+ tuple_format_unref(format);
+ struct memtx_tuple *memtx_tuple =
+ container_of(tuple, struct memtx_tuple, base);
+ if (memtx_alloc.free_mode != SMALL_DELAYED_FREE ||
+ memtx_tuple->version == snapshot_version)
+ smfree(&memtx_alloc, memtx_tuple, total);
+ else
+ smfree_delayed(&memtx_alloc, memtx_tuple, total);
+}
+
+struct tuple_format_vtab memtx_tuple_format_vtab = {
+ memtx_tuple_delete,
+ memtx_tuple_new,
+};
+
/**
* Allocate a block of size MEMTX_EXTENT_SIZE for memtx index
*/
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 72f40528..9f28c268 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -45,6 +45,8 @@ extern "C" {
struct index;
struct fiber;
+struct tuple;
+struct tuple_format;
/**
* The state of memtx recovery process.
@@ -148,6 +150,17 @@ memtx_engine_set_snap_io_rate_limit(struct memtx_engine *memtx, double limit);
void
memtx_engine_set_max_tuple_size(struct memtx_engine *memtx, size_t max_size);
+/** Allocate a memtx tuple. @sa tuple_new(). */
+struct tuple *
+memtx_tuple_new(struct tuple_format *format, const char *data, const char *end);
+
+/** Free a memtx tuple. @sa tuple_delete(). */
+void
+memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple);
+
+/** Tuple format vtab for memtx engine. */
+extern struct tuple_format_vtab memtx_tuple_format_vtab;
+
enum {
MEMTX_EXTENT_SIZE = 16 * 1024,
MEMTX_SLAB_SIZE = 4 * 1024 * 1024
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index 73df4c7d..7c795b99 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -32,13 +32,14 @@
#include "space.h"
#include "iproto_constants.h"
#include "txn.h"
+#include "tuple.h"
#include "tuple_update.h"
#include "xrow.h"
#include "memtx_hash.h"
#include "memtx_tree.h"
#include "memtx_rtree.h"
#include "memtx_bitset.h"
-#include "memtx_tuple.h"
+#include "memtx_engine.h"
#include "column_mask.h"
#include "sequence.h"
diff --git a/src/box/memtx_tuple.cc b/src/box/memtx_tuple.cc
deleted file mode 100644
index 60564e3c..00000000
--- a/src/box/memtx_tuple.cc
+++ /dev/null
@@ -1,193 +0,0 @@
-/*
- * Copyright 2010-2016, Tarantool AUTHORS, please see AUTHORS file.
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * 1. Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the
- * following disclaimer.
- *
- * 2. Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
- * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
- * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
- * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
- * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
- * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
- * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
- * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include "memtx_tuple.h"
-
-#include "small/small.h"
-#include "small/region.h"
-#include "small/quota.h"
-#include "fiber.h"
-#include "clock.h"
-#include "errinj.h"
-#include "box.h"
-
-struct memtx_tuple {
- /*
- * sic: the header of the tuple is used
- * to store a free list pointer in smfree_delayed.
- * Please don't change it without understanding
- * how smfree_delayed and snapshotting COW works.
- */
- /** Snapshot generation version. */
- uint32_t version;
- struct tuple base;
-};
-
-/** Memtx slab arena */
-extern struct slab_arena memtx_arena; /* defined in memtx_engine.cc */
-/* Memtx slab_cache for tuples */
-static struct slab_cache memtx_slab_cache;
-/** Common quota for memtx tuples and indexes */
-static struct quota memtx_quota;
-/** Memtx tuple allocator */
-struct small_alloc memtx_alloc; /* used box box.slab.info() */
-/* The maximal allowed tuple size, box.cfg.memtx_max_tuple_size */
-size_t memtx_max_tuple_size = 1 * 1024 * 1024; /* set dynamically */
-uint32_t snapshot_version;
-
-enum {
- /** Lowest allowed slab_alloc_minimal */
- OBJSIZE_MIN = 16,
- SLAB_SIZE = 16 * 1024 * 1024,
-};
-
-void
-memtx_tuple_init(uint64_t tuple_arena_max_size, uint32_t objsize_min,
- float alloc_factor)
-{
- /* Apply lowest allowed objsize bounds */
- if (objsize_min < OBJSIZE_MIN)
- objsize_min = OBJSIZE_MIN;
- /** Preallocate entire quota. */
- quota_init(&memtx_quota, tuple_arena_max_size);
- tuple_arena_create(&memtx_arena, &memtx_quota, tuple_arena_max_size,
- SLAB_SIZE, "memtx");
- slab_cache_create(&memtx_slab_cache, &memtx_arena);
- small_alloc_create(&memtx_alloc, &memtx_slab_cache,
- objsize_min, alloc_factor);
-}
-
-void
-memtx_tuple_free(void)
-{
-}
-
-struct tuple_format_vtab memtx_tuple_format_vtab = {
- memtx_tuple_delete,
- memtx_tuple_new,
-};
-
-struct tuple *
-memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
-{
- assert(mp_typeof(*data) == MP_ARRAY);
- size_t tuple_len = end - data;
- size_t meta_size = tuple_format_meta_size(format);
- size_t total = sizeof(struct memtx_tuple) + meta_size + tuple_len;
-
- ERROR_INJECT(ERRINJ_TUPLE_ALLOC,
- do { diag_set(OutOfMemory, (unsigned) total,
- "slab allocator", "memtx_tuple"); return NULL; }
- while(false); );
- if (unlikely(total > memtx_max_tuple_size)) {
- diag_set(ClientError, ER_MEMTX_MAX_TUPLE_SIZE,
- (unsigned) total);
- error_log(diag_last_error(diag_get()));
- return NULL;
- }
-
- struct memtx_tuple *memtx_tuple =
- (struct memtx_tuple *) smalloc(&memtx_alloc, total);
- /**
- * Use a nothrow version and throw an exception here,
- * to throw an instance of ClientError. Apart from being
- * more nice to the user, ClientErrors are ignored in
- * force_recovery=true mode, allowing us to start
- * with lower arena than necessary in the circumstances
- * of disaster recovery.
- */
- if (memtx_tuple == NULL) {
- diag_set(OutOfMemory, (unsigned) total,
- "slab allocator", "memtx_tuple");
- return NULL;
- }
- struct tuple *tuple = &memtx_tuple->base;
- tuple->refs = 0;
- memtx_tuple->version = snapshot_version;
- assert(tuple_len <= UINT32_MAX); /* bsize is UINT32_MAX */
- tuple->bsize = tuple_len;
- tuple->format_id = tuple_format_id(format);
- tuple_format_ref(format);
- /*
- * Data offset is calculated from the begin of the struct
- * tuple base, not from memtx_tuple, because the struct
- * tuple is not the first field of the memtx_tuple.
- */
- tuple->data_offset = sizeof(struct tuple) + meta_size;
- char *raw = (char *) tuple + tuple->data_offset;
- uint32_t *field_map = (uint32_t *) raw;
- memcpy(raw, data, tuple_len);
- if (tuple_init_field_map(format, field_map, raw)) {
- memtx_tuple_delete(format, tuple);
- return NULL;
- }
- say_debug("%s(%zu) = %p", __func__, tuple_len, memtx_tuple);
- return tuple;
-}
-
-void
-memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple)
-{
- say_debug("%s(%p)", __func__, tuple);
- assert(tuple->refs == 0);
-#ifndef NDEBUG
- struct errinj *delay = errinj(ERRINJ_MEMTX_TUPLE_DELETE_DELAY,
- ERRINJ_DOUBLE);
- if (delay != NULL && delay->dparam > 0) {
- double start = clock_monotonic();
- while (clock_monotonic() < start + delay->dparam)
- /* busy loop */ ;
- }
-#endif
- size_t total = sizeof(struct memtx_tuple) +
- tuple_format_meta_size(format) + tuple->bsize;
- tuple_format_unref(format);
- struct memtx_tuple *memtx_tuple =
- container_of(tuple, struct memtx_tuple, base);
- if (memtx_alloc.free_mode != SMALL_DELAYED_FREE ||
- memtx_tuple->version == snapshot_version)
- smfree(&memtx_alloc, memtx_tuple, total);
- else
- smfree_delayed(&memtx_alloc, memtx_tuple, total);
-}
-
-void
-memtx_tuple_begin_snapshot()
-{
- snapshot_version++;
- small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, true);
-}
-
-void
-memtx_tuple_end_snapshot()
-{
- small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, false);
-}
diff --git a/src/box/memtx_tuple.h b/src/box/memtx_tuple.h
deleted file mode 100644
index e06fb63d..00000000
--- a/src/box/memtx_tuple.h
+++ /dev/null
@@ -1,100 +0,0 @@
-#ifndef INCLUDES_TARANTOOL_BOX_MEMTX_TUPLE_H
-#define INCLUDES_TARANTOOL_BOX_MEMTX_TUPLE_H
-/*
- * Copyright 2010-2016, Tarantool AUTHORS, please see AUTHORS file.
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * 1. Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the
- * following disclaimer.
- *
- * 2. Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
- * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
- * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
- * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
- * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
- * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
- * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
- * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include "diag.h"
-#include "tuple_format.h"
-#include "tuple.h"
-
-#if defined(__cplusplus)
-extern "C" {
-#endif /* defined(__cplusplus) */
-
-/** Memtx tuple allocator, available to statistics. */
-extern struct small_alloc memtx_alloc;
-
-/**
- * Initialize memtx_tuple library
- */
-void
-memtx_tuple_init(uint64_t tuple_arena_max_size, uint32_t objsize_min,
- float alloc_factor);
-
-/**
- * Cleanup memtx_tuple library
- */
-void
-memtx_tuple_free(void);
-
-/** Create a tuple in the memtx engine format. @sa tuple_new(). */
-struct tuple *
-memtx_tuple_new(struct tuple_format *format, const char *data, const char *end);
-
-/**
- * Free the tuple of a memtx space.
- * @pre tuple->refs == 0
- */
-void
-memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple);
-
-/** Maximal allowed tuple size (box.cfg.memtx_max_tuple_size) */
-extern size_t memtx_max_tuple_size;
-
-/** tuple format vtab for memtx engine. */
-extern struct tuple_format_vtab memtx_tuple_format_vtab;
-
-void
-memtx_tuple_begin_snapshot();
-
-void
-memtx_tuple_end_snapshot();
-
-#if defined(__cplusplus)
-}
-
-/**
- * Create a tuple in the memtx engine format. Throw an exception
- * if an error occured. @sa memtx_tuple_new().
- */
-static inline struct tuple *
-memtx_tuple_new_xc(struct tuple_format *format, const char *data,
- const char *end)
-{
- struct tuple *res = memtx_tuple_new(format, data, end);
- if (res == NULL)
- diag_raise();
- return res;
-}
-
-#endif /* defined(__cplusplus) */
-
-#endif
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 3/8] memtx: pass engine to memory allocation functions
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
2018-05-22 11:46 ` [PATCH 1/8] memtx: init index extent allocator in engine constructor Vladimir Davydov
2018-05-22 11:46 ` [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:47 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 4/8] memtx: move all global variables to engine Vladimir Davydov
` (4 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
We need this so that we can force garbage collection when we are short
on memory. There are two such functions: one is used for allocating
index extents, another for allocating tuples. Index allocating function
has an opaque context so we simply reuse it for passing memtx engine to
it. To pass memtx engine to tuple allocating function, we add an opaque
engine specific pointer to tuple_format (engine_data) and set it to
memtx_engine for memtx spaces.
---
src/box/memtx_bitset.c | 2 +-
src/box/memtx_engine.c | 13 ++++++++++---
src/box/memtx_engine.h | 4 +++-
src/box/memtx_hash.c | 2 +-
src/box/memtx_rtree.c | 2 +-
src/box/memtx_space.c | 4 +++-
src/box/memtx_tree.c | 5 ++---
src/box/tuple_format.c | 1 +
src/box/tuple_format.h | 2 ++
9 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/src/box/memtx_bitset.c b/src/box/memtx_bitset.c
index 93b4a00a..e2252169 100644
--- a/src/box/memtx_bitset.c
+++ b/src/box/memtx_bitset.c
@@ -514,7 +514,7 @@ memtx_bitset_index_new(struct memtx_engine *memtx, struct index_def *def)
if (index->id_to_tuple == NULL)
panic("failed to allocate memtx bitset index");
matras_create(index->id_to_tuple, MEMTX_EXTENT_SIZE, sizeof(struct tuple *),
- memtx_index_extent_alloc, memtx_index_extent_free, NULL);
+ memtx_index_extent_alloc, memtx_index_extent_free, memtx);
index->tuple_to_id = mh_bitset_index_new();
if (index->tuple_to_id == NULL)
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 0c5136cf..70478d75 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1059,6 +1059,8 @@ memtx_engine_set_max_tuple_size(struct memtx_engine *memtx, size_t max_size)
struct tuple *
memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
{
+ struct memtx_engine *memtx = (struct memtx_engine *)format->engine_data;
+ (void)memtx;
assert(mp_typeof(*data) == MP_ARRAY);
size_t tuple_len = end - data;
size_t meta_size = tuple_format_meta_size(format);
@@ -1106,6 +1108,8 @@ memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
void
memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple)
{
+ struct memtx_engine *memtx = (struct memtx_engine *)format->engine_data;
+ (void)memtx;
say_debug("%s(%p)", __func__, tuple);
assert(tuple->refs == 0);
#ifndef NDEBUG
@@ -1140,7 +1144,8 @@ struct tuple_format_vtab memtx_tuple_format_vtab = {
void *
memtx_index_extent_alloc(void *ctx)
{
- (void)ctx;
+ struct memtx_engine *memtx = (struct memtx_engine *)ctx;
+ (void)memtx;
if (memtx_index_reserved_extents) {
assert(memtx_index_num_reserved_extents > 0);
memtx_index_num_reserved_extents--;
@@ -1168,7 +1173,8 @@ memtx_index_extent_alloc(void *ctx)
void
memtx_index_extent_free(void *ctx, void *extent)
{
- (void)ctx;
+ struct memtx_engine *memtx = (struct memtx_engine *)ctx;
+ (void)memtx;
return mempool_free(&memtx_index_extent_pool, extent);
}
@@ -1177,8 +1183,9 @@ memtx_index_extent_free(void *ctx, void *extent)
* Ensure that next num extent_alloc will succeed w/o an error
*/
int
-memtx_index_extent_reserve(int num)
+memtx_index_extent_reserve(struct memtx_engine *memtx, int num)
{
+ (void)memtx;
ERROR_INJECT(ERRINJ_INDEX_ALLOC, {
/* same error as in mempool_alloc */
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 9f28c268..389314ba 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -168,12 +168,14 @@ enum {
/**
* Allocate a block of size MEMTX_EXTENT_SIZE for memtx index
+ * @ctx must point to memtx engine
*/
void *
memtx_index_extent_alloc(void *ctx);
/**
* Free a block previously allocated by memtx_index_extent_alloc
+ * @ctx must point to memtx engine
*/
void
memtx_index_extent_free(void *ctx, void *extent);
@@ -183,7 +185,7 @@ memtx_index_extent_free(void *ctx, void *extent);
* Ensure that next num extent_alloc will succeed w/o an error
*/
int
-memtx_index_extent_reserve(int num);
+memtx_index_extent_reserve(struct memtx_engine *memtx, int num);
/**
* Generic implementation of index_vtab::def_change_requires_rebuild,
diff --git a/src/box/memtx_hash.c b/src/box/memtx_hash.c
index 08ba84ad..d672cafe 100644
--- a/src/box/memtx_hash.c
+++ b/src/box/memtx_hash.c
@@ -490,7 +490,7 @@ memtx_hash_index_new(struct memtx_engine *memtx, struct index_def *def)
light_index_create(hash_table, HASH_INDEX_EXTENT_SIZE,
memtx_index_extent_alloc, memtx_index_extent_free,
- NULL, index->base.def->key_def);
+ memtx, index->base.def->key_def);
index->hash_table = hash_table;
if (def->iid == 0)
diff --git a/src/box/memtx_rtree.c b/src/box/memtx_rtree.c
index baeffa26..d751a9d8 100644
--- a/src/box/memtx_rtree.c
+++ b/src/box/memtx_rtree.c
@@ -370,7 +370,7 @@ memtx_rtree_index_new(struct memtx_engine *memtx, struct index_def *def)
index->dimension = def->opts.dimension;
rtree_init(&index->tree, index->dimension, MEMTX_EXTENT_SIZE,
- memtx_index_extent_alloc, memtx_index_extent_free, NULL,
+ memtx_index_extent_alloc, memtx_index_extent_free, memtx,
distance_type);
return index;
}
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index 7c795b99..3dd97ddf 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -229,13 +229,14 @@ int
memtx_space_replace_all_keys(struct space *space, struct txn_stmt *stmt,
enum dup_replace_mode mode)
{
+ struct memtx_engine *memtx = (struct memtx_engine *)space->engine;
struct tuple *old_tuple = stmt->old_tuple;
struct tuple *new_tuple = stmt->new_tuple;
/*
* Ensure we have enough slack memory to guarantee
* successful statement-level rollback.
*/
- if (memtx_index_extent_reserve(new_tuple ?
+ if (memtx_index_extent_reserve(memtx, new_tuple != NULL ?
RESERVE_EXTENTS_BEFORE_REPLACE :
RESERVE_EXTENTS_BEFORE_DELETE) != 0)
return -1;
@@ -894,6 +895,7 @@ memtx_space_new(struct memtx_engine *memtx,
free(memtx_space);
return NULL;
}
+ format->engine_data = memtx;
format->exact_field_count = def->exact_field_count;
tuple_format_ref(format);
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index 0911af52..8814b13b 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -688,9 +688,8 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
}
struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
- memtx_tree_create(&index->tree, cmp_def,
- memtx_index_extent_alloc,
- memtx_index_extent_free, NULL);
+ memtx_tree_create(&index->tree, cmp_def, memtx_index_extent_alloc,
+ memtx_index_extent_free, memtx);
if (def->iid == 0)
index->gc_task.func = memtx_tree_index_destroy_f;
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index d21477f8..bee71e4f 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -268,6 +268,7 @@ tuple_format_new(struct tuple_format_vtab *vtab, struct key_def * const *keys,
if (format == NULL)
return NULL;
format->vtab = *vtab;
+ format->engine_data = NULL;
format->extra_size = extra_size;
if (tuple_format_register(format) < 0) {
tuple_format_destroy(format);
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 08c91a1a..3775b484 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -117,6 +117,8 @@ struct tuple_field {
struct tuple_format {
/** Virtual function table */
struct tuple_format_vtab vtab;
+ /** Pointer to engine-specific data. */
+ void *engine_data;
/** Identifier */
uint16_t id;
/** Reference counter */
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 4/8] memtx: move all global variables to engine
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
` (2 preceding siblings ...)
2018-05-22 11:46 ` [PATCH 3/8] memtx: pass engine to memory allocation functions Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:48 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 5/8] memtx: destroy slab arena on engine shutdown Vladimir Davydov
` (3 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
All functions that need them are now explicitly passed engine so we can
consolidate all variables related to memtx engine state in one place.
---
src/box/lua/slab.c | 35 ++++++++-------
src/box/memtx_engine.c | 114 ++++++++++++++++---------------------------------
src/box/memtx_engine.h | 32 ++++++++++++++
3 files changed, 89 insertions(+), 92 deletions(-)
diff --git a/src/box/lua/slab.c b/src/box/lua/slab.c
index 1367a4eb..9f5e7e95 100644
--- a/src/box/lua/slab.c
+++ b/src/box/lua/slab.c
@@ -41,9 +41,8 @@
#include "small/small.h"
#include "small/quota.h"
#include "memory.h"
-
-extern struct small_alloc memtx_alloc;
-extern struct mempool memtx_index_extent_pool;
+#include "box/engine.h"
+#include "box/memtx_engine.h"
static int
small_stats_noop_cb(const struct mempool_stats *stats, void *cb_ctx)
@@ -106,15 +105,18 @@ small_stats_lua_cb(const struct mempool_stats *stats, void *cb_ctx)
static int
lbox_slab_stats(struct lua_State *L)
{
+ struct memtx_engine *memtx;
+ memtx = (struct memtx_engine *)engine_by_name("memtx");
+
struct small_stats totals;
lua_newtable(L);
/*
* List all slabs used for tuples and slabs used for
* indexes, with their stats.
*/
- small_stats(&memtx_alloc, &totals, small_stats_lua_cb, L);
+ small_stats(&memtx->alloc, &totals, small_stats_lua_cb, L);
struct mempool_stats index_stats;
- mempool_stats(&memtx_index_extent_pool, &index_stats);
+ mempool_stats(&memtx->index_extent_pool, &index_stats);
small_stats_lua_cb(&index_stats, L);
return 1;
@@ -123,6 +125,9 @@ lbox_slab_stats(struct lua_State *L)
static int
lbox_slab_info(struct lua_State *L)
{
+ struct memtx_engine *memtx;
+ memtx = (struct memtx_engine *)engine_by_name("memtx");
+
struct small_stats totals;
/*
@@ -130,12 +135,10 @@ lbox_slab_info(struct lua_State *L)
* indexes, with their stats.
*/
lua_newtable(L);
- small_stats(&memtx_alloc, &totals, small_stats_noop_cb, L);
+ small_stats(&memtx->alloc, &totals, small_stats_noop_cb, L);
struct mempool_stats index_stats;
- mempool_stats(&memtx_index_extent_pool, &index_stats);
+ mempool_stats(&memtx->index_extent_pool, &index_stats);
- struct slab_arena *tuple_arena = memtx_alloc.cache->arena;
- struct quota *memtx_quota = tuple_arena->quota;
double ratio;
char ratio_buf[32];
@@ -176,7 +179,7 @@ lbox_slab_info(struct lua_State *L)
* quota_used_ratio > 0.9 work as an indicator
* for reaching Tarantool memory limit.
*/
- size_t arena_size = tuple_arena->used;
+ size_t arena_size = memtx->arena.used;
luaL_pushuint64(L, arena_size);
lua_settable(L, -3);
/**
@@ -200,7 +203,7 @@ lbox_slab_info(struct lua_State *L)
* box.cfg.slab_alloc_arena, but in bytes
*/
lua_pushstring(L, "quota_size");
- luaL_pushuint64(L, quota_total(memtx_quota));
+ luaL_pushuint64(L, quota_total(&memtx->quota));
lua_settable(L, -3);
/*
@@ -208,7 +211,7 @@ lbox_slab_info(struct lua_State *L)
* size of slabs in various slab caches.
*/
lua_pushstring(L, "quota_used");
- luaL_pushuint64(L, quota_used(memtx_quota));
+ luaL_pushuint64(L, quota_used(&memtx->quota));
lua_settable(L, -3);
/**
@@ -217,8 +220,8 @@ lbox_slab_info(struct lua_State *L)
* factor, it's the quota that give you OOM error in the
* end of the day.
*/
- ratio = 100 * ((double) quota_used(memtx_quota) /
- ((double) quota_total(memtx_quota) + 0.0001));
+ ratio = 100 * ((double) quota_used(&memtx->quota) /
+ ((double) quota_total(&memtx->quota) + 0.0001));
snprintf(ratio_buf, sizeof(ratio_buf), "%0.2lf%%", ratio);
lua_pushstring(L, "quota_used_ratio");
@@ -254,7 +257,9 @@ lbox_runtime_info(struct lua_State *L)
static int
lbox_slab_check(MAYBE_UNUSED struct lua_State *L)
{
- slab_cache_check(memtx_alloc.cache);
+ struct memtx_engine *memtx;
+ memtx = (struct memtx_engine *)engine_by_name("memtx");
+ slab_cache_check(memtx->alloc.cache);
return 0;
}
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 70478d75..b1a9b157 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -50,41 +50,6 @@
#include "schema.h"
#include "gc.h"
-/*
- * If you decide to use memtx_arena for anything other than
- * memtx_alloc or memtx_index_extent_pool, make sure this
- * is reflected in box.slab.info(), @sa lua/slab.c.
- */
-
-/** Common quota for memtx tuples and indexes. */
-static struct quota memtx_quota;
-/** Common slab arena for memtx tuples and indexes. */
-static struct slab_arena memtx_arena;
-/** Slab cache for allocating memtx tuples. */
-static struct slab_cache memtx_slab_cache;
-/** Memtx tuple allocator. */
-struct small_alloc memtx_alloc; /* used by box.slab.info() */
-/** Slab cache for allocating memtx index extents. */
-static struct slab_cache memtx_index_slab_cache;
-/** Memtx index extent allocator. */
-struct mempool memtx_index_extent_pool; /* used by box.slab.info() */
-
-/**
- * To ensure proper statement-level rollback in case
- * of out of memory conditions, we maintain a number
- * of slack memory extents reserved before a statement
- * is begun. If there isn't enough slack memory,
- * we don't begin the statement.
- */
-static int memtx_index_num_reserved_extents;
-static void *memtx_index_reserved_extents;
-
-/** Maximal allowed tuple size, box.cfg.memtx_max_tuple_size. */
-static size_t memtx_max_tuple_size = 1 * 1024 * 1024;
-
-/** Incremented with each next snapshot. */
-uint32_t snapshot_version;
-
static void
txn_on_yield_or_stop(struct trigger *trigger, void *event)
{
@@ -108,6 +73,7 @@ struct memtx_tuple {
enum {
OBJSIZE_MIN = 16,
SLAB_SIZE = 16 * 1024 * 1024,
+ MAX_TUPLE_SIZE = 1 * 1024 * 1024,
};
static int
@@ -701,8 +667,8 @@ memtx_engine_begin_checkpoint(struct engine *engine)
}
/* increment snapshot version; set tuple deletion to delayed mode */
- snapshot_version++;
- small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, true);
+ memtx->snapshot_version++;
+ small_alloc_setopt(&memtx->alloc, SMALL_DELAYED_FREE_MODE, true);
return 0;
}
@@ -748,7 +714,7 @@ memtx_engine_commit_checkpoint(struct engine *engine, struct vclock *vclock)
/* waitCheckpoint() must have been done. */
assert(!memtx->checkpoint->waiting_for_snap_thread);
- small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, false);
+ small_alloc_setopt(&memtx->alloc, SMALL_DELAYED_FREE_MODE, false);
if (!memtx->checkpoint->touch) {
int64_t lsn = vclock_sum(memtx->checkpoint->vclock);
@@ -791,7 +757,7 @@ memtx_engine_abort_checkpoint(struct engine *engine)
memtx->checkpoint->waiting_for_snap_thread = false;
}
- small_alloc_setopt(&memtx_alloc, SMALL_DELAYED_FREE_MODE, false);
+ small_alloc_setopt(&memtx->alloc, SMALL_DELAYED_FREE_MODE, false);
/** Remove garbage .inprogress file. */
char *filename =
@@ -915,11 +881,11 @@ small_stats_noop_cb(const struct mempool_stats *stats, void *cb_ctx)
static void
memtx_engine_memory_stat(struct engine *engine, struct engine_memory_stat *stat)
{
- (void)engine;
+ struct memtx_engine *memtx = (struct memtx_engine *)engine;
struct small_stats data_stats;
struct mempool_stats index_stats;
- mempool_stats(&memtx_index_extent_pool, &index_stats);
- small_stats(&memtx_alloc, &data_stats, small_stats_noop_cb, NULL);
+ mempool_stats(&memtx->index_extent_pool, &index_stats);
+ small_stats(&memtx->alloc, &data_stats, small_stats_noop_cb, NULL);
stat->data += data_stats.used;
stat->index += index_stats.totals.used;
}
@@ -1007,21 +973,22 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
objsize_min = OBJSIZE_MIN;
/* Initialize tuple allocator. */
- quota_init(&memtx_quota, tuple_arena_max_size);
- tuple_arena_create(&memtx_arena, &memtx_quota, tuple_arena_max_size,
+ quota_init(&memtx->quota, tuple_arena_max_size);
+ tuple_arena_create(&memtx->arena, &memtx->quota, tuple_arena_max_size,
SLAB_SIZE, "memtx");
- slab_cache_create(&memtx_slab_cache, &memtx_arena);
- small_alloc_create(&memtx_alloc, &memtx_slab_cache,
+ slab_cache_create(&memtx->slab_cache, &memtx->arena);
+ small_alloc_create(&memtx->alloc, &memtx->slab_cache,
objsize_min, alloc_factor);
/* Initialize index extent allocator. */
- slab_cache_create(&memtx_index_slab_cache, &memtx_arena);
- mempool_create(&memtx_index_extent_pool, &memtx_index_slab_cache,
+ slab_cache_create(&memtx->index_slab_cache, &memtx->arena);
+ mempool_create(&memtx->index_extent_pool, &memtx->index_slab_cache,
MEMTX_EXTENT_SIZE);
- memtx_index_num_reserved_extents = 0;
- memtx_index_reserved_extents = NULL;
+ memtx->num_reserved_extents = 0;
+ memtx->reserved_extents = NULL;
memtx->state = MEMTX_INITIALIZED;
+ memtx->max_tuple_size = MAX_TUPLE_SIZE;
memtx->force_recovery = force_recovery;
memtx->base.vtab = &memtx_engine_vtab;
@@ -1052,15 +1019,13 @@ memtx_engine_set_snap_io_rate_limit(struct memtx_engine *memtx, double limit)
void
memtx_engine_set_max_tuple_size(struct memtx_engine *memtx, size_t max_size)
{
- (void)memtx;
- memtx_max_tuple_size = max_size;
+ memtx->max_tuple_size = max_size;
}
struct tuple *
memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
{
struct memtx_engine *memtx = (struct memtx_engine *)format->engine_data;
- (void)memtx;
assert(mp_typeof(*data) == MP_ARRAY);
size_t tuple_len = end - data;
size_t meta_size = tuple_format_meta_size(format);
@@ -1070,20 +1035,20 @@ memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
diag_set(OutOfMemory, total, "slab allocator", "memtx_tuple");
return NULL;
});
- if (unlikely(total > memtx_max_tuple_size)) {
+ if (unlikely(total > memtx->max_tuple_size)) {
diag_set(ClientError, ER_MEMTX_MAX_TUPLE_SIZE, total);
error_log(diag_last_error(diag_get()));
return NULL;
}
- struct memtx_tuple *memtx_tuple = smalloc(&memtx_alloc, total);
+ struct memtx_tuple *memtx_tuple = smalloc(&memtx->alloc, total);
if (memtx_tuple == NULL) {
diag_set(OutOfMemory, total, "slab allocator", "memtx_tuple");
return NULL;
}
struct tuple *tuple = &memtx_tuple->base;
tuple->refs = 0;
- memtx_tuple->version = snapshot_version;
+ memtx_tuple->version = memtx->snapshot_version;
assert(tuple_len <= UINT32_MAX); /* bsize is UINT32_MAX */
tuple->bsize = tuple_len;
tuple->format_id = tuple_format_id(format);
@@ -1109,7 +1074,6 @@ void
memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple)
{
struct memtx_engine *memtx = (struct memtx_engine *)format->engine_data;
- (void)memtx;
say_debug("%s(%p)", __func__, tuple);
assert(tuple->refs == 0);
#ifndef NDEBUG
@@ -1126,11 +1090,11 @@ memtx_tuple_delete(struct tuple_format *format, struct tuple *tuple)
tuple_format_unref(format);
struct memtx_tuple *memtx_tuple =
container_of(tuple, struct memtx_tuple, base);
- if (memtx_alloc.free_mode != SMALL_DELAYED_FREE ||
- memtx_tuple->version == snapshot_version)
- smfree(&memtx_alloc, memtx_tuple, total);
+ if (memtx->alloc.free_mode != SMALL_DELAYED_FREE ||
+ memtx_tuple->version == memtx->snapshot_version)
+ smfree(&memtx->alloc, memtx_tuple, total);
else
- smfree_delayed(&memtx_alloc, memtx_tuple, total);
+ smfree_delayed(&memtx->alloc, memtx_tuple, total);
}
struct tuple_format_vtab memtx_tuple_format_vtab = {
@@ -1145,13 +1109,11 @@ void *
memtx_index_extent_alloc(void *ctx)
{
struct memtx_engine *memtx = (struct memtx_engine *)ctx;
- (void)memtx;
- if (memtx_index_reserved_extents) {
- assert(memtx_index_num_reserved_extents > 0);
- memtx_index_num_reserved_extents--;
- void *result = memtx_index_reserved_extents;
- memtx_index_reserved_extents = *(void **)
- memtx_index_reserved_extents;
+ if (memtx->reserved_extents) {
+ assert(memtx->num_reserved_extents > 0);
+ memtx->num_reserved_extents--;
+ void *result = memtx->reserved_extents;
+ memtx->reserved_extents = *(void **)memtx->reserved_extents;
return result;
}
ERROR_INJECT(ERRINJ_INDEX_ALLOC, {
@@ -1160,7 +1122,7 @@ memtx_index_extent_alloc(void *ctx)
"mempool", "new slab");
return NULL;
});
- void *ret = mempool_alloc(&memtx_index_extent_pool);
+ void *ret = mempool_alloc(&memtx->index_extent_pool);
if (ret == NULL)
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
"mempool", "new slab");
@@ -1174,8 +1136,7 @@ void
memtx_index_extent_free(void *ctx, void *extent)
{
struct memtx_engine *memtx = (struct memtx_engine *)ctx;
- (void)memtx;
- return mempool_free(&memtx_index_extent_pool, extent);
+ return mempool_free(&memtx->index_extent_pool, extent);
}
/**
@@ -1185,23 +1146,22 @@ memtx_index_extent_free(void *ctx, void *extent)
int
memtx_index_extent_reserve(struct memtx_engine *memtx, int num)
{
- (void)memtx;
ERROR_INJECT(ERRINJ_INDEX_ALLOC, {
/* same error as in mempool_alloc */
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
"mempool", "new slab");
return -1;
});
- while (memtx_index_num_reserved_extents < num) {
- void *ext = mempool_alloc(&memtx_index_extent_pool);
+ while (memtx->num_reserved_extents < num) {
+ void *ext = mempool_alloc(&memtx->index_extent_pool);
if (ext == NULL) {
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
"mempool", "new slab");
return -1;
}
- *(void **)ext = memtx_index_reserved_extents;
- memtx_index_reserved_extents = ext;
- memtx_index_num_reserved_extents++;
+ *(void **)ext = memtx->reserved_extents;
+ memtx->reserved_extents = ext;
+ memtx->num_reserved_extents++;
}
return 0;
}
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 389314ba..0bcd24ac 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -33,6 +33,8 @@
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
+#include <small/quota.h>
+#include <small/small.h>
#include <small/mempool.h>
#include "engine.h"
@@ -97,6 +99,36 @@ struct memtx_engine {
uint64_t snap_io_rate_limit;
/** Skip invalid snapshot records if this flag is set. */
bool force_recovery;
+ /** Common quota for tuples and indexes. */
+ struct quota quota;
+ /**
+ * Common slab arena for tuples and indexes.
+ * If you decide to use it for anything other than
+ * tuple_alloc or index_extent_pool, make sure this
+ * is reflected in box.slab.info(), @sa lua/slab.c.
+ */
+ struct slab_arena arena;
+ /** Slab cache for allocating tuples. */
+ struct slab_cache slab_cache;
+ /** Tuple allocator. */
+ struct small_alloc alloc;
+ /** Slab cache for allocating index extents. */
+ struct slab_cache index_slab_cache;
+ /** Index extent allocator. */
+ struct mempool index_extent_pool;
+ /**
+ * To ensure proper statement-level rollback in case
+ * of out of memory conditions, we maintain a number
+ * of slack memory extents reserved before a statement
+ * is begun. If there isn't enough slack memory,
+ * we don't begin the statement.
+ */
+ int num_reserved_extents;
+ void *reserved_extents;
+ /** Maximal allowed tuple size, box.cfg.memtx_max_tuple_size. */
+ size_t max_tuple_size;
+ /** Incremented with each next snapshot. */
+ uint32_t snapshot_version;
/** Memory pool for tree index iterator. */
struct mempool tree_iterator_pool;
/** Memory pool for rtree index iterator. */
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 5/8] memtx: destroy slab arena on engine shutdown
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
` (3 preceding siblings ...)
2018-05-22 11:46 ` [PATCH 4/8] memtx: move all global variables to engine Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:50 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 6/8] memtx: embed light hash into memtx_hash_index Vladimir Davydov
` (2 subsequent siblings)
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Since it is created when the memtx engine is initialized, we should
destroy it on engine shutdown.
---
src/box/memtx_engine.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index b1a9b157..51a47eff 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -138,6 +138,11 @@ memtx_engine_shutdown(struct engine *engine)
mempool_destroy(&memtx->hash_iterator_pool);
if (mempool_is_initialized(&memtx->bitset_iterator_pool))
mempool_destroy(&memtx->bitset_iterator_pool);
+ mempool_destroy(&memtx->index_extent_pool);
+ slab_cache_destroy(&memtx->index_slab_cache);
+ small_alloc_destroy(&memtx->alloc);
+ slab_cache_destroy(&memtx->slab_cache);
+ tuple_arena_destroy(&memtx->arena);
xdir_destroy(&memtx->snap_dir);
free(memtx);
}
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 6/8] memtx: embed light hash into memtx_hash_index
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
` (4 preceding siblings ...)
2018-05-22 11:46 ` [PATCH 5/8] memtx: destroy slab arena on engine shutdown Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:51 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 7/8] memtx: rework background garbage collection procedure Vladimir Davydov
2018-05-22 11:46 ` [PATCH 8/8] memtx: run garbage collection on demand Vladimir Davydov
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
No point in this level of indirection. We embed bps tree implementation
into memtx_tree_index, why don't we do the same in case of hash index.
A good side effect is that we can now define iterators in headers for
both memtx_tree_index and memtx_hash_index, which is required to improve
memtx garbage collection mechanism.
---
src/box/memtx_hash.c | 66 ++++++++++++----------------------------------------
src/box/memtx_hash.h | 34 ++++++++++++++++++++++++---
2 files changed, 46 insertions(+), 54 deletions(-)
diff --git a/src/box/memtx_hash.c b/src/box/memtx_hash.c
index d672cafe..e1fdad54 100644
--- a/src/box/memtx_hash.c
+++ b/src/box/memtx_hash.c
@@ -38,34 +38,8 @@
#include "schema.h" /* space_cache_find() */
#include "errinj.h"
-#include "third_party/PMurHash.h"
#include <small/mempool.h>
-static inline bool
-equal(struct tuple *tuple_a, struct tuple *tuple_b,
- const struct key_def *key_def)
-{
- return tuple_compare(tuple_a, tuple_b, key_def) == 0;
-}
-
-static inline bool
-equal_key(struct tuple *tuple, const char *key,
- const struct key_def *key_def)
-{
- return tuple_compare_with_key(tuple, key, key_def->part_count,
- key_def) == 0;
-}
-
-#define LIGHT_NAME _index
-#define LIGHT_DATA_TYPE struct tuple *
-#define LIGHT_KEY_TYPE const char *
-#define LIGHT_CMP_ARG_TYPE struct key_def *
-#define LIGHT_EQUAL(a, b, c) equal(a, b, c)
-#define LIGHT_EQUAL_KEY(a, b, c) equal_key(a, b, c)
-#define HASH_INDEX_EXTENT_SIZE MEMTX_EXTENT_SIZE
-typedef uint32_t hash_t;
-#include "salad/light.h"
-
/* {{{ MemtxHash Iterators ****************************************/
struct hash_iterator {
@@ -131,8 +105,7 @@ hash_iterator_eq(struct iterator *it, struct tuple **ret)
static void
memtx_hash_index_free(struct memtx_hash_index *index)
{
- light_index_destroy(index->hash_table);
- free(index->hash_table);
+ light_index_destroy(&index->hash_table);
free(index);
}
@@ -151,7 +124,7 @@ memtx_hash_index_destroy_f(struct memtx_gc_task *task)
struct memtx_hash_index *index = container_of(task,
struct memtx_hash_index, gc_task);
- struct light_index_core *hash = index->hash_table;
+ struct light_index_core *hash = &index->hash_table;
struct light_index_iterator itr;
light_index_iterator_begin(hash, &itr);
@@ -191,29 +164,29 @@ static void
memtx_hash_index_update_def(struct index *base)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
- index->hash_table->arg = index->base.def->key_def;
+ index->hash_table.arg = index->base.def->key_def;
}
static ssize_t
memtx_hash_index_size(struct index *base)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
- return index->hash_table->count;
+ return index->hash_table.count;
}
static ssize_t
memtx_hash_index_bsize(struct index *base)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
- return matras_extent_count(&index->hash_table->mtable) *
- HASH_INDEX_EXTENT_SIZE;
+ return matras_extent_count(&index->hash_table.mtable) *
+ MEMTX_EXTENT_SIZE;
}
static int
memtx_hash_index_random(struct index *base, uint32_t rnd, struct tuple **result)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
- struct light_index_core *hash_table = index->hash_table;
+ struct light_index_core *hash_table = &index->hash_table;
*result = NULL;
if (hash_table->count == 0)
@@ -248,9 +221,9 @@ memtx_hash_index_get(struct index *base, const char *key,
*result = NULL;
uint32_t h = key_hash(key, base->def->key_def);
- uint32_t k = light_index_find_key(index->hash_table, h, key);
+ uint32_t k = light_index_find_key(&index->hash_table, h, key);
if (k != light_index_end)
- *result = light_index_get(index->hash_table, k);
+ *result = light_index_get(&index->hash_table, k);
return 0;
}
@@ -260,12 +233,13 @@ memtx_hash_index_replace(struct index *base, struct tuple *old_tuple,
struct tuple **result)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
- struct light_index_core *hash_table = index->hash_table;
+ struct light_index_core *hash_table = &index->hash_table;
if (new_tuple) {
uint32_t h = tuple_hash(new_tuple, base->def->key_def);
struct tuple *dup_tuple = NULL;
- hash_t pos = light_index_replace(hash_table, h, new_tuple, &dup_tuple);
+ uint32_t pos = light_index_replace(hash_table, h, new_tuple,
+ &dup_tuple);
if (pos == light_index_end)
pos = light_index_insert(hash_table, h, new_tuple);
@@ -331,7 +305,7 @@ memtx_hash_index_create_iterator(struct index *base, enum iterator_type type,
iterator_create(&it->base, base);
it->pool = &memtx->hash_iterator_pool;
it->base.free = hash_iterator_free;
- it->hash_table = index->hash_table;
+ it->hash_table = &index->hash_table;
light_index_iterator_begin(it->hash_table, &it->iterator);
switch (type) {
@@ -422,7 +396,7 @@ memtx_hash_index_create_snapshot_iterator(struct index *base)
it->base.next = hash_snapshot_iterator_next;
it->base.free = hash_snapshot_iterator_free;
- it->hash_table = index->hash_table;
+ it->hash_table = &index->hash_table;
light_index_iterator_begin(it->hash_table, &it->iterator);
light_index_iterator_freeze(it->hash_table, &it->iterator);
return (struct snapshot_iterator *) it;
@@ -473,25 +447,15 @@ memtx_hash_index_new(struct memtx_engine *memtx, struct index_def *def)
"malloc", "struct memtx_hash_index");
return NULL;
}
- struct light_index_core *hash_table =
- (struct light_index_core *)malloc(sizeof(*hash_table));
- if (hash_table == NULL) {
- free(index);
- diag_set(OutOfMemory, sizeof(*hash_table),
- "malloc", "struct light_index_core");
- return NULL;
- }
if (index_create(&index->base, (struct engine *)memtx,
&memtx_hash_index_vtab, def) != 0) {
- free(hash_table);
free(index);
return NULL;
}
- light_index_create(hash_table, HASH_INDEX_EXTENT_SIZE,
+ light_index_create(&index->hash_table, MEMTX_EXTENT_SIZE,
memtx_index_extent_alloc, memtx_index_extent_free,
memtx, index->base.def->key_def);
- index->hash_table = hash_table;
if (def->iid == 0)
index->gc_task.func = memtx_hash_index_destroy_f;
diff --git a/src/box/memtx_hash.h b/src/box/memtx_hash.h
index 99001079..7a08dc87 100644
--- a/src/box/memtx_hash.h
+++ b/src/box/memtx_hash.h
@@ -37,12 +37,40 @@
extern "C" {
#endif /* defined(__cplusplus) */
-struct memtx_engine;
-struct light_index_core;
+static inline bool
+memtx_hash_equal(struct tuple *tuple_a, struct tuple *tuple_b,
+ const struct key_def *key_def)
+{
+ return tuple_compare(tuple_a, tuple_b, key_def) == 0;
+}
+
+static inline bool
+memtx_hash_equal_key(struct tuple *tuple, const char *key,
+ const struct key_def *key_def)
+{
+ return tuple_compare_with_key(tuple, key, key_def->part_count,
+ key_def) == 0;
+}
+
+#define LIGHT_NAME _index
+#define LIGHT_DATA_TYPE struct tuple *
+#define LIGHT_KEY_TYPE const char *
+#define LIGHT_CMP_ARG_TYPE struct key_def *
+#define LIGHT_EQUAL(a, b, c) memtx_hash_equal(a, b, c)
+#define LIGHT_EQUAL_KEY(a, b, c) memtx_hash_equal_key(a, b, c)
+
+#include "salad/light.h"
+
+#undef LIGHT_NAME
+#undef LIGHT_DATA_TYPE
+#undef LIGHT_KEY_TYPE
+#undef LIGHT_CMP_ARG_TYPE
+#undef LIGHT_EQUAL
+#undef LIGHT_EQUAL_KEY
struct memtx_hash_index {
struct index base;
- struct light_index_core *hash_table;
+ struct light_index_core hash_table;
struct memtx_gc_task gc_task;
};
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 7/8] memtx: rework background garbage collection procedure
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
` (5 preceding siblings ...)
2018-05-22 11:46 ` [PATCH 6/8] memtx: embed light hash into memtx_hash_index Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 13:56 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 8/8] memtx: run garbage collection on demand Vladimir Davydov
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Currently, the engine has not control over yields issued during
asynchronous index destruction. As a result, it can't force gc when
there's not enough memory. To fix that, let's make gc callback stateful:
now it's supposed to free some objects and return true if there's still
more objects to free or false otherwise. Yields are now done by the
memtx engine itself after each gc callback invocation.
---
src/box/index.cc | 7 ++++++-
src/box/memtx_engine.c | 33 ++++++++++++++++++++++++++++-----
src/box/memtx_engine.h | 9 ++++++++-
src/box/memtx_hash.c | 23 +++++++++++------------
src/box/memtx_hash.h | 1 +
src/box/memtx_tree.c | 26 +++++++++++++-------------
src/box/memtx_tree.h | 1 +
7 files changed, 68 insertions(+), 32 deletions(-)
diff --git a/src/box/index.cc b/src/box/index.cc
index 978935b3..b67f8bc6 100644
--- a/src/box/index.cc
+++ b/src/box/index.cc
@@ -496,8 +496,13 @@ index_create(struct index *index, struct engine *engine,
void
index_delete(struct index *index)
{
- index_def_delete(index->def);
+ /*
+ * Free index_def after destroying the index as
+ * engine might still need it.
+ */
+ struct index_def *def = index->def;
index->vtab->destroy(index);
+ index_def_delete(def);
}
int
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 51a47eff..3b21bcaa 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -933,19 +933,42 @@ static const struct engine_vtab memtx_engine_vtab = {
/* .check_space_def = */ memtx_engine_check_space_def,
};
+/**
+ * Run one iteration of garbage collection. Return true if
+ * there still may be objects to collect, false otherwise.
+ */
+static bool
+memtx_engine_run_gc(struct memtx_engine *memtx)
+{
+ if (stailq_empty(&memtx->gc_queue))
+ return false;
+
+ struct memtx_gc_task *task = stailq_shift_entry(&memtx->gc_queue,
+ struct memtx_gc_task, link);
+ if (task->func(task)) {
+ /*
+ * The task still has objects to collect,
+ * put it back.
+ */
+ stailq_add_entry(&memtx->gc_queue, task, link);
+ }
+ return true;
+}
+
static int
memtx_engine_gc_f(va_list va)
{
struct memtx_engine *memtx = va_arg(va, struct memtx_engine *);
while (!fiber_is_cancelled()) {
- if (stailq_empty(&memtx->gc_queue)) {
+ if (!memtx_engine_run_gc(memtx)) {
fiber_yield_timeout(TIMEOUT_INFINITY);
continue;
}
- struct memtx_gc_task *task;
- task = stailq_shift_entry(&memtx->gc_queue,
- struct memtx_gc_task, link);
- task->func(task);
+ /*
+ * Yield after each iteration so as not to block
+ * tx thread for too long.
+ */
+ fiber_sleep(0);
}
return 0;
}
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 0bcd24ac..934bc4e0 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -150,7 +150,14 @@ struct memtx_engine {
};
struct memtx_gc_task;
-typedef void (*memtx_gc_func)(struct memtx_gc_task *);
+
+/**
+ * Garbage collection callback. It is supposed to free some memory
+ * and return true if it still has more or false if it has freed
+ * everything it had. In the latter case, it must also free the
+ * task object.
+ */
+typedef bool (*memtx_gc_func)(struct memtx_gc_task *);
/** Garbage collection task. */
struct memtx_gc_task {
diff --git a/src/box/memtx_hash.c b/src/box/memtx_hash.c
index e1fdad54..67979b76 100644
--- a/src/box/memtx_hash.c
+++ b/src/box/memtx_hash.c
@@ -109,8 +109,8 @@ memtx_hash_index_free(struct memtx_hash_index *index)
free(index);
}
-static void
-memtx_hash_index_destroy_f(struct memtx_gc_task *task)
+static bool
+memtx_hash_index_gc_f(struct memtx_gc_task *task)
{
/*
* Yield every 1K tuples to keep latency < 0.1 ms.
@@ -125,18 +125,17 @@ memtx_hash_index_destroy_f(struct memtx_gc_task *task)
struct memtx_hash_index *index = container_of(task,
struct memtx_hash_index, gc_task);
struct light_index_core *hash = &index->hash_table;
-
- struct light_index_iterator itr;
- light_index_iterator_begin(hash, &itr);
+ struct light_index_iterator *itr = &index->gc_iterator;
struct tuple **res;
unsigned int loops = 0;
- while ((res = light_index_iterator_get_and_next(hash, &itr)) != NULL) {
+ while ((res = light_index_iterator_get_and_next(hash, itr)) != NULL) {
tuple_unref(*res);
- if (++loops % YIELD_LOOPS == 0)
- fiber_sleep(0);
+ if (++loops >= YIELD_LOOPS)
+ return true;
}
memtx_hash_index_free(index);
+ return false;
}
static void
@@ -144,12 +143,15 @@ memtx_hash_index_destroy(struct index *base)
{
struct memtx_hash_index *index = (struct memtx_hash_index *)base;
struct memtx_engine *memtx = (struct memtx_engine *)base->engine;
- if (index->gc_task.func != NULL) {
+ if (base->def->iid == 0) {
/*
* Primary index. We need to free all tuples stored
* in the index, which may take a while. Schedule a
* background task in order not to block tx thread.
*/
+ index->gc_task.func = memtx_hash_index_gc_f;
+ light_index_iterator_begin(&index->hash_table,
+ &index->gc_iterator);
memtx_engine_schedule_gc(memtx, &index->gc_task);
} else {
/*
@@ -456,9 +458,6 @@ memtx_hash_index_new(struct memtx_engine *memtx, struct index_def *def)
light_index_create(&index->hash_table, MEMTX_EXTENT_SIZE,
memtx_index_extent_alloc, memtx_index_extent_free,
memtx, index->base.def->key_def);
-
- if (def->iid == 0)
- index->gc_task.func = memtx_hash_index_destroy_f;
return index;
}
diff --git a/src/box/memtx_hash.h b/src/box/memtx_hash.h
index 7a08dc87..a3b48052 100644
--- a/src/box/memtx_hash.h
+++ b/src/box/memtx_hash.h
@@ -72,6 +72,7 @@ struct memtx_hash_index {
struct index base;
struct light_index_core hash_table;
struct memtx_gc_task gc_task;
+ struct light_index_iterator gc_iterator;
};
struct memtx_hash_index *
diff --git a/src/box/memtx_tree.c b/src/box/memtx_tree.c
index 8814b13b..5a5d7fe1 100644
--- a/src/box/memtx_tree.c
+++ b/src/box/memtx_tree.c
@@ -308,8 +308,8 @@ memtx_tree_index_free(struct memtx_tree_index *index)
free(index);
}
-static void
-memtx_tree_index_destroy_f(struct memtx_gc_task *task)
+static bool
+memtx_tree_index_gc_f(struct memtx_gc_task *task)
{
/*
* Yield every 1K tuples to keep latency < 0.1 ms.
@@ -320,21 +320,22 @@ memtx_tree_index_destroy_f(struct memtx_gc_task *task)
#else
enum { YIELD_LOOPS = 10 };
#endif
+
struct memtx_tree_index *index = container_of(task,
struct memtx_tree_index, gc_task);
struct memtx_tree *tree = &index->tree;
+ struct memtx_tree_iterator *itr = &index->gc_iterator;
unsigned int loops = 0;
- struct memtx_tree_iterator itr;
- for (itr = memtx_tree_iterator_first(tree);
- !memtx_tree_iterator_is_invalid(&itr);
- memtx_tree_iterator_next(tree, &itr)) {
- struct tuple *tuple = *memtx_tree_iterator_get_elem(tree, &itr);
+ while (!memtx_tree_iterator_is_invalid(itr)) {
+ struct tuple *tuple = *memtx_tree_iterator_get_elem(tree, itr);
+ memtx_tree_iterator_next(tree, itr);
tuple_unref(tuple);
- if (++loops % YIELD_LOOPS == 0)
- fiber_sleep(0);
+ if (++loops >= YIELD_LOOPS)
+ return true;
}
memtx_tree_index_free(index);
+ return false;
}
static void
@@ -342,12 +343,14 @@ memtx_tree_index_destroy(struct index *base)
{
struct memtx_tree_index *index = (struct memtx_tree_index *)base;
struct memtx_engine *memtx = (struct memtx_engine *)base->engine;
- if (index->gc_task.func != NULL) {
+ if (base->def->iid == 0) {
/*
* Primary index. We need to free all tuples stored
* in the index, which may take a while. Schedule a
* background task in order not to block tx thread.
*/
+ index->gc_task.func = memtx_tree_index_gc_f;
+ index->gc_iterator = memtx_tree_iterator_first(&index->tree);
memtx_engine_schedule_gc(memtx, &index->gc_task);
} else {
/*
@@ -690,8 +693,5 @@ memtx_tree_index_new(struct memtx_engine *memtx, struct index_def *def)
struct key_def *cmp_def = memtx_tree_index_cmp_def(index);
memtx_tree_create(&index->tree, cmp_def, memtx_index_extent_alloc,
memtx_index_extent_free, memtx);
-
- if (def->iid == 0)
- index->gc_task.func = memtx_tree_index_destroy_f;
return index;
}
diff --git a/src/box/memtx_tree.h b/src/box/memtx_tree.h
index 2e7acde2..bbc8b241 100644
--- a/src/box/memtx_tree.h
+++ b/src/box/memtx_tree.h
@@ -98,6 +98,7 @@ struct memtx_tree_index {
struct tuple **build_array;
size_t build_array_size, build_array_alloc_size;
struct memtx_gc_task gc_task;
+ struct memtx_tree_iterator gc_iterator;
};
struct memtx_tree_index *
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 8/8] memtx: run garbage collection on demand
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
` (6 preceding siblings ...)
2018-05-22 11:46 ` [PATCH 7/8] memtx: rework background garbage collection procedure Vladimir Davydov
@ 2018-05-22 11:46 ` Vladimir Davydov
2018-05-22 14:00 ` Konstantin Osipov
7 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 11:46 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
When a memtx space is dropped or truncated, we delegate freeing tuples
stored in it to a background fiber so as not to block the caller (and tx
thread) for too long. Turns out it doesn't work out well for ephemeral
spaces, which share the destruction code with normal spaces: the problem
is the user might issue a lot of complex SQL SELECT statements that
create a lot of ephemeral spaces and do not yield and hence don't give
the garbage collection fiber a chance to clean up. There's a test that
emulates this, 2.0:test/sql-tap/gh-3083-ephemeral-unref-tuples.test.lua.
For this test to pass, let's run garbage collection procedure on demand,
i.e. when any of memtx allocation functions fails to allocate memory.
Follow-up #3408
---
src/box/memtx_engine.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 3b21bcaa..afb453f9 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1069,7 +1069,11 @@ memtx_tuple_new(struct tuple_format *format, const char *data, const char *end)
return NULL;
}
- struct memtx_tuple *memtx_tuple = smalloc(&memtx->alloc, total);
+ struct memtx_tuple *memtx_tuple;
+ while ((memtx_tuple = smalloc(&memtx->alloc, total)) == NULL) {
+ if (!memtx_engine_run_gc(memtx))
+ break;
+ }
if (memtx_tuple == NULL) {
diag_set(OutOfMemory, total, "slab allocator", "memtx_tuple");
return NULL;
@@ -1150,7 +1154,11 @@ memtx_index_extent_alloc(void *ctx)
"mempool", "new slab");
return NULL;
});
- void *ret = mempool_alloc(&memtx->index_extent_pool);
+ void *ret;
+ while ((ret = mempool_alloc(&memtx->index_extent_pool)) == NULL) {
+ if (!memtx_engine_run_gc(memtx))
+ break;
+ }
if (ret == NULL)
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
"mempool", "new slab");
@@ -1183,6 +1191,8 @@ memtx_index_extent_reserve(struct memtx_engine *memtx, int num)
while (memtx->num_reserved_extents < num) {
void *ext = mempool_alloc(&memtx->index_extent_pool);
if (ext == NULL) {
+ if (memtx_engine_run_gc(memtx))
+ continue;
diag_set(OutOfMemory, MEMTX_EXTENT_SIZE,
"mempool", "new slab");
return -1;
--
2.11.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/8] memtx: init index extent allocator in engine constructor
2018-05-22 11:46 ` [PATCH 1/8] memtx: init index extent allocator in engine constructor Vladimir Davydov
@ 2018-05-22 13:43 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:43 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> Postponing it until a memtx index is created for the first time saves us
> no memory or cpu, it only makes the code more difficult to follow.
> ---
> src/box/memtx_bitset.c | 2 --
> src/box/memtx_engine.c | 34 +++++++---------------------------
> src/box/memtx_engine.h | 9 ---------
> src/box/memtx_hash.c | 2 --
> src/box/memtx_rtree.c | 2 --
> src/box/memtx_tree.c | 2 --
OK to push.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c
2018-05-22 11:46 ` [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c Vladimir Davydov
@ 2018-05-22 13:45 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:45 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> The two files are too closely related: memtx_arena is defined and
> used in memtx_engine.c, but initialized in memtx_tuple.cc. Since
> memtx_tuple.cc is small, let's fold it into memtx_engine.c.
OK to push.
I assume you simply move the code around.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/8] memtx: pass engine to memory allocation functions
2018-05-22 11:46 ` [PATCH 3/8] memtx: pass engine to memory allocation functions Vladimir Davydov
@ 2018-05-22 13:47 ` Konstantin Osipov
2018-05-22 14:39 ` Vladimir Davydov
0 siblings, 1 reply; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:47 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> We need this so that we can force garbage collection when we are short
> on memory. There are two such functions: one is used for allocating
> index extents, another for allocating tuples. Index allocating function
> has an opaque context so we simply reuse it for passing memtx engine to
> it. To pass memtx engine to tuple allocating function, we add an opaque
> engine specific pointer to tuple_format (engine_data) and set it to
> memtx_engine for memtx spaces.
Please rename engine_data to engine. I would not use an opaque
pointer either, reference the engine explicitly.
Otherwise OK to push.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/8] memtx: move all global variables to engine
2018-05-22 11:46 ` [PATCH 4/8] memtx: move all global variables to engine Vladimir Davydov
@ 2018-05-22 13:48 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:48 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> src/box/lua/slab.c | 35 ++++++++-------
> src/box/memtx_engine.c | 114 ++++++++++++++++---------------------------------
> src/box/memtx_engine.h | 32 ++++++++++++++
> 3 files changed, 89 insertions(+), 92 deletions(-)
>
OK to push.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 5/8] memtx: destroy slab arena on engine shutdown
2018-05-22 11:46 ` [PATCH 5/8] memtx: destroy slab arena on engine shutdown Vladimir Davydov
@ 2018-05-22 13:50 ` Konstantin Osipov
2018-05-22 16:26 ` Vladimir Davydov
0 siblings, 1 reply; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:50 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> Since it is created when the memtx engine is initialized, we should
> destroy it on engine shutdown.
> ---
> src/box/memtx_engine.c | 5 +++++
> 1 file changed, 5 insertions(+)
OK to push.
This needs a unit test. Please file a ticket for a unit test if
you can't make one at once.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 6/8] memtx: embed light hash into memtx_hash_index
2018-05-22 11:46 ` [PATCH 6/8] memtx: embed light hash into memtx_hash_index Vladimir Davydov
@ 2018-05-22 13:51 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:51 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> No point in this level of indirection. We embed bps tree implementation
> into memtx_tree_index, why don't we do the same in case of hash index.
> A good side effect is that we can now define iterators in headers for
> both memtx_tree_index and memtx_hash_index, which is required to improve
> memtx garbage collection mechanism.
OK to push.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 7/8] memtx: rework background garbage collection procedure
2018-05-22 11:46 ` [PATCH 7/8] memtx: rework background garbage collection procedure Vladimir Davydov
@ 2018-05-22 13:56 ` Konstantin Osipov
2018-05-22 14:49 ` Vladimir Davydov
0 siblings, 1 reply; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 13:56 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> +/**
> + * Run one iteration of garbage collection. Return true if
> + * there still may be objects to collect, false otherwise.
> + */
> +static bool
> +memtx_engine_run_gc(struct memtx_engine *memtx)
> +{
> + if (stailq_empty(&memtx->gc_queue))
> + return false;
> +
> + struct memtx_gc_task *task = stailq_shift_entry(&memtx->gc_queue,
> + struct memtx_gc_task, link);
> + if (task->func(task)) {
> + /*
> + * The task still has objects to collect,
> + * put it back.
> + */
> + stailq_add_entry(&memtx->gc_queue, task, link);
> + }
> + return true;
> +}
Please rework this to avoid adding back tasks to the queue.
Peek the task in the queue and pop it if it's complete.
Using boolean for task completion status makes the code hard to
follow. I would use an out parameter.
> memtx_engine_gc_f(va_list va)
> {
> struct memtx_engine *memtx = va_arg(va, struct memtx_engine *);
> while (!fiber_is_cancelled()) {
> - if (stailq_empty(&memtx->gc_queue)) {
> + if (!memtx_engine_run_gc(memtx)) {
This is also counter to our typical use of return values.
Otherwise OK to push.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 8/8] memtx: run garbage collection on demand
2018-05-22 11:46 ` [PATCH 8/8] memtx: run garbage collection on demand Vladimir Davydov
@ 2018-05-22 14:00 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 14:00 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> When a memtx space is dropped or truncated, we delegate freeing tuples
> stored in it to a background fiber so as not to block the caller (and tx
> thread) for too long. Turns out it doesn't work out well for ephemeral
> spaces, which share the destruction code with normal spaces: the problem
> is the user might issue a lot of complex SQL SELECT statements that
> create a lot of ephemeral spaces and do not yield and hence don't give
> the garbage collection fiber a chance to clean up. There's a test that
> emulates this, 2.0:test/sql-tap/gh-3083-ephemeral-unref-tuples.test.lua.
> For this test to pass, let's run garbage collection procedure on demand,
> i.e. when any of memtx allocation functions fails to allocate memory.
The patch looks good to me, except the signature of run_gc, which
should be changed from the previous patch..
We're getting really close for capped collections with this patch,
but a log structured allocator would fit in really nicely before
that.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/8] memtx: pass engine to memory allocation functions
2018-05-22 13:47 ` Konstantin Osipov
@ 2018-05-22 14:39 ` Vladimir Davydov
0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 14:39 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tarantool-patches
On Tue, May 22, 2018 at 04:47:28PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> > We need this so that we can force garbage collection when we are short
> > on memory. There are two such functions: one is used for allocating
> > index extents, another for allocating tuples. Index allocating function
> > has an opaque context so we simply reuse it for passing memtx engine to
> > it. To pass memtx engine to tuple allocating function, we add an opaque
> > engine specific pointer to tuple_format (engine_data) and set it to
> > memtx_engine for memtx spaces.
>
> Please rename engine_data to engine.
OK.
> I would not use an opaque pointer either, reference the engine
> explicitly.
The reason I decided to make it an opaque pointer is that we might want
to reuse it in vinyl to make vy_max_tuple_size, which is currently a
global variable, a part of vy_env. The problem is vy_max_tuple_size is
used in vy_stmt.c, which knows nothing about vy_env, so to avoid cross
dependencies we will probably have to introduce vy_stmt_env to store
this variable and make tuple_format reference not the whole vy_env, but
only vy_stmt_env hence the opaque pointer.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 7/8] memtx: rework background garbage collection procedure
2018-05-22 13:56 ` Konstantin Osipov
@ 2018-05-22 14:49 ` Vladimir Davydov
2018-05-22 16:42 ` Konstantin Osipov
0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 14:49 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tarantool-patches
On Tue, May 22, 2018 at 04:56:34PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> > +/**
> > + * Run one iteration of garbage collection. Return true if
> > + * there still may be objects to collect, false otherwise.
> > + */
> > +static bool
> > +memtx_engine_run_gc(struct memtx_engine *memtx)
> > +{
> > + if (stailq_empty(&memtx->gc_queue))
> > + return false;
> > +
> > + struct memtx_gc_task *task = stailq_shift_entry(&memtx->gc_queue,
> > + struct memtx_gc_task, link);
> > + if (task->func(task)) {
> > + /*
> > + * The task still has objects to collect,
> > + * put it back.
> > + */
> > + stailq_add_entry(&memtx->gc_queue, task, link);
> > + }
> > + return true;
> > +}
>
> Please rework this to avoid adding back tasks to the queue.
>
> Peek the task in the queue and pop it if it's complete.
Can't do that, unfortunately: in this implementation a garbage
collection callback is supposed to free the task on the last iteration
(as a task embedded in an index) so I can't remove a task from the queue
after its callback reported there was no more objects to free.
I can add a comment explaining this. If you have a better idea, please
share.
>
> Using boolean for task completion status makes the code hard to
> follow. I would use an out parameter.
OK.
>
> > memtx_engine_gc_f(va_list va)
> > {
> > struct memtx_engine *memtx = va_arg(va, struct memtx_engine *);
> > while (!fiber_is_cancelled()) {
> > - if (stailq_empty(&memtx->gc_queue)) {
> > + if (!memtx_engine_run_gc(memtx)) {
>
> This is also counter to our typical use of return values.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 5/8] memtx: destroy slab arena on engine shutdown
2018-05-22 13:50 ` Konstantin Osipov
@ 2018-05-22 16:26 ` Vladimir Davydov
0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Davydov @ 2018-05-22 16:26 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tarantool-patches
On Tue, May 22, 2018 at 04:50:16PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 15:10]:
> > Since it is created when the memtx engine is initialized, we should
> > destroy it on engine shutdown.
> > ---
> > src/box/memtx_engine.c | 5 +++++
> > 1 file changed, 5 insertions(+)
>
> OK to push.
>
> This needs a unit test. Please file a ticket for a unit test if
> you can't make one at once.
https://github.com/tarantool/tarantool/issues/3419
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 7/8] memtx: rework background garbage collection procedure
2018-05-22 14:49 ` Vladimir Davydov
@ 2018-05-22 16:42 ` Konstantin Osipov
0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-22 16:42 UTC (permalink / raw)
To: Vladimir Davydov; +Cc: tarantool-patches
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/05/22 17:53]:
> Can't do that, unfortunately: in this implementation a garbage
> collection callback is supposed to free the task on the last iteration
> (as a task embedded in an index) so I can't remove a task from the queue
> after its callback reported there was no more objects to free.
>
> I can add a comment explaining this. If you have a better idea, please
> share.
Implement a vtab for tasks, with free() virtual method.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2018-05-22 16:42 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-22 11:46 [PATCH 0/8] Follow-up on async memtx index cleanup Vladimir Davydov
2018-05-22 11:46 ` [PATCH 1/8] memtx: init index extent allocator in engine constructor Vladimir Davydov
2018-05-22 13:43 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 2/8] memtx: fold memtx_tuple.cc into memtx_engine.c Vladimir Davydov
2018-05-22 13:45 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 3/8] memtx: pass engine to memory allocation functions Vladimir Davydov
2018-05-22 13:47 ` Konstantin Osipov
2018-05-22 14:39 ` Vladimir Davydov
2018-05-22 11:46 ` [PATCH 4/8] memtx: move all global variables to engine Vladimir Davydov
2018-05-22 13:48 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 5/8] memtx: destroy slab arena on engine shutdown Vladimir Davydov
2018-05-22 13:50 ` Konstantin Osipov
2018-05-22 16:26 ` Vladimir Davydov
2018-05-22 11:46 ` [PATCH 6/8] memtx: embed light hash into memtx_hash_index Vladimir Davydov
2018-05-22 13:51 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 7/8] memtx: rework background garbage collection procedure Vladimir Davydov
2018-05-22 13:56 ` Konstantin Osipov
2018-05-22 14:49 ` Vladimir Davydov
2018-05-22 16:42 ` Konstantin Osipov
2018-05-22 11:46 ` [PATCH 8/8] memtx: run garbage collection on demand Vladimir Davydov
2018-05-22 14:00 ` Konstantin Osipov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox