* [PATCH v2 0/5] Prepare vylog for space alter
@ 2018-03-20 11:29 Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 1/5] vinyl: refactor vylog recovery Vladimir Davydov
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
The vinyl metadata log (vylog) has two idiosyncrasies that thwart
implementation of ALTER of vinyl spaces. The first one is the
callback-based recovery procedure (see vy_recovery_iterate): an
attempt to purge incomplete indexes from vylog after restart using the
recovery callback would result in extremely ugly and obfuscated code.
The second problem is the way indexes are identified in vylog: we
currently use LSN as a unique identifier, but it won't be unique if
more than one index is altered in one operation, which is the case
when the primary key definition is changed.
The patch set solves both these problems as follows:
- Patch 1 refactors the recovery procedure so that it uses plain
iterators instead of callbacks.
- Patch 2 and 3 do some renames in preparation for patch 4.
- Patch 4 ceases usage of LSN as unique index identifier. Instead
it introduces a special unique index id, similar to range id or
slice id we already have.
- Patch 5 does some cleanup in ALTER code that becomes possible
after patch 4 is applied.
https://github.com/tarantool/tarantool/tree/vy-prepare-vylog-for-alter
Changes in v2:
- Introduce unique index id for identifying indexes in vylog
instead of using space_id/index_id pair in v1. We still need
a unique id, because during index rebuild (yet to be implemented),
two indexes with the same space_id/index_id will coexist.
- Do not use 'incarnation' counter for distinguishing different
incarnations of the same index. Use LSN as before.
v1: https://www.freelists.org/post/tarantool-patches/PATCH-04-Prepare-vylog-for-space-alter
Vladimir Davydov (5):
vinyl: refactor vylog recovery
vinyl: rename vy_index::id to index_id
vinyl: rename vy_log_record::index_id/space_id to
index_def_id/space_def_id
vinyl: do not use index lsn to identify indexes in vylog
alter: rewrite space truncation using alter infrastructure
src/box/alter.cc | 107 +-------
src/box/memtx_space.c | 29 +-
src/box/space.h | 44 ----
src/box/sysview_engine.c | 17 --
src/box/vinyl.c | 611 +++++++++++++++---------------------------
src/box/vy_index.c | 416 +++++++++++++++--------------
src/box/vy_index.h | 36 +--
src/box/vy_log.c | 626 +++++++++++++++-----------------------------
src/box/vy_log.h | 282 ++++++++++++--------
src/box/vy_point_lookup.c | 2 +-
src/box/vy_read_iterator.c | 5 +-
src/box/vy_scheduler.c | 29 +-
src/box/vy_tx.c | 4 +-
test/unit/vy_log_stub.c | 8 +-
test/unit/vy_point_lookup.c | 6 +-
test/vinyl/layout.result | 64 ++---
16 files changed, 894 insertions(+), 1392 deletions(-)
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/5] vinyl: refactor vylog recovery
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
@ 2018-03-20 11:29 ` Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 2/5] vinyl: rename vy_index::id to index_id Vladimir Davydov
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
The vy_recovery structure was initially designed as opaque to the
outside world - to iterate over objects stored in it, one is supposed to
use vy_recovery_iterate(), which invokes the given callback for each
recovered object encoded as vy_log_record that was used to create it.
Such a design gets extremely difficult to use when we need to preserve
some context between callback invocations - e.g. see how ugly backup and
garbage collection procedures look. And it is going to become even more
obfuscated once we introduce the notion of incomplete indexes (indexes
that are currently being built by ALTER).
So let's refactor vylog recovery procedure: let's make the vy_recovery
structure transparent and allow to iterate over internal representations
of recovered objects directly, without callbacks.
---
src/box/vinyl.c | 399 +++++++++++++++++++++--------------------------
src/box/vy_index.c | 375 ++++++++++++++++++++++----------------------
src/box/vy_log.c | 340 ++++++++++------------------------------
src/box/vy_log.h | 197 +++++++++++++++--------
test/unit/vy_log_stub.c | 8 +-
test/vinyl/layout.result | 4 +-
6 files changed, 582 insertions(+), 741 deletions(-)
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index e0c30757..d3659b0b 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -3011,8 +3011,6 @@ struct vy_join_ctx {
struct cbus_call_msg cmsg;
/** ID of the space currently being relayed. */
uint32_t space_id;
- /** Ordinal number of the index. */
- uint32_t index_id;
/**
* Index key definition, as defined by the user.
* We only send the primary key, so the definition
@@ -3036,6 +3034,55 @@ struct vy_join_ctx {
struct rlist slices;
};
+/**
+ * Recover a slice and add it to the list of slices.
+ * Newer slices are supposed to be recovered first.
+ * Returns 0 on success, -1 on failure.
+ */
+static int
+vy_prepare_send_slice(struct vy_join_ctx *ctx,
+ struct vy_slice_recovery_info *slice_info)
+{
+ int rc = -1;
+ struct vy_run *run = NULL;
+ struct tuple *begin = NULL, *end = NULL;
+
+ run = vy_run_new(&ctx->env->run_env, slice_info->run->id);
+ if (run == NULL)
+ goto out;
+ if (vy_run_recover(run, ctx->env->path, ctx->space_id, 0) != 0)
+ goto out;
+
+ if (slice_info->begin != NULL) {
+ begin = vy_key_from_msgpack(ctx->env->index_env.key_format,
+ slice_info->begin);
+ if (begin == NULL)
+ goto out;
+ }
+ if (slice_info->end != NULL) {
+ end = vy_key_from_msgpack(ctx->env->index_env.key_format,
+ slice_info->end);
+ if (end == NULL)
+ goto out;
+ }
+
+ struct vy_slice *slice = vy_slice_new(slice_info->id, run,
+ begin, end, ctx->key_def);
+ if (slice == NULL)
+ goto out;
+
+ rlist_add_tail_entry(&ctx->slices, slice, in_join);
+ rc = 0;
+out:
+ if (run != NULL)
+ vy_run_unref(run);
+ if (begin != NULL)
+ tuple_unref(begin);
+ if (end != NULL)
+ tuple_unref(end);
+ return rc;
+}
+
static int
vy_send_range_f(struct cbus_call_msg *cmsg)
{
@@ -3068,28 +3115,38 @@ err:
return rc;
}
-/**
- * Merge and send all runs from the given relay context.
- * On success, delete runs.
- */
+/** Merge and send all runs of the given range. */
static int
-vy_send_range(struct vy_join_ctx *ctx)
+vy_send_range(struct vy_join_ctx *ctx,
+ struct vy_range_recovery_info *range_info)
{
- if (rlist_empty(&ctx->slices))
+ int rc;
+ struct vy_slice *slice, *tmp;
+
+ if (rlist_empty(&range_info->slices))
return 0; /* nothing to do */
- int rc = -1;
+ /* Recover slices. */
+ struct vy_slice_recovery_info *slice_info;
+ rlist_foreach_entry(slice_info, &range_info->slices, in_range) {
+ rc = vy_prepare_send_slice(ctx, slice_info);
+ if (rc != 0)
+ goto out_delete_slices;
+ }
+
+ /* Create a write iterator. */
struct rlist fake_read_views;
rlist_create(&fake_read_views);
ctx->wi = vy_write_iterator_new(ctx->key_def,
ctx->format, ctx->upsert_format,
true, true, &fake_read_views);
- if (ctx->wi == NULL)
+ if (ctx->wi == NULL) {
+ rc = -1;
goto out;
-
- struct vy_slice *slice;
+ }
rlist_foreach_entry(slice, &ctx->slices, in_join) {
- if (vy_write_iterator_new_slice(ctx->wi, slice) != 0)
+ rc = vy_write_iterator_new_slice(ctx->wi, slice);
+ if (rc != 0)
goto out_delete_wi;
}
@@ -3099,11 +3156,10 @@ vy_send_range(struct vy_join_ctx *ctx)
vy_send_range_f, NULL, TIMEOUT_INFINITY);
fiber_set_cancellable(cancellable);
- struct vy_slice *tmp;
+out_delete_slices:
rlist_foreach_entry_safe(slice, &ctx->slices, in_join, tmp)
vy_slice_delete(slice);
rlist_create(&ctx->slices);
-
out_delete_wi:
ctx->wi->iface->close(ctx->wi);
ctx->wi = NULL;
@@ -3111,96 +3167,59 @@ out:
return rc;
}
-/** Relay callback, passed to vy_recovery_iterate(). */
+/** Send all tuples stored in the given index. */
static int
-vy_join_cb(const struct vy_log_record *record, void *arg)
+vy_send_index(struct vy_join_ctx *ctx,
+ struct vy_index_recovery_info *index_info)
{
- struct vy_join_ctx *ctx = arg;
-
- if (record->type == VY_LOG_CREATE_INDEX ||
- record->type == VY_LOG_INSERT_RANGE) {
- /*
- * All runs of the current range have been recovered,
- * so send them to the replica.
- */
- if (vy_send_range(ctx) != 0)
- return -1;
- }
+ int rc = -1;
- if (record->type == VY_LOG_CREATE_INDEX) {
- ctx->space_id = record->space_id;
- ctx->index_id = record->index_id;
- if (ctx->key_def != NULL)
- free(ctx->key_def);
- ctx->key_def = key_def_new_with_parts(record->key_parts,
- record->key_part_count);
- if (ctx->key_def == NULL)
- return -1;
- if (ctx->format != NULL)
- tuple_format_unref(ctx->format);
- ctx->format = tuple_format_new(&vy_tuple_format_vtab,
- &ctx->key_def, 1, 0, NULL, 0,
- NULL);
- if (ctx->format == NULL)
- return -1;
- tuple_format_ref(ctx->format);
- if (ctx->upsert_format != NULL)
- tuple_format_unref(ctx->upsert_format);
- ctx->upsert_format = vy_tuple_format_new_upsert(ctx->format);
- if (ctx->upsert_format == NULL)
- return -1;
- tuple_format_ref(ctx->upsert_format);
- }
+ if (index_info->is_dropped)
+ return 0;
/*
* We are only interested in the primary index.
* Secondary keys will be rebuilt on the destination.
*/
- if (ctx->index_id != 0)
+ if (index_info->index_id != 0)
return 0;
- if (record->type == VY_LOG_INSERT_SLICE) {
- struct tuple_format *key_format = ctx->env->index_env.key_format;
- struct tuple *begin = NULL, *end = NULL;
- bool success = false;
-
- struct vy_run *run = vy_run_new(&ctx->env->run_env,
- record->run_id);
- if (run == NULL)
- goto done_slice;
- if (vy_run_recover(run, ctx->env->path,
- ctx->space_id, ctx->index_id) != 0)
- goto done_slice;
-
- if (record->begin != NULL) {
- begin = vy_key_from_msgpack(key_format, record->begin);
- if (begin == NULL)
- goto done_slice;
- }
- if (record->end != NULL) {
- end = vy_key_from_msgpack(key_format, record->end);
- if (end == NULL)
- goto done_slice;
- }
+ ctx->space_id = index_info->space_id;
- struct vy_slice *slice = vy_slice_new(record->slice_id,
- run, begin, end, ctx->key_def);
- if (slice == NULL)
- goto done_slice;
-
- rlist_add_entry(&ctx->slices, slice, in_join);
- success = true;
-done_slice:
- if (run != NULL)
- vy_run_unref(run);
- if (begin != NULL)
- tuple_unref(begin);
- if (end != NULL)
- tuple_unref(end);
- if (!success)
- return -1;
+ /* Create key definition and tuple format. */
+ ctx->key_def = key_def_new_with_parts(index_info->key_parts,
+ index_info->key_part_count);
+ if (ctx->key_def == NULL)
+ goto out;
+ ctx->format = tuple_format_new(&vy_tuple_format_vtab, &ctx->key_def,
+ 1, 0, NULL, 0, NULL);
+ if (ctx->format == NULL)
+ goto out_free_key_def;
+ tuple_format_ref(ctx->format);
+ ctx->upsert_format = vy_tuple_format_new_upsert(ctx->format);
+ if (ctx->upsert_format == NULL)
+ goto out_free_format;
+ tuple_format_ref(ctx->upsert_format);
+
+ /* Send ranges. */
+ struct vy_range_recovery_info *range_info;
+ assert(!rlist_empty(&index_info->ranges));
+ rlist_foreach_entry(range_info, &index_info->ranges, in_index) {
+ rc = vy_send_range(ctx, range_info);
+ if (rc != 0)
+ break;
}
- return 0;
+
+ tuple_format_unref(ctx->upsert_format);
+ ctx->upsert_format = NULL;
+out_free_format:
+ tuple_format_unref(ctx->format);
+ ctx->format = NULL;
+out_free_key_def:
+ free(ctx->key_def);
+ ctx->key_def = NULL;
+out:
+ return rc;
}
/** Relay cord function. */
@@ -3260,22 +3279,15 @@ vinyl_engine_join(struct engine *engine, struct vclock *vclock,
say_error("failed to recover vylog to join a replica");
goto out_join_cord;
}
- rc = vy_recovery_iterate(recovery, vy_join_cb, ctx);
+ rc = 0;
+ struct vy_index_recovery_info *index_info;
+ rlist_foreach_entry(index_info, &recovery->indexes, in_recovery) {
+ rc = vy_send_index(ctx, index_info);
+ if (rc != 0)
+ break;
+ }
vy_recovery_delete(recovery);
- /* Send the last range. */
- if (rc == 0)
- rc = vy_send_range(ctx);
-
- /* Cleanup. */
- if (ctx->key_def != NULL)
- free(ctx->key_def);
- if (ctx->format != NULL)
- tuple_format_unref(ctx->format);
- if (ctx->upsert_format != NULL)
- tuple_format_unref(ctx->upsert_format);
- struct vy_slice *slice, *tmp;
- rlist_foreach_entry_safe(slice, &ctx->slices, in_join, tmp)
- vy_slice_delete(slice);
+
out_join_cord:
cbus_stop_loop(&ctx->relay_pipe);
cpipe_destroy(&ctx->relay_pipe);
@@ -3355,70 +3367,29 @@ vinyl_space_apply_initial_join_row(struct space *space, struct request *request)
/* {{{ Garbage collection */
-/** Argument passed to vy_gc_cb(). */
-struct vy_gc_arg {
- /** Vinyl environment. */
- struct vy_env *env;
- /**
- * Specifies what kinds of runs to delete.
- * See VY_GC_*.
- */
- unsigned int gc_mask;
- /** LSN of the oldest checkpoint to save. */
- int64_t gc_lsn;
- /**
- * ID of the current space and index.
- * Needed for file name formatting.
- */
- uint32_t space_id;
- uint32_t index_id;
- /** Number of times the callback has been called. */
- int loops;
-};
-
/**
- * Garbage collection callback, passed to vy_recovery_iterate().
- *
* Given a record encoding information about a vinyl run, try to
* delete the corresponding files. On success, write a "forget" record
* to the log so that all information about the run is deleted on the
* next log rotation.
*/
-static int
-vy_gc_cb(const struct vy_log_record *record, void *cb_arg)
+static void
+vy_gc_run(struct vy_env *env,
+ struct vy_index_recovery_info *index_info,
+ struct vy_run_recovery_info *run_info)
{
- struct vy_gc_arg *arg = cb_arg;
-
- switch (record->type) {
- case VY_LOG_CREATE_INDEX:
- arg->space_id = record->space_id;
- arg->index_id = record->index_id;
- goto out;
- case VY_LOG_PREPARE_RUN:
- if ((arg->gc_mask & VY_GC_INCOMPLETE) == 0)
- goto out;
- break;
- case VY_LOG_DROP_RUN:
- if ((arg->gc_mask & VY_GC_DROPPED) == 0 ||
- record->gc_lsn >= arg->gc_lsn)
- goto out;
- break;
- default:
- goto out;
- }
-
ERROR_INJECT(ERRINJ_VY_GC,
{say_error("error injection: vinyl run %lld not deleted",
- (long long)record->run_id); goto out;});
+ (long long)run_info->id); return;});
/* Try to delete files. */
- if (vy_run_remove_files(arg->env->path, arg->space_id,
- arg->index_id, record->run_id) != 0)
- goto out;
+ if (vy_run_remove_files(env->path, index_info->space_id,
+ index_info->index_id, run_info->id) != 0)
+ return;
/* Forget the run on success. */
vy_log_tx_begin();
- vy_log_forget_run(record->run_id);
+ vy_log_forget_run(run_info->id);
/*
* Leave the record in the vylog buffer on disk error.
* If we fail to flush it before restart, we will retry
@@ -3426,23 +3397,35 @@ vy_gc_cb(const struct vy_log_record *record, void *cb_arg)
* is invoked, which is harmless.
*/
vy_log_tx_try_commit();
-out:
- if (++arg->loops % VY_YIELD_LOOPS == 0)
- fiber_sleep(0);
- return 0;
}
-/** Delete unused run files, see vy_gc_arg for more details. */
+/**
+ * Delete unused run files stored in the recovery context.
+ * @param env Vinyl environment.
+ * @param recovery Recovery context.
+ * @param gc_mask Specifies what kinds of runs to delete (see VY_GC_*).
+ * @param gc_lsn LSN of the oldest checkpoint to save.
+ */
static void
vy_gc(struct vy_env *env, struct vy_recovery *recovery,
unsigned int gc_mask, int64_t gc_lsn)
{
- struct vy_gc_arg arg = {
- .env = env,
- .gc_mask = gc_mask,
- .gc_lsn = gc_lsn,
- };
- vy_recovery_iterate(recovery, vy_gc_cb, &arg);
+ int loops = 0;
+ struct vy_index_recovery_info *index_info;
+ rlist_foreach_entry(index_info, &recovery->indexes, in_recovery) {
+ struct vy_run_recovery_info *run_info;
+ rlist_foreach_entry(run_info, &index_info->runs, in_index) {
+ if ((run_info->is_dropped &&
+ run_info->gc_lsn < gc_lsn &&
+ (gc_mask & VY_GC_DROPPED) != 0) ||
+ (run_info->is_incomplete &&
+ (gc_mask & VY_GC_INCOMPLETE) != 0)) {
+ vy_gc_run(env, index_info, run_info);
+ }
+ if (loops % VY_YIELD_LOOPS == 0)
+ fiber_sleep(0);
+ }
+ }
}
static int
@@ -3469,52 +3452,6 @@ vinyl_engine_collect_garbage(struct engine *engine, int64_t lsn)
/* {{{ Backup */
-/** Argument passed to vy_backup_cb(). */
-struct vy_backup_arg {
- /** Vinyl environment. */
- struct vy_env *env;
- /** Backup callback. */
- int (*cb)(const char *, void *);
- /** Argument passed to @cb. */
- void *cb_arg;
- /**
- * ID of the current space and index.
- * Needed for file name formatting.
- */
- uint32_t space_id;
- uint32_t index_id;
- /** Number of times the callback has been called. */
- int loops;
-};
-
-/** Backup callback, passed to vy_recovery_iterate(). */
-static int
-vy_backup_cb(const struct vy_log_record *record, void *cb_arg)
-{
- struct vy_backup_arg *arg = cb_arg;
-
- if (record->type == VY_LOG_CREATE_INDEX) {
- arg->space_id = record->space_id;
- arg->index_id = record->index_id;
- }
-
- if (record->type != VY_LOG_CREATE_RUN || record->is_dropped)
- goto out;
-
- char path[PATH_MAX];
- for (int type = 0; type < vy_file_MAX; type++) {
- vy_run_snprint_path(path, sizeof(path), arg->env->path,
- arg->space_id, arg->index_id,
- record->run_id, type);
- if (arg->cb(path, arg->cb_arg) != 0)
- return -1;
- }
-out:
- if (++arg->loops % VY_YIELD_LOOPS == 0)
- fiber_sleep(0);
- return 0;
-}
-
static int
vinyl_engine_backup(struct engine *engine, struct vclock *vclock,
engine_backup_cb cb, void *cb_arg)
@@ -3535,12 +3472,32 @@ vinyl_engine_backup(struct engine *engine, struct vclock *vclock,
say_error("failed to recover vylog for backup");
return -1;
}
- struct vy_backup_arg arg = {
- .env = env,
- .cb = cb,
- .cb_arg = cb_arg,
- };
- int rc = vy_recovery_iterate(recovery, vy_backup_cb, &arg);
+ int rc = 0;
+ int loops = 0;
+ struct vy_index_recovery_info *index_info;
+ rlist_foreach_entry(index_info, &recovery->indexes, in_recovery) {
+ if (index_info->is_dropped)
+ continue;
+ struct vy_run_recovery_info *run_info;
+ rlist_foreach_entry(run_info, &index_info->runs, in_index) {
+ if (run_info->is_dropped || run_info->is_incomplete)
+ continue;
+ char path[PATH_MAX];
+ for (int type = 0; type < vy_file_MAX; type++) {
+ vy_run_snprint_path(path, sizeof(path),
+ env->path,
+ index_info->space_id,
+ index_info->index_id,
+ run_info->id, type);
+ rc = cb(path, cb_arg);
+ if (rc != 0)
+ goto out;
+ }
+ if (loops % VY_YIELD_LOOPS == 0)
+ fiber_sleep(0);
+ }
+ }
+out:
vy_recovery_delete(recovery);
return rc;
}
diff --git a/src/box/vy_index.c b/src/box/vy_index.c
index 68fccab5..9c199ddd 100644
--- a/src/box/vy_index.c
+++ b/src/box/vy_index.c
@@ -36,7 +36,6 @@
#include <sys/stat.h>
#include <sys/types.h>
-#include "assoc.h"
#include "diag.h"
#include "errcode.h"
#include "histogram.h"
@@ -386,156 +385,150 @@ vy_index_create(struct vy_index *index)
return vy_index_init_range_tree(index);
}
-/** vy_index_recovery_cb() argument. */
-struct vy_index_recovery_cb_arg {
- /** Index being recovered. */
- struct vy_index *index;
- /** Last recovered range. */
- struct vy_range *range;
- /** Vinyl run environment. */
- struct vy_run_env *run_env;
- /**
- * All recovered runs hashed by ID.
- * It is needed in order not to load the same
- * run each time a slice is created for it.
- */
- struct mh_i64ptr_t *run_hash;
- /**
- * True if force_recovery mode is enabled.
+static struct vy_run *
+vy_index_recover_run(struct vy_index *index,
+ struct vy_run_recovery_info *run_info,
+ struct vy_run_env *run_env, bool force_recovery)
+{
+ assert(!run_info->is_dropped);
+ assert(!run_info->is_incomplete);
+
+ if (run_info->data != NULL) {
+ /* Already recovered. */
+ return run_info->data;
+ }
+
+ struct vy_run *run = vy_run_new(run_env, run_info->id);
+ if (run == NULL)
+ return NULL;
+
+ run->dump_lsn = run_info->dump_lsn;
+ if (vy_run_recover(run, index->env->path,
+ index->space_id, index->id) != 0 &&
+ (!force_recovery ||
+ vy_run_rebuild_index(run, index->env->path,
+ index->space_id, index->id,
+ index->cmp_def, index->key_def,
+ index->mem_format, index->upsert_format,
+ &index->opts) != 0)) {
+ vy_run_unref(run);
+ return NULL;
+ }
+ vy_index_add_run(index, run);
+
+ /*
+ * The same run can be referenced by more than one slice
+ * so we cache recovered runs in run_info to avoid loading
+ * the same run multiple times.
+ *
+ * Runs are stored with their reference counters elevated.
+ * We drop the extra references as soon as index recovery
+ * is complete (see vy_index_recover()).
*/
- bool force_recovery;
-};
+ run_info->data = run;
+ return run;
+}
-/** Index recovery callback, passed to vy_recovery_load_index(). */
-static int
-vy_index_recovery_cb(const struct vy_log_record *record, void *cb_arg)
+static struct vy_slice *
+vy_index_recover_slice(struct vy_index *index, struct vy_range *range,
+ struct vy_slice_recovery_info *slice_info,
+ struct vy_run_env *run_env, bool force_recovery)
{
- struct vy_index_recovery_cb_arg *arg = cb_arg;
- struct vy_index *index = arg->index;
- struct vy_range *range = arg->range;
- struct vy_run_env *run_env = arg->run_env;
- struct mh_i64ptr_t *run_hash = arg->run_hash;
- bool force_recovery = arg->force_recovery;
- struct tuple_format *key_format = index->env->key_format;
struct tuple *begin = NULL, *end = NULL;
+ struct vy_slice *slice = NULL;
struct vy_run *run;
- struct vy_slice *slice;
- bool success = false;
-
- assert(record->type == VY_LOG_CREATE_INDEX || index->commit_lsn >= 0);
- if (record->type == VY_LOG_INSERT_RANGE ||
- record->type == VY_LOG_INSERT_SLICE) {
- if (record->begin != NULL) {
- begin = vy_key_from_msgpack(key_format, record->begin);
- if (begin == NULL)
- goto out;
- }
- if (record->end != NULL) {
- end = vy_key_from_msgpack(key_format, record->end);
- if (end == NULL)
- goto out;
- }
- }
-
- switch (record->type) {
- case VY_LOG_CREATE_INDEX:
- assert(record->index_id == index->id);
- assert(record->space_id == index->space_id);
- assert(index->commit_lsn < 0);
- assert(record->index_lsn >= 0);
- index->commit_lsn = record->index_lsn;
- break;
- case VY_LOG_DUMP_INDEX:
- assert(record->index_lsn == index->commit_lsn);
- index->dump_lsn = record->dump_lsn;
- break;
- case VY_LOG_TRUNCATE_INDEX:
- assert(record->index_lsn == index->commit_lsn);
- index->truncate_count = record->truncate_count;
- break;
- case VY_LOG_DROP_INDEX:
- assert(record->index_lsn == index->commit_lsn);
- index->is_dropped = true;
- /*
- * If the index was dropped, we don't need to replay
- * truncate (see vy_prepare_truncate_space()).
- */
- index->truncate_count = UINT64_MAX;
- break;
- case VY_LOG_PREPARE_RUN:
- break;
- case VY_LOG_CREATE_RUN:
- if (record->is_dropped)
- break;
- assert(record->index_lsn == index->commit_lsn);
- run = vy_run_new(run_env, record->run_id);
- if (run == NULL)
+ if (slice_info->begin != NULL) {
+ begin = vy_key_from_msgpack(index->env->key_format,
+ slice_info->begin);
+ if (begin == NULL)
goto out;
- run->dump_lsn = record->dump_lsn;
- if (vy_run_recover(run, index->env->path,
- index->space_id, index->id) != 0 &&
- (!force_recovery ||
- vy_run_rebuild_index(run, index->env->path,
- index->space_id, index->id,
- index->cmp_def, index->key_def,
- index->mem_format,
- index->upsert_format,
- &index->opts) != 0)) {
- vy_run_unref(run);
+ }
+ if (slice_info->end != NULL) {
+ end = vy_key_from_msgpack(index->env->key_format,
+ slice_info->end);
+ if (end == NULL)
goto out;
- }
- struct mh_i64ptr_node_t node = { run->id, run };
- if (mh_i64ptr_put(run_hash, &node,
- NULL, NULL) == mh_end(run_hash)) {
- diag_set(OutOfMemory, 0,
- "mh_i64ptr_put", "mh_i64ptr_node_t");
- vy_run_unref(run);
+ }
+ if (begin != NULL && end != NULL &&
+ vy_key_compare(begin, end, index->cmp_def) >= 0) {
+ diag_set(ClientError, ER_INVALID_VYLOG_FILE,
+ tt_sprintf("begin >= end for slice %lld",
+ (long long)slice_info->id));
+ goto out;
+ }
+
+ run = vy_index_recover_run(index, slice_info->run,
+ run_env, force_recovery);
+ if (run == NULL)
+ goto out;
+
+ slice = vy_slice_new(slice_info->id, run, begin, end, index->cmp_def);
+ if (slice == NULL)
+ goto out;
+
+ vy_range_add_slice(range, slice);
+out:
+ if (begin != NULL)
+ tuple_unref(begin);
+ if (end != NULL)
+ tuple_unref(end);
+ return slice;
+}
+
+static struct vy_range *
+vy_index_recover_range(struct vy_index *index,
+ struct vy_range_recovery_info *range_info,
+ struct vy_run_env *run_env, bool force_recovery)
+{
+ struct tuple *begin = NULL, *end = NULL;
+ struct vy_range *range = NULL;
+
+ if (range_info->begin != NULL) {
+ begin = vy_key_from_msgpack(index->env->key_format,
+ range_info->begin);
+ if (begin == NULL)
goto out;
- }
- break;
- case VY_LOG_DROP_RUN:
- break;
- case VY_LOG_INSERT_RANGE:
- assert(record->index_lsn == index->commit_lsn);
- range = vy_range_new(record->range_id, begin, end,
- index->cmp_def);
- if (range == NULL)
+ }
+ if (range_info->end != NULL) {
+ end = vy_key_from_msgpack(index->env->key_format,
+ range_info->end);
+ if (end == NULL)
goto out;
- if (range->begin != NULL && range->end != NULL &&
- vy_key_compare(range->begin, range->end,
- index->cmp_def) >= 0) {
- diag_set(ClientError, ER_INVALID_VYLOG_FILE,
- tt_sprintf("begin >= end for range id %lld",
- (long long)range->id));
+ }
+ if (begin != NULL && end != NULL &&
+ vy_key_compare(begin, end, index->cmp_def) >= 0) {
+ diag_set(ClientError, ER_INVALID_VYLOG_FILE,
+ tt_sprintf("begin >= end for range %lld",
+ (long long)range_info->id));
+ goto out;
+ }
+
+ range = vy_range_new(range_info->id, begin, end, index->cmp_def);
+ if (range == NULL)
+ goto out;
+
+ /*
+ * Newer slices are stored closer to the head of the list,
+ * while we are supposed to add slices in chronological
+ * order, so use reverse iterator.
+ */
+ struct vy_slice_recovery_info *slice_info;
+ rlist_foreach_entry_reverse(slice_info, &range_info->slices, in_range) {
+ if (vy_index_recover_slice(index, range, slice_info,
+ run_env, force_recovery) == NULL) {
vy_range_delete(range);
+ range = NULL;
goto out;
}
- vy_index_add_range(index, range);
- arg->range = range;
- break;
- case VY_LOG_INSERT_SLICE:
- assert(range != NULL);
- assert(range->id == record->range_id);
- mh_int_t k = mh_i64ptr_find(run_hash, record->run_id, NULL);
- assert(k != mh_end(run_hash));
- run = mh_i64ptr_node(run_hash, k)->val;
- slice = vy_slice_new(record->slice_id, run, begin, end,
- index->cmp_def);
- if (slice == NULL)
- goto out;
- vy_range_add_slice(range, slice);
- break;
- default:
- unreachable();
}
- success = true;
+ vy_index_add_range(index, range);
out:
if (begin != NULL)
tuple_unref(begin);
if (end != NULL)
tuple_unref(end);
- return success ? 0 : -1;
+ return range;
}
int
@@ -545,19 +538,6 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
{
assert(index->range_count == 0);
- struct vy_index_recovery_cb_arg arg = {
- .index = index,
- .range = NULL,
- .run_env = run_env,
- .run_hash = NULL,
- .force_recovery = force_recovery,
- };
- arg.run_hash = mh_i64ptr_new();
- if (arg.run_hash == NULL) {
- diag_set(OutOfMemory, 0, "mh_i64ptr_new", "mh_i64ptr_t");
- return -1;
- }
-
/*
* Backward compatibility fixup: historically, we used
* box.info.signature for LSN of index creation, which
@@ -568,39 +548,14 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
if (index->opts.lsn != 0)
lsn = index->opts.lsn;
- int rc = vy_recovery_load_index(recovery, index->space_id, index->id,
- lsn, is_checkpoint_recovery,
- vy_index_recovery_cb, &arg);
-
- mh_int_t k;
- mh_foreach(arg.run_hash, k) {
- struct vy_run *run = mh_i64ptr_node(arg.run_hash, k)->val;
- if (run->refs > 1)
- vy_index_add_run(index, run);
- if (run->refs == 1 && rc == 0) {
- diag_set(ClientError, ER_INVALID_VYLOG_FILE,
- tt_sprintf("Unused run %lld in index %lld",
- (long long)run->id,
- (long long)index->commit_lsn));
- rc = -1;
- /*
- * Continue the loop to unreference
- * all runs in the hash.
- */
- }
- /* Drop the reference held by the hash. */
- vy_run_unref(run);
- }
- mh_i64ptr_delete(arg.run_hash);
-
- if (rc != 0) {
- /* Recovery callback failed. */
- return -1;
- }
-
- if (index->commit_lsn < 0) {
- /* Index was not found in the metadata log. */
- if (is_checkpoint_recovery) {
+ /*
+ * Look up the last incarnation of the index in vylog.
+ */
+ struct vy_index_recovery_info *index_info;
+ index_info = vy_recovery_lookup_index(recovery,
+ index->space_id, index->id);
+ if (is_checkpoint_recovery) {
+ if (index_info == NULL) {
/*
* All indexes created from snapshot rows must
* be present in vylog, because snapshot can
@@ -608,10 +563,21 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
* flushed.
*/
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
- tt_sprintf("Index %lld not found",
- (long long)index->commit_lsn));
+ tt_sprintf("Index %u/%u not found",
+ (unsigned)index->space_id,
+ (unsigned)index->id));
return -1;
}
+ if (lsn > index_info->index_lsn) {
+ /*
+ * The last incarnation of the index was created
+ * before the last checkpoint, load it now.
+ */
+ lsn = index_info->index_lsn;
+ }
+ }
+
+ if (index_info == NULL || lsn > index_info->index_lsn) {
/*
* If we failed to log index creation before restart,
* we won't find it in the log on recovery. This is
@@ -622,15 +588,58 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
return vy_index_init_range_tree(index);
}
- if (index->is_dropped) {
+ index->commit_lsn = lsn;
+
+ if (lsn < index_info->index_lsn || index_info->is_dropped) {
/*
- * Initial range is not stored in the metadata log
- * for dropped indexes, but we need it for recovery.
+ * Loading a past incarnation of the index, i.e.
+ * the index is going to dropped during final
+ * recovery. Mark it as such.
+ */
+ index->is_dropped = true;
+ /*
+ * If the index was dropped, we don't need to replay
+ * truncate (see vinyl_space_prepare_truncate()).
+ */
+ index->truncate_count = UINT64_MAX;
+ /*
+ * We need range tree initialized for all indexes,
+ * even for dropped ones.
*/
return vy_index_init_range_tree(index);
}
/*
+ * Loading the last incarnation of the index from vylog.
+ */
+ index->dump_lsn = index_info->dump_lsn;
+ index->truncate_count = index_info->truncate_count;
+
+ int rc = 0;
+ struct vy_range_recovery_info *range_info;
+ rlist_foreach_entry(range_info, &index_info->ranges, in_index) {
+ if (vy_index_recover_range(index, range_info,
+ run_env, force_recovery) == NULL) {
+ rc = -1;
+ break;
+ }
+ }
+
+ /*
+ * vy_index_recover_run() elevates reference counter
+ * of each recovered run. We need to drop the extra
+ * references once we are done.
+ */
+ struct vy_run *run;
+ rlist_foreach_entry(run, &index->runs, in_index) {
+ assert(run->refs > 1);
+ vy_run_unref(run);
+ }
+
+ if (rc != 0)
+ return -1;
+
+ /*
* Account ranges to the index and check that the range tree
* does not have holes or overlaps.
*/
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index c31a588e..8b95282b 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -115,8 +115,6 @@ static const char *vy_log_type_name[] = {
[VY_LOG_TRUNCATE_INDEX] = "truncate_index",
};
-struct vy_recovery;
-
/** Metadata log object. */
struct vy_log {
/**
@@ -170,111 +168,6 @@ struct vy_log {
};
static struct vy_log vy_log;
-/** Recovery context. */
-struct vy_recovery {
- /** space_id, index_id -> vy_index_recovery_info. */
- struct mh_i64ptr_t *index_id_hash;
- /** index_lsn -> vy_index_recovery_info. */
- struct mh_i64ptr_t *index_lsn_hash;
- /** ID -> vy_range_recovery_info. */
- struct mh_i64ptr_t *range_hash;
- /** ID -> vy_run_recovery_info. */
- struct mh_i64ptr_t *run_hash;
- /** ID -> vy_slice_recovery_info. */
- struct mh_i64ptr_t *slice_hash;
- /**
- * Maximal vinyl object ID, according to the metadata log,
- * or -1 in case no vinyl objects were recovered.
- */
- int64_t max_id;
-};
-
-/** Vinyl index info stored in a recovery context. */
-struct vy_index_recovery_info {
- /** LSN of the index creation. */
- int64_t index_lsn;
- /** Ordinal index number in the space. */
- uint32_t index_id;
- /** Space ID. */
- uint32_t space_id;
- /** Array of key part definitions. */
- struct key_part_def *key_parts;
- /** Number of key parts. */
- uint32_t key_part_count;
- /** True if the index was dropped. */
- bool is_dropped;
- /** LSN of the last index dump. */
- int64_t dump_lsn;
- /** Truncate count. */
- int64_t truncate_count;
- /**
- * List of all ranges in the index, linked by
- * vy_range_recovery_info::in_index.
- */
- struct rlist ranges;
- /**
- * List of all runs created for the index
- * (both committed and not), linked by
- * vy_run_recovery_info::in_index.
- */
- struct rlist runs;
-};
-
-/** Vinyl range info stored in a recovery context. */
-struct vy_range_recovery_info {
- /** Link in vy_index_recovery_info::ranges. */
- struct rlist in_index;
- /** ID of the range. */
- int64_t id;
- /** Start of the range, stored in MsgPack array. */
- char *begin;
- /** End of the range, stored in MsgPack array. */
- char *end;
- /**
- * List of all slices in the range, linked by
- * vy_slice_recovery_info::in_range.
- *
- * Newer slices are closer to the head.
- */
- struct rlist slices;
-};
-
-/** Run info stored in a recovery context. */
-struct vy_run_recovery_info {
- /** Link in vy_index_recovery_info::runs. */
- struct rlist in_index;
- /** ID of the run. */
- int64_t id;
- /** Max LSN stored on disk. */
- int64_t dump_lsn;
- /**
- * For deleted runs: LSN of the last checkpoint
- * that uses this run.
- */
- int64_t gc_lsn;
- /**
- * True if the run was not committed (there's
- * VY_LOG_PREPARE_RUN, but no VY_LOG_CREATE_RUN).
- */
- bool is_incomplete;
- /** True if the run was dropped (VY_LOG_DROP_RUN). */
- bool is_dropped;
-};
-
-/** Slice info stored in a recovery context. */
-struct vy_slice_recovery_info {
- /** Link in vy_range_recovery_info::slices. */
- struct rlist in_range;
- /** ID of the slice. */
- int64_t id;
- /** Run this slice was created for. */
- struct vy_run_recovery_info *run;
- /** Start of the slice, stored in MsgPack array. */
- char *begin;
- /** End of the slice, stored in MsgPack array. */
- char *end;
-};
-
static struct vy_recovery *
vy_recovery_new_locked(int64_t signature, bool only_checkpoint);
@@ -977,91 +870,6 @@ vy_log_end_recovery(void)
return 0;
}
-/** Argument passed to vy_log_rotate_cb_func(). */
-struct vy_log_rotate_cb_arg {
- struct xdir *dir;
- struct xlog *xlog;
- const struct vclock *vclock;
-};
-
-/** Callback passed to vy_recovery_iterate() for log rotation. */
-static int
-vy_log_rotate_cb_func(const struct vy_log_record *record, void *cb_arg)
-{
- struct vy_log_rotate_cb_arg *arg = cb_arg;
- struct xlog *xlog = arg->xlog;
- struct xrow_header row;
-
- say_verbose("save vylog record: %s", vy_log_record_str(record));
-
- /* Create the log file on the first write. */
- if (!xlog_is_open(xlog) &&
- xdir_create_xlog(arg->dir, xlog, arg->vclock) < 0)
- return -1;
-
- if (vy_log_record_encode(record, &row) < 0 ||
- xlog_write_row(xlog, &row) < 0)
- return -1;
- return 0;
-}
-
-/**
- * Create an vy_log file from a recovery context.
- */
-static int
-vy_log_create(const struct vclock *vclock, struct vy_recovery *recovery)
-{
- /*
- * Only create the log file if we have something
- * to write to it.
- */
- struct xlog xlog;
- xlog_clear(&xlog);
-
- say_verbose("saving vylog %lld", (long long)vclock_sum(vclock));
-
- struct vy_log_rotate_cb_arg arg = {
- .xlog = &xlog,
- .dir = &vy_log.dir,
- .vclock = vclock,
- };
- if (vy_recovery_iterate(recovery, vy_log_rotate_cb_func, &arg) < 0)
- goto err_write_xlog;
-
- if (!xlog_is_open(&xlog))
- goto done; /* nothing written */
-
- /* Mark the end of the snapshot. */
- struct xrow_header row;
- struct vy_log_record record;
- vy_log_record_init(&record);
- record.type = VY_LOG_SNAPSHOT;
- if (vy_log_record_encode(&record, &row) < 0 ||
- xlog_write_row(&xlog, &row) < 0)
- goto err_write_xlog;
-
- /* Finalize the new xlog. */
- if (xlog_flush(&xlog) < 0 ||
- xlog_sync(&xlog) < 0 ||
- xlog_rename(&xlog) < 0)
- goto err_write_xlog;
-
- xlog_close(&xlog, false);
-done:
- say_verbose("done saving vylog");
- return 0;
-
-err_write_xlog:
- /* Delete the unfinished xlog. */
- if (xlog_is_open(&xlog)) {
- if (unlink(xlog.filename) < 0)
- say_syserror("failed to delete file '%s'",
- xlog.filename);
- xlog_close(&xlog, false);
- }
- return -1;
-}
-
static ssize_t
vy_log_rotate_f(va_list ap)
{
@@ -1285,9 +1093,9 @@ vy_recovery_index_id_hash(uint32_t space_id, uint32_t index_id)
}
/** Lookup a vinyl index in vy_recovery::index_id_hash map. */
-static struct vy_index_recovery_info *
-vy_recovery_lookup_index_by_id(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id)
+struct vy_index_recovery_info *
+vy_recovery_lookup_index(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id)
{
int64_t key = vy_recovery_index_id_hash(space_id, index_id);
struct mh_i64ptr_t *h = recovery->index_id_hash;
@@ -1412,6 +1220,7 @@ vy_recovery_create_index(struct vy_recovery *recovery, int64_t index_lsn,
free(index);
return -1;
}
+ rlist_add_entry(&recovery->indexes, index, in_recovery);
} else {
/*
* The index was dropped and recreated with the
@@ -1583,6 +1392,7 @@ vy_recovery_do_create_run(struct vy_recovery *recovery, int64_t run_id)
run->gc_lsn = -1;
run->is_incomplete = false;
run->is_dropped = false;
+ run->data = NULL;
rlist_create(&run->in_index);
if (recovery->max_id < run_id)
recovery->max_id = run_id;
@@ -2024,6 +1834,7 @@ vy_recovery_new_f(va_list ap)
goto fail;
}
+ rlist_create(&recovery->indexes);
recovery->index_id_hash = NULL;
recovery->index_lsn_hash = NULL;
recovery->range_hash = NULL;
@@ -2176,9 +1987,23 @@ vy_recovery_delete(struct vy_recovery *recovery)
free(recovery);
}
+/** Write a record to vylog. */
+static int
+vy_log_append_record(struct xlog *xlog, struct vy_log_record *record)
+{
+ say_verbose("save vylog record: %s", vy_log_record_str(record));
+
+ struct xrow_header row;
+ if (vy_log_record_encode(record, &row) < 0)
+ return -1;
+ if (xlog_write_row(xlog, &row) < 0)
+ return -1;
+ return 0;
+}
+
+/** Write all records corresponding to an index to vylog. */
static int
-vy_recovery_iterate_index(struct vy_index_recovery_info *index,
- vy_recovery_cb cb, void *cb_arg)
+vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
{
struct vy_range_recovery_info *range;
struct vy_slice_recovery_info *slice;
@@ -2192,7 +2017,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.space_id = index->space_id;
record.key_parts = index->key_parts;
record.key_part_count = index->key_part_count;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
if (index->truncate_count > 0) {
@@ -2200,7 +2025,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.type = VY_LOG_TRUNCATE_INDEX;
record.index_lsn = index->index_lsn;
record.truncate_count = index->truncate_count;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
@@ -2209,7 +2034,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.type = VY_LOG_DUMP_INDEX;
record.index_lsn = index->index_lsn;
record.dump_lsn = index->dump_lsn;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
@@ -2223,8 +2048,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
}
record.index_lsn = index->index_lsn;
record.run_id = run->id;
- record.is_dropped = run->is_dropped;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
if (!run->is_dropped)
@@ -2234,7 +2058,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.type = VY_LOG_DROP_RUN;
record.run_id = run->id;
record.gc_lsn = run->gc_lsn;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
@@ -2245,7 +2069,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.range_id = range->id;
record.begin = range->begin;
record.end = range->end;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
/*
* Newer slices are stored closer to the head of the list,
@@ -2260,7 +2084,7 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
record.run_id = slice->run->id;
record.begin = slice->begin;
record.end = slice->end;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
}
@@ -2269,16 +2093,25 @@ vy_recovery_iterate_index(struct vy_index_recovery_info *index,
vy_log_record_init(&record);
record.type = VY_LOG_DROP_INDEX;
record.index_lsn = index->index_lsn;
- if (cb(&record, cb_arg) != 0)
+ if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
return 0;
}
-int
-vy_recovery_iterate(struct vy_recovery *recovery,
- vy_recovery_cb cb, void *cb_arg)
+/** Create vylog from a recovery context. */
+static int
+vy_log_create(const struct vclock *vclock, struct vy_recovery *recovery)
{
+ say_verbose("saving vylog %lld", (long long)vclock_sum(vclock));
+
+ /*
+ * Only create the log file if we have something
+ * to write to it.
+ */
+ struct xlog xlog;
+ xlog_clear(&xlog);
+
mh_int_t i;
mh_foreach(recovery->index_id_hash, i) {
struct vy_index_recovery_info *index;
@@ -2290,59 +2123,44 @@ vy_recovery_iterate(struct vy_recovery *recovery,
*/
if (index->is_dropped && rlist_empty(&index->runs))
continue;
- if (vy_recovery_iterate_index(index, cb, cb_arg) < 0)
- return -1;
+
+ /* Create the log file on the first write. */
+ if (!xlog_is_open(&xlog) &&
+ xdir_create_xlog(&vy_log.dir, &xlog, vclock) != 0)
+ goto err_create_xlog;
+
+ if (vy_log_append_index(&xlog, index) != 0)
+ goto err_write_xlog;
}
+ if (!xlog_is_open(&xlog))
+ goto done; /* nothing written */
+
+ /* Mark the end of the snapshot. */
+ struct vy_log_record record;
+ vy_log_record_init(&record);
+ record.type = VY_LOG_SNAPSHOT;
+ if (vy_log_append_record(&xlog, &record) != 0)
+ goto err_write_xlog;
+
+ /* Finalize the new xlog. */
+ if (xlog_flush(&xlog) < 0 ||
+ xlog_sync(&xlog) < 0 ||
+ xlog_rename(&xlog) < 0)
+ goto err_write_xlog;
+
+ xlog_close(&xlog, false);
+done:
+ say_verbose("done saving vylog");
return 0;
-}
-int
-vy_recovery_load_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id,
- int64_t index_lsn, bool is_checkpoint_recovery,
- vy_recovery_cb cb, void *cb_arg)
-{
- struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_id(recovery, space_id, index_id);
- if (index == NULL)
- return 0;
- /* See the comment to the function declaration. */
- if (index_lsn < index->index_lsn) {
- /*
- * Loading a past incarnation of the index.
- * Emit create/drop records to indicate that
- * it is going to be dropped by a WAL statement
- * and hence doesn't need to be recovered.
- */
- struct vy_log_record record;
- vy_log_record_init(&record);
- record.type = VY_LOG_CREATE_INDEX;
- record.index_id = index->index_id;
- record.space_id = index->space_id;
- record.index_lsn = index_lsn;
- if (cb(&record, cb_arg) != 0)
- return -1;
- vy_log_record_init(&record);
- record.type = VY_LOG_DROP_INDEX;
- record.index_lsn = index_lsn;
- if (cb(&record, cb_arg) != 0)
- return -1;
- return 0;
- } else if (is_checkpoint_recovery || index_lsn == index->index_lsn) {
- /*
- * Loading the last incarnation of the index.
- * Replay all records we have recovered from
- * the log for this index.
- */
- return vy_recovery_iterate_index(index, cb, cb_arg);
- } else {
- /*
- * The requested incarnation is missing in the recovery
- * context, because we failed to log it before restart.
- * Do nothing and let the caller retry logging.
- */
- assert(!is_checkpoint_recovery);
- assert(index_lsn > index->index_lsn);
- return 0;
- }
+err_write_xlog:
+ /* Delete the unfinished xlog. */
+ assert(xlog_is_open(&xlog));
+ if (unlink(xlog.filename) < 0)
+ say_syserror("failed to delete file '%s'",
+ xlog.filename);
+ xlog_close(&xlog, false);
+
+err_create_xlog:
+ return -1;
}
diff --git a/src/box/vy_log.h b/src/box/vy_log.h
index f17b122a..8fbacd0f 100644
--- a/src/box/vy_log.h
+++ b/src/box/vy_log.h
@@ -34,6 +34,7 @@
#include <stdbool.h>
#include <stdint.h>
#include <string.h>
+#include <small/rlist.h>
#include "salad/stailq.h"
@@ -58,8 +59,7 @@ struct xlog;
struct vclock;
struct key_def;
struct key_part_def;
-
-struct vy_recovery;
+struct mh_i64ptr_t;
/** Type of a metadata log record. */
enum vy_log_record_type {
@@ -175,12 +175,6 @@ struct vy_log_record {
/** Unique ID of the run slice. */
int64_t slice_id;
/**
- * For VY_LOG_CREATE_RUN record: hint that the run
- * is dropped, i.e. there is a VY_LOG_DROP_RUN record
- * following this one.
- */
- bool is_dropped;
- /**
* Msgpack key for start of the range/slice.
* NULL if the range/slice starts from -inf.
*/
@@ -213,6 +207,127 @@ struct vy_log_record {
struct stailq_entry in_tx;
};
+/** Recovery context. */
+struct vy_recovery {
+ /**
+ * List of all indexes stored in the recovery context,
+ * linked by vy_index_recovery_info::in_recovery.
+ */
+ struct rlist indexes;
+ /** space_id, index_id -> vy_index_recovery_info. */
+ struct mh_i64ptr_t *index_id_hash;
+ /** index_lsn -> vy_index_recovery_info. */
+ struct mh_i64ptr_t *index_lsn_hash;
+ /** ID -> vy_range_recovery_info. */
+ struct mh_i64ptr_t *range_hash;
+ /** ID -> vy_run_recovery_info. */
+ struct mh_i64ptr_t *run_hash;
+ /** ID -> vy_slice_recovery_info. */
+ struct mh_i64ptr_t *slice_hash;
+ /**
+ * Maximal vinyl object ID, according to the metadata log,
+ * or -1 in case no vinyl objects were recovered.
+ */
+ int64_t max_id;
+};
+
+/** Vinyl index info stored in a recovery context. */
+struct vy_index_recovery_info {
+ /** Link in vy_recovery::indexes. */
+ struct rlist in_recovery;
+ /** LSN of the index creation. */
+ int64_t index_lsn;
+ /** Ordinal index number in the space. */
+ uint32_t index_id;
+ /** Space ID. */
+ uint32_t space_id;
+ /** Array of key part definitions. */
+ struct key_part_def *key_parts;
+ /** Number of key parts. */
+ uint32_t key_part_count;
+ /** True if the index was dropped. */
+ bool is_dropped;
+ /** LSN of the last index dump. */
+ int64_t dump_lsn;
+ /** Truncate count. */
+ int64_t truncate_count;
+ /**
+ * List of all ranges in the index, linked by
+ * vy_range_recovery_info::in_index.
+ */
+ struct rlist ranges;
+ /**
+ * List of all runs created for the index
+ * (both committed and not), linked by
+ * vy_run_recovery_info::in_index.
+ */
+ struct rlist runs;
+};
+
+/** Vinyl range info stored in a recovery context. */
+struct vy_range_recovery_info {
+ /** Link in vy_index_recovery_info::ranges. */
+ struct rlist in_index;
+ /** ID of the range. */
+ int64_t id;
+ /** Start of the range, stored in MsgPack array. */
+ char *begin;
+ /** End of the range, stored in MsgPack array. */
+ char *end;
+ /**
+ * List of all slices in the range, linked by
+ * vy_slice_recovery_info::in_range.
+ *
+ * Newer slices are closer to the head.
+ */
+ struct rlist slices;
+};
+
+/** Run info stored in a recovery context. */
+struct vy_run_recovery_info {
+ /** Link in vy_index_recovery_info::runs. */
+ struct rlist in_index;
+ /** ID of the run. */
+ int64_t id;
+ /** Max LSN stored on disk. */
+ int64_t dump_lsn;
+ /**
+ * For deleted runs: LSN of the last checkpoint
+ * that uses this run.
+ */
+ int64_t gc_lsn;
+ /**
+ * True if the run was not committed (there's
+ * VY_LOG_PREPARE_RUN, but no VY_LOG_CREATE_RUN).
+ */
+ bool is_incomplete;
+ /** True if the run was dropped (VY_LOG_DROP_RUN). */
+ bool is_dropped;
+ /*
+ * The following field is initialized to NULL and
+ * ignored by vy_log subsystem. It may be used by
+ * the caller to store some extra information.
+ *
+ * During recovery, we store a pointer to vy_run
+ * corresponding to this object.
+ */
+ void *data;
+};
+
+/** Slice info stored in a recovery context. */
+struct vy_slice_recovery_info {
+ /** Link in vy_range_recovery_info::slices. */
+ struct rlist in_range;
+ /** ID of the slice. */
+ int64_t id;
+ /** Run this slice was created for. */
+ struct vy_run_recovery_info *run;
+ /** Start of the slice, stored in MsgPack array. */
+ char *begin;
+ /** End of the slice, stored in MsgPack array. */
+ char *end;
+};
+
/**
* Initialize the metadata log.
* @dir is the directory where log files are stored.
@@ -359,70 +474,14 @@ vy_recovery_new(int64_t signature, bool only_checkpoint);
void
vy_recovery_delete(struct vy_recovery *recovery);
-typedef int
-(*vy_recovery_cb)(const struct vy_log_record *record, void *arg);
-
-/**
- * Iterate over all objects stored in a recovery context.
- *
- * This function invokes callback @cb for each object (index, run, etc)
- * stored in the given recovery context. The callback is passed a record
- * used to log the object and optional argument @cb_arg. If the callback
- * returns a value different from 0, iteration stops and -1 is returned,
- * otherwise the function returns 0.
- *
- * To ease the work done by the callback, records corresponding to
- * slices of a range always go right after the range, in the
- * chronological order, while an index's runs go after the index
- * and before its ranges.
- */
-int
-vy_recovery_iterate(struct vy_recovery *recovery,
- vy_recovery_cb cb, void *cb_arg);
-
/**
- * Load an index from a recovery context.
- *
- * Call @cb for each object related to the index. Break the loop and
- * return -1 if @cb returned a non-zero value, otherwise return 0.
- * Objects are loaded in the same order as by vy_recovery_iterate().
+ * Look up the last incarnation of an index stored in a recovery context.
*
- * Note, this function returns 0 if there's no index with the requested
- * id in the recovery context. In this case, @cb isn't called at all.
- *
- * The @is_checkpoint_recovery flag indicates that the row that created
- * the index was loaded from a snapshot, in which case @index_lsn is
- * the snapshot signature. Otherwise @index_lsn is the LSN of the WAL
- * row that created the index.
- *
- * The index is looked up by @space_id and @index_id while @index_lsn
- * is used to discern different incarnations of the same index as
- * follows. Let @record denote the vylog record corresponding to the
- * last incarnation of the index. Then
- *
- * - If @is_checkpoint_recovery is set and @index_lsn >= @record->index_lsn,
- * the last index incarnation was created before the snapshot and we
- * need to load it right now.
- *
- * - If @is_checkpoint_recovery is set and @index_lsn < @record->index_lsn,
- * the last index incarnation was created after the snapshot, i.e.
- * the index loaded now is going to be dropped so load a dummy.
- *
- * - If @is_checkpoint_recovery is unset and @index_lsn < @record->index_lsn,
- * the last index incarnation is created further in WAL, load a dummy.
- *
- * - If @is_checkpoint_recovery is unset and @index_lsn == @record->index_lsn,
- * load the last index incarnation.
- *
- * - If @is_checkpoint_recovery is unset and @index_lsn > @record->index_lsn,
- * it seems we failed to log index creation before restart. In this
- * case don't do anything. The caller is supposed to retry logging.
+ * Returns NULL if the index was not found.
*/
-int
-vy_recovery_load_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id,
- int64_t index_lsn, bool is_checkpoint_recovery,
- vy_recovery_cb cb, void *cb_arg);
+struct vy_index_recovery_info *
+vy_recovery_lookup_index(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id);
/**
* Initialize a log record with default values.
diff --git a/test/unit/vy_log_stub.c b/test/unit/vy_log_stub.c
index daabf3f9..1fda0a6b 100644
--- a/test/unit/vy_log_stub.c
+++ b/test/unit/vy_log_stub.c
@@ -51,11 +51,9 @@ vy_log_tx_commit(void)
void
vy_log_write(const struct vy_log_record *record) {}
-int
-vy_recovery_load_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id,
- int64_t index_lsn, bool snapshot_recovery,
- vy_recovery_cb cb, void *cb_arg)
+struct vy_index_recovery_info *
+vy_recovery_lookup_index(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id)
{
unreachable();
}
diff --git a/test/vinyl/layout.result b/test/vinyl/layout.result
index 5c78babf..603d2865 100644
--- a/test/vinyl/layout.result
+++ b/test/vinyl/layout.result
@@ -189,12 +189,12 @@ result
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [7, {2: 3}]
+ tuple: [7, {2: 2}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [7, {2: 2}]
+ tuple: [7, {2: 3}]
- HEADER:
timestamp: <timestamp>
type: INSERT
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/5] vinyl: rename vy_index::id to index_id
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 1/5] vinyl: refactor vylog recovery Vladimir Davydov
@ 2018-03-20 11:29 ` Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 3/5] vinyl: rename vy_log_record::index_id/space_id to index_def_id/space_def_id Vladimir Davydov
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Throughout Vinyl we use the term 'id' for calling members representing
unique object identifiers: vy_slice::id, vy_run::id, vy_range::id.
There's one exception though: vy_index::id is the ordinal number of the
index in a space. This is confusing. Besides, I'm planning to assign a
unique id to each vinyl index so that I could look them up in vylog.
I'd like to call the new member 'id' for consistency. So let's rename
vy_index::id to index_id.
---
src/box/vinyl.c | 38 ++++++++++++++++++++------------------
src/box/vy_index.c | 18 +++++++++---------
src/box/vy_index.h | 4 ++--
src/box/vy_point_lookup.c | 2 +-
src/box/vy_read_iterator.c | 5 +++--
src/box/vy_scheduler.c | 19 ++++++++++---------
src/box/vy_tx.c | 4 ++--
test/unit/vy_point_lookup.c | 6 +++---
8 files changed, 50 insertions(+), 46 deletions(-)
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index d3659b0b..af3621df 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -830,7 +830,7 @@ vinyl_index_commit_create(struct index *base, int64_t lsn)
* recovery.
*/
vy_log_tx_begin();
- vy_log_create_index(index->commit_lsn, index->id,
+ vy_log_create_index(index->commit_lsn, index->index_id,
index->space_id, index->key_def);
vy_log_insert_range(index->commit_lsn, range->id, NULL, NULL);
vy_log_tx_try_commit();
@@ -1305,7 +1305,7 @@ vinyl_index_bsize(struct index *base)
struct vy_index *index = vy_index(base);
ssize_t bsize = vy_index_mem_tree_size(index) +
index->page_index_size + index->bloom_size;
- if (index->id > 0)
+ if (index->index_id > 0)
bsize += index->stat.disk.count.bytes;
return bsize;
}
@@ -1425,7 +1425,8 @@ vy_check_is_unique(struct vy_env *env, struct vy_tx *tx, struct space *space,
if (found) {
tuple_unref(found);
diag_set(ClientError, ER_TUPLE_FOUND,
- index_name_by_id(space, index->id), space_name(space));
+ index_name_by_id(space, index->index_id),
+ space_name(space));
return -1;
}
return 0;
@@ -1448,7 +1449,7 @@ vy_insert_primary(struct vy_env *env, struct vy_tx *tx, struct space *space,
{
assert(vy_stmt_type(stmt) == IPROTO_INSERT);
assert(tx != NULL && tx->state == VINYL_TX_READY);
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
/*
* A primary index is always unique and the new tuple must not
* conflict with existing tuples.
@@ -1477,7 +1478,7 @@ vy_insert_secondary(struct vy_env *env, struct vy_tx *tx, struct space *space,
assert(vy_stmt_type(stmt) == IPROTO_INSERT ||
vy_stmt_type(stmt) == IPROTO_REPLACE);
assert(tx != NULL && tx->state == VINYL_TX_READY);
- assert(index->id > 0);
+ assert(index->index_id > 0);
/*
* If the index is unique then the new tuple must not
* conflict with existing tuples. If the index is not
@@ -1529,7 +1530,7 @@ vy_replace_one(struct vy_env *env, struct vy_tx *tx, struct space *space,
(void)env;
assert(tx != NULL && tx->state == VINYL_TX_READY);
struct vy_index *pk = vy_index(space->index[0]);
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
if (tuple_validate_raw(pk->mem_format, request->tuple))
return -1;
struct tuple *new_tuple =
@@ -1589,7 +1590,7 @@ vy_replace_impl(struct vy_env *env, struct vy_tx *tx, struct space *space,
return -1;
/* Primary key is dumped last. */
assert(!vy_is_committed_one(env, space, pk));
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
if (tuple_validate_raw(pk->mem_format, request->tuple))
return -1;
new_stmt = vy_stmt_new_replace(pk->mem_format, request->tuple,
@@ -1729,7 +1730,7 @@ vy_index_full_by_key(struct vy_index *index, struct vy_tx *tx,
tuple_unref(key);
if (rc != 0)
return -1;
- if (index->id == 0 || found == NULL) {
+ if (index->index_id == 0 || found == NULL) {
*result = found;
return 0;
}
@@ -1835,7 +1836,7 @@ vy_delete(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
assert(stmt->old_tuple != NULL);
return vy_delete_impl(env, tx, space, stmt->old_tuple);
} else { /* Primary is the single index in the space. */
- assert(index->id == 0);
+ assert(index->index_id == 0);
struct tuple *delete =
vy_stmt_new_surrogate_delete_from_key(request->key,
pk->key_def,
@@ -1874,7 +1875,8 @@ vy_check_update(struct space *space, const struct vy_index *pk,
if (!key_update_can_be_skipped(pk->key_def->column_mask, column_mask) &&
vy_tuple_compare(old_tuple, new_tuple, pk->key_def) != 0) {
diag_set(ClientError, ER_CANT_UPDATE_PRIMARY_KEY,
- index_name_by_id(space, pk->id), space_name(space));
+ index_name_by_id(space, pk->index_id),
+ space_name(space));
return -1;
}
return 0;
@@ -1918,7 +1920,7 @@ vy_update(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
/* Apply update operations. */
struct vy_index *pk = vy_index(space->index[0]);
assert(pk != NULL);
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
/* Primary key is dumped last. */
assert(!vy_is_committed_one(env, space, pk));
uint64_t column_mask = 0;
@@ -2007,7 +2009,7 @@ vy_insert_first_upsert(struct vy_env *env, struct vy_tx *tx,
assert(space->index_count > 0);
assert(vy_stmt_type(stmt) == IPROTO_INSERT);
struct vy_index *pk = vy_index(space->index[0]);
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
if (vy_tx_set(tx, pk, stmt) != 0)
return -1;
struct vy_index *index;
@@ -2292,7 +2294,7 @@ vy_insert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
if (pk == NULL)
/* The space hasn't the primary index. */
return -1;
- assert(pk->id == 0);
+ assert(pk->index_id == 0);
/* Primary key is dumped last. */
assert(!vy_is_committed_one(env, space, pk));
if (tuple_validate_raw(pk->mem_format, request->tuple))
@@ -3571,7 +3573,7 @@ vy_squash_process(struct vy_squash *squash)
struct key_def *def = index->cmp_def;
/* Upserts enabled only in the primary index. */
- assert(index->id == 0);
+ assert(index->index_id == 0);
/*
* Use the committed read view to avoid squashing
@@ -3666,7 +3668,7 @@ vy_squash_process(struct vy_squash *squash)
tuple_unref(result);
return 0;
}
- assert(index->id == 0);
+ assert(index->index_id == 0);
struct tuple *applied =
vy_apply_upsert(mem_stmt, result, def, mem->format,
mem->upsert_format, true);
@@ -3860,7 +3862,7 @@ vinyl_iterator_primary_next(struct iterator *base, struct tuple **ret)
{
assert(base->next = vinyl_iterator_primary_next);
struct vinyl_iterator *it = (struct vinyl_iterator *)base;
- assert(it->index->id == 0);
+ assert(it->index->index_id == 0);
struct tuple *tuple;
if (it->tx == NULL) {
@@ -3894,7 +3896,7 @@ vinyl_iterator_secondary_next(struct iterator *base, struct tuple **ret)
{
assert(base->next = vinyl_iterator_secondary_next);
struct vinyl_iterator *it = (struct vinyl_iterator *)base;
- assert(it->index->id > 0);
+ assert(it->index->index_id > 0);
struct tuple *tuple;
if (it->tx == NULL) {
@@ -3977,7 +3979,7 @@ vinyl_index_create_iterator(struct index *base, enum iterator_type type,
}
iterator_create(&it->base, base);
- if (index->id == 0)
+ if (index->index_id == 0)
it->base.next = vinyl_iterator_primary_next;
else
it->base.next = vinyl_iterator_secondary_next;
diff --git a/src/box/vy_index.c b/src/box/vy_index.c
index 9c199ddd..e4f567e2 100644
--- a/src/box/vy_index.c
+++ b/src/box/vy_index.c
@@ -62,7 +62,7 @@ vy_index_validate_formats(const struct vy_index *index)
assert(index->upsert_format != NULL);
uint32_t index_field_count = index->mem_format->index_field_count;
(void) index_field_count;
- if (index->id == 0) {
+ if (index->index_id == 0) {
assert(index->disk_format == index->mem_format);
assert(index->disk_format->index_field_count ==
index_field_count);
@@ -115,7 +115,7 @@ vy_index_name(struct vy_index *index)
{
char *buf = tt_static_buf();
snprintf(buf, TT_STATIC_BUF_LEN, "%u/%u",
- (unsigned)index->space_id, (unsigned)index->id);
+ (unsigned)index->space_id, (unsigned)index->index_id);
return buf;
}
@@ -236,7 +236,7 @@ vy_index_new(struct vy_index_env *index_env, struct vy_cache_env *cache_env,
index->in_dump.pos = UINT32_MAX;
index->in_compact.pos = UINT32_MAX;
index->space_id = index_def->space_id;
- index->id = index_def->iid;
+ index->index_id = index_def->iid;
index->opts = index_def->opts;
index->check_is_unique = index->opts.is_unique;
vy_index_read_set_new(&index->read_set);
@@ -355,7 +355,7 @@ vy_index_create(struct vy_index *index)
int rc;
char path[PATH_MAX];
vy_index_snprint_path(path, sizeof(path), index->env->path,
- index->space_id, index->id);
+ index->space_id, index->index_id);
char *path_sep = path;
while (*path_sep == '/') {
/* Don't create root */
@@ -404,10 +404,10 @@ vy_index_recover_run(struct vy_index *index,
run->dump_lsn = run_info->dump_lsn;
if (vy_run_recover(run, index->env->path,
- index->space_id, index->id) != 0 &&
+ index->space_id, index->index_id) != 0 &&
(!force_recovery ||
vy_run_rebuild_index(run, index->env->path,
- index->space_id, index->id,
+ index->space_id, index->index_id,
index->cmp_def, index->key_def,
index->mem_format, index->upsert_format,
&index->opts) != 0)) {
@@ -553,7 +553,7 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
*/
struct vy_index_recovery_info *index_info;
index_info = vy_recovery_lookup_index(recovery,
- index->space_id, index->id);
+ index->space_id, index->index_id);
if (is_checkpoint_recovery) {
if (index_info == NULL) {
/*
@@ -565,7 +565,7 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Index %u/%u not found",
(unsigned)index->space_id,
- (unsigned)index->id));
+ (unsigned)index->index_id));
return -1;
}
if (lsn > index_info->index_lsn) {
@@ -850,7 +850,7 @@ vy_index_commit_upsert(struct vy_index *index, struct vy_mem *mem,
* UPSERT is enabled only for the spaces with the single
* index.
*/
- assert(index->id == 0);
+ assert(index->index_id == 0);
const struct tuple *older;
int64_t lsn = vy_stmt_lsn(stmt);
diff --git a/src/box/vy_index.h b/src/box/vy_index.h
index 4368f6a5..7a5a8aa8 100644
--- a/src/box/vy_index.h
+++ b/src/box/vy_index.h
@@ -148,8 +148,8 @@ struct vy_index {
* until all pending operations have completed.
*/
int refs;
- /** Index ID visible to the user. */
- uint32_t id;
+ /** Ordinal index number in the index array. */
+ uint32_t index_id;
/** ID of the space this index belongs to. */
uint32_t space_id;
/** Index options. */
diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
index ab0bc6b8..d92cb94f 100644
--- a/src/box/vy_point_lookup.c
+++ b/src/box/vy_point_lookup.c
@@ -267,7 +267,7 @@ vy_point_lookup_scan_slice(struct vy_index *index, struct vy_slice *slice,
vy_run_iterator_open(&run_itr, &index->stat.disk.iterator, slice,
ITER_EQ, key, rv, index->cmp_def, index->key_def,
index->disk_format, index->upsert_format,
- index->id == 0);
+ index->index_id == 0);
struct tuple *stmt;
rc = vy_run_iterator_next_key(&run_itr, &stmt);
while (rc == 0 && stmt != NULL) {
diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c
index a265f587..3b5e34fc 100644
--- a/src/box/vy_read_iterator.c
+++ b/src/box/vy_read_iterator.c
@@ -649,7 +649,7 @@ vy_read_iterator_squash_upsert(struct vy_read_iterator *itr,
struct tuple *t = itr->curr_stmt;
/* Upserts enabled only in the primary index. */
- assert(vy_stmt_type(t) != IPROTO_UPSERT || index->id == 0);
+ assert(vy_stmt_type(t) != IPROTO_UPSERT || index->index_id == 0);
tuple_ref(t);
while (vy_stmt_type(t) == IPROTO_UPSERT) {
struct tuple *next;
@@ -755,7 +755,8 @@ vy_read_iterator_add_disk(struct vy_read_iterator *itr)
iterator_type, itr->key,
itr->read_view, index->cmp_def,
index->key_def, index->disk_format,
- index->upsert_format, index->id == 0);
+ index->upsert_format,
+ index->index_id == 0);
}
}
diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
index 382cf071..05234532 100644
--- a/src/box/vy_scheduler.c
+++ b/src/box/vy_scheduler.c
@@ -207,7 +207,7 @@ vy_dump_heap_less(struct heap_node *a, struct heap_node *b)
* ahead of secondary indexes of the same space, i.e. it must
* be dumped last.
*/
- return i1->id > i2->id;
+ return i1->index_id > i2->index_id;
}
#define HEAP_NAME vy_dump_heap
@@ -637,7 +637,7 @@ vy_task_write_run(struct vy_scheduler *scheduler, struct vy_task *task)
struct vy_run_writer writer;
if (vy_run_writer_create(&writer, task->new_run, index->env->path,
- index->space_id, index->id,
+ index->space_id, index->index_id,
index->cmp_def, index->key_def,
task->page_size, task->bloom_fpr,
task->max_output_count) != 0)
@@ -839,7 +839,7 @@ delete_mems:
index->is_dumping = false;
vy_scheduler_update_index(scheduler, index);
- if (index->id != 0)
+ if (index->index_id != 0)
vy_scheduler_unpin_index(scheduler, index->pk);
assert(scheduler->dump_task_count > 0);
@@ -893,7 +893,7 @@ vy_task_dump_abort(struct vy_scheduler *scheduler, struct vy_task *task,
index->is_dumping = false;
vy_scheduler_update_index(scheduler, index);
- if (index->id != 0)
+ if (index->index_id != 0)
vy_scheduler_unpin_index(scheduler, index->pk);
assert(scheduler->dump_task_count > 0);
@@ -935,7 +935,7 @@ vy_task_dump_new(struct vy_scheduler *scheduler, struct vy_index *index,
assert(scheduler->dump_generation < scheduler->generation);
struct errinj *inj = errinj(ERRINJ_VY_INDEX_DUMP, ERRINJ_INT);
- if (inj != NULL && inj->iparam == (int)index->id) {
+ if (inj != NULL && inj->iparam == (int)index->index_id) {
diag_set(ClientError, ER_INJECTION, "vinyl index dump");
goto err;
}
@@ -990,7 +990,7 @@ vy_task_dump_new(struct vy_scheduler *scheduler, struct vy_index *index,
struct vy_stmt_stream *wi;
bool is_last_level = (index->run_count == 0);
wi = vy_write_iterator_new(index->cmp_def, index->disk_format,
- index->upsert_format, index->id == 0,
+ index->upsert_format, index->index_id == 0,
is_last_level, scheduler->read_views);
if (wi == NULL)
goto err_wi;
@@ -1010,7 +1010,7 @@ vy_task_dump_new(struct vy_scheduler *scheduler, struct vy_index *index,
index->is_dumping = true;
vy_scheduler_update_index(scheduler, index);
- if (index->id != 0) {
+ if (index->index_id != 0) {
/*
* The primary index must be dumped after all
* secondary indexes of the same space - see
@@ -1124,7 +1124,8 @@ vy_task_compact_complete(struct vy_scheduler *scheduler, struct vy_task *task)
vy_log_tx_begin();
rlist_foreach_entry(run, &unused_runs, in_unused) {
if (vy_run_remove_files(index->env->path,
- index->space_id, index->id,
+ index->space_id,
+ index->index_id,
run->id) == 0) {
vy_log_forget_run(run->id);
}
@@ -1265,7 +1266,7 @@ vy_task_compact_new(struct vy_scheduler *scheduler, struct vy_index *index,
struct vy_stmt_stream *wi;
bool is_last_level = (range->compact_priority == range->slice_count);
wi = vy_write_iterator_new(index->cmp_def, index->disk_format,
- index->upsert_format, index->id == 0,
+ index->upsert_format, index->index_id == 0,
is_last_level, scheduler->read_views);
if (wi == NULL)
goto err_wi;
diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c
index 01130020..1b583240 100644
--- a/src/box/vy_tx.c
+++ b/src/box/vy_tx.c
@@ -525,7 +525,7 @@ vy_tx_prepare(struct vy_tx *tx)
MAYBE_UNUSED uint32_t current_space_id = 0;
stailq_foreach_entry(v, &tx->log, next_in_log) {
struct vy_index *index = v->index;
- if (index->id == 0) {
+ if (index->index_id == 0) {
/* The beginning of the new txn_stmt is met. */
current_space_id = index->space_id;
repsert = NULL;
@@ -816,7 +816,7 @@ vy_tx_set(struct vy_tx *tx, struct vy_index *index, struct tuple *stmt)
struct txv *old = write_set_search_key(&tx->write_set, index, stmt);
/* Found a match of the previous action of this transaction */
if (old != NULL && vy_stmt_type(stmt) == IPROTO_UPSERT) {
- assert(index->id == 0);
+ assert(index->index_id == 0);
uint8_t old_type = vy_stmt_type(old->stmt);
assert(old_type == IPROTO_UPSERT ||
old_type == IPROTO_INSERT ||
diff --git a/test/unit/vy_point_lookup.c b/test/unit/vy_point_lookup.c
index 52f4427e..c324160f 100644
--- a/test/unit/vy_point_lookup.c
+++ b/test/unit/vy_point_lookup.c
@@ -19,7 +19,7 @@ write_run(struct vy_run *run, const char *dir_name,
{
struct vy_run_writer writer;
if (vy_run_writer_create(&writer, run, dir_name,
- index->space_id, index->id,
+ index->space_id, index->index_id,
index->cmp_def, index->key_def,
4096, 0.1, 100500) != 0)
goto fail;
@@ -193,7 +193,7 @@ test_basic()
}
struct vy_stmt_stream *write_stream
= vy_write_iterator_new(pk->cmp_def, pk->disk_format,
- pk->upsert_format, pk->id == 0,
+ pk->upsert_format, true,
true, &read_views);
vy_write_iterator_new_mem(write_stream, run_mem);
struct vy_run *run = vy_run_new(&run_env, 1);
@@ -228,7 +228,7 @@ test_basic()
}
write_stream
= vy_write_iterator_new(pk->cmp_def, pk->disk_format,
- pk->upsert_format, pk->id == 0,
+ pk->upsert_format, true,
true, &read_views);
vy_write_iterator_new_mem(write_stream, run_mem);
run = vy_run_new(&run_env, 2);
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 3/5] vinyl: rename vy_log_record::index_id/space_id to index_def_id/space_def_id
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 1/5] vinyl: refactor vylog recovery Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 2/5] vinyl: rename vy_index::id to index_id Vladimir Davydov
@ 2018-03-20 11:29 ` Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 5/5] alter: rewrite space truncation using alter infrastructure Vladimir Davydov
4 siblings, 0 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
I'm planning to assign a unique identifier to each vinyl index so that
it could be used instead of lsn for identifying indexes in vylog. In
order not to confuse it with the index ordinal number, let's rename
vy_log_record::index_id to index_def_id and, for consistency, space_id
to space_def_id.
---
src/box/vy_log.c | 57 ++++++++++++++++++++++++++++----------------------------
src/box/vy_log.h | 10 +++++-----
2 files changed, 34 insertions(+), 33 deletions(-)
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index 8b95282b..a6f03a55 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -73,8 +73,8 @@ enum vy_log_key {
VY_LOG_KEY_RUN_ID = 2,
VY_LOG_KEY_BEGIN = 3,
VY_LOG_KEY_END = 4,
- VY_LOG_KEY_INDEX_ID = 5,
- VY_LOG_KEY_SPACE_ID = 6,
+ VY_LOG_KEY_INDEX_DEF_ID = 5,
+ VY_LOG_KEY_SPACE_DEF_ID = 6,
VY_LOG_KEY_DEF = 7,
VY_LOG_KEY_SLICE_ID = 8,
VY_LOG_KEY_DUMP_LSN = 9,
@@ -89,8 +89,8 @@ static const char *vy_log_key_name[] = {
[VY_LOG_KEY_RUN_ID] = "run_id",
[VY_LOG_KEY_BEGIN] = "begin",
[VY_LOG_KEY_END] = "end",
- [VY_LOG_KEY_INDEX_ID] = "index_id",
- [VY_LOG_KEY_SPACE_ID] = "space_id",
+ [VY_LOG_KEY_INDEX_DEF_ID] = "index_def_id",
+ [VY_LOG_KEY_SPACE_DEF_ID] = "space_def_id",
[VY_LOG_KEY_DEF] = "key_def",
[VY_LOG_KEY_SLICE_ID] = "slice_id",
[VY_LOG_KEY_DUMP_LSN] = "dump_lsn",
@@ -231,13 +231,14 @@ vy_log_record_snprint(char *buf, int size, const struct vy_log_record *record)
SNPRINT(total, mp_snprint, buf, size, record->end);
SNPRINT(total, snprintf, buf, size, ", ");
}
- if (record->index_id > 0)
+ if (record->index_def_id > 0)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIu32", ",
- vy_log_key_name[VY_LOG_KEY_INDEX_ID], record->index_id);
- if (record->space_id > 0)
+ vy_log_key_name[VY_LOG_KEY_INDEX_DEF_ID],
+ record->index_def_id);
+ if (record->space_def_id > 0)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIu32", ",
- vy_log_key_name[VY_LOG_KEY_SPACE_ID],
- record->space_id);
+ vy_log_key_name[VY_LOG_KEY_SPACE_DEF_ID],
+ record->space_def_id);
if (record->key_parts != NULL) {
SNPRINT(total, snprintf, buf, size, "%s=",
vy_log_key_name[VY_LOG_KEY_DEF]);
@@ -335,14 +336,14 @@ vy_log_record_encode(const struct vy_log_record *record,
size += p - record->end;
n_keys++;
}
- if (record->index_id > 0) {
- size += mp_sizeof_uint(VY_LOG_KEY_INDEX_ID);
- size += mp_sizeof_uint(record->index_id);
+ if (record->index_def_id > 0) {
+ size += mp_sizeof_uint(VY_LOG_KEY_INDEX_DEF_ID);
+ size += mp_sizeof_uint(record->index_def_id);
n_keys++;
}
- if (record->space_id > 0) {
- size += mp_sizeof_uint(VY_LOG_KEY_SPACE_ID);
- size += mp_sizeof_uint(record->space_id);
+ if (record->space_def_id > 0) {
+ size += mp_sizeof_uint(VY_LOG_KEY_SPACE_DEF_ID);
+ size += mp_sizeof_uint(record->space_def_id);
n_keys++;
}
if (record->key_parts != NULL) {
@@ -412,13 +413,13 @@ vy_log_record_encode(const struct vy_log_record *record,
memcpy(pos, record->end, p - record->end);
pos += p - record->end;
}
- if (record->index_id > 0) {
- pos = mp_encode_uint(pos, VY_LOG_KEY_INDEX_ID);
- pos = mp_encode_uint(pos, record->index_id);
+ if (record->index_def_id > 0) {
+ pos = mp_encode_uint(pos, VY_LOG_KEY_INDEX_DEF_ID);
+ pos = mp_encode_uint(pos, record->index_def_id);
}
- if (record->space_id > 0) {
- pos = mp_encode_uint(pos, VY_LOG_KEY_SPACE_ID);
- pos = mp_encode_uint(pos, record->space_id);
+ if (record->space_def_id > 0) {
+ pos = mp_encode_uint(pos, VY_LOG_KEY_SPACE_DEF_ID);
+ pos = mp_encode_uint(pos, record->space_def_id);
}
if (record->key_parts != NULL) {
pos = mp_encode_uint(pos, VY_LOG_KEY_DEF);
@@ -520,11 +521,11 @@ vy_log_record_decode(struct vy_log_record *record,
record->end = mp_decode_array(&tmp) > 0 ? pos : NULL;
mp_next(&pos);
break;
- case VY_LOG_KEY_INDEX_ID:
- record->index_id = mp_decode_uint(&pos);
+ case VY_LOG_KEY_INDEX_DEF_ID:
+ record->index_def_id = mp_decode_uint(&pos);
break;
- case VY_LOG_KEY_SPACE_ID:
- record->space_id = mp_decode_uint(&pos);
+ case VY_LOG_KEY_SPACE_DEF_ID:
+ record->space_def_id = mp_decode_uint(&pos);
break;
case VY_LOG_KEY_DEF: {
uint32_t part_count = mp_decode_array(&pos);
@@ -1765,7 +1766,7 @@ vy_recovery_process_record(struct vy_recovery *recovery,
switch (record->type) {
case VY_LOG_CREATE_INDEX:
rc = vy_recovery_create_index(recovery, record->index_lsn,
- record->index_id, record->space_id,
+ record->index_def_id, record->space_def_id,
record->key_parts, record->key_part_count);
break;
case VY_LOG_DROP_INDEX:
@@ -2013,8 +2014,8 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
vy_log_record_init(&record);
record.type = VY_LOG_CREATE_INDEX;
record.index_lsn = index->index_lsn;
- record.index_id = index->index_id;
- record.space_id = index->space_id;
+ record.index_def_id = index->index_id;
+ record.space_def_id = index->space_id;
record.key_parts = index->key_parts;
record.key_part_count = index->key_part_count;
if (vy_log_append_record(xlog, &record) != 0)
diff --git a/src/box/vy_log.h b/src/box/vy_log.h
index 8fbacd0f..ac9b987e 100644
--- a/src/box/vy_log.h
+++ b/src/box/vy_log.h
@@ -65,7 +65,7 @@ struct mh_i64ptr_t;
enum vy_log_record_type {
/**
* Create a new vinyl index.
- * Requires vy_log_record::index_lsn, index_id, space_id,
+ * Requires vy_log_record::index_lsn, index_def_id, space_def_id,
* key_def (with primary key parts).
*/
VY_LOG_CREATE_INDEX = 0,
@@ -185,9 +185,9 @@ struct vy_log_record {
*/
const char *end;
/** Ordinal index number in the space. */
- uint32_t index_id;
+ uint32_t index_def_id;
/** Space ID. */
- uint32_t space_id;
+ uint32_t space_def_id;
/** Index key definition, as defined by the user. */
const struct key_def *key_def;
/** Array of key part definitions. */
@@ -503,8 +503,8 @@ vy_log_create_index(int64_t index_lsn, uint32_t index_id, uint32_t space_id,
vy_log_record_init(&record);
record.type = VY_LOG_CREATE_INDEX;
record.index_lsn = index_lsn;
- record.index_id = index_id;
- record.space_id = space_id;
+ record.index_def_id = index_id;
+ record.space_def_id = space_id;
record.key_def = key_def;
vy_log_write(&record);
}
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
` (2 preceding siblings ...)
2018-03-20 11:29 ` [PATCH v2 3/5] vinyl: rename vy_log_record::index_id/space_id to index_def_id/space_def_id Vladimir Davydov
@ 2018-03-20 11:29 ` Vladimir Davydov
2018-03-22 15:08 ` Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 5/5] alter: rewrite space truncation using alter infrastructure Vladimir Davydov
4 siblings, 1 reply; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
vy_log_record::index_lsn serves two purposes. First, it is used as a
unique object identifier in vylog (it is similar to range_id or slice_id
in this regard). Second, it is the LSN of the WAL row that committed the
index, and we use it to lookup the appropriate index incarnation during
WAL recovery. Mixing these two functions is a bad design choice because
as a result we can't create two vinyl indexes in one WAL row, which may
happen on ALTER of a primary key. Besides, we can't create an index
object before WAL write, which is also needed for ALTER, because at that
time there's no LSN assigned to the index yet.
That said, we need to split this variable in two: index_id and
commit_lsn. To be backward compatible, we rename index_lsn to
index_id everywhere in vylog and add a new record field commit_lsn;
if commit_lsn is missing for a create_index record, then this must
be a record left from an old vylog and so we initialize it with
index_id (former index_lsn) - see vy_log_record_decode().
---
src/box/vinyl.c | 24 +++---
src/box/vy_index.c | 30 +++++---
src/box/vy_index.h | 10 ++-
src/box/vy_log.c | 195 ++++++++++++++++++++++++++++-------------------
src/box/vy_log.h | 72 ++++++++---------
src/box/vy_scheduler.c | 10 +--
test/unit/vy_log_stub.c | 4 +-
test/vinyl/layout.result | 64 ++++++++--------
8 files changed, 229 insertions(+), 180 deletions(-)
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index af3621df..1908d5fc 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -781,6 +781,8 @@ vinyl_index_commit_create(struct index *base, int64_t lsn)
struct vy_env *env = vy_env(base->engine);
struct vy_index *index = vy_index(base);
+ assert(index->id >= 0);
+
if (env->status == VINYL_INITIAL_RECOVERY_LOCAL ||
env->status == VINYL_FINAL_RECOVERY_LOCAL) {
/*
@@ -791,7 +793,7 @@ vinyl_index_commit_create(struct index *base, int64_t lsn)
* the index isn't in the recovery context and we
* need to retry to log it now.
*/
- if (index->commit_lsn >= 0) {
+ if (index->is_committed) {
vy_scheduler_add_index(&env->scheduler, index);
return;
}
@@ -816,7 +818,8 @@ vinyl_index_commit_create(struct index *base, int64_t lsn)
if (index->opts.lsn != 0)
lsn = index->opts.lsn;
- index->commit_lsn = lsn;
+ assert(!index->is_committed);
+ index->is_committed = true;
assert(index->range_count == 1);
struct vy_range *range = vy_range_tree_first(index->tree);
@@ -830,9 +833,9 @@ vinyl_index_commit_create(struct index *base, int64_t lsn)
* recovery.
*/
vy_log_tx_begin();
- vy_log_create_index(index->commit_lsn, index->index_id,
- index->space_id, index->key_def);
- vy_log_insert_range(index->commit_lsn, range->id, NULL, NULL);
+ vy_log_create_index(index->id, index->space_id, index->index_id,
+ index->key_def, lsn);
+ vy_log_insert_range(index->id, range->id, NULL, NULL);
vy_log_tx_try_commit();
/*
* After we committed the index in the log, we can schedule
@@ -889,7 +892,7 @@ vinyl_index_commit_drop(struct index *base)
vy_log_tx_begin();
vy_log_index_prune(index, checkpoint_last(NULL));
- vy_log_drop_index(index->commit_lsn);
+ vy_log_drop_index(index->id);
vy_log_tx_try_commit();
}
@@ -938,7 +941,8 @@ vinyl_space_prepare_truncate(struct space *old_space, struct space *new_space)
struct vy_index *old_index = vy_index(old_space->index[i]);
struct vy_index *new_index = vy_index(new_space->index[i]);
- new_index->commit_lsn = old_index->commit_lsn;
+ new_index->id = old_index->id;
+ new_index->is_committed = old_index->is_committed;
if (truncate_done) {
/*
@@ -1015,10 +1019,8 @@ vinyl_space_commit_truncate(struct space *old_space, struct space *new_space)
assert(new_index->range_count == 1);
vy_log_index_prune(old_index, gc_lsn);
- vy_log_insert_range(new_index->commit_lsn,
- range->id, NULL, NULL);
- vy_log_truncate_index(new_index->commit_lsn,
- new_index->truncate_count);
+ vy_log_insert_range(new_index->id, range->id, NULL, NULL);
+ vy_log_truncate_index(new_index->id, new_index->truncate_count);
}
vy_log_tx_try_commit();
diff --git a/src/box/vy_index.c b/src/box/vy_index.c
index e4f567e2..de8c5f1e 100644
--- a/src/box/vy_index.c
+++ b/src/box/vy_index.c
@@ -220,8 +220,8 @@ vy_index_new(struct vy_index_env *index_env, struct vy_cache_env *cache_env,
if (index->mem == NULL)
goto fail_mem;
+ index->id = -1;
index->refs = 1;
- index->commit_lsn = -1;
index->dump_lsn = -1;
vy_cache_create(&index->cache, cache_env, cmp_def);
rlist_create(&index->sealed);
@@ -381,6 +381,10 @@ vy_index_create(struct vy_index *index)
return -1;
}
+ /* Assign unique id. */
+ assert(index->id < 0);
+ index->id = vy_log_next_id();
+
/* Allocate initial range. */
return vy_index_init_range_tree(index);
}
@@ -536,6 +540,8 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
struct vy_run_env *run_env, int64_t lsn,
bool is_checkpoint_recovery, bool force_recovery)
{
+ assert(index->id < 0);
+ assert(!index->is_committed);
assert(index->range_count == 0);
/*
@@ -552,7 +558,7 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
* Look up the last incarnation of the index in vylog.
*/
struct vy_index_recovery_info *index_info;
- index_info = vy_recovery_lookup_index(recovery,
+ index_info = vy_recovery_index_by_id(recovery,
index->space_id, index->index_id);
if (is_checkpoint_recovery) {
if (index_info == NULL) {
@@ -568,29 +574,31 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
(unsigned)index->index_id));
return -1;
}
- if (lsn > index_info->index_lsn) {
+ if (lsn > index_info->commit_lsn) {
/*
* The last incarnation of the index was created
* before the last checkpoint, load it now.
*/
- lsn = index_info->index_lsn;
+ lsn = index_info->commit_lsn;
}
}
- if (index_info == NULL || lsn > index_info->index_lsn) {
+ if (index_info == NULL || lsn > index_info->commit_lsn) {
/*
* If we failed to log index creation before restart,
* we won't find it in the log on recovery. This is
* OK as the index doesn't have any runs in this case.
* We will retry to log index in vy_index_commit_create().
- * For now, just create the initial range.
+ * For now, just create the initial range and assign id.
*/
+ index->id = vy_log_next_id();
return vy_index_init_range_tree(index);
}
- index->commit_lsn = lsn;
+ index->id = index_info->id;
+ index->is_committed = true;
- if (lsn < index_info->index_lsn || index_info->is_dropped) {
+ if (lsn < index_info->commit_lsn || index_info->is_dropped) {
/*
* Loading a past incarnation of the index, i.e.
* the index is going to dropped during final
@@ -672,7 +680,7 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
if (prev == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Index %lld has empty range tree",
- (long long)index->commit_lsn));
+ (long long)index->id));
return -1;
}
if (prev->end != NULL) {
@@ -1049,7 +1057,7 @@ vy_index_split_range(struct vy_index *index, struct vy_range *range)
vy_log_delete_range(range->id);
for (int i = 0; i < n_parts; i++) {
part = parts[i];
- vy_log_insert_range(index->commit_lsn, part->id,
+ vy_log_insert_range(index->id, part->id,
tuple_data_or_null(part->begin),
tuple_data_or_null(part->end));
rlist_foreach_entry(slice, &part->slices, in_range)
@@ -1115,7 +1123,7 @@ vy_index_coalesce_range(struct vy_index *index, struct vy_range *range)
* Log change in metadata.
*/
vy_log_tx_begin();
- vy_log_insert_range(index->commit_lsn, result->id,
+ vy_log_insert_range(index->id, result->id,
tuple_data_or_null(result->begin),
tuple_data_or_null(result->end));
for (it = first; it != end; it = vy_range_tree_next(index->tree, it)) {
diff --git a/src/box/vy_index.h b/src/box/vy_index.h
index 7a5a8aa8..417f2d36 100644
--- a/src/box/vy_index.h
+++ b/src/box/vy_index.h
@@ -148,6 +148,8 @@ struct vy_index {
* until all pending operations have completed.
*/
int refs;
+ /** Unique ID of this index. */
+ int64_t id;
/** Ordinal index number in the index array. */
uint32_t index_id;
/** ID of the space this index belongs to. */
@@ -253,11 +255,11 @@ struct vy_index {
* been dumped yet.
*/
int64_t dump_lsn;
- /**
- * LSN of the row that committed the index or -1 if
- * the index was not committed to the metadata log.
+ /*
+ * This flag is set if the index creation was
+ * committed to the metadata log.
*/
- int64_t commit_lsn;
+ bool is_committed;
/**
* This flag is set if the index was dropped.
* It is also set on local recovery if the index
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index a6f03a55..9c8dd631 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -68,7 +68,7 @@
* Used for packing a record in MsgPack.
*/
enum vy_log_key {
- VY_LOG_KEY_INDEX_LSN = 0,
+ VY_LOG_KEY_INDEX_ID = 0,
VY_LOG_KEY_RANGE_ID = 1,
VY_LOG_KEY_RUN_ID = 2,
VY_LOG_KEY_BEGIN = 3,
@@ -80,11 +80,12 @@ enum vy_log_key {
VY_LOG_KEY_DUMP_LSN = 9,
VY_LOG_KEY_GC_LSN = 10,
VY_LOG_KEY_TRUNCATE_COUNT = 11,
+ VY_LOG_KEY_COMMIT_LSN = 12,
};
/** vy_log_key -> human readable name. */
static const char *vy_log_key_name[] = {
- [VY_LOG_KEY_INDEX_LSN] = "index_lsn",
+ [VY_LOG_KEY_INDEX_ID] = "index_id",
[VY_LOG_KEY_RANGE_ID] = "range_id",
[VY_LOG_KEY_RUN_ID] = "run_id",
[VY_LOG_KEY_BEGIN] = "begin",
@@ -96,6 +97,7 @@ static const char *vy_log_key_name[] = {
[VY_LOG_KEY_DUMP_LSN] = "dump_lsn",
[VY_LOG_KEY_GC_LSN] = "gc_lsn",
[VY_LOG_KEY_TRUNCATE_COUNT] = "truncate_count",
+ [VY_LOG_KEY_COMMIT_LSN] = "commit_lsn",
};
/** vy_log_type -> human readable name. */
@@ -207,10 +209,10 @@ vy_log_record_snprint(char *buf, int size, const struct vy_log_record *record)
assert(record->type < vy_log_record_type_MAX);
SNPRINT(total, snprintf, buf, size, "%s{",
vy_log_type_name[record->type]);
- if (record->index_lsn > 0)
+ if (record->index_id > 0)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
- vy_log_key_name[VY_LOG_KEY_INDEX_LSN],
- record->index_lsn);
+ vy_log_key_name[VY_LOG_KEY_INDEX_ID],
+ record->index_id);
if (record->range_id > 0)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
vy_log_key_name[VY_LOG_KEY_RANGE_ID],
@@ -250,6 +252,10 @@ vy_log_record_snprint(char *buf, int size, const struct vy_log_record *record)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
vy_log_key_name[VY_LOG_KEY_SLICE_ID],
record->slice_id);
+ if (record->commit_lsn > 0)
+ SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
+ vy_log_key_name[VY_LOG_KEY_COMMIT_LSN],
+ record->commit_lsn);
if (record->dump_lsn > 0)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
vy_log_key_name[VY_LOG_KEY_DUMP_LSN],
@@ -305,9 +311,9 @@ vy_log_record_encode(const struct vy_log_record *record,
size += mp_sizeof_array(2);
size += mp_sizeof_uint(record->type);
size_t n_keys = 0;
- if (record->index_lsn > 0) {
- size += mp_sizeof_uint(VY_LOG_KEY_INDEX_LSN);
- size += mp_sizeof_uint(record->index_lsn);
+ if (record->index_id > 0) {
+ size += mp_sizeof_uint(VY_LOG_KEY_INDEX_ID);
+ size += mp_sizeof_uint(record->index_id);
n_keys++;
}
if (record->range_id > 0) {
@@ -358,6 +364,11 @@ vy_log_record_encode(const struct vy_log_record *record,
size += mp_sizeof_uint(record->slice_id);
n_keys++;
}
+ if (record->commit_lsn > 0) {
+ size += mp_sizeof_uint(VY_LOG_KEY_COMMIT_LSN);
+ size += mp_sizeof_uint(record->commit_lsn);
+ n_keys++;
+ }
if (record->dump_lsn > 0) {
size += mp_sizeof_uint(VY_LOG_KEY_DUMP_LSN);
size += mp_sizeof_uint(record->dump_lsn);
@@ -387,9 +398,9 @@ vy_log_record_encode(const struct vy_log_record *record,
pos = mp_encode_array(pos, 2);
pos = mp_encode_uint(pos, record->type);
pos = mp_encode_map(pos, n_keys);
- if (record->index_lsn > 0) {
- pos = mp_encode_uint(pos, VY_LOG_KEY_INDEX_LSN);
- pos = mp_encode_uint(pos, record->index_lsn);
+ if (record->index_id > 0) {
+ pos = mp_encode_uint(pos, VY_LOG_KEY_INDEX_ID);
+ pos = mp_encode_uint(pos, record->index_id);
}
if (record->range_id > 0) {
pos = mp_encode_uint(pos, VY_LOG_KEY_RANGE_ID);
@@ -431,6 +442,10 @@ vy_log_record_encode(const struct vy_log_record *record,
pos = mp_encode_uint(pos, VY_LOG_KEY_SLICE_ID);
pos = mp_encode_uint(pos, record->slice_id);
}
+ if (record->commit_lsn > 0) {
+ pos = mp_encode_uint(pos, VY_LOG_KEY_COMMIT_LSN);
+ pos = mp_encode_uint(pos, record->commit_lsn);
+ }
if (record->dump_lsn > 0) {
pos = mp_encode_uint(pos, VY_LOG_KEY_DUMP_LSN);
pos = mp_encode_uint(pos, record->dump_lsn);
@@ -502,8 +517,8 @@ vy_log_record_decode(struct vy_log_record *record,
for (uint32_t i = 0; i < n_keys; i++) {
uint32_t key = mp_decode_uint(&pos);
switch (key) {
- case VY_LOG_KEY_INDEX_LSN:
- record->index_lsn = mp_decode_uint(&pos);
+ case VY_LOG_KEY_INDEX_ID:
+ record->index_id = mp_decode_uint(&pos);
break;
case VY_LOG_KEY_RANGE_ID:
record->range_id = mp_decode_uint(&pos);
@@ -552,6 +567,9 @@ vy_log_record_decode(struct vy_log_record *record,
case VY_LOG_KEY_SLICE_ID:
record->slice_id = mp_decode_uint(&pos);
break;
+ case VY_LOG_KEY_COMMIT_LSN:
+ record->commit_lsn = mp_decode_uint(&pos);
+ break;
case VY_LOG_KEY_DUMP_LSN:
record->dump_lsn = mp_decode_uint(&pos);
break;
@@ -568,6 +586,17 @@ vy_log_record_decode(struct vy_log_record *record,
goto fail;
}
}
+ if (record->type == VY_LOG_CREATE_INDEX && record->commit_lsn == 0) {
+ /*
+ * We used to use LSN as unique index identifier
+ * and didn't store LSN separately so if there's
+ * no 'commit_lsn' field in the record, we are
+ * recovering from an old vylog and 'id' is in
+ * fact the LSN of the WAL record that committed
+ * the index.
+ */
+ record->commit_lsn = record->index_id;
+ }
return 0;
fail:
buf = tt_static_buf();
@@ -1095,8 +1124,8 @@ vy_recovery_index_id_hash(uint32_t space_id, uint32_t index_id)
/** Lookup a vinyl index in vy_recovery::index_id_hash map. */
struct vy_index_recovery_info *
-vy_recovery_lookup_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id)
+vy_recovery_index_by_id(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id)
{
int64_t key = vy_recovery_index_id_hash(space_id, index_id);
struct mh_i64ptr_t *h = recovery->index_id_hash;
@@ -1106,12 +1135,12 @@ vy_recovery_lookup_index(struct vy_recovery *recovery,
return mh_i64ptr_node(h, k)->val;
}
-/** Lookup a vinyl index in vy_recovery::index_lsn_hash map. */
+/** Lookup a vinyl index in vy_recovery::index_hash map. */
static struct vy_index_recovery_info *
-vy_recovery_lookup_index_by_lsn(struct vy_recovery *recovery, int64_t index_lsn)
+vy_recovery_lookup_index(struct vy_recovery *recovery, int64_t id)
{
- struct mh_i64ptr_t *h = recovery->index_lsn_hash;
- mh_int_t k = mh_i64ptr_find(h, index_lsn, NULL);
+ struct mh_i64ptr_t *h = recovery->index_hash;
+ mh_int_t k = mh_i64ptr_find(h, id, NULL);
if (k == mh_end(h))
return NULL;
return mh_i64ptr_node(h, k)->val;
@@ -1152,15 +1181,15 @@ vy_recovery_lookup_slice(struct vy_recovery *recovery, int64_t slice_id)
/**
* Handle a VY_LOG_CREATE_INDEX log record.
- * This function allocates a new vinyl index with ID @index_lsn
+ * This function allocates a new vinyl index with ID @id
* and inserts it to the hash.
* Return 0 on success, -1 on failure (ID collision or OOM).
*/
static int
-vy_recovery_create_index(struct vy_recovery *recovery, int64_t index_lsn,
- uint32_t index_id, uint32_t space_id,
+vy_recovery_create_index(struct vy_recovery *recovery, int64_t id,
+ uint32_t space_id, uint32_t index_id,
const struct key_part_def *key_parts,
- uint32_t key_part_count)
+ uint32_t key_part_count, int64_t commit_lsn)
{
struct vy_index_recovery_info *index;
struct key_part_def *key_parts_copy;
@@ -1175,7 +1204,7 @@ vy_recovery_create_index(struct vy_recovery *recovery, int64_t index_lsn,
if (key_parts == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Missing key definition for index %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
key_parts_copy = malloc(sizeof(*key_parts) * key_part_count);
@@ -1242,23 +1271,24 @@ vy_recovery_create_index(struct vy_recovery *recovery, int64_t index_lsn,
free(index->key_parts);
}
- index->index_lsn = index_lsn;
+ index->id = id;
index->key_parts = key_parts_copy;
index->key_part_count = key_part_count;
index->is_dropped = false;
+ index->commit_lsn = commit_lsn;
index->dump_lsn = -1;
index->truncate_count = 0;
/*
- * Add the index to the LSN hash.
+ * Add the index to the hash.
*/
- h = recovery->index_lsn_hash;
- node.key = index_lsn;
+ h = recovery->index_hash;
+ node.key = id;
node.val = index;
- if (mh_i64ptr_find(h, index_lsn, NULL) != mh_end(h)) {
+ if (mh_i64ptr_find(h, id, NULL) != mh_end(h)) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Duplicate index id %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
if (mh_i64ptr_put(h, &node, NULL, NULL) == mh_end(h)) {
@@ -1266,36 +1296,40 @@ vy_recovery_create_index(struct vy_recovery *recovery, int64_t index_lsn,
"mh_i64ptr_node_t");
return -1;
}
+
+ if (recovery->max_id < id)
+ recovery->max_id = id;
+
return 0;
}
/**
* Handle a VY_LOG_DROP_INDEX log record.
- * This function marks the vinyl index with ID @index_lsn as dropped.
+ * This function marks the vinyl index with ID @id as dropped.
* All ranges and runs of the index must have been deleted by now.
* Returns 0 on success, -1 if ID not found or index is already marked.
*/
static int
-vy_recovery_drop_index(struct vy_recovery *recovery, int64_t index_lsn)
+vy_recovery_drop_index(struct vy_recovery *recovery, int64_t id)
{
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Index %lld deleted but not registered",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
if (index->is_dropped) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Index %lld deleted twice",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
if (!rlist_empty(&index->ranges)) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Dropped index %lld has ranges",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
struct vy_run_recovery_info *run;
@@ -1303,7 +1337,7 @@ vy_recovery_drop_index(struct vy_recovery *recovery, int64_t index_lsn)
if (!run->is_dropped && !run->is_incomplete) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Dropped index %lld has active "
- "runs", (long long)index_lsn));
+ "runs", (long long)id));
return -1;
}
}
@@ -1314,25 +1348,25 @@ vy_recovery_drop_index(struct vy_recovery *recovery, int64_t index_lsn)
/**
* Handle a VY_LOG_DUMP_INDEX log record.
* This function updates LSN of the last dump of the vinyl index
- * with ID @index_lsn.
+ * with ID @id.
* Returns 0 on success, -1 if ID not found or index is dropped.
*/
static int
vy_recovery_dump_index(struct vy_recovery *recovery,
- int64_t index_lsn, int64_t dump_lsn)
+ int64_t id, int64_t dump_lsn)
{
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Dump of unregistered index %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
if (index->is_dropped) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Dump of deleted index %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
index->dump_lsn = dump_lsn;
@@ -1341,25 +1375,25 @@ vy_recovery_dump_index(struct vy_recovery *recovery,
/**
* Handle a VY_LOG_TRUNCATE_INDEX log record.
- * This function updates truncate_count of the index with ID @index_lsn.
+ * This function updates truncate_count of the index with ID @id.
* Returns 0 on success, -1 if ID not found or index is dropped.
*/
static int
vy_recovery_truncate_index(struct vy_recovery *recovery,
- int64_t index_lsn, int64_t truncate_count)
+ int64_t id, int64_t truncate_count)
{
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Truncation of unregistered index %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
if (index->is_dropped) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Truncation of deleted index %lld",
- (long long)index_lsn));
+ (long long)id));
return -1;
}
index->truncate_count = truncate_count;
@@ -1403,21 +1437,21 @@ vy_recovery_do_create_run(struct vy_recovery *recovery, int64_t run_id)
/**
* Handle a VY_LOG_PREPARE_RUN log record.
* This function creates a new incomplete vinyl run with ID @run_id
- * and adds it to the list of runs of the index with ID @index_lsn.
+ * and adds it to the list of runs of the index with ID @index_id.
* Return 0 on success, -1 if run already exists, index not found,
* or OOM.
*/
static int
-vy_recovery_prepare_run(struct vy_recovery *recovery, int64_t index_lsn,
+vy_recovery_prepare_run(struct vy_recovery *recovery, int64_t index_id,
int64_t run_id)
{
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, index_id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Run %lld created for unregistered "
"index %lld", (long long)run_id,
- (long long)index_lsn));
+ (long long)index_id));
return -1;
}
if (vy_recovery_lookup_run(recovery, run_id) != NULL) {
@@ -1438,29 +1472,29 @@ vy_recovery_prepare_run(struct vy_recovery *recovery, int64_t index_lsn,
/**
* Handle a VY_LOG_CREATE_RUN log record.
* This function adds the vinyl run with ID @run_id to the list
- * of runs of the index with ID @index_lsn and marks it committed.
+ * of runs of the index with ID @index_id and marks it committed.
* If the run does not exist, it will be created.
* Return 0 on success, -1 if index not found, run or index
* is dropped, or OOM.
*/
static int
-vy_recovery_create_run(struct vy_recovery *recovery, int64_t index_lsn,
+vy_recovery_create_run(struct vy_recovery *recovery, int64_t index_id,
int64_t run_id, int64_t dump_lsn)
{
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, index_id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Run %lld created for unregistered "
"index %lld", (long long)run_id,
- (long long)index_lsn));
+ (long long)index_id));
return -1;
}
if (index->is_dropped) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Run %lld created for deleted "
"index %lld", (long long)run_id,
- (long long)index_lsn));
+ (long long)index_id));
return -1;
}
struct vy_run_recovery_info *run;
@@ -1539,11 +1573,11 @@ vy_recovery_forget_run(struct vy_recovery *recovery, int64_t run_id)
* Handle a VY_LOG_INSERT_RANGE log record.
* This function allocates a new vinyl range with ID @range_id,
* inserts it to the hash, and adds it to the list of ranges of the
- * index with ID @index_lsn.
+ * index with ID @index_id.
* Return 0 on success, -1 on failure (ID collision or OOM).
*/
static int
-vy_recovery_insert_range(struct vy_recovery *recovery, int64_t index_lsn,
+vy_recovery_insert_range(struct vy_recovery *recovery, int64_t index_id,
int64_t range_id, const char *begin, const char *end)
{
if (vy_recovery_lookup_range(recovery, range_id) != NULL) {
@@ -1553,12 +1587,12 @@ vy_recovery_insert_range(struct vy_recovery *recovery, int64_t index_lsn,
return -1;
}
struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index_by_lsn(recovery, index_lsn);
+ index = vy_recovery_lookup_index(recovery, index_id);
if (index == NULL) {
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
tt_sprintf("Range %lld created for unregistered "
"index %lld", (long long)range_id,
- (long long)index_lsn));
+ (long long)index_id));
return -1;
}
@@ -1765,26 +1799,27 @@ vy_recovery_process_record(struct vy_recovery *recovery,
int rc;
switch (record->type) {
case VY_LOG_CREATE_INDEX:
- rc = vy_recovery_create_index(recovery, record->index_lsn,
- record->index_def_id, record->space_def_id,
- record->key_parts, record->key_part_count);
+ rc = vy_recovery_create_index(recovery, record->index_id,
+ record->space_def_id, record->index_def_id,
+ record->key_parts, record->key_part_count,
+ record->commit_lsn);
break;
case VY_LOG_DROP_INDEX:
- rc = vy_recovery_drop_index(recovery, record->index_lsn);
+ rc = vy_recovery_drop_index(recovery, record->index_id);
break;
case VY_LOG_INSERT_RANGE:
- rc = vy_recovery_insert_range(recovery, record->index_lsn,
+ rc = vy_recovery_insert_range(recovery, record->index_id,
record->range_id, record->begin, record->end);
break;
case VY_LOG_DELETE_RANGE:
rc = vy_recovery_delete_range(recovery, record->range_id);
break;
case VY_LOG_PREPARE_RUN:
- rc = vy_recovery_prepare_run(recovery, record->index_lsn,
+ rc = vy_recovery_prepare_run(recovery, record->index_id,
record->run_id);
break;
case VY_LOG_CREATE_RUN:
- rc = vy_recovery_create_run(recovery, record->index_lsn,
+ rc = vy_recovery_create_run(recovery, record->index_id,
record->run_id, record->dump_lsn);
break;
case VY_LOG_DROP_RUN:
@@ -1803,11 +1838,11 @@ vy_recovery_process_record(struct vy_recovery *recovery,
rc = vy_recovery_delete_slice(recovery, record->slice_id);
break;
case VY_LOG_DUMP_INDEX:
- rc = vy_recovery_dump_index(recovery, record->index_lsn,
+ rc = vy_recovery_dump_index(recovery, record->index_id,
record->dump_lsn);
break;
case VY_LOG_TRUNCATE_INDEX:
- rc = vy_recovery_truncate_index(recovery, record->index_lsn,
+ rc = vy_recovery_truncate_index(recovery, record->index_id,
record->truncate_count);
break;
default:
@@ -1837,19 +1872,19 @@ vy_recovery_new_f(va_list ap)
rlist_create(&recovery->indexes);
recovery->index_id_hash = NULL;
- recovery->index_lsn_hash = NULL;
+ recovery->index_hash = NULL;
recovery->range_hash = NULL;
recovery->run_hash = NULL;
recovery->slice_hash = NULL;
recovery->max_id = -1;
recovery->index_id_hash = mh_i64ptr_new();
- recovery->index_lsn_hash = mh_i64ptr_new();
+ recovery->index_hash = mh_i64ptr_new();
recovery->range_hash = mh_i64ptr_new();
recovery->run_hash = mh_i64ptr_new();
recovery->slice_hash = mh_i64ptr_new();
if (recovery->index_id_hash == NULL ||
- recovery->index_lsn_hash == NULL ||
+ recovery->index_hash == NULL ||
recovery->range_hash == NULL ||
recovery->run_hash == NULL ||
recovery->slice_hash == NULL) {
@@ -1974,9 +2009,9 @@ vy_recovery_delete(struct vy_recovery *recovery)
}
mh_i64ptr_delete(recovery->index_id_hash);
}
- if (recovery->index_lsn_hash != NULL) {
+ if (recovery->index_hash != NULL) {
/* Hash entries were deleted along with index_id_hash. */
- mh_i64ptr_delete(recovery->index_lsn_hash);
+ mh_i64ptr_delete(recovery->index_hash);
}
if (recovery->range_hash != NULL)
vy_recovery_delete_hash(recovery->range_hash);
@@ -2013,7 +2048,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
vy_log_record_init(&record);
record.type = VY_LOG_CREATE_INDEX;
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
record.index_def_id = index->index_id;
record.space_def_id = index->space_id;
record.key_parts = index->key_parts;
@@ -2024,7 +2059,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
if (index->truncate_count > 0) {
vy_log_record_init(&record);
record.type = VY_LOG_TRUNCATE_INDEX;
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
record.truncate_count = index->truncate_count;
if (vy_log_append_record(xlog, &record) != 0)
return -1;
@@ -2033,7 +2068,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
if (index->dump_lsn >= 0) {
vy_log_record_init(&record);
record.type = VY_LOG_DUMP_INDEX;
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
record.dump_lsn = index->dump_lsn;
if (vy_log_append_record(xlog, &record) != 0)
return -1;
@@ -2047,7 +2082,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
record.type = VY_LOG_CREATE_RUN;
record.dump_lsn = run->dump_lsn;
}
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
record.run_id = run->id;
if (vy_log_append_record(xlog, &record) != 0)
return -1;
@@ -2066,7 +2101,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
rlist_foreach_entry(range, &index->ranges, in_index) {
vy_log_record_init(&record);
record.type = VY_LOG_INSERT_RANGE;
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
record.range_id = range->id;
record.begin = range->begin;
record.end = range->end;
@@ -2093,7 +2128,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
if (index->is_dropped) {
vy_log_record_init(&record);
record.type = VY_LOG_DROP_INDEX;
- record.index_lsn = index->index_lsn;
+ record.index_id = index->id;
if (vy_log_append_record(xlog, &record) != 0)
return -1;
}
diff --git a/src/box/vy_log.h b/src/box/vy_log.h
index ac9b987e..19987c61 100644
--- a/src/box/vy_log.h
+++ b/src/box/vy_log.h
@@ -65,18 +65,18 @@ struct mh_i64ptr_t;
enum vy_log_record_type {
/**
* Create a new vinyl index.
- * Requires vy_log_record::index_lsn, index_def_id, space_def_id,
- * key_def (with primary key parts).
+ * Requires vy_log_record::index_id, index_def_id, space_def_id,
+ * key_def (with primary key parts), commit_lsn.
*/
VY_LOG_CREATE_INDEX = 0,
/**
* Drop an index.
- * Requires vy_log_record::index_lsn.
+ * Requires vy_log_record::index_id.
*/
VY_LOG_DROP_INDEX = 1,
/**
* Insert a new range into a vinyl index.
- * Requires vy_log_record::index_lsn, range_id, begin, end.
+ * Requires vy_log_record::index_id, range_id, begin, end.
*/
VY_LOG_INSERT_RANGE = 2,
/**
@@ -86,7 +86,7 @@ enum vy_log_record_type {
VY_LOG_DELETE_RANGE = 3,
/**
* Prepare a vinyl run file.
- * Requires vy_log_record::index_lsn, run_id.
+ * Requires vy_log_record::index_id, run_id.
*
* Record of this type is written before creating a run file.
* It is needed to keep track of unfinished due to errors run
@@ -95,7 +95,7 @@ enum vy_log_record_type {
VY_LOG_PREPARE_RUN = 4,
/**
* Commit a vinyl run file creation.
- * Requires vy_log_record::index_lsn, run_id, dump_lsn.
+ * Requires vy_log_record::index_id, run_id, dump_lsn.
*
* Written after a run file was successfully created.
*/
@@ -136,7 +136,7 @@ enum vy_log_record_type {
VY_LOG_DELETE_SLICE = 9,
/**
* Update LSN of the last index dump.
- * Requires vy_log_record::index_lsn, dump_lsn.
+ * Requires vy_log_record::index_id, dump_lsn.
*/
VY_LOG_DUMP_INDEX = 10,
/**
@@ -152,7 +152,7 @@ enum vy_log_record_type {
VY_LOG_SNAPSHOT = 11,
/**
* Update truncate count of a vinyl index.
- * Requires vy_log_record::index_lsn, truncate_count.
+ * Requires vy_log_record::index_id, truncate_count.
*/
VY_LOG_TRUNCATE_INDEX = 12,
@@ -163,11 +163,8 @@ enum vy_log_record_type {
struct vy_log_record {
/** Type of the record. */
enum vy_log_record_type type;
- /**
- * LSN from the time of index creation.
- * Used to identify indexes in vylog.
- */
- int64_t index_lsn;
+ /** Unique ID of the vinyl index. */
+ int64_t index_id;
/** Unique ID of the vinyl range. */
int64_t range_id;
/** Unique ID of the vinyl run. */
@@ -194,6 +191,8 @@ struct vy_log_record {
struct key_part_def *key_parts;
/** Number of key parts. */
uint32_t key_part_count;
+ /** LSN of the WAL row corresponding to this record. */
+ int64_t commit_lsn;
/** Max LSN stored on disk. */
int64_t dump_lsn;
/**
@@ -216,8 +215,8 @@ struct vy_recovery {
struct rlist indexes;
/** space_id, index_id -> vy_index_recovery_info. */
struct mh_i64ptr_t *index_id_hash;
- /** index_lsn -> vy_index_recovery_info. */
- struct mh_i64ptr_t *index_lsn_hash;
+ /** ID -> vy_index_recovery_info. */
+ struct mh_i64ptr_t *index_hash;
/** ID -> vy_range_recovery_info. */
struct mh_i64ptr_t *range_hash;
/** ID -> vy_run_recovery_info. */
@@ -235,8 +234,8 @@ struct vy_recovery {
struct vy_index_recovery_info {
/** Link in vy_recovery::indexes. */
struct rlist in_recovery;
- /** LSN of the index creation. */
- int64_t index_lsn;
+ /** ID of the index. */
+ int64_t id;
/** Ordinal index number in the space. */
uint32_t index_id;
/** Space ID. */
@@ -247,6 +246,8 @@ struct vy_index_recovery_info {
uint32_t key_part_count;
/** True if the index was dropped. */
bool is_dropped;
+ /** LSN of the WAL row that committed the index. */
+ int64_t commit_lsn;
/** LSN of the last index dump. */
int64_t dump_lsn;
/** Truncate count. */
@@ -480,8 +481,8 @@ vy_recovery_delete(struct vy_recovery *recovery);
* Returns NULL if the index was not found.
*/
struct vy_index_recovery_info *
-vy_recovery_lookup_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id);
+vy_recovery_index_by_id(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id);
/**
* Initialize a log record with default values.
@@ -496,39 +497,40 @@ vy_log_record_init(struct vy_log_record *record)
/** Helper to log a vinyl index creation. */
static inline void
-vy_log_create_index(int64_t index_lsn, uint32_t index_id, uint32_t space_id,
- const struct key_def *key_def)
+vy_log_create_index(int64_t id, uint32_t space_id, uint32_t index_id,
+ const struct key_def *key_def, int64_t commit_lsn)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_CREATE_INDEX;
- record.index_lsn = index_lsn;
- record.index_def_id = index_id;
+ record.index_id = id;
record.space_def_id = space_id;
+ record.index_def_id = index_id;
record.key_def = key_def;
+ record.commit_lsn = commit_lsn;
vy_log_write(&record);
}
/** Helper to log a vinyl index drop. */
static inline void
-vy_log_drop_index(int64_t index_lsn)
+vy_log_drop_index(int64_t id)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_DROP_INDEX;
- record.index_lsn = index_lsn;
+ record.index_id = id;
vy_log_write(&record);
}
/** Helper to log a vinyl range insertion. */
static inline void
-vy_log_insert_range(int64_t index_lsn, int64_t range_id,
+vy_log_insert_range(int64_t index_id, int64_t range_id,
const char *begin, const char *end)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_INSERT_RANGE;
- record.index_lsn = index_lsn;
+ record.index_id = index_id;
record.range_id = range_id;
record.begin = begin;
record.end = end;
@@ -548,24 +550,24 @@ vy_log_delete_range(int64_t range_id)
/** Helper to log a vinyl run file creation. */
static inline void
-vy_log_prepare_run(int64_t index_lsn, int64_t run_id)
+vy_log_prepare_run(int64_t index_id, int64_t run_id)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_PREPARE_RUN;
- record.index_lsn = index_lsn;
+ record.index_id = index_id;
record.run_id = run_id;
vy_log_write(&record);
}
/** Helper to log a vinyl run creation. */
static inline void
-vy_log_create_run(int64_t index_lsn, int64_t run_id, int64_t dump_lsn)
+vy_log_create_run(int64_t index_id, int64_t run_id, int64_t dump_lsn)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_CREATE_RUN;
- record.index_lsn = index_lsn;
+ record.index_id = index_id;
record.run_id = run_id;
record.dump_lsn = dump_lsn;
vy_log_write(&record);
@@ -623,24 +625,24 @@ vy_log_delete_slice(int64_t slice_id)
/** Helper to log index dump. */
static inline void
-vy_log_dump_index(int64_t index_lsn, int64_t dump_lsn)
+vy_log_dump_index(int64_t id, int64_t dump_lsn)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_DUMP_INDEX;
- record.index_lsn = index_lsn;
+ record.index_id = id;
record.dump_lsn = dump_lsn;
vy_log_write(&record);
}
/** Helper to log index truncation. */
static inline void
-vy_log_truncate_index(int64_t index_lsn, int64_t truncate_count)
+vy_log_truncate_index(int64_t id, int64_t truncate_count)
{
struct vy_log_record record;
vy_log_record_init(&record);
record.type = VY_LOG_TRUNCATE_INDEX;
- record.index_lsn = index_lsn;
+ record.index_id = id;
record.truncate_count = truncate_count;
vy_log_write(&record);
}
diff --git a/src/box/vy_scheduler.c b/src/box/vy_scheduler.c
index 05234532..7c75390c 100644
--- a/src/box/vy_scheduler.c
+++ b/src/box/vy_scheduler.c
@@ -583,7 +583,7 @@ vy_run_prepare(struct vy_run_env *run_env, struct vy_index *index)
if (run == NULL)
return NULL;
vy_log_tx_begin();
- vy_log_prepare_run(index->commit_lsn, run->id);
+ vy_log_prepare_run(index->id, run->id);
if (vy_log_tx_commit() < 0) {
vy_run_unref(run);
return NULL;
@@ -706,7 +706,7 @@ vy_task_dump_complete(struct vy_scheduler *scheduler, struct vy_task *task)
* to log index dump anyway.
*/
vy_log_tx_begin();
- vy_log_dump_index(index->commit_lsn, dump_lsn);
+ vy_log_dump_index(index->id, dump_lsn);
if (vy_log_tx_commit() < 0)
goto fail;
vy_run_discard(new_run);
@@ -766,7 +766,7 @@ vy_task_dump_complete(struct vy_scheduler *scheduler, struct vy_task *task)
* Log change in metadata.
*/
vy_log_tx_begin();
- vy_log_create_run(index->commit_lsn, new_run->id, dump_lsn);
+ vy_log_create_run(index->id, new_run->id, dump_lsn);
for (range = begin_range, i = 0; range != end_range;
range = vy_range_tree_next(index->tree, range), i++) {
assert(i < index->range_count);
@@ -778,7 +778,7 @@ vy_task_dump_complete(struct vy_scheduler *scheduler, struct vy_task *task)
if (++loops % VY_YIELD_LOOPS == 0)
fiber_sleep(0); /* see comment above */
}
- vy_log_dump_index(index->commit_lsn, dump_lsn);
+ vy_log_dump_index(index->id, dump_lsn);
if (vy_log_tx_commit() < 0)
goto fail_free_slices;
@@ -1103,7 +1103,7 @@ vy_task_compact_complete(struct vy_scheduler *scheduler, struct vy_task *task)
rlist_foreach_entry(run, &unused_runs, in_unused)
vy_log_drop_run(run->id, gc_lsn);
if (new_slice != NULL) {
- vy_log_create_run(index->commit_lsn, new_run->id,
+ vy_log_create_run(index->id, new_run->id,
new_run->dump_lsn);
vy_log_insert_slice(range->id, new_run->id, new_slice->id,
tuple_data_or_null(new_slice->begin),
diff --git a/test/unit/vy_log_stub.c b/test/unit/vy_log_stub.c
index 1fda0a6b..7cfaff84 100644
--- a/test/unit/vy_log_stub.c
+++ b/test/unit/vy_log_stub.c
@@ -52,8 +52,8 @@ void
vy_log_write(const struct vy_log_record *record) {}
struct vy_index_recovery_info *
-vy_recovery_lookup_index(struct vy_recovery *recovery,
- uint32_t space_id, uint32_t index_id)
+vy_recovery_index_by_id(struct vy_recovery *recovery,
+ uint32_t space_id, uint32_t index_id)
{
unreachable();
}
diff --git a/test/vinyl/layout.result b/test/vinyl/layout.result
index 603d2865..f1f52b9f 100644
--- a/test/vinyl/layout.result
+++ b/test/vinyl/layout.result
@@ -128,59 +128,59 @@ result
- - HEADER:
type: INSERT
BODY:
- tuple: [0, {0: 3, 7: [{'field': 0, 'collation': 1, 'type': 'string'}], 6: 512}]
+ tuple: [0, {7: [{'field': 0, 'collation': 1, 'type': 'string'}], 6: 512}]
- HEADER:
type: INSERT
BODY:
- tuple: [10, {0: 3, 9: 9}]
+ tuple: [10, {9: 9}]
- HEADER:
type: INSERT
BODY:
- tuple: [5, {0: 3, 2: 6, 9: 9}]
+ tuple: [5, {2: 8, 9: 9}]
- HEADER:
type: INSERT
BODY:
- tuple: [4, {0: 3, 2: 3}]
+ tuple: [4, {2: 5}]
- HEADER:
type: INSERT
BODY:
- tuple: [6, {2: 3}]
+ tuple: [6, {2: 5}]
- HEADER:
type: INSERT
BODY:
- tuple: [2, {0: 3}]
+ tuple: [2, {1: 1}]
- HEADER:
type: INSERT
BODY:
- tuple: [8, {2: 6, 8: 7}]
+ tuple: [8, {1: 1, 2: 8, 8: 9}]
- HEADER:
type: INSERT
BODY:
- tuple: [0, {0: 4, 5: 1, 6: 512, 7: [{'field': 1, 'is_nullable': true, 'type': 'unsigned'}]}]
+ tuple: [0, {0: 2, 5: 1, 6: 512, 7: [{'field': 1, 'is_nullable': true, 'type': 'unsigned'}]}]
- HEADER:
type: INSERT
BODY:
- tuple: [10, {0: 4, 9: 9}]
+ tuple: [10, {0: 2, 9: 9}]
- HEADER:
type: INSERT
BODY:
- tuple: [5, {0: 4, 2: 4, 9: 9}]
+ tuple: [5, {0: 2, 2: 6, 9: 9}]
- HEADER:
type: INSERT
BODY:
- tuple: [4, {0: 4, 2: 2}]
+ tuple: [4, {0: 2, 2: 4}]
- HEADER:
type: INSERT
BODY:
- tuple: [6, {2: 2}]
+ tuple: [6, {2: 4}]
- HEADER:
type: INSERT
BODY:
- tuple: [2, {0: 4, 1: 1}]
+ tuple: [2, {0: 2, 1: 3}]
- HEADER:
type: INSERT
BODY:
- tuple: [8, {1: 1, 2: 4, 8: 5}]
+ tuple: [8, {1: 3, 2: 6, 8: 7}]
- HEADER:
type: INSERT
BODY:
@@ -189,53 +189,53 @@ result
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [7, {2: 2}]
+ tuple: [7, {2: 4}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [7, {2: 3}]
+ tuple: [7, {2: 5}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [4, {0: 4, 2: 8}]
+ tuple: [4, {0: 2, 2: 10}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [5, {0: 4, 2: 8, 9: 12}]
+ tuple: [5, {0: 2, 2: 10, 9: 12}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [8, {1: 1, 2: 8, 8: 9}]
+ tuple: [8, {1: 3, 2: 10, 8: 11}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [10, {0: 4, 9: 12}]
+ tuple: [10, {0: 2, 9: 12}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [4, {0: 3, 2: 10}]
+ tuple: [4, {2: 12}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [5, {0: 3, 2: 10, 9: 12}]
+ tuple: [5, {2: 12, 9: 12}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [8, {2: 10, 8: 11}]
+ tuple: [8, {1: 1, 2: 12, 8: 13}]
- HEADER:
timestamp: <timestamp>
type: INSERT
BODY:
- tuple: [10, {0: 3, 9: 12}]
- - - 00000000000000000006.index
+ tuple: [10, {9: 12}]
+ - - 00000000000000000008.index
- - HEADER:
type: RUNINFO
BODY:
@@ -254,7 +254,7 @@ result
unpacked_size: 67
row_count: 3
min_key: ['ёёё']
- - - 00000000000000000006.run
+ - - 00000000000000000008.run
- - HEADER:
lsn: 9
type: INSERT
@@ -274,7 +274,7 @@ result
type: ROWINDEX
BODY:
row_index: "\0\0\0\0\0\0\0\x10\0\0\0 "
- - - 00000000000000000010.index
+ - - 00000000000000000012.index
- - HEADER:
type: RUNINFO
BODY:
@@ -293,7 +293,7 @@ result
unpacked_size: 71
row_count: 3
min_key: ['ёёё']
- - - 00000000000000000010.run
+ - - 00000000000000000012.run
- - HEADER:
lsn: 10
type: REPLACE
@@ -313,7 +313,7 @@ result
type: ROWINDEX
BODY:
row_index: "\0\0\0\0\0\0\0\x10\0\0\0\""
- - - 00000000000000000004.index
+ - - 00000000000000000006.index
- - HEADER:
type: RUNINFO
BODY:
@@ -332,7 +332,7 @@ result
unpacked_size: 67
row_count: 3
min_key: [null, 'ёёё']
- - - 00000000000000000004.run
+ - - 00000000000000000006.run
- - HEADER:
lsn: 9
type: INSERT
@@ -352,7 +352,7 @@ result
type: ROWINDEX
BODY:
row_index: "\0\0\0\0\0\0\0\x10\0\0\0 "
- - - 00000000000000000008.index
+ - - 00000000000000000010.index
- - HEADER:
type: RUNINFO
BODY:
@@ -371,7 +371,7 @@ result
unpacked_size: 91
row_count: 4
min_key: [null, 'ёёё']
- - - 00000000000000000008.run
+ - - 00000000000000000010.run
- - HEADER:
lsn: 10
type: DELETE
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 5/5] alter: rewrite space truncation using alter infrastructure
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
` (3 preceding siblings ...)
2018-03-20 11:29 ` [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog Vladimir Davydov
@ 2018-03-20 11:29 ` Vladimir Davydov
4 siblings, 0 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-20 11:29 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
Truncation of a space is equivalent to recreation of all space indexes
with the same definition. The reason why we use a special system space
to trigger space truncation (_truncate) is that we don't have
transactional DDL while space truncation has to be done atomically.
However, apart from the new system space, implementation of truncation
entailed a new vylog record (VY_LOG_TRUNCATE_INDEX) and quite a few
lines of code to handle it. So why couldn't we just invoke ALTER that
would recreate all indexes?
To answer this question, one needs to recall that back then vinyl used
LSN to identify indexes in vylog. As a result, we couldn't recreate more
than one index in one operation - if we did that, they would all have
the same LSN and hence wouldn't be distinguishable in vylog. So we had
to introduce a special vylog operation (VY_LOG_TRUNCATE_INDEX) that
bump the truncation counter of an index instead of just dropping and
recreating it. We also had to introduce a pair of new virtual space
methods, prepare_truncate and commit_truncate so that we could write
this new command to vylog in vinyl. Putting it all together, it becomes
obvious why we couldn't reuse ALTER code for space truncation.
Fortunately, things have changed since then. Now, vylog identifies
indexes by space_id/index_id. That means that now we can simplify
space truncation implementation a great deal by
- reusing alter_space_do() for space truncation,
- dropping space_vtab::prepare_truncate and commit_truncate,
- removing truncate_count from space, index, and vylog.
---
src/box/alter.cc | 107 ++++---------------------------
src/box/memtx_space.c | 29 +++------
src/box/space.h | 44 -------------
src/box/sysview_engine.c | 17 -----
src/box/vinyl.c | 160 ++++-------------------------------------------
src/box/vy_index.c | 25 +-------
src/box/vy_index.h | 22 -------
src/box/vy_log.c | 56 +----------------
src/box/vy_log.h | 29 ++++-----
9 files changed, 48 insertions(+), 441 deletions(-)
diff --git a/src/box/alter.cc b/src/box/alter.cc
index 8455373b..54db02e3 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -826,7 +826,6 @@ alter_space_do(struct txn *txn, struct alter_space *alter)
space_prepare_alter_xc(alter->old_space, alter->new_space);
alter->new_space->sequence = alter->old_space->sequence;
- alter->new_space->truncate_count = alter->old_space->truncate_count;
memcpy(alter->new_space->access, alter->old_space->access,
sizeof(alter->old_space->access));
@@ -1803,48 +1802,6 @@ on_replace_dd_index(struct trigger * /* trigger */, void *event)
scoped_guard.is_active = false;
}
-/* {{{ space truncate */
-
-struct truncate_space {
- /** Space being truncated. */
- struct space *old_space;
- /** Space created as a result of truncation. */
- struct space *new_space;
- /** Trigger executed to commit truncation. */
- struct trigger on_commit;
- /** Trigger executed to rollback truncation. */
- struct trigger on_rollback;
-};
-
-/**
- * Call the engine specific method to commit truncation
- * and delete the old space.
- */
-static void
-truncate_space_commit(struct trigger *trigger, void * /* event */)
-{
- struct truncate_space *truncate =
- (struct truncate_space *) trigger->data;
- space_commit_truncate(truncate->old_space, truncate->new_space);
- space_delete(truncate->old_space);
-}
-
-/**
- * Move the old space back to the cache and delete
- * the new space.
- */
-static void
-truncate_space_rollback(struct trigger *trigger, void * /* event */)
-{
- struct truncate_space *truncate =
- (struct truncate_space *) trigger->data;
- if (space_cache_replace(truncate->old_space) != truncate->new_space)
- unreachable();
-
- space_swap_triggers(truncate->new_space, truncate->old_space);
- space_delete(truncate->new_space);
-}
-
/**
* A trigger invoked on replace in space _truncate.
*
@@ -1871,16 +1828,13 @@ on_replace_dd_truncate(struct trigger * /* trigger */, void *event)
uint32_t space_id =
tuple_field_u32_xc(new_tuple, BOX_TRUNCATE_FIELD_SPACE_ID);
- uint64_t truncate_count =
- tuple_field_u64_xc(new_tuple, BOX_TRUNCATE_FIELD_COUNT);
struct space *old_space = space_cache_find_xc(space_id);
if (stmt->row->type == IPROTO_INSERT) {
/*
* Space creation during initial recovery -
- * initialize truncate_count.
+ * nothing to do.
*/
- old_space->truncate_count = truncate_count;
return;
}
@@ -1898,59 +1852,24 @@ on_replace_dd_truncate(struct trigger * /* trigger */, void *event)
*/
access_check_space_xc(old_space, PRIV_W);
- /*
- * Truncate counter is updated - truncate the space.
- */
- struct truncate_space *truncate =
- region_calloc_object_xc(&fiber()->gc, struct truncate_space);
-
- /* Create an empty copy of the old space. */
- struct rlist key_list;
- space_dump_def(old_space, &key_list);
- struct space *new_space = space_new_xc(old_space->def, &key_list);
- new_space->truncate_count = truncate_count;
- auto space_guard = make_scoped_guard([=] { space_delete(new_space); });
-
- /* Notify the engine about upcoming space truncation. */
- space_prepare_truncate_xc(old_space, new_space);
-
- space_guard.is_active = false;
-
- /* Preserve the access control lists during truncate. */
- memcpy(new_space->access, old_space->access, sizeof(old_space->access));
-
- /* Truncate does not affect space sequence. */
- new_space->sequence = old_space->sequence;
-
- /*
- * Replace the old space with the new one in the space
- * cache. Requests processed after this point will see
- * the space as truncated.
- */
- if (space_cache_replace(new_space) != old_space)
- unreachable();
+ struct alter_space *alter = alter_space_new(old_space);
+ auto scoped_guard =
+ make_scoped_guard([=] { alter_space_delete(alter); });
/*
- * Register the trigger that will commit or rollback
- * truncation depending on whether WAL write succeeds
- * or fails.
+ * Recreate all indexes of the truncated space.
*/
- truncate->old_space = old_space;
- truncate->new_space = new_space;
-
- trigger_create(&truncate->on_commit,
- truncate_space_commit, truncate, NULL);
- txn_on_commit(txn, &truncate->on_commit);
-
- trigger_create(&truncate->on_rollback,
- truncate_space_rollback, truncate, NULL);
- txn_on_rollback(txn, &truncate->on_rollback);
+ for (uint32_t i = 0; i < old_space->index_count; i++) {
+ struct index *old_index = old_space->index[i];
+ (void) new DropIndex(alter, old_index->def);
+ auto create_index = new CreateIndex(alter);
+ create_index->new_index_def = index_def_dup_xc(old_index->def);
+ }
- space_swap_triggers(truncate->new_space, truncate->old_space);
+ alter_space_do(txn, alter);
+ scoped_guard.is_active = false;
}
-/* }}} */
-
/* {{{ access control */
bool
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index 2d94597a..c7e58946 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -818,16 +818,6 @@ memtx_space_build_secondary_key(struct space *old_space,
return rc;
}
-static int
-memtx_space_prepare_truncate(struct space *old_space,
- struct space *new_space)
-{
- struct memtx_space *old_memtx_space = (struct memtx_space *)old_space;
- struct memtx_space *new_memtx_space = (struct memtx_space *)new_space;
- new_memtx_space->replace = old_memtx_space->replace;
- return 0;
-}
-
static void
memtx_space_prune(struct space *space)
{
@@ -858,14 +848,6 @@ fail:
panic("failed to prune space");
}
-static void
-memtx_space_commit_truncate(struct space *old_space,
- struct space *new_space)
-{
- (void)new_space;
- memtx_space_prune(old_space);
-}
-
static int
memtx_space_prepare_alter(struct space *old_space, struct space *new_space)
{
@@ -883,9 +865,14 @@ memtx_space_commit_alter(struct space *old_space, struct space *new_space)
{
struct memtx_space *old_memtx_space = (struct memtx_space *)old_space;
struct memtx_space *new_memtx_space = (struct memtx_space *)new_space;
+ bool is_empty = new_space->index_count == 0 ||
+ index_size(new_space->index[0]) == 0;
- /* Delete all tuples when the last index is dropped. */
- if (new_space->index_count == 0)
+ /*
+ * Delete all tuples when the last index is dropped
+ * or the space is truncated.
+ */
+ if (is_empty)
memtx_space_prune(old_space);
else
new_memtx_space->bsize = old_memtx_space->bsize;
@@ -908,8 +895,6 @@ static const struct space_vtab memtx_space_vtab = {
/* .drop_primary_key = */ memtx_space_drop_primary_key,
/* .check_format = */ memtx_space_check_format,
/* .build_secondary_key = */ memtx_space_build_secondary_key,
- /* .prepare_truncate = */ memtx_space_prepare_truncate,
- /* .commit_truncate = */ memtx_space_commit_truncate,
/* .prepare_alter = */ memtx_space_prepare_alter,
/* .commit_alter = */ memtx_space_commit_alter,
};
diff --git a/src/box/space.h b/src/box/space.h
index 6408eedc..65f1531d 100644
--- a/src/box/space.h
+++ b/src/box/space.h
@@ -104,23 +104,6 @@ struct space_vtab {
struct space *new_space,
struct index *new_index);
/**
- * Notify the enigne about upcoming space truncation
- * so that it can prepare new_space object.
- */
- int (*prepare_truncate)(struct space *old_space,
- struct space *new_space);
- /**
- * Commit space truncation. Called after space truncate
- * record was written to WAL hence must not fail.
- *
- * The old_space is the space that was replaced with the
- * new_space as a result of truncation. The callback is
- * supposed to release resources associated with the
- * old_space and commit the new_space.
- */
- void (*commit_truncate)(struct space *old_space,
- struct space *new_space);
- /**
* Notify the engine about the changed space,
* before it's done, to prepare 'new_space' object.
*/
@@ -167,12 +150,6 @@ struct space {
struct space_def *def;
/** Sequence attached to this space or NULL. */
struct sequence *sequence;
- /**
- * Number of times the space has been truncated.
- * Updating this counter via _truncate space triggers
- * space truncation.
- */
- uint64_t truncate_count;
/** Enable/disable triggers. */
bool run_triggers;
/**
@@ -354,20 +331,6 @@ space_build_secondary_key(struct space *old_space,
}
static inline int
-space_prepare_truncate(struct space *old_space, struct space *new_space)
-{
- assert(old_space->vtab == new_space->vtab);
- return new_space->vtab->prepare_truncate(old_space, new_space);
-}
-
-static inline void
-space_commit_truncate(struct space *old_space, struct space *new_space)
-{
- assert(old_space->vtab == new_space->vtab);
- new_space->vtab->commit_truncate(old_space, new_space);
-}
-
-static inline int
space_prepare_alter(struct space *old_space, struct space *new_space)
{
assert(old_space->vtab == new_space->vtab);
@@ -525,13 +488,6 @@ space_build_secondary_key_xc(struct space *old_space,
}
static inline void
-space_prepare_truncate_xc(struct space *old_space, struct space *new_space)
-{
- if (space_prepare_truncate(old_space, new_space) != 0)
- diag_raise();
-}
-
-static inline void
space_prepare_alter_xc(struct space *old_space, struct space *new_space)
{
if (space_prepare_alter(old_space, new_space) != 0)
diff --git a/src/box/sysview_engine.c b/src/box/sysview_engine.c
index 27d9263a..f6122645 100644
--- a/src/box/sysview_engine.c
+++ b/src/box/sysview_engine.c
@@ -147,21 +147,6 @@ sysview_space_build_secondary_key(struct space *old_space,
}
static int
-sysview_space_prepare_truncate(struct space *old_space, struct space *new_space)
-{
- (void)old_space;
- (void)new_space;
- return 0;
-}
-
-static void
-sysview_space_commit_truncate(struct space *old_space, struct space *new_space)
-{
- (void)old_space;
- (void)new_space;
-}
-
-static int
sysview_space_prepare_alter(struct space *old_space, struct space *new_space)
{
(void)old_space;
@@ -200,8 +185,6 @@ static const struct space_vtab sysview_space_vtab = {
/* .drop_primary_key = */ sysview_space_drop_primary_key,
/* .check_format = */ sysview_space_check_format,
/* .build_secondary_key = */ sysview_space_build_secondary_key,
- /* .prepare_truncate = */ sysview_space_prepare_truncate,
- /* .commit_truncate = */ sysview_space_commit_truncate,
/* .prepare_alter = */ sysview_space_prepare_alter,
/* .commit_alter = */ sysview_space_commit_alter,
};
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 1908d5fc..f13a9fe9 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -904,137 +904,6 @@ vinyl_init_system_space(struct space *space)
}
static int
-vinyl_space_prepare_truncate(struct space *old_space, struct space *new_space)
-{
- struct vy_env *env = vy_env(old_space->engine);
-
- if (vinyl_check_wal(env, "DDL") != 0)
- return -1;
-
- assert(old_space->index_count == new_space->index_count);
- uint32_t index_count = new_space->index_count;
- if (index_count == 0)
- return 0;
-
- struct vy_index *pk = vy_index(old_space->index[0]);
-
- /*
- * On local recovery, we need to handle the following
- * scenarios:
- *
- * - Space truncation was successfully logged before restart.
- * In this case indexes of the old space contain data added
- * after truncation (recovered by vy_index_recover()) and
- * hence we just need to swap contents between old and new
- * spaces.
- *
- * - We failed to log space truncation before restart.
- * In this case we have to replay space truncation the
- * same way we handle it during normal operation.
- *
- * See also vy_commit_truncate_space().
- */
- bool truncate_done = (env->status == VINYL_FINAL_RECOVERY_LOCAL &&
- pk->truncate_count > old_space->truncate_count);
-
- for (uint32_t i = 0; i < index_count; i++) {
- struct vy_index *old_index = vy_index(old_space->index[i]);
- struct vy_index *new_index = vy_index(new_space->index[i]);
-
- new_index->id = old_index->id;
- new_index->is_committed = old_index->is_committed;
-
- if (truncate_done) {
- /*
- * We are replaying truncate from WAL and the
- * old space already contains data added after
- * truncate (recovered from vylog). Avoid
- * reloading the space content from vylog,
- * simply swap the contents of old and new
- * spaces instead.
- */
- vy_index_swap(old_index, new_index);
- new_index->is_dropped = old_index->is_dropped;
- new_index->truncate_count = old_index->truncate_count;
- vy_scheduler_remove_index(&env->scheduler, old_index);
- vy_scheduler_add_index(&env->scheduler, new_index);
- continue;
- }
-
- if (vy_index_init_range_tree(new_index) != 0)
- return -1;
-
- new_index->truncate_count = new_space->truncate_count;
- }
- return 0;
-}
-
-static void
-vinyl_space_commit_truncate(struct space *old_space, struct space *new_space)
-{
- struct vy_env *env = vy_env(old_space->engine);
-
- assert(old_space->index_count == new_space->index_count);
- uint32_t index_count = new_space->index_count;
- if (index_count == 0)
- return;
-
- struct vy_index *pk = vy_index(old_space->index[0]);
-
- /*
- * See the comment in vy_prepare_truncate_space().
- */
- if (env->status == VINYL_FINAL_RECOVERY_LOCAL &&
- pk->truncate_count > old_space->truncate_count)
- return;
-
- /*
- * Mark old indexes as dropped and remove them from the scheduler.
- * After this point no task can be scheduled or completed for any
- * of them (only aborted).
- */
- for (uint32_t i = 0; i < index_count; i++) {
- struct vy_index *index = vy_index(old_space->index[i]);
- index->is_dropped = true;
- vy_scheduler_remove_index(&env->scheduler, index);
- }
-
- /*
- * Log change in metadata.
- *
- * Since we can't fail here, in case of vylog write failure
- * we leave records we failed to write in vylog buffer so
- * that they get flushed along with the next write. If they
- * don't, we will replay them during WAL recovery.
- */
- vy_log_tx_begin();
- int64_t gc_lsn = checkpoint_last(NULL);
- for (uint32_t i = 0; i < index_count; i++) {
- struct vy_index *old_index = vy_index(old_space->index[i]);
- struct vy_index *new_index = vy_index(new_space->index[i]);
- struct vy_range *range = vy_range_tree_first(new_index->tree);
-
- assert(!new_index->is_dropped);
- assert(new_index->truncate_count == new_space->truncate_count);
- assert(new_index->range_count == 1);
-
- vy_log_index_prune(old_index, gc_lsn);
- vy_log_insert_range(new_index->id, range->id, NULL, NULL);
- vy_log_truncate_index(new_index->id, new_index->truncate_count);
- }
- vy_log_tx_try_commit();
-
- /*
- * After we committed space truncation in the metadata log,
- * we can make new indexes eligible for dump and compaction.
- */
- for (uint32_t i = 0; i < index_count; i++) {
- struct vy_index *index = vy_index(new_space->index[i]);
- vy_scheduler_add_index(&env->scheduler, index);
- }
-}
-
-static int
vinyl_space_prepare_alter(struct space *old_space, struct space *new_space)
{
struct vy_env *env = vy_env(old_space->engine);
@@ -1329,15 +1198,12 @@ vinyl_index_bsize(struct index *base)
* either.
*/
static inline bool
-vy_is_committed_one(struct vy_env *env, struct space *space,
- struct vy_index *index)
+vy_is_committed_one(struct vy_env *env, struct vy_index *index)
{
if (likely(env->status != VINYL_FINAL_RECOVERY_LOCAL))
return false;
if (index->is_dropped)
return true;
- if (index->truncate_count > space->truncate_count)
- return true;
if (vclock_sum(env->recovery_vclock) <= index->dump_lsn)
return true;
return false;
@@ -1354,7 +1220,7 @@ vy_is_committed(struct vy_env *env, struct space *space)
return false;
for (uint32_t iid = 0; iid < space->index_count; iid++) {
struct vy_index *index = vy_index(space->index[iid]);
- if (!vy_is_committed_one(env, space, index))
+ if (!vy_is_committed_one(env, index))
return false;
}
return true;
@@ -1591,7 +1457,7 @@ vy_replace_impl(struct vy_env *env, struct vy_tx *tx, struct space *space,
if (pk == NULL) /* space has no primary key */
return -1;
/* Primary key is dumped last. */
- assert(!vy_is_committed_one(env, space, pk));
+ assert(!vy_is_committed_one(env, pk));
assert(pk->index_id == 0);
if (tuple_validate_raw(pk->mem_format, request->tuple))
return -1;
@@ -1630,7 +1496,7 @@ vy_replace_impl(struct vy_env *env, struct vy_tx *tx, struct space *space,
for (uint32_t iid = 1; iid < space->index_count; ++iid) {
struct vy_index *index;
index = vy_index(space->index[iid]);
- if (vy_is_committed_one(env, space, index))
+ if (vy_is_committed_one(env, index))
continue;
/*
* Delete goes first, so if old and new keys
@@ -1763,7 +1629,7 @@ vy_delete_impl(struct vy_env *env, struct vy_tx *tx, struct space *space,
if (pk == NULL)
return -1;
/* Primary key is dumped last. */
- assert(!vy_is_committed_one(env, space, pk));
+ assert(!vy_is_committed_one(env, pk));
struct tuple *delete =
vy_stmt_new_surrogate_delete(pk->mem_format, tuple);
if (delete == NULL)
@@ -1775,7 +1641,7 @@ vy_delete_impl(struct vy_env *env, struct vy_tx *tx, struct space *space,
struct vy_index *index;
for (uint32_t i = 1; i < space->index_count; ++i) {
index = vy_index(space->index[i]);
- if (vy_is_committed_one(env, space, index))
+ if (vy_is_committed_one(env, index))
continue;
if (vy_tx_set(tx, index, delete) != 0)
goto error;
@@ -1924,7 +1790,7 @@ vy_update(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
assert(pk != NULL);
assert(pk->index_id == 0);
/* Primary key is dumped last. */
- assert(!vy_is_committed_one(env, space, pk));
+ assert(!vy_is_committed_one(env, pk));
uint64_t column_mask = 0;
const char *new_tuple, *new_tuple_end;
uint32_t new_size, old_size;
@@ -1978,7 +1844,7 @@ vy_update(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
for (uint32_t i = 1; i < space->index_count; ++i) {
index = vy_index(space->index[i]);
- if (vy_is_committed_one(env, space, index))
+ if (vy_is_committed_one(env, index))
continue;
if (vy_tx_set(tx, index, delete) != 0)
goto error;
@@ -2162,7 +2028,7 @@ vy_upsert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
if (pk == NULL)
return -1;
/* Primary key is dumped last. */
- assert(!vy_is_committed_one(env, space, pk));
+ assert(!vy_is_committed_one(env, pk));
if (tuple_validate_raw(pk->mem_format, tuple))
return -1;
@@ -2258,7 +2124,7 @@ vy_upsert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
for (uint32_t i = 1; i < space->index_count; ++i) {
index = vy_index(space->index[i]);
- if (vy_is_committed_one(env, space, index))
+ if (vy_is_committed_one(env, index))
continue;
if (vy_tx_set(tx, index, delete) != 0)
goto error;
@@ -2298,7 +2164,7 @@ vy_insert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
return -1;
assert(pk->index_id == 0);
/* Primary key is dumped last. */
- assert(!vy_is_committed_one(env, space, pk));
+ assert(!vy_is_committed_one(env, pk));
if (tuple_validate_raw(pk->mem_format, request->tuple))
return -1;
/* First insert into the primary index. */
@@ -2311,7 +2177,7 @@ vy_insert(struct vy_env *env, struct vy_tx *tx, struct txn_stmt *stmt,
for (uint32_t iid = 1; iid < space->index_count; ++iid) {
struct vy_index *index = vy_index(space->index[iid]);
- if (vy_is_committed_one(env, space, index))
+ if (vy_is_committed_one(env, index))
continue;
if (vy_insert_secondary(env, tx, space, index,
stmt->new_tuple) != 0)
@@ -4080,8 +3946,6 @@ static const struct space_vtab vinyl_space_vtab = {
/* .drop_primary_key = */ vinyl_space_drop_primary_key,
/* .check_format = */ vinyl_space_check_format,
/* .build_secondary_key = */ vinyl_space_build_secondary_key,
- /* .prepare_truncate = */ vinyl_space_prepare_truncate,
- /* .commit_truncate = */ vinyl_space_commit_truncate,
/* .prepare_alter = */ vinyl_space_prepare_alter,
/* .commit_alter = */ vinyl_space_commit_alter,
};
diff --git a/src/box/vy_index.c b/src/box/vy_index.c
index de8c5f1e..59c79910 100644
--- a/src/box/vy_index.c
+++ b/src/box/vy_index.c
@@ -318,23 +318,8 @@ vy_index_delete(struct vy_index *index)
free(index);
}
-void
-vy_index_swap(struct vy_index *old_index, struct vy_index *new_index)
-{
- assert(old_index->stat.memory.count.rows == 0);
- assert(new_index->stat.memory.count.rows == 0);
-
- SWAP(old_index->dump_lsn, new_index->dump_lsn);
- SWAP(old_index->range_count, new_index->range_count);
- SWAP(old_index->run_count, new_index->run_count);
- SWAP(old_index->stat, new_index->stat);
- SWAP(old_index->run_hist, new_index->run_hist);
- SWAP(old_index->tree, new_index->tree);
- SWAP(old_index->range_heap, new_index->range_heap);
- rlist_swap(&old_index->runs, &new_index->runs);
-}
-
-int
+/** Initialize the range tree of a new index. */
+static int
vy_index_init_range_tree(struct vy_index *index)
{
struct vy_range *range = vy_range_new(vy_log_next_id(), NULL, NULL,
@@ -606,11 +591,6 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
*/
index->is_dropped = true;
/*
- * If the index was dropped, we don't need to replay
- * truncate (see vinyl_space_prepare_truncate()).
- */
- index->truncate_count = UINT64_MAX;
- /*
* We need range tree initialized for all indexes,
* even for dropped ones.
*/
@@ -621,7 +601,6 @@ vy_index_recover(struct vy_index *index, struct vy_recovery *recovery,
* Loading the last incarnation of the index from vylog.
*/
index->dump_lsn = index_info->dump_lsn;
- index->truncate_count = index_info->truncate_count;
int rc = 0;
struct vy_range_recovery_info *range_info;
diff --git a/src/box/vy_index.h b/src/box/vy_index.h
index 417f2d36..33d1da4a 100644
--- a/src/box/vy_index.h
+++ b/src/box/vy_index.h
@@ -267,15 +267,6 @@ struct vy_index {
*/
bool is_dropped;
/**
- * Number of times the index was truncated.
- *
- * After recovery is complete, it equals space->truncate_count.
- * On local recovery, it is loaded from the metadata log and may
- * be greater than space->truncate_count, which indicates that
- * the space is truncated in WAL.
- */
- uint64_t truncate_count;
- /**
* If pin_count > 0 the index can't be scheduled for dump.
* Used to make sure that the primary index is dumped last.
*/
@@ -346,19 +337,6 @@ vy_index_unref(struct vy_index *index)
}
/**
- * Swap disk contents (ranges, runs, and corresponding stats)
- * between two indexes. Used only on recovery, to skip reloading
- * indexes of a truncated space. The in-memory tree of the index
- * can't be populated - see vy_is_committed_one().
- */
-void
-vy_index_swap(struct vy_index *old_index, struct vy_index *new_index);
-
-/** Initialize the range tree of a new index. */
-int
-vy_index_init_range_tree(struct vy_index *index);
-
-/**
* Create a new vinyl index.
*
* This function is called when an index is created after recovery
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index 9c8dd631..0a5dd26e 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -264,10 +264,6 @@ vy_log_record_snprint(char *buf, int size, const struct vy_log_record *record)
SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
vy_log_key_name[VY_LOG_KEY_GC_LSN],
record->gc_lsn);
- if (record->truncate_count > 0)
- SNPRINT(total, snprintf, buf, size, "%s=%"PRIi64", ",
- vy_log_key_name[VY_LOG_KEY_TRUNCATE_COUNT],
- record->truncate_count);
SNPRINT(total, snprintf, buf, size, "}");
return total;
}
@@ -379,11 +375,6 @@ vy_log_record_encode(const struct vy_log_record *record,
size += mp_sizeof_uint(record->gc_lsn);
n_keys++;
}
- if (record->truncate_count > 0) {
- size += mp_sizeof_uint(VY_LOG_KEY_TRUNCATE_COUNT);
- size += mp_sizeof_uint(record->truncate_count);
- n_keys++;
- }
size += mp_sizeof_map(n_keys);
/*
@@ -454,10 +445,6 @@ vy_log_record_encode(const struct vy_log_record *record,
pos = mp_encode_uint(pos, VY_LOG_KEY_GC_LSN);
pos = mp_encode_uint(pos, record->gc_lsn);
}
- if (record->truncate_count > 0) {
- pos = mp_encode_uint(pos, VY_LOG_KEY_TRUNCATE_COUNT);
- pos = mp_encode_uint(pos, record->truncate_count);
- }
assert(pos == tuple + size);
/*
@@ -577,7 +564,7 @@ vy_log_record_decode(struct vy_log_record *record,
record->gc_lsn = mp_decode_uint(&pos);
break;
case VY_LOG_KEY_TRUNCATE_COUNT:
- record->truncate_count = mp_decode_uint(&pos);
+ /* Not used anymore, ignore. */
break;
default:
diag_set(ClientError, ER_INVALID_VYLOG_FILE,
@@ -1277,7 +1264,6 @@ vy_recovery_create_index(struct vy_recovery *recovery, int64_t id,
index->is_dropped = false;
index->commit_lsn = commit_lsn;
index->dump_lsn = -1;
- index->truncate_count = 0;
/*
* Add the index to the hash.
@@ -1374,33 +1360,6 @@ vy_recovery_dump_index(struct vy_recovery *recovery,
}
/**
- * Handle a VY_LOG_TRUNCATE_INDEX log record.
- * This function updates truncate_count of the index with ID @id.
- * Returns 0 on success, -1 if ID not found or index is dropped.
- */
-static int
-vy_recovery_truncate_index(struct vy_recovery *recovery,
- int64_t id, int64_t truncate_count)
-{
- struct vy_index_recovery_info *index;
- index = vy_recovery_lookup_index(recovery, id);
- if (index == NULL) {
- diag_set(ClientError, ER_INVALID_VYLOG_FILE,
- tt_sprintf("Truncation of unregistered index %lld",
- (long long)id));
- return -1;
- }
- if (index->is_dropped) {
- diag_set(ClientError, ER_INVALID_VYLOG_FILE,
- tt_sprintf("Truncation of deleted index %lld",
- (long long)id));
- return -1;
- }
- index->truncate_count = truncate_count;
- return 0;
-}
-
-/**
* Allocate a vinyl run with ID @run_id and insert it to the hash.
* Return the new run on success, NULL on OOM.
*/
@@ -1842,8 +1801,8 @@ vy_recovery_process_record(struct vy_recovery *recovery,
record->dump_lsn);
break;
case VY_LOG_TRUNCATE_INDEX:
- rc = vy_recovery_truncate_index(recovery, record->index_id,
- record->truncate_count);
+ /* Not used anymore, ignore. */
+ rc = 0;
break;
default:
unreachable();
@@ -2056,15 +2015,6 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
if (vy_log_append_record(xlog, &record) != 0)
return -1;
- if (index->truncate_count > 0) {
- vy_log_record_init(&record);
- record.type = VY_LOG_TRUNCATE_INDEX;
- record.index_id = index->id;
- record.truncate_count = index->truncate_count;
- if (vy_log_append_record(xlog, &record) != 0)
- return -1;
- }
-
if (index->dump_lsn >= 0) {
vy_log_record_init(&record);
record.type = VY_LOG_DUMP_INDEX;
diff --git a/src/box/vy_log.h b/src/box/vy_log.h
index 19987c61..8e1a2a1d 100644
--- a/src/box/vy_log.h
+++ b/src/box/vy_log.h
@@ -151,8 +151,17 @@ enum vy_log_record_type {
*/
VY_LOG_SNAPSHOT = 11,
/**
- * Update truncate count of a vinyl index.
- * Requires vy_log_record::index_id, truncate_count.
+ * When we used LSN for identifying indexes in vylog, we
+ * couldn't simply recreate an index on space truncation,
+ * because in case the space had more than one index, we
+ * wouldn't be able to distinguish them after truncation.
+ * So we wrote special 'truncate' record.
+ *
+ * Now, we assign a unique id to each index and so we don't
+ * need a special record type for space truncation. If we
+ * are recovering from an old vylog, we simply ignore all
+ * 'truncate' records - this will result in replay of all
+ * WAL records written after truncation.
*/
VY_LOG_TRUNCATE_INDEX = 12,
@@ -200,8 +209,6 @@ struct vy_log_record {
* that uses this run.
*/
int64_t gc_lsn;
- /** Index truncate count. */
- int64_t truncate_count;
/** Link in vy_log::tx. */
struct stailq_entry in_tx;
};
@@ -250,8 +257,6 @@ struct vy_index_recovery_info {
int64_t commit_lsn;
/** LSN of the last index dump. */
int64_t dump_lsn;
- /** Truncate count. */
- int64_t truncate_count;
/**
* List of all ranges in the index, linked by
* vy_range_recovery_info::in_index.
@@ -635,18 +640,6 @@ vy_log_dump_index(int64_t id, int64_t dump_lsn)
vy_log_write(&record);
}
-/** Helper to log index truncation. */
-static inline void
-vy_log_truncate_index(int64_t id, int64_t truncate_count)
-{
- struct vy_log_record record;
- vy_log_record_init(&record);
- record.type = VY_LOG_TRUNCATE_INDEX;
- record.index_id = id;
- record.truncate_count = truncate_count;
- vy_log_write(&record);
-}
-
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
--
2.11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog
2018-03-20 11:29 ` [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog Vladimir Davydov
@ 2018-03-22 15:08 ` Vladimir Davydov
0 siblings, 0 replies; 7+ messages in thread
From: Vladimir Davydov @ 2018-03-22 15:08 UTC (permalink / raw)
To: kostja; +Cc: tarantool-patches
On Tue, Mar 20, 2018 at 02:29:04PM +0300, Vladimir Davydov wrote:
> diff --git a/src/box/vy_log.c b/src/box/vy_log.c
> index a6f03a55..9c8dd631 100644
> --- a/src/box/vy_log.c
> +++ b/src/box/vy_log.c
> @@ -2013,7 +2048,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
>
> vy_log_record_init(&record);
> record.type = VY_LOG_CREATE_INDEX;
> - record.index_lsn = index->index_lsn;
> + record.index_id = index->id;
> record.index_def_id = index->index_id;
> record.space_def_id = index->space_id;
> record.key_parts = index->key_parts;
Missed setting record.commit_lsn here - fixed on branch:
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index 0a5dd26e..ea4b902d 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -2012,6 +2012,7 @@ vy_log_append_index(struct xlog *xlog, struct vy_index_recovery_info *index)
record.space_def_id = index->space_id;
record.key_parts = index->key_parts;
record.key_part_count = index->key_part_count;
+ record.commit_lsn = index->commit_lsn;
if (vy_log_append_record(xlog, &record) != 0)
return -1;
diff --git a/test/vinyl/layout.result b/test/vinyl/layout.result
index f1f52b9f..8878cb5e 100644
--- a/test/vinyl/layout.result
+++ b/test/vinyl/layout.result
@@ -128,7 +128,8 @@ result
- - HEADER:
type: INSERT
BODY:
- tuple: [0, {7: [{'field': 0, 'collation': 1, 'type': 'string'}], 6: 512}]
+ tuple: [0, {7: [{'field': 0, 'collation': 1, 'type': 'string'}], 12: 3,
+ 6: 512}]
- HEADER:
type: INSERT
BODY:
@@ -156,7 +157,8 @@ result
- HEADER:
type: INSERT
BODY:
- tuple: [0, {0: 2, 5: 1, 6: 512, 7: [{'field': 1, 'is_nullable': true, 'type': 'unsigned'}]}]
+ tuple: [0, {0: 2, 5: 1, 6: 512, 7: [{'field': 1, 'is_nullable': true, 'type': 'unsigned'}],
+ 12: 4}]
- HEADER:
type: INSERT
BODY:
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-03-22 15:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-20 11:29 [PATCH v2 0/5] Prepare vylog for space alter Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 1/5] vinyl: refactor vylog recovery Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 2/5] vinyl: rename vy_index::id to index_id Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 3/5] vinyl: rename vy_log_record::index_id/space_id to index_def_id/space_def_id Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 4/5] vinyl: do not use index lsn to identify indexes in vylog Vladimir Davydov
2018-03-22 15:08 ` Vladimir Davydov
2018-03-20 11:29 ` [PATCH v2 5/5] alter: rewrite space truncation using alter infrastructure Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox