[Tarantool-patches] [PATCH v3 11/13] txm: introduce txm_story

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Jul 16 03:20:53 MSK 2020


Thanks for the patch!

See 18 comments below.

On 15.07.2020 15:55, Aleksandr Lyapunov wrote:
> TXM story is a part of a history of a value in space.
> It's a story about a tuple, from the point it was added to space
> to the point when it was deleted from the space.
> All stories are linked into a list of stories of the same key of
> each index.
> 
> Part of #4897
> ---
>  src/box/txn.c | 798 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  src/box/txn.h | 193 ++++++++++++++
>  2 files changed, 991 insertions(+)
> 
> diff --git a/src/box/txn.c b/src/box/txn.c
> index ba81b58..4c46b60 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -37,6 +37,28 @@
>  #include "xrow.h"
>  #include "errinj.h"
>  #include "iproto_constants.h"
> +#include "small/mempool.h"
> +
> +static uint32_t
> +txm_story_key_hash(const struct tuple *a)
> +{
> +	uintptr_t u = (uintptr_t)a;
> +	if (sizeof(uintptr_t) <= sizeof(uint32_t))
> +		return u;
> +	else
> +		return u ^ (u >> 32);
> +}
> +
> +#define mh_name _history

1. Why is the hash called 'history', but the objects are 'story'?
Or as 'history' you mean a list of stories of the same key in one
index?

> +#define mh_key_t struct tuple *
> +#define mh_node_t struct txm_story *
> +#define mh_arg_t int
> +#define mh_hash(a, arg) (txm_story_key_hash((*(a))->tuple))
> +#define mh_hash_key(a, arg) (txm_story_key_hash(a))
> +#define mh_cmp(a, b, arg) ((*(a))->tuple != (*(b))->tuple)
> +#define mh_cmp_key(a, b, arg) ((a) != (*(b))->tuple)
> +#define MH_SOURCE
> +#include "salad/mhash.h"
>  
>  struct tx_manager
>  {
> @@ -48,6 +70,14 @@ struct tx_manager
>  	 * so the list is ordered by rv_psn.
>  	 */
>  	struct rlist read_view_txs;
> +	/** Mempools for tx_story objects with difference index count. */

2. difference -> different.

> +	struct mempool txm_story_pool[BOX_INDEX_MAX];
> +	/** Hash table tuple -> txm_story of that tuple. */
> +	struct mh_history_t *history;
> +	/** List of all txm_story objects. */
> +	struct rlist all_stories;
> +	/** Iterator that sequentially traverses all txm_story objects. */
> +	struct rlist *traverse_all_stories;

3. The iterator is initialized and assigned to something, but is
never used for anything meaningful. The same for all_stories list.
At least on top of this commit.

>  };
>  
>  /** That's a definition, see declaration for description. */
> @@ -1294,11 +1327,23 @@ void
>  tx_manager_init()
>  {
>  	rlist_create(&txm.read_view_txs);
> +	for (size_t i = 0; i < BOX_INDEX_MAX; i++) {
> +		size_t item_size = sizeof(struct txm_story) +
> +			i * sizeof(struct txm_story_link);
> +		mempool_create(&txm.txm_story_pool[i],
> +			       cord_slab_cache(), item_size);

4. Is it correct you also create a pool for 0 index count?
Why?

Probably would be much simpler to just create one mempool with
objects having BOX_INDEX_MAX records in them. It would also solve
the DDL problem. At least partially.

Can the story objects be allocated on txn's region? AFAIU,
stories are created by transactions, and can't outlive their
creator.

> +	}
> +	txm.history = mh_history_new();
> +	rlist_create(&txm.all_stories);
> +	txm.traverse_all_stories = &txm.all_stories;
>  }
>  
>  void
>  tx_manager_free()
>  {
> +	mh_history_delete(txm.history);
> +	for (size_t i = 0; i < BOX_INDEX_MAX; i++)
> +		mempool_destroy(&txm.txm_story_pool[i]);
>  }
>  
>  int
> @@ -1345,3 +1390,756 @@ txm_cause_conflict(struct txn *breaker, struct txn *victim)
>  	rlist_add(&victim->conflicted_by_list, &tracker->in_conflicted_by_list);
>  	return 0;
>  }
> +
> +/**
> + * Creates new story and links it with the @tuple.
> + * @return story on success, NULL on error (diag is set).
> + */
> +static struct txm_story *
> +txm_story_new(struct space *space, struct tuple *tuple)
> +{
> +	assert(!tuple->is_dirty);
> +	uint32_t index_count = space->index_count;
> +	assert(index_count < BOX_INDEX_MAX);
> +	struct mempool *pool = &txm.txm_story_pool[index_count];
> +	struct txm_story *story = (struct txm_story *)mempool_alloc(pool);
> +	if (story == NULL) {
> +		size_t item_size = sizeof(struct txm_story) +
> +			index_count * sizeof(struct txm_story_link);
> +		diag_set(OutOfMemory, item_size, "tx_manager", "tx story");

5. Diag OOM takes size, allocator, and variable name. Here the
allocator is "mempool_alloc", and the variable name is "story".
Lets be consistent with other diag_sets. The same for all the
other new places.

> +		return story;

6. I know story is NULL, but why not to return NULL explicitly?
This looks confusing.

> +	}
> +	story->tuple = tuple;

7. Nit: would be better to call tuple_ref() right near this assignment.
For the sake of OOM handler below, worth moving this line down to the
tuple_ref() call below.

> +
> +	const struct txm_story **put_story = (const struct txm_story **)&story;
> +	struct txm_story **empty = NULL;
> +	mh_int_t pos = mh_history_put(txm.history, put_story, &empty, 0);
> +	if (pos == mh_end(txm.history)) {
> +		mempool_free(pool, story);
> +		diag_set(OutOfMemory, pos + 1,
> +			 "tx_manager", "tx history hash table");
> +		return NULL;
> +	}
> +	tuple->is_dirty = true;
> +	tuple_ref(tuple);
> +
> +	story->index_count = index_count;
> +	story->add_stmt = NULL;
> +	story->add_psn = 0;
> +	story->del_stmt = NULL;
> +	story->del_psn = 0;
> +	rlist_create(&story->reader_list);
> +	rlist_add_tail(&txm.all_stories, &story->in_all_stories);
> +	memset(story->link, 0, sizeof(story->link[0]) * index_count);
> +	return story;
> +}
> +
> +static void
> +txm_story_delete(struct txm_story *story);
> +
> +/**
> + * Creates new story of a @tuple that was added by @stmt.

8. Comments for functions should use imperative mood. The same
in other similar places.

> + * @return story on success, NULL on error (diag is set).
> + */
> +static struct txm_story *
> +txm_story_new_add_stmt(struct tuple *tuple, struct txn_stmt *stmt)
> +{
> +	struct txm_story *res = txm_story_new(stmt->space, tuple);
> +	if (res == NULL)
> +		return NULL;
> +	res->add_stmt = stmt;
> +	assert(stmt->add_story == NULL);
> +	stmt->add_story = res;
> +	return res;
> +}
> +
> +/**
> + * Creates new story of a @tuple that was deleted by @stmt.
> + * @return story on success, NULL on error (diag is set).
> + */
> +static struct txm_story *
> +txm_story_new_del_stmt(struct tuple *tuple, struct txn_stmt *stmt)
> +{
> +	struct txm_story *res = txm_story_new(stmt->space, tuple);
> +	if (res == NULL)
> +		return NULL;
> +	res->del_stmt = stmt;
> +	assert(stmt->del_story == NULL);
> +	stmt->del_story = res;
> +	return res;
> +}
> +
> +/**
> + * Undo txm_story_new_add_stmt.
> + */
> +static void
> +txm_story_delete_add_stmt(struct txm_story *story)
> +{
> +	story->add_stmt->add_story = NULL;
> +	story->add_stmt = NULL;
> +	txm_story_delete(story);
> +}
> +
> +/**
> + * Undo txm_story_new_del_stmt.
> + */
> +static void
> +txm_story_delete_del_stmt(struct txm_story *story)
> +{
> +	story->del_stmt->del_story = NULL;
> +	story->del_stmt = NULL;
> +	txm_story_delete(story);
> +}
> +
> +
> +/**
> + * Find a story of a @tuple. The story expected to be present (assert).
> + */
> +static struct txm_story *
> +txm_story_get(struct tuple *tuple)
> +{
> +	assert(tuple->is_dirty);
> +
> +	mh_int_t pos = mh_history_find(txm.history, tuple, 0);
> +	assert(pos != mh_end(txm.history));
> +	return *mh_history_node(txm.history, pos);
> +}
> +
> +/**
> + * Get the older tuple, extracting it from older story if necessary.
> + */
> +static struct tuple *
> +txm_story_older_tuple(struct txm_story_link *link)
> +{
> +	return link->older.is_story ? link->older.story->tuple
> +				    : link->older.tuple;

9. Why so? If older.is_story, it means it stores a story,
not a tuple. But you return vice versa.

10. Why do you need all the new structs in txn.h if they aren't
used out of txn.c anyway. At least some of them.

> +}
> +
> +/**
> + * Link a @story with older story in @index (in both directions).
> + */
> +static void
> +txm_story_link_story(struct txm_story *story, struct txm_story *older_story,
> +		     uint32_t index)

11. What is index? Index id? Index ordinal number?

> +{
> +	assert(older_story != NULL);
> +	struct txm_story_link *link = &story->link[index];
> +	/* Must be unlinked. */
> +	assert(!link->older.is_story);
> +	assert(link->older.tuple == NULL);
> +	link->older.is_story = true;
> +	link->older.story = older_story;
> +	older_story->link[index].newer_story = story;
> +}
> +
> +/**
> + * Link a @story with older tuple in @index. In case if the tuple is dirty -
> + * find and link with the corresponding story.
> + */
> +static void
> +txm_story_link_tuple(struct txm_story *story, struct tuple *older_tuple,
> +                     uint32_t index)
> +{
> +	struct txm_story_link *link = &story->link[index];
> +	/* Must be unlinked. */
> +	assert(!link->older.is_story);
> +	assert(link->older.tuple == NULL);
> +	if (older_tuple == NULL)
> +		return;
> +	if (older_tuple->is_dirty) {
> +		txm_story_link_story(story, txm_story_get(older_tuple), index);
> +		return;
> +	}
> +	link->older.tuple = older_tuple;
> +	tuple_ref(link->older.tuple);
> +}
> +
> +/**
> + * Unlink a @story with older story/tuple in @index.
> + */
> +static void
> +txm_story_unlink(struct txm_story *story, uint32_t index)
> +{
> +	struct txm_story_link *link = &story->link[index];
> +	if (link->older.is_story) {
> +		link->older.story->link[index].newer_story = NULL;
> +	} else if (link->older.tuple != NULL) {
> +		tuple_unref(link->older.tuple);
> +		link->older.tuple = NULL;
> +	}
> +	link->older.is_story = false;
> +	link->older.tuple = NULL;
> +}
> +
> +/**
> + * Check if a @story is visible for transaction @txn. Return visible tuple to
> + * @visible_tuple (can be set to NULL).
> + * @param is_prepared_ok - whether prepared (not committed) change is acceptable.

12. Would be easier to underand if you would reverse the flag and use the
existing concept 'read committed' for it. So the variable meaning would
be reversed and it would be named like 'do_read_committed_only'.

> + * @param own_change - return true if the change was made by @txn itself.

13. own_change -> is_own_change. In the code too. Check all the other
new flags too.

> + * @return true if the story is visible, false otherwise.
> + */
> +static bool
> +txm_story_is_visible(struct txm_story *story, struct txn *txn,
> +		     struct tuple **visible_tuple, bool is_prepared_ok,
> +		     bool *own_change)
> +{
> +	*own_change = false;
> +	*visible_tuple = NULL;
> +	struct txn_stmt *dels = story->del_stmt;
> +	while (dels != NULL) {
> +		if (dels->txn == txn) {
> +			/* Tuple is deleted by us (@txn). */
> +			*own_change = true;
> +			return true;
> +		}
> +		dels = dels->next_in_del_list;
> +	}
> +	if (is_prepared_ok && story->del_psn != 0 &&
> +	    (txn->rv_psn == 0 || story->del_psn < txn->rv_psn)) {
> +		/* Tuple is deleted by prepared TX. */

14. How does it follow from the condition? I don't understand.
You check some flags for the current transaction, which is
clearly not prepared. And there are no our own del stmt. So
it means you somehow made this conclusion from story->del_psn != 0
and story->del_psn < txn->rv_psn. I don't understand how does
it mean the author of the story is prepared but not committed.
And yet below the condition is almost the same, but means committed.

The same for prepared not committed add_stmt.

> +		return true;
> +	}
> +	if (story->del_psn != 0 && story->del_stmt == NULL &&
> +	    (txn->rv_psn == 0 || story->del_psn < txn->rv_psn)) {
> +		/* Tuple is deleted by committed TX. */
> +		return true;
> +	}
> +
> +	if (story->add_stmt != NULL && story->add_stmt->txn == txn) {
> +		/* Tuple is added by us (@txn). */
> +		*visible_tuple = story->tuple;
> +		*own_change = true;
> +		return true;

15. What if this transaction added the tuple, and then another transaction
also added a tuple with the same key. Shouldn't there by a cycle like for
dels?

> +	}
> +	if (is_prepared_ok && story->add_psn != 0 &&
> +	    (txn->rv_psn == 0 || story->add_psn < txn->rv_psn)) {
> +		/* Tuple is added by another prepared TX. */
> +		*visible_tuple = story->tuple;
> +		return true;
> +	}
> +	if (story->add_psn != 0 && story->add_stmt == NULL &&
> +		(txn->rv_psn == 0 || story->add_psn < txn->rv_psn)) {

16. Bad indentation.

> +		/* Tuple is added by committed TX. */
> +		*visible_tuple = story->tuple;
> +		return true;
> +	}
> +	if (story->add_psn == 0 && story->add_stmt == NULL) {
> +		/* added long time ago. */

17. Did't understand the comment nor the condition.

> +		*visible_tuple = story->tuple;
> +		return true;
> +	}
> +	return false;
> +}
> +
> +/**
> + * Temporary (allocated on region) struct that stores a conflicting TX.
> + */
> +struct txn_conflict
> +{
> +	struct txn *other_txn;
> +	struct txn_conflict *next;

18. What are these members?

I am going to stop here. The patch is very interesting, but I need to get
some sleep. Will return to it afterwards.


More information about the Tarantool-patches mailing list