* [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv
@ 2020-01-28 19:22 Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
Updated series sits in gorcunov/gh-4730-diag-raise-master-7
Cyrill Gorcunov (5):
box/request: add missing OutOfMemory diag_set
box/applier: add missing diag_set on region_alloc failure
box/applier: fix nil dereference in applier rollback
errinj: add ERRINJ_REPLICA_TXN_WRITE
test: add replication/applier-rollback
src/box/applier.cc | 54 +-
src/box/request.c | 8 +-
src/lib/core/errinj.h | 1 +
test/box/errinj.result | 2614 ++++++++++---------
test/replication/applier-rollback-slave.lua | 16 +
test/replication/applier-rollback.result | 160 ++
test/replication/applier-rollback.test.lua | 79 +
test/replication/suite.ini | 2 +-
8 files changed, 1670 insertions(+), 1264 deletions(-)
create mode 100644 test/replication/applier-rollback-slave.lua
create mode 100644 test/replication/applier-rollback.result
create mode 100644 test/replication/applier-rollback.test.lua
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
@ 2020-01-28 19:22 ` Cyrill Gorcunov
2020-02-03 14:37 ` Sergey Ostanevich
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure Cyrill Gorcunov
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
In request_create_from_tuple and request_handle_sequence
we may be unable to request memory for tuples, don't
forget to setup diag error otherwise diag_raise will
lead to nil dereference.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/request.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/box/request.c b/src/box/request.c
index 82232a155..994f2da62 100644
--- a/src/box/request.c
+++ b/src/box/request.c
@@ -109,8 +109,10 @@ request_create_from_tuple(struct request *request, struct space *space,
* the tuple data to WAL on commit.
*/
char *buf = region_alloc(&fiber()->gc, size);
- if (buf == NULL)
+ if (buf == NULL) {
+ diag_set(OutOfMemory, size, "region_alloc", "tuple");
return -1;
+ }
memcpy(buf, data, size);
request->tuple = buf;
request->tuple_end = buf + size;
@@ -199,8 +201,10 @@ request_handle_sequence(struct request *request, struct space *space)
size_t buf_size = (request->tuple_end - request->tuple) +
mp_sizeof_uint(UINT64_MAX);
char *tuple = region_alloc(&fiber()->gc, buf_size);
- if (tuple == NULL)
+ if (tuple == NULL) {
+ diag_set(OutOfMemory, buf_size, "region_alloc", "tuple");
return -1;
+ }
char *tuple_end = mp_encode_array(tuple, len);
if (unlikely(key != data)) {
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
@ 2020-01-28 19:22 ` Cyrill Gorcunov
2020-02-03 14:39 ` Sergey Ostanevich
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback Cyrill Gorcunov
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
In case if we're hitting memory limit allocating triggers
we should setup diag error to prevent nil dereference
in diag_raise call (for example from applier_apply_tx).
Note that there are region_alloc_xc helpers which are
throwing errors but as far as I understand we need the
rollback action to process first instead of immediate
throw/catch thus we use diag_set.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/applier.cc | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/box/applier.cc b/src/box/applier.cc
index ae3d281a5..2ed5125d0 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -796,8 +796,11 @@ applier_apply_tx(struct stailq *rows)
sizeof(struct trigger));
on_commit = (struct trigger *)region_alloc(&txn->region,
sizeof(struct trigger));
- if (on_rollback == NULL || on_commit == NULL)
+ if (on_rollback == NULL || on_commit == NULL) {
+ diag_set(OutOfMemory, sizeof(struct trigger),
+ "region_alloc", "on_rollback/on_commit");
goto rollback;
+ }
trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL);
txn_on_rollback(txn, on_rollback);
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure Cyrill Gorcunov
@ 2020-01-28 19:22 ` Cyrill Gorcunov
2020-02-04 22:19 ` Konstantin Osipov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 5/5] test: add replication/applier-rollback Cyrill Gorcunov
4 siblings, 1 reply; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
Currently when transaction rollback happens we just drop an existing
error setting ClientError to the replicaset.applier.diag. This action
leaves current fiber with diag=nil, which in turn leads to sigsegv once
diag_raise() called right after applier_apply_tx():
| applier_f
| try {
| applier_subscribe
| applier_apply_tx
| // error happens
| txn_rollback
| diag_set(ClientError, ER_WAL_IO)
| diag_move(&fiber()->diag, &replicaset.applier.diag)
| // fiber->diag = nil
| applier_on_rollback
| diag_add_error(&applier->diag, diag_last_error(&replicaset.applier.diag)
| fiber_cancel(applier->reader);
| diag_raise() -> NULL dereference
| } catch { ... }
The applier_f works in try/catch cycle and handles errors depending on
what exactly happened during transaction application. It might reconnect
appliers in some cases, the applier is simply cancelled and reaped out in
others.
The problem is that the shared replicaset.applier.diag is handled on
FiberIsCancelled exception only (while it is set inside transaction
rollback action) and we never trigger this specific exception. But
even if we would the former error which has been causing the applier
abort is vanished by ClientError which is too general.
Thus:
- on transaction rollback save the origin error which caused
the transaction abort to the replicaset.applier.diag;
- there are cases (such as xlog error injection) where diag
is explicitly clear on error path, for this sake we setup
ClientError instead;
- trigger FiberIsCancelled exception which will log the
problem and zap the applier;
- put fixme mark into the code: we need to figure out
if underlierd error is really critical one maybe we
could retry the applier iteration instead.
Part-of #4730
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/applier.cc | 43 +++++++++++++++++++++++++++++++++++++++----
1 file changed, 39 insertions(+), 4 deletions(-)
diff --git a/src/box/applier.cc b/src/box/applier.cc
index 2ed5125d0..967dc91de 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -692,9 +692,31 @@ static int
applier_txn_rollback_cb(struct trigger *trigger, void *event)
{
(void) trigger;
- /* Setup shared applier diagnostic area. */
- diag_set(ClientError, ER_WAL_IO);
- diag_move(&fiber()->diag, &replicaset.applier.diag);
+
+ /*
+ * We must not loose the origin error, instead
+ * lets keep it in replicaset diag instance.
+ *
+ * FIXME: We need to revisit this code and
+ * figure out if we can reconnect and retry
+ * the prelication process instead of cancelling
+ * applier with FiberIsCancelled.
+ */
+ struct error *e = diag_last_error(diag_get());
+ if (!e) {
+ /*
+ * If information is already lost
+ * (say xlog cleared diag instance)
+ * setup general ClientError, seriously
+ * we need to unweave this mess, if error
+ * happened it must never been cleared
+ * until error handling in rollback.
+ */
+ diag_set(ClientError, ER_WAL_IO);
+ e = diag_last_error(diag_get());
+ }
+ diag_add_error(&replicaset.applier.diag, e);
+
/* Broadcast the rollback event across all appliers. */
trigger_run(&replicaset.applier.on_rollback, event);
/* Rollback applier vclock to the committed one. */
@@ -849,8 +871,20 @@ applier_on_rollback(struct trigger *trigger, void *event)
diag_add_error(&applier->diag,
diag_last_error(&replicaset.applier.diag));
}
- /* Stop the applier fiber. */
+
+ /*
+ * Something really bad happened, we can't proceed
+ * thus stop the applier and throw FiberIsCancelled
+ * exception which will be catched by the caller
+ * and the fiber gracefully finish.
+ *
+ * FIXME: Need to make sure that this is a really
+ * final error where we can't longer proceed and should
+ * zap the applier, probably we could reconnect and
+ * retry instead?
+ */
fiber_cancel(applier->reader);
+ diag_set(FiberIsCancelled);
return 0;
}
@@ -1098,6 +1132,7 @@ applier_f(va_list ap)
} catch (FiberIsCancelled *e) {
if (!diag_is_empty(&applier->diag)) {
diag_move(&applier->diag, &fiber()->diag);
+ diag_log();
applier_disconnect(applier, APPLIER_STOPPED);
break;
}
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
` (2 preceding siblings ...)
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback Cyrill Gorcunov
@ 2020-01-28 19:22 ` Cyrill Gorcunov
2020-02-04 22:45 ` Konstantin Osipov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 5/5] test: add replication/applier-rollback Cyrill Gorcunov
4 siblings, 1 reply; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
To test rollback error nil dereference
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/applier.cc | 6 +
src/lib/core/errinj.h | 1 +
test/box/errinj.result | 2614 +++++++++++++++++++++-------------------
3 files changed, 1365 insertions(+), 1256 deletions(-)
diff --git a/src/box/applier.cc b/src/box/applier.cc
index 967dc91de..e739f23e2 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -51,6 +51,7 @@
#include "txn.h"
#include "box.h"
#include "scoped_guard.h"
+#include "errinj.h"
STRS(applier_state, applier_STATE);
@@ -830,6 +831,11 @@ applier_apply_tx(struct stailq *rows)
trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL);
txn_on_commit(txn, on_commit);
+ ERROR_INJECT(ERRINJ_REPLICA_TXN_WRITE, {
+ diag_set(ClientError, ER_INJECTION, "replica txn write injection");
+ goto rollback;
+ });
+
if (txn_write(txn) < 0)
goto fail;
diff --git a/src/lib/core/errinj.h b/src/lib/core/errinj.h
index 672da2119..da6adfe97 100644
--- a/src/lib/core/errinj.h
+++ b/src/lib/core/errinj.h
@@ -135,6 +135,7 @@ struct errinj {
_(ERRINJ_COIO_SENDFILE_CHUNK, ERRINJ_INT, {.iparam = -1}) \
_(ERRINJ_SWIM_FD_ONLY, ERRINJ_BOOL, {.bparam = false}) \
_(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT, {.iparam = 0}) \
+ _(ERRINJ_REPLICA_TXN_WRITE, ERRINJ_BOOL, {.bparam = false}) \
ENUM0(errinj_id, ERRINJ_LIST);
extern struct errinj errinjs[];
diff --git a/test/box/errinj.result b/test/box/errinj.result
index babe36b1b..68dd319eb 100644
--- a/test/box/errinj.result
+++ b/test/box/errinj.result
@@ -1,1783 +1,1885 @@
+-- test-run result file version 2
-- Test that recovery had been completed without errors
test_run = require('test_run').new()
----
-...
+ | ---
+ | ...
test_run:cmd("restart server default")
+ |
box.error.last() == nil
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
errinj = box.error.injection
----
-...
+ | ---
+ | ...
net_box = require('net.box')
----
-...
+ | ---
+ | ...
+
space = box.schema.space.create('tweedledum')
----
-...
+ | ---
+ | ...
index = space:create_index('primary', { type = 'hash' })
----
-...
+ | ---
+ | ...
+
errinj.info()
----
-- ERRINJ_VY_RUN_WRITE_STMT_TIMEOUT:
- state: 0
- ERRINJ_WAL_WRITE:
- state: false
- ERRINJ_RELAY_BREAK_LSN:
- state: -1
- ERRINJ_HTTPC_EXECUTE:
- state: false
- ERRINJ_VYRUN_DATA_READ:
- state: false
- ERRINJ_SWIM_FD_ONLY:
- state: false
- ERRINJ_SQL_NAME_NORMALIZATION:
- state: false
- ERRINJ_VY_SCHED_TIMEOUT:
- state: 0
- ERRINJ_COIO_SENDFILE_CHUNK:
- state: -1
- ERRINJ_HTTP_RESPONSE_ADD_WAIT:
- state: false
- ERRINJ_WAL_WRITE_PARTIAL:
- state: -1
- ERRINJ_VY_GC:
- state: false
- ERRINJ_WAL_DELAY:
- state: false
- ERRINJ_INDEX_ALLOC:
- state: false
- ERRINJ_WAL_WRITE_EOF:
- state: false
- ERRINJ_WAL_SYNC:
- state: false
- ERRINJ_BUILD_INDEX:
- state: -1
- ERRINJ_BUILD_INDEX_DELAY:
- state: false
- ERRINJ_VY_RUN_FILE_RENAME:
- state: false
- ERRINJ_VY_COMPACTION_DELAY:
- state: false
- ERRINJ_VY_DUMP_DELAY:
- state: false
- ERRINJ_VY_DELAY_PK_LOOKUP:
- state: false
- ERRINJ_VY_TASK_COMPLETE:
- state: false
- ERRINJ_PORT_DUMP:
- state: false
- ERRINJ_WAL_BREAK_LSN:
- state: -1
- ERRINJ_WAL_IO:
- state: false
- ERRINJ_WAL_FALLOCATE:
- state: 0
- ERRINJ_DYN_MODULE_COUNT:
- state: 0
- ERRINJ_VY_INDEX_FILE_RENAME:
- state: false
- ERRINJ_TUPLE_FORMAT_COUNT:
- state: -1
- ERRINJ_TUPLE_ALLOC:
- state: false
- ERRINJ_VY_RUN_WRITE_DELAY:
- state: false
- ERRINJ_VY_READ_PAGE:
- state: false
- ERRINJ_RELAY_REPORT_INTERVAL:
- state: 0
- ERRINJ_VY_LOG_FILE_RENAME:
- state: false
- ERRINJ_VY_READ_PAGE_TIMEOUT:
- state: 0
- ERRINJ_XLOG_META:
- state: false
- ERRINJ_SIO_READ_MAX:
- state: -1
- ERRINJ_SNAP_COMMIT_DELAY:
- state: false
- ERRINJ_WAL_WRITE_DISK:
- state: false
- ERRINJ_SNAP_WRITE_DELAY:
- state: false
- ERRINJ_LOG_ROTATE:
- state: false
- ERRINJ_VY_RUN_WRITE:
- state: false
- ERRINJ_CHECK_FORMAT_DELAY:
- state: false
- ERRINJ_VY_LOG_FLUSH_DELAY:
- state: false
- ERRINJ_RELAY_FINAL_JOIN:
- state: false
- ERRINJ_REPLICA_JOIN_DELAY:
- state: false
- ERRINJ_RELAY_FINAL_SLEEP:
- state: false
- ERRINJ_VY_RUN_DISCARD:
- state: false
- ERRINJ_WAL_ROTATE:
- state: false
- ERRINJ_RELAY_EXIT_DELAY:
- state: 0
- ERRINJ_VY_POINT_ITER_WAIT:
- state: false
- ERRINJ_MEMTX_DELAY_GC:
- state: false
- ERRINJ_IPROTO_TX_DELAY:
- state: false
- ERRINJ_XLOG_READ:
- state: -1
- ERRINJ_TUPLE_FIELD:
- state: false
- ERRINJ_XLOG_GARBAGE:
- state: false
- ERRINJ_VY_INDEX_DUMP:
- state: -1
- ERRINJ_VY_READ_PAGE_DELAY:
- state: false
- ERRINJ_TESTING:
- state: false
- ERRINJ_RELAY_SEND_DELAY:
- state: false
- ERRINJ_VY_SQUASH_TIMEOUT:
- state: 0
- ERRINJ_VY_LOG_FLUSH:
- state: false
- ERRINJ_RELAY_TIMEOUT:
- state: 0
-...
+ | ---
+ | - ERRINJ_VY_RUN_WRITE_STMT_TIMEOUT:
+ | state: 0
+ | ERRINJ_WAL_BREAK_LSN:
+ | state: -1
+ | ERRINJ_VYRUN_DATA_READ:
+ | state: false
+ | ERRINJ_VY_SCHED_TIMEOUT:
+ | state: 0
+ | ERRINJ_HTTP_RESPONSE_ADD_WAIT:
+ | state: false
+ | ERRINJ_WAL_WRITE_EOF:
+ | state: false
+ | ERRINJ_BUILD_INDEX_DELAY:
+ | state: false
+ | ERRINJ_VY_DELAY_PK_LOOKUP:
+ | state: false
+ | ERRINJ_VY_POINT_ITER_WAIT:
+ | state: false
+ | ERRINJ_WAL_IO:
+ | state: false
+ | ERRINJ_VY_INDEX_FILE_RENAME:
+ | state: false
+ | ERRINJ_TUPLE_FORMAT_COUNT:
+ | state: -1
+ | ERRINJ_TUPLE_ALLOC:
+ | state: false
+ | ERRINJ_VY_RUN_FILE_RENAME:
+ | state: false
+ | ERRINJ_VY_READ_PAGE:
+ | state: false
+ | ERRINJ_RELAY_REPORT_INTERVAL:
+ | state: 0
+ | ERRINJ_RELAY_BREAK_LSN:
+ | state: -1
+ | ERRINJ_XLOG_META:
+ | state: false
+ | ERRINJ_SNAP_COMMIT_DELAY:
+ | state: false
+ | ERRINJ_VY_RUN_WRITE:
+ | state: false
+ | ERRINJ_BUILD_INDEX:
+ | state: -1
+ | ERRINJ_RELAY_FINAL_JOIN:
+ | state: false
+ | ERRINJ_REPLICA_JOIN_DELAY:
+ | state: false
+ | ERRINJ_LOG_ROTATE:
+ | state: false
+ | ERRINJ_MEMTX_DELAY_GC:
+ | state: false
+ | ERRINJ_XLOG_GARBAGE:
+ | state: false
+ | ERRINJ_VY_READ_PAGE_DELAY:
+ | state: false
+ | ERRINJ_SWIM_FD_ONLY:
+ | state: false
+ | ERRINJ_WAL_WRITE:
+ | state: false
+ | ERRINJ_HTTPC_EXECUTE:
+ | state: false
+ | ERRINJ_SQL_NAME_NORMALIZATION:
+ | state: false
+ | ERRINJ_WAL_WRITE_PARTIAL:
+ | state: -1
+ | ERRINJ_VY_GC:
+ | state: false
+ | ERRINJ_WAL_DELAY:
+ | state: false
+ | ERRINJ_XLOG_READ:
+ | state: -1
+ | ERRINJ_WAL_SYNC:
+ | state: false
+ | ERRINJ_VY_TASK_COMPLETE:
+ | state: false
+ | ERRINJ_PORT_DUMP:
+ | state: false
+ | ERRINJ_COIO_SENDFILE_CHUNK:
+ | state: -1
+ | ERRINJ_DYN_MODULE_COUNT:
+ | state: 0
+ | ERRINJ_SIO_READ_MAX:
+ | state: -1
+ | ERRINJ_REPLICA_TXN_WRITE:
+ | state: false
+ | ERRINJ_RELAY_TIMEOUT:
+ | state: 0
+ | ERRINJ_VY_DUMP_DELAY:
+ | state: false
+ | ERRINJ_VY_SQUASH_TIMEOUT:
+ | state: 0
+ | ERRINJ_VY_LOG_FLUSH_DELAY:
+ | state: false
+ | ERRINJ_RELAY_SEND_DELAY:
+ | state: false
+ | ERRINJ_VY_COMPACTION_DELAY:
+ | state: false
+ | ERRINJ_VY_LOG_FILE_RENAME:
+ | state: false
+ | ERRINJ_VY_RUN_DISCARD:
+ | state: false
+ | ERRINJ_WAL_ROTATE:
+ | state: false
+ | ERRINJ_VY_READ_PAGE_TIMEOUT:
+ | state: 0
+ | ERRINJ_VY_INDEX_DUMP:
+ | state: -1
+ | ERRINJ_TUPLE_FIELD:
+ | state: false
+ | ERRINJ_SNAP_WRITE_DELAY:
+ | state: false
+ | ERRINJ_IPROTO_TX_DELAY:
+ | state: false
+ | ERRINJ_RELAY_EXIT_DELAY:
+ | state: 0
+ | ERRINJ_RELAY_FINAL_SLEEP:
+ | state: false
+ | ERRINJ_WAL_WRITE_DISK:
+ | state: false
+ | ERRINJ_CHECK_FORMAT_DELAY:
+ | state: false
+ | ERRINJ_TESTING:
+ | state: false
+ | ERRINJ_VY_RUN_WRITE_DELAY:
+ | state: false
+ | ERRINJ_WAL_FALLOCATE:
+ | state: 0
+ | ERRINJ_VY_LOG_FLUSH:
+ | state: false
+ | ERRINJ_INDEX_ALLOC:
+ | state: false
+ | ...
errinj.set("some-injection", true)
----
-- 'error: can''t find error injection ''some-injection'''
-...
+ | ---
+ | - 'error: can''t find error injection ''some-injection'''
+ | ...
errinj.set("some-injection") -- check error
----
-- 'error: can''t find error injection ''some-injection'''
-...
+ | ---
+ | - 'error: can''t find error injection ''some-injection'''
+ | ...
space:select{222444}
----
-- []
-...
+ | ---
+ | - []
+ | ...
errinj.set("ERRINJ_TESTING", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:select{222444}
----
-- error: Error injection 'ERRINJ_TESTING'
-...
+ | ---
+ | - error: Error injection 'ERRINJ_TESTING'
+ | ...
errinj.set("ERRINJ_TESTING", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
-- Check how well we handle a failed log write
errinj.set("ERRINJ_WAL_IO", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
space:get{1}
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_IO", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
errinj.set("ERRINJ_WAL_IO", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
space:get{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
space:get{2}
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_IO", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- [1, 2]
-...
+ | ---
+ | - [1, 2]
+ | ...
space:truncate()
----
-...
+ | ---
+ | ...
+
-- Check that WAL vclock isn't promoted on failed write.
lsn1 = box.info.vclock[box.info.id]
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_WRITE_PARTIAL", 0)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
errinj.set("ERRINJ_WAL_WRITE_PARTIAL", -1)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
-- Check vclock was promoted only one time
box.info.vclock[box.info.id] == lsn1 + 1
----
-- true
-...
+ | ---
+ | - true
+ | ...
errinj.set("ERRINJ_WAL_WRITE_PARTIAL", 0)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
space:get{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
errinj.set("ERRINJ_WAL_WRITE_PARTIAL", -1)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- [1, 2]
-...
+ | ---
+ | - [1, 2]
+ | ...
-- Check vclock was promoted only two times
box.info.vclock[box.info.id] == lsn1 + 2
----
-- true
-...
+ | ---
+ | - true
+ | ...
space:truncate()
----
-...
+ | ---
+ | ...
+
-- Check a failed log rotation
errinj.set("ERRINJ_WAL_ROTATE", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
space:get{1}
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_ROTATE", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
errinj.set("ERRINJ_WAL_ROTATE", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
space:get{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
space:get{2}
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_ROTATE", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:update(1, {{'=', 2, 2}})
----
-- [1, 2]
-...
+ | ---
+ | - [1, 2]
+ | ...
space:get{1}
----
-- [1, 2]
-...
+ | ---
+ | - [1, 2]
+ | ...
space:get{2}
----
-...
+ | ---
+ | ...
space:truncate()
----
-...
+ | ---
+ | ...
+
space:drop()
----
-...
+ | ---
+ | ...
+
-- Check how well we handle a failed log write in DDL
s_disabled = box.schema.space.create('disabled')
----
-...
+ | ---
+ | ...
s_withindex = box.schema.space.create('withindex')
----
-...
+ | ---
+ | ...
index1 = s_withindex:create_index('primary', { type = 'hash' })
----
-...
+ | ---
+ | ...
s_withdata = box.schema.space.create('withdata')
----
-...
+ | ---
+ | ...
index2 = s_withdata:create_index('primary', { type = 'tree' })
----
-...
+ | ---
+ | ...
s_withdata:insert{1, 2, 3, 4, 5}
----
-- [1, 2, 3, 4, 5]
-...
+ | ---
+ | - [1, 2, 3, 4, 5]
+ | ...
s_withdata:insert{4, 5, 6, 7, 8}
----
-- [4, 5, 6, 7, 8]
-...
+ | ---
+ | - [4, 5, 6, 7, 8]
+ | ...
index3 = s_withdata:create_index('secondary', { type = 'hash', parts = {2, 'unsigned', 3, 'unsigned' }})
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_IO", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
test = box.schema.space.create('test')
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s_disabled:create_index('primary', { type = 'hash' })
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s_disabled.enabled
----
-- false
-...
+ | ---
+ | - false
+ | ...
s_disabled:insert{0}
----
-- error: 'No index #0 is defined in space ''disabled'''
-...
+ | ---
+ | - error: 'No index #0 is defined in space ''disabled'''
+ | ...
s_withindex:create_index('secondary', { type = 'tree', parts = { 2, 'unsigned'} })
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s_withindex.index.secondary
----
-- null
-...
+ | ---
+ | - null
+ | ...
s_withdata.index.secondary:drop()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s_withdata.index.secondary.unique
----
-- true
-...
+ | ---
+ | - true
+ | ...
s_withdata:drop()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
box.space['withdata'].enabled
----
-- true
-...
+ | ---
+ | - true
+ | ...
index4 = s_withdata:create_index('another', { type = 'tree', parts = { 5, 'unsigned' }, unique = false})
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s_withdata.index.another
----
-- null
-...
+ | ---
+ | - null
+ | ...
errinj.set("ERRINJ_WAL_IO", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
test = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
index5 = s_disabled:create_index('primary', { type = 'hash' })
----
-...
+ | ---
+ | ...
s_disabled.enabled
----
-- true
-...
+ | ---
+ | - true
+ | ...
s_disabled:insert{0}
----
-- [0]
-...
+ | ---
+ | - [0]
+ | ...
index6 = s_withindex:create_index('secondary', { type = 'tree', parts = { 2, 'unsigned'} })
----
-...
+ | ---
+ | ...
s_withindex.index.secondary.unique
----
-- true
-...
+ | ---
+ | - true
+ | ...
s_withdata.index.secondary:drop()
----
-...
+ | ---
+ | ...
s_withdata.index.secondary
----
-- null
-...
+ | ---
+ | - null
+ | ...
s_withdata:drop()
----
-...
+ | ---
+ | ...
box.space['withdata']
----
-- null
-...
+ | ---
+ | - null
+ | ...
index7 = s_withdata:create_index('another', { type = 'tree', parts = { 5, 'unsigned' }, unique = false})
----
-- error: Space 'withdata' does not exist
-...
+ | ---
+ | - error: Space 'withdata' does not exist
+ | ...
s_withdata.index.another
----
-- null
-...
+ | ---
+ | - null
+ | ...
test:drop()
----
-...
+ | ---
+ | ...
s_disabled:drop()
----
-...
+ | ---
+ | ...
s_withindex:drop()
----
-...
+ | ---
+ | ...
+
-- Check transaction rollback when out of memory
env = require('test_run')
----
-...
+ | ---
+ | ...
test_run = env.new()
----
-...
+ | ---
+ | ...
+
s = box.schema.space.create('s')
----
-...
+ | ---
+ | ...
_ = s:create_index('pk')
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_TUPLE_ALLOC", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
s:auto_increment{}
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
s:select{}
----
-- []
-...
+ | ---
+ | - []
+ | ...
s:auto_increment{}
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
s:select{}
----
-- []
-...
+ | ---
+ | - []
+ | ...
s:auto_increment{}
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
s:select{}
----
-- []
-...
+ | ---
+ | - []
+ | ...
test_run:cmd("setopt delimiter ';'")
----
-- true
-...
+ | ---
+ | - true
+ | ...
box.begin()
s:insert{1}
box.commit();
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
box.rollback();
----
-...
+ | ---
+ | ...
s:select{};
----
-- []
-...
+ | ---
+ | - []
+ | ...
box.begin()
s:insert{1}
s:insert{2}
box.commit();
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
s:select{};
----
-- []
-...
+ | ---
+ | - []
+ | ...
box.rollback();
----
-...
+ | ---
+ | ...
box.begin()
pcall(s.insert, s, {1})
s:insert{2}
box.commit();
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
s:select{};
----
-- []
-...
+ | ---
+ | - []
+ | ...
box.rollback();
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_TUPLE_ALLOC", false);
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.begin()
s:insert{1}
errinj.set("ERRINJ_TUPLE_ALLOC", true)
s:insert{2}
box.commit();
----
-- error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 16 bytes in slab allocator for memtx_tuple
+ | ...
errinj.set("ERRINJ_TUPLE_ALLOC", false);
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.rollback();
----
-...
+ | ---
+ | ...
s:select{};
----
-- []
-...
+ | ---
+ | - []
+ | ...
box.begin()
s:insert{1}
errinj.set("ERRINJ_TUPLE_ALLOC", true)
pcall(s.insert, s, {2})
box.commit();
----
-...
+ | ---
+ | ...
s:select{};
----
-- - [1]
-...
+ | ---
+ | - - [1]
+ | ...
box.rollback();
----
-...
+ | ---
+ | ...
+
test_run:cmd("setopt delimiter ''");
----
-- true
-...
+ | ---
+ | - true
+ | ...
errinj.set("ERRINJ_TUPLE_ALLOC", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
s:drop()
----
-...
+ | ---
+ | ...
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('test', {parts = {1, 'unsigned', 3, 'unsigned', 5, 'unsigned'}})
----
-...
+ | ---
+ | ...
s:insert{1, 2, 3, 4, 5, 6}
----
-- [1, 2, 3, 4, 5, 6]
-...
+ | ---
+ | - [1, 2, 3, 4, 5, 6]
+ | ...
t = s:select{}[1]
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_TUPLE_FIELD", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
tostring(t[1]) .. tostring(t[2]) ..tostring(t[3]) .. tostring(t[4]) .. tostring(t[5]) .. tostring(t[6])
----
-- 1nil3nil5nil
-...
+ | ---
+ | - 1nil3nil5nil
+ | ...
errinj.set("ERRINJ_TUPLE_FIELD", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
tostring(t[1]) .. tostring(t[2]) ..tostring(t[3]) .. tostring(t[4]) .. tostring(t[5]) .. tostring(t[6])
----
-- '123456'
-...
+ | ---
+ | - '123456'
+ | ...
+
s:drop()
----
-...
+ | ---
+ | ...
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('test', {parts = {2, 'unsigned', 4, 'unsigned', 6, 'unsigned'}})
----
-...
+ | ---
+ | ...
s:insert{1, 2, 3, 4, 5, 6}
----
-- [1, 2, 3, 4, 5, 6]
-...
+ | ---
+ | - [1, 2, 3, 4, 5, 6]
+ | ...
t = s:select{}[1]
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_TUPLE_FIELD", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
tostring(t[1]) .. tostring(t[2]) ..tostring(t[3]) .. tostring(t[4]) .. tostring(t[5]) .. tostring(t[6])
----
-- 12nil4nil6
-...
+ | ---
+ | - 12nil4nil6
+ | ...
errinj.set("ERRINJ_TUPLE_FIELD", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
tostring(t[1]) .. tostring(t[2]) ..tostring(t[3]) .. tostring(t[4]) .. tostring(t[5]) .. tostring(t[6])
----
-- '123456'
-...
+ | ---
+ | - '123456'
+ | ...
+
-- Cleanup
s:drop()
----
-...
+ | ---
+ | ...
+
--
-- gh-2046: don't store offsets for sequential multi-parts keys
--
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('seq2', { parts = { 1, 'unsigned', 2, 'unsigned' }})
----
-...
+ | ---
+ | ...
_ = s:create_index('seq3', { parts = { 1, 'unsigned', 2, 'unsigned', 3, 'unsigned' }})
----
-...
+ | ---
+ | ...
_ = s:create_index('seq5', { parts = { 1, 'unsigned', 2, 'unsigned', 3, 'unsigned', 4, 'scalar', 5, 'number' }})
----
-...
+ | ---
+ | ...
_ = s:create_index('rnd1', { parts = { 3, 'unsigned' }})
----
-...
+ | ---
+ | ...
+
errinj.set("ERRINJ_TUPLE_FIELD", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
tuple = s:insert({1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
----
-...
+ | ---
+ | ...
tuple
----
-- [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
tuple[1] -- not-null, always accessible
----
-- 1
-...
+ | ---
+ | - 1
+ | ...
tuple[2] -- null, doesn't have offset
----
-- null
-...
+ | ---
+ | - null
+ | ...
tuple[3] -- not null, has offset
----
-- 3
-...
+ | ---
+ | - 3
+ | ...
tuple[4] -- null, doesn't have offset
----
-- null
-...
+ | ---
+ | - null
+ | ...
tuple[5] -- null, doesn't have offset
----
-- null
-...
+ | ---
+ | - null
+ | ...
s.index.seq2:select({1})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.seq2:select({1, 2})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.seq3:select({1})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.seq3:select({1, 2, 3})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.seq5:select({1})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.seq5:select({1, 2, 3, 4, 5})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
s.index.rnd1:select({3})
----
-- - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
-...
+ | ---
+ | - - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ | ...
errinj.set("ERRINJ_TUPLE_FIELD", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
s:drop()
----
-...
+ | ---
+ | ...
+
space = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = space:create_index('pk')
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_WRITE", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:insert{1}
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
errinj.set("ERRINJ_WAL_WRITE", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
errinj.set("ERRINJ_WAL_WRITE_DISK", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
_ = space:insert{1, require'digest'.urandom(192 * 1024)}
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
errinj.set("ERRINJ_WAL_WRITE_DISK", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
_ = space:insert{1}
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_WRITE", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.snapshot()
----
-- error: Error injection 'xlog write injection'
-...
+ | ---
+ | - error: Error injection 'xlog write injection'
+ | ...
errinj.set("ERRINJ_WAL_WRITE", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
space:drop()
----
-...
+ | ---
+ | ...
+
--test space:bsize() in case of memory error
utils = dofile('utils.lua')
----
-...
+ | ---
+ | ...
s = box.schema.space.create('space_bsize')
----
-...
+ | ---
+ | ...
idx = s:create_index('primary')
----
-...
+ | ---
+ | ...
+
for i = 1, 13 do s:insert{ i, string.rep('x', i) } end
----
-...
+ | ---
+ | ...
+
s:bsize()
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
utils.space_bsize(s)
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
+
errinj.set("ERRINJ_TUPLE_ALLOC", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
s:replace{1, "test"}
----
-- error: Failed to allocate 21 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 21 bytes in slab allocator for memtx_tuple
+ | ...
s:bsize()
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
utils.space_bsize(s)
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
+
s:update({1}, {{'=', 3, '!'}})
----
-- error: Failed to allocate 20 bytes in slab allocator for memtx_tuple
-...
+ | ---
+ | - error: Failed to allocate 20 bytes in slab allocator for memtx_tuple
+ | ...
s:bsize()
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
utils.space_bsize(s)
----
-- 130
-...
+ | ---
+ | - 130
+ | ...
+
errinj.set("ERRINJ_TUPLE_ALLOC", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
s:drop()
----
-...
+ | ---
+ | ...
+
space = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
index1 = space:create_index('primary')
----
-...
+ | ---
+ | ...
fiber = require'fiber'
----
-...
+ | ---
+ | ...
ch = fiber.channel(1)
----
-...
+ | ---
+ | ...
+
test_run:cmd('setopt delimiter ";"')
----
-- true
-...
+ | ---
+ | - true
+ | ...
function test()
errinj.set('ERRINJ_WAL_WRITE_DISK', true)
pcall(box.space.test.replace, box.space.test, {1, 1})
errinj.set('ERRINJ_WAL_WRITE_DISK', false)
ch:put(true)
end ;
----
-...
+ | ---
+ | ...
+
function run()
fiber.create(test)
box.snapshot()
end ;
----
-...
+ | ---
+ | ...
+
test_run:cmd('setopt delimiter ""');
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
-- Port_dump can fail.
+
box.schema.user.grant('guest', 'read', 'space', '_space')
----
-...
+ | ---
+ | ...
+
cn = net_box.connect(box.cfg.listen)
----
-...
+ | ---
+ | ...
cn:ping()
----
-- true
-...
+ | ---
+ | - true
+ | ...
errinj.set('ERRINJ_PORT_DUMP', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
ok, ret = pcall(cn.space._space.select, cn.space._space)
----
-...
+ | ---
+ | ...
assert(not ok)
----
-- true
-...
+ | ---
+ | - true
+ | ...
assert(string.match(tostring(ret), 'Failed to allocate'))
----
-- Failed to allocate
-...
+ | ---
+ | - Failed to allocate
+ | ...
errinj.set('ERRINJ_PORT_DUMP', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
cn:close()
----
-...
+ | ---
+ | ...
box.schema.user.revoke('guest', 'read', 'space', '_space')
----
-...
+ | ---
+ | ...
+
run()
----
-- error: Can't start a checkpoint while in cascading rollback
-...
+ | ---
+ | - error: Can't start a checkpoint while in cascading rollback
+ | ...
ch:get()
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
box.space.test:select()
----
-- []
-...
+ | ---
+ | - []
+ | ...
test_run:cmd('restart server default')
+ |
box.space.test:select()
----
-- []
-...
+ | ---
+ | - []
+ | ...
box.space.test:drop()
----
-...
+ | ---
+ | ...
+
errinj = box.error.injection
----
-...
+ | ---
+ | ...
net_box = require('net.box')
----
-...
+ | ---
+ | ...
fiber = require'fiber'
----
-...
+ | ---
+ | ...
+
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('pk')
----
-...
+ | ---
+ | ...
+
ch = fiber.channel(2)
----
-...
+ | ---
+ | ...
+
test_run:cmd("setopt delimiter ';'")
----
-- true
-...
+ | ---
+ | - true
+ | ...
function test(tuple)
ch:put({pcall(s.replace, s, tuple)})
end;
----
-...
+ | ---
+ | ...
test_run:cmd("setopt delimiter ''");
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
errinj.set("ERRINJ_WAL_WRITE", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
_ = {fiber.create(test, {1, 2, 3}), fiber.create(test, {3, 4, 5})}
----
-...
+ | ---
+ | ...
+
{ch:get(), ch:get()}
----
-- - - false
- - Failed to write to disk
- - - false
- - Failed to write to disk
-...
+ | ---
+ | - - - false
+ | - Failed to write to disk
+ | - - false
+ | - Failed to write to disk
+ | ...
errinj.set("ERRINJ_WAL_WRITE", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
s:drop()
----
-...
+ | ---
+ | ...
+
-- rebuild some secondary indexes if the primary was changed
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
i1 = s:create_index('i1', {parts = {1, 'unsigned'}})
----
-...
+ | ---
+ | ...
--i2 = s:create_index('i2', {parts = {5, 'unsigned'}, unique = false})
--i3 = s:create_index('i3', {parts = {6, 'unsigned'}, unique = false})
i2 = i1 i3 = i1
----
-...
+ | ---
+ | ...
+
_ = s:insert{1, 4, 3, 4, 10, 10}
----
-...
+ | ---
+ | ...
_ = s:insert{2, 3, 1, 2, 10, 10}
----
-...
+ | ---
+ | ...
_ = s:insert{3, 2, 2, 1, 10, 10}
----
-...
+ | ---
+ | ...
_ = s:insert{4, 1, 4, 3, 10, 10}
----
-...
+ | ---
+ | ...
+
i1:select{}
----
-- - [1, 4, 3, 4, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [4, 1, 4, 3, 10, 10]
-...
+ | ---
+ | - - [1, 4, 3, 4, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [4, 1, 4, 3, 10, 10]
+ | ...
i2:select{}
----
-- - [1, 4, 3, 4, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [4, 1, 4, 3, 10, 10]
-...
+ | ---
+ | - - [1, 4, 3, 4, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [4, 1, 4, 3, 10, 10]
+ | ...
i3:select{}
----
-- - [1, 4, 3, 4, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [4, 1, 4, 3, 10, 10]
-...
+ | ---
+ | - - [1, 4, 3, 4, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [4, 1, 4, 3, 10, 10]
+ | ...
+
i1:alter({parts={2, 'unsigned'}})
----
-...
+ | ---
+ | ...
+
_ = collectgarbage('collect')
----
-...
+ | ---
+ | ...
i1:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i2:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i3:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
+
box.error.injection.set('ERRINJ_BUILD_INDEX', i2.id)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
i1:alter{parts = {3, "unsigned"}}
----
-- error: Error injection 'build index'
-...
+ | ---
+ | - error: Error injection 'build index'
+ | ...
+
_ = collectgarbage('collect')
----
-...
+ | ---
+ | ...
i1:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i2:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i3:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
+
box.error.injection.set('ERRINJ_BUILD_INDEX', i3.id)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
i1:alter{parts = {4, "unsigned"}}
----
-- error: Error injection 'build index'
-...
+ | ---
+ | - error: Error injection 'build index'
+ | ...
+
_ = collectgarbage('collect')
----
-...
+ | ---
+ | ...
i1:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i2:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
i3:select{}
----
-- - [4, 1, 4, 3, 10, 10]
- - [3, 2, 2, 1, 10, 10]
- - [2, 3, 1, 2, 10, 10]
- - [1, 4, 3, 4, 10, 10]
-...
+ | ---
+ | - - [4, 1, 4, 3, 10, 10]
+ | - [3, 2, 2, 1, 10, 10]
+ | - [2, 3, 1, 2, 10, 10]
+ | - [1, 4, 3, 4, 10, 10]
+ | ...
+
box.error.injection.set('ERRINJ_BUILD_INDEX', -1)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
s:drop()
----
-...
+ | ---
+ | ...
+
--
-- Do not rebuild index if the only change is a key part type
-- compatible change.
--
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
pk = s:create_index('pk')
----
-...
+ | ---
+ | ...
sk = s:create_index('sk', {parts = {2, 'unsigned'}})
----
-...
+ | ---
+ | ...
s:replace{1, 1}
----
-- [1, 1]
-...
+ | ---
+ | - [1, 1]
+ | ...
box.error.injection.set('ERRINJ_BUILD_INDEX', sk.id)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
sk:alter({parts = {2, 'number'}})
----
-...
+ | ---
+ | ...
box.error.injection.set('ERRINJ_BUILD_INDEX', -1)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
s:drop()
----
-...
+ | ---
+ | ...
+
--
-- gh-3255: iproto can crash and discard responses, if a network
-- is saturated, and DML yields too long on commit.
--
+
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('pk')
----
-...
+ | ---
+ | ...
box.schema.user.grant('guest', 'read,write,alter', 'space', 'test')
----
-...
+ | ---
+ | ...
c = net_box.connect(box.cfg.listen)
----
-...
+ | ---
+ | ...
+
ch = fiber.channel(200)
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_IPROTO_TX_DELAY", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
for i = 1, 100 do fiber.create(function() for j = 1, 10 do c.space.test:replace{1} end ch:put(true) end) end
----
-...
+ | ---
+ | ...
for i = 1, 100 do fiber.create(function() for j = 1, 10 do c.space.test:select() end ch:put(true) end) end
----
-...
+ | ---
+ | ...
for i = 1, 200 do ch:get() end
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_IPROTO_TX_DELAY", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
s:drop()
----
-...
+ | ---
+ | ...
+
--
-- gh-3325: do not cancel already sent requests, when a schema
-- change is detected.
--
+
box.schema.user.grant('guest', 'execute', 'universe')
----
-...
+ | ---
+ | ...
+
s = box.schema.create_space('test')
----
-...
+ | ---
+ | ...
pk = s:create_index('pk')
----
-...
+ | ---
+ | ...
+
box.schema.user.grant('guest', 'read,write,alter', 'space', 'test')
----
-...
+ | ---
+ | ...
box.schema.user.grant('guest', 'create', 'space')
----
-...
+ | ---
+ | ...
box.schema.user.grant('guest', 'write', 'space', '_index')
----
-...
+ | ---
+ | ...
s:replace{1, 1}
----
-- [1, 1]
-...
+ | ---
+ | - [1, 1]
+ | ...
cn = net_box.connect(box.cfg.listen)
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_WAL_DELAY", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
ok = nil
----
-...
+ | ---
+ | ...
err = nil
----
-...
+ | ---
+ | ...
test_run:cmd('setopt delimiter ";"')
----
-- true
-...
+ | ---
+ | - true
+ | ...
f = fiber.create(function()
local str = 'box.space.test:create_index("sk", {parts = {{2, "integer"}}})'
ok, err = pcall(cn.eval, cn, str)
end)
test_run:cmd('setopt delimiter ""');
----
-...
+ | ---
+ | ...
cn.space.test:get{1}
----
-- [1, 1]
-...
+ | ---
+ | - [1, 1]
+ | ...
errinj.set("ERRINJ_WAL_DELAY", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
while ok == nil do fiber.sleep(0.01) end
----
-...
+ | ---
+ | ...
ok, err
----
-- true
-- null
-...
+ | ---
+ | - true
+ | - null
+ | ...
cn:close()
----
-...
+ | ---
+ | ...
s:drop()
----
-...
+ | ---
+ | ...
box.schema.user.revoke('guest', 'execute', 'universe')
----
-...
+ | ---
+ | ...
box.schema.user.revoke('guest', 'create', 'space')
----
-...
+ | ---
+ | ...
box.schema.user.revoke('guest', 'write', 'space', '_index')
----
-...
+ | ---
+ | ...
--
-- If message memory pool is used up, stop the connection, until
-- the pool has free memory.
--
started = 0
----
-...
+ | ---
+ | ...
finished = 0
----
-...
+ | ---
+ | ...
continue = false
----
-...
+ | ---
+ | ...
test_run:cmd('setopt delimiter ";"')
----
-- true
-...
+ | ---
+ | - true
+ | ...
function long_poll_f()
started = started + 1
f = fiber.self()
while not continue do fiber.sleep(0.01) end
finished = finished + 1
end;
----
-...
+ | ---
+ | ...
+
box.schema.func.create('long_poll_f');
----
-...
+ | ---
+ | ...
box.schema.user.grant('guest', 'execute', 'function', 'long_poll_f');
----
-...
+ | ---
+ | ...
+
test_run:cmd('setopt delimiter ""');
----
-- true
-...
+ | ---
+ | - true
+ | ...
cn = net_box.connect(box.cfg.listen)
----
-...
+ | ---
+ | ...
function long_poll() cn:call('long_poll_f') end
----
-...
+ | ---
+ | ...
_ = fiber.create(long_poll)
----
-...
+ | ---
+ | ...
while started ~= 1 do fiber.sleep(0.01) end
----
-...
+ | ---
+ | ...
-- Simulate OOM for new requests.
errinj.set("ERRINJ_TESTING", true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
-- This request tries to allocate memory for request data and
-- fails. This stops the connection until an existing
-- request is finished.
log = require('log')
----
-...
+ | ---
+ | ...
-- Fill the log with garbage to not accidentally read log messages
-- produced by a previous test.
log.info(string.rep('a', 1000))
----
-...
+ | ---
+ | ...
_ = fiber.create(long_poll)
----
-...
+ | ---
+ | ...
while not test_run:grep_log('default', 'can not allocate memory for a new message', 1000) do fiber.sleep(0.01) end
----
-...
+ | ---
+ | ...
test_run:grep_log('default', 'stopping input on connection', 1000) ~= nil
----
-- true
-...
+ | ---
+ | - true
+ | ...
started == 1
----
-- true
-...
+ | ---
+ | - true
+ | ...
continue = true
----
-...
+ | ---
+ | ...
errinj.set("ERRINJ_TESTING", false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
-- Ensure that when memory is available again, the pending
-- request is executed.
while finished ~= 2 do fiber.sleep(0.01) end
----
-...
+ | ---
+ | ...
cn:close()
----
-...
+ | ---
+ | ...
+
box.schema.user.revoke('guest', 'execute', 'function', 'long_poll_f')
----
-...
+ | ---
+ | ...
box.schema.func.drop('long_poll_f')
----
-...
+ | ---
+ | ...
--
-- gh-3289: drop/truncate leaves the space in inconsistent
-- state if WAL write fails.
--
s = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = s:create_index('pk')
----
-...
+ | ---
+ | ...
for i = 1, 10 do s:replace{i} end
----
-...
+ | ---
+ | ...
errinj.set('ERRINJ_WAL_IO', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
s:drop()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s:truncate()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s:drop()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
s:truncate()
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
errinj.set('ERRINJ_WAL_IO', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
for i = 1, 10 do s:replace{i + 10} end
----
-...
+ | ---
+ | ...
s:select()
----
-- - [1]
- - [2]
- - [3]
- - [4]
- - [5]
- - [6]
- - [7]
- - [8]
- - [9]
- - [10]
- - [11]
- - [12]
- - [13]
- - [14]
- - [15]
- - [16]
- - [17]
- - [18]
- - [19]
- - [20]
-...
+ | ---
+ | - - [1]
+ | - [2]
+ | - [3]
+ | - [4]
+ | - [5]
+ | - [6]
+ | - [7]
+ | - [8]
+ | - [9]
+ | - [10]
+ | - [11]
+ | - [12]
+ | - [13]
+ | - [14]
+ | - [15]
+ | - [16]
+ | - [17]
+ | - [18]
+ | - [19]
+ | - [20]
+ | ...
s:drop()
----
-...
+ | ---
+ | ...
+
--
-- gh-3432: check that deletion of temporary tuples is not delayed
-- if snapshot is in progress.
--
test_run:cmd("create server test with script='box/lua/cfg_memory.lua'")
----
-- true
-...
+ | ---
+ | - true
+ | ...
test_run:cmd(string.format("start server test with args='%d'", 100 * 1024 * 1024))
----
-- true
-...
+ | ---
+ | - true
+ | ...
test_run:cmd("switch test")
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
fiber = require('fiber')
----
-...
+ | ---
+ | ...
+
-- Create a persistent space.
_ = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = box.space.test:create_index('pk')
----
-...
+ | ---
+ | ...
for i = 1, 100 do box.space.test:insert{i} end
----
-...
+ | ---
+ | ...
+
-- Create a temporary space.
count = 500
----
-...
+ | ---
+ | ...
pad = string.rep('x', 100 * 1024)
----
-...
+ | ---
+ | ...
_ = box.schema.space.create('tmp', {temporary = true})
----
-...
+ | ---
+ | ...
_ = box.space.tmp:create_index('pk')
----
-...
+ | ---
+ | ...
for i = 1, count do box.space.tmp:insert{i, pad} end
----
-...
+ | ---
+ | ...
+
-- Start background snapshot.
c = fiber.channel(1)
----
-...
+ | ---
+ | ...
box.error.injection.set('ERRINJ_SNAP_WRITE_DELAY', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
_ = fiber.create(function() box.snapshot() c:put(true) end)
----
-...
+ | ---
+ | ...
+
-- Overwrite data stored in the temporary space while snapshot
-- is in progress to make sure that tuples stored in it are freed
-- immediately.
for i = 1, count do box.space.tmp:delete{i} end
----
-...
+ | ---
+ | ...
_ = collectgarbage('collect')
----
-...
+ | ---
+ | ...
for i = 1, count do box.space.tmp:insert{i, pad} end
----
-...
+ | ---
+ | ...
+
box.error.injection.set('ERRINJ_SNAP_WRITE_DELAY', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
c:get()
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
box.space.tmp:drop()
----
-...
+ | ---
+ | ...
box.space.test:drop()
----
-...
+ | ---
+ | ...
+
test_run:cmd("switch default")
----
-- true
-...
+ | ---
+ | - true
+ | ...
test_run:cmd("stop server test")
----
-- true
-...
+ | ---
+ | - true
+ | ...
test_run:cmd("cleanup server test")
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
--
-- gh-3406: check that incomplete files got cleaned up after restart.
--
fio = require('fio')
----
-...
+ | ---
+ | ...
fiber = require('fiber')
----
-...
+ | ---
+ | ...
+
-- Check that snap.inprogress files are removed.
_ = box.schema.space.create('test')
----
-...
+ | ---
+ | ...
_ = box.space.test:create_index('primary')
----
-...
+ | ---
+ | ...
for i = 1, 10 do box.space.test:insert{i} end
----
-...
+ | ---
+ | ...
+
errinj.set('ERRINJ_SNAP_WRITE_DELAY', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
_ = fiber.create(function() box.snapshot() end)
----
-...
+ | ---
+ | ...
path = fio.pathjoin(box.cfg.memtx_dir, '*.snap.inprogress')
----
-...
+ | ---
+ | ...
while #fio.glob(path) == 0 do fiber.sleep(0.001) end
----
-...
+ | ---
+ | ...
#fio.glob(path) > 0
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
test_run:cmd('restart server default')
+ |
+
fio = require('fio')
----
-...
+ | ---
+ | ...
fiber = require('fiber')
----
-...
+ | ---
+ | ...
errinj = box.error.injection
----
-...
+ | ---
+ | ...
+
#fio.glob(fio.pathjoin(box.cfg.memtx_dir, "*.snap.inprogress")) == 0
----
-- true
-...
+ | ---
+ | - true
+ | ...
box.space.test:drop()
----
-...
+ | ---
+ | ...
+
-- Check that run.inprogress, index.inprogress, and vylog.inprogress
-- files are removed.
_ = box.schema.space.create('test', {engine = 'vinyl'})
----
-...
+ | ---
+ | ...
_ = box.space.test:create_index('primary')
----
-...
+ | ---
+ | ...
+
errinj.set('ERRINJ_VY_LOG_FILE_RENAME', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.snapshot()
----
-- error: Error injection 'vinyl log file rename'
-...
+ | ---
+ | - error: Error injection 'vinyl log file rename'
+ | ...
errinj.set('ERRINJ_VY_LOG_FILE_RENAME', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
errinj.set('ERRINJ_VY_GC', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
errinj.set('ERRINJ_VY_SCHED_TIMEOUT', 0.001)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
errinj.set('ERRINJ_VY_RUN_FILE_RENAME', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.space.test:insert{1}
----
-- [1]
-...
+ | ---
+ | - [1]
+ | ...
box.snapshot() -- error
----
-- error: Error injection 'vinyl run file rename'
-...
+ | ---
+ | - error: Error injection 'vinyl run file rename'
+ | ...
errinj.set('ERRINJ_VY_RUN_FILE_RENAME', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
-- Wait for the scheduler to unthrottle.
repeat fiber.sleep(0.001) until pcall(box.snapshot)
----
-...
+ | ---
+ | ...
+
errinj.set('ERRINJ_VY_INDEX_FILE_RENAME', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.space.test:insert{2}
----
-- [2]
-...
+ | ---
+ | - [2]
+ | ...
box.snapshot() -- error
----
-- error: Error injection 'vinyl index file rename'
-...
+ | ---
+ | - error: Error injection 'vinyl index file rename'
+ | ...
errinj.set('ERRINJ_VY_INDEX_FILE_RENAME', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
errinj.set('ERRINJ_VY_SCHED_TIMEOUT', 0)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
errinj.set('ERRINJ_VY_GC', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
+
test_run:cmd('restart server default')
+ |
+
fio = require('fio')
----
-...
+ | ---
+ | ...
#fio.glob(fio.pathjoin(box.cfg.vinyl_dir, '*.vylog.inprogress')) == 0
----
-- true
-...
+ | ---
+ | - true
+ | ...
#fio.glob(fio.pathjoin(box.cfg.vinyl_dir, box.space.test.id, 0, '*.run.inprogress')) == 0
----
-- true
-...
+ | ---
+ | - true
+ | ...
#fio.glob(fio.pathjoin(box.cfg.vinyl_dir, box.space.test.id, 0, '*.index.inprogress')) == 0
----
-- true
-...
+ | ---
+ | - true
+ | ...
+
box.space.test:drop()
----
-...
+ | ---
+ | ...
+
-- gh-4276 - check grant privilege rollback
_ = box.schema.user.create('testg')
----
-...
+ | ---
+ | ...
_ = box.schema.space.create('testg'):create_index('pk')
----
-...
+ | ---
+ | ...
+
box.error.injection.set('ERRINJ_WAL_IO', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
-- the grant operation above fails and test hasn't any space test permissions
box.schema.user.grant('testg', 'read,write', 'space', 'testg')
----
-- error: Failed to write to disk
-...
+ | ---
+ | - error: Failed to write to disk
+ | ...
-- switch user and check they couldn't select
box.session.su('testg')
----
-...
+ | ---
+ | ...
box.space.testg:select()
----
-- error: Read access to space 'testg' is denied for user 'testg'
-...
+ | ---
+ | - error: Read access to space 'testg' is denied for user 'testg'
+ | ...
box.session.su('admin')
----
-...
+ | ---
+ | ...
box.error.injection.set('ERRINJ_WAL_IO', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.schema.user.drop('testg')
----
-...
+ | ---
+ | ...
box.space.testg:drop()
----
-...
+ | ---
+ | ...
+
--
-- Errinj:get().
--
box.error.injection.get('bad name')
----
-- 'error: can''t find error injection ''bad name'''
-...
+ | ---
+ | - 'error: can''t find error injection ''bad name'''
+ | ...
box.error.injection.set('ERRINJ_WAL_IO', true)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_WAL_IO')
----
-- true
-...
+ | ---
+ | - true
+ | ...
box.error.injection.set('ERRINJ_WAL_IO', false)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_WAL_IO')
----
-- false
-...
+ | ---
+ | - false
+ | ...
box.error.injection.set('ERRINJ_TUPLE_FORMAT_COUNT', 20)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_TUPLE_FORMAT_COUNT')
----
-- 20
-...
+ | ---
+ | - 20
+ | ...
box.error.injection.set('ERRINJ_TUPLE_FORMAT_COUNT', -1)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_TUPLE_FORMAT_COUNT')
----
-- -1
-...
+ | ---
+ | - -1
+ | ...
box.error.injection.set('ERRINJ_RELAY_TIMEOUT', 0.5)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_RELAY_TIMEOUT')
----
-- 0.5
-...
+ | ---
+ | - 0.5
+ | ...
box.error.injection.set('ERRINJ_RELAY_TIMEOUT', 0)
----
-- ok
-...
+ | ---
+ | - ok
+ | ...
box.error.injection.get('ERRINJ_RELAY_TIMEOUT')
----
-- 0
-...
+ | ---
+ | - 0
+ | ...
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH v7 5/5] test: add replication/applier-rollback
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
` (3 preceding siblings ...)
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE Cyrill Gorcunov
@ 2020-01-28 19:22 ` Cyrill Gorcunov
4 siblings, 0 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-01-28 19:22 UTC (permalink / raw)
To: tml
In the test force error injection ERRINJ_REPLICA_TXN_WRITE
to happen which will initiate applier transaction rollback.
Without the fix it will cause SIGSEGV due to lack of error
propagation.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
test/replication/applier-rollback-slave.lua | 16 ++
test/replication/applier-rollback.result | 160 ++++++++++++++++++++
test/replication/applier-rollback.test.lua | 79 ++++++++++
test/replication/suite.ini | 2 +-
4 files changed, 256 insertions(+), 1 deletion(-)
create mode 100644 test/replication/applier-rollback-slave.lua
create mode 100644 test/replication/applier-rollback.result
create mode 100644 test/replication/applier-rollback.test.lua
diff --git a/test/replication/applier-rollback-slave.lua b/test/replication/applier-rollback-slave.lua
new file mode 100644
index 000000000..26fb10055
--- /dev/null
+++ b/test/replication/applier-rollback-slave.lua
@@ -0,0 +1,16 @@
+--
+-- vim: ts=4 sw=4 et
+--
+
+print('arg', arg)
+
+box.cfg({
+ replication = os.getenv("MASTER"),
+ listen = os.getenv("LISTEN"),
+ memtx_memory = 107374182,
+ replication_timeout = 0.1,
+ replication_connect_timeout = 0.5,
+ read_only = true,
+})
+
+require('console').listen(os.getenv('ADMIN'))
diff --git a/test/replication/applier-rollback.result b/test/replication/applier-rollback.result
new file mode 100644
index 000000000..3209fc7fd
--- /dev/null
+++ b/test/replication/applier-rollback.result
@@ -0,0 +1,160 @@
+-- test-run result file version 2
+#!/usr/bin/env tarantool
+ | ---
+ | ...
+--
+-- vim: ts=4 sw=4 et
+--
+
+test_run = require('test_run').new()
+ | ---
+ | ...
+
+errinj = box.error.injection
+ | ---
+ | ...
+engine = test_run:get_cfg('engine')
+ | ---
+ | ...
+
+--
+-- Allow replica to connect to us
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+
+--
+-- Create replica instance, we're the master and
+-- start it, no data to sync yet though
+test_run:cmd("create server replica_slave with rpl_master=default, script='replication/applier-rollback-slave.lua'")
+ | ---
+ | - true
+ | ...
+test_run:cmd("start server replica_slave")
+ | ---
+ | - true
+ | ...
+
+--
+-- Fill initial data on the master instance
+test_run:cmd('switch default')
+ | ---
+ | - true
+ | ...
+
+_ = box.schema.space.create('test', {engine=engine})
+ | ---
+ | ...
+s = box.space.test
+ | ---
+ | ...
+
+s:format({{name = 'id', type = 'unsigned'}, {name = 'band_name', type = 'string'}})
+ | ---
+ | ...
+
+_ = s:create_index('primary', {type = 'tree', parts = {'id'}})
+ | ---
+ | ...
+s:insert({1, '1'})
+ | ---
+ | - [1, '1']
+ | ...
+s:insert({2, '2'})
+ | ---
+ | - [2, '2']
+ | ...
+s:insert({3, '3'})
+ | ---
+ | - [3, '3']
+ | ...
+
+--
+-- To make sure we're running
+box.info.status
+ | ---
+ | - running
+ | ...
+
+--
+-- Wait for data from master get propagated
+test_run:wait_lsn('replica_slave', 'default')
+ | ---
+ | ...
+
+--
+-- Now inject error into slave instance
+test_run:cmd('switch replica_slave')
+ | ---
+ | - true
+ | ...
+
+--
+-- To make sure we're running
+box.info.status
+ | ---
+ | - running
+ | ...
+
+errinj = box.error.injection
+ | ---
+ | ...
+errinj.set('ERRINJ_REPLICA_TXN_WRITE', true)
+ | ---
+ | - ok
+ | ...
+
+--
+-- Jump back to master node and write new
+-- entry which should cause error to happen
+-- on slave instance
+test_run:cmd('switch default')
+ | ---
+ | - true
+ | ...
+s:insert({4, '4'})
+ | ---
+ | - [4, '4']
+ | ...
+
+--
+-- Wait for error to trigger
+test_run:cmd('switch replica_slave')
+ | ---
+ | - true
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+while test_run:grep_log('replica_slave', 'ER_INJECTION:[^\n]*') == nil do fiber.sleep(0.1) end
+ | ---
+ | ...
+
+----
+---- Such error cause the applier to be
+---- cancelled and reaped, thus stop the
+---- slave node and cleanup
+test_run:cmd('switch default')
+ | ---
+ | - true
+ | ...
+
+--
+-- Cleanup
+test_run:cmd("stop server replica_slave")
+ | ---
+ | - true
+ | ...
+test_run:cmd("delete server replica_slave")
+ | ---
+ | - true
+ | ...
+box.cfg{replication=""}
+ | ---
+ | ...
+box.space.test:drop()
+ | ---
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
diff --git a/test/replication/applier-rollback.test.lua b/test/replication/applier-rollback.test.lua
new file mode 100644
index 000000000..d31eff9f0
--- /dev/null
+++ b/test/replication/applier-rollback.test.lua
@@ -0,0 +1,79 @@
+#!/usr/bin/env tarantool
+--
+-- vim: ts=4 sw=4 et
+--
+
+test_run = require('test_run').new()
+
+errinj = box.error.injection
+engine = test_run:get_cfg('engine')
+
+--
+-- Allow replica to connect to us
+box.schema.user.grant('guest', 'replication')
+
+--
+-- Create replica instance, we're the master and
+-- start it, no data to sync yet though
+test_run:cmd("create server replica_slave with rpl_master=default, script='replication/applier-rollback-slave.lua'")
+test_run:cmd("start server replica_slave")
+
+--
+-- Fill initial data on the master instance
+test_run:cmd('switch default')
+
+_ = box.schema.space.create('test', {engine=engine})
+s = box.space.test
+
+s:format({{name = 'id', type = 'unsigned'}, {name = 'band_name', type = 'string'}})
+
+_ = s:create_index('primary', {type = 'tree', parts = {'id'}})
+s:insert({1, '1'})
+s:insert({2, '2'})
+s:insert({3, '3'})
+
+--
+-- To make sure we're running
+box.info.status
+
+--
+-- Wait for data from master get propagated
+test_run:wait_lsn('replica_slave', 'default')
+
+--
+-- Now inject error into slave instance
+test_run:cmd('switch replica_slave')
+
+--
+-- To make sure we're running
+box.info.status
+
+errinj = box.error.injection
+errinj.set('ERRINJ_REPLICA_TXN_WRITE', true)
+
+--
+-- Jump back to master node and write new
+-- entry which should cause error to happen
+-- on slave instance
+test_run:cmd('switch default')
+s:insert({4, '4'})
+
+--
+-- Wait for error to trigger
+test_run:cmd('switch replica_slave')
+fiber = require('fiber')
+while test_run:grep_log('replica_slave', 'ER_INJECTION:[^\n]*') == nil do fiber.sleep(0.1) end
+
+----
+---- Such error cause the applier to be
+---- cancelled and reaped, thus stop the
+---- slave node and cleanup
+test_run:cmd('switch default')
+
+--
+-- Cleanup
+test_run:cmd("stop server replica_slave")
+test_run:cmd("delete server replica_slave")
+box.cfg{replication=""}
+box.space.test:drop()
+box.schema.user.revoke('guest', 'replication')
diff --git a/test/replication/suite.ini b/test/replication/suite.ini
index ed1de3140..b804b85f6 100644
--- a/test/replication/suite.ini
+++ b/test/replication/suite.ini
@@ -3,7 +3,7 @@ core = tarantool
script = master.lua
description = tarantool/box, replication
disabled = consistent.test.lua
-release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua
+release_disabled = catch.test.lua errinj.test.lua gc.test.lua gc_no_space.test.lua before_replace.test.lua quorum.test.lua recover_missing_xlog.test.lua sync.test.lua long_row_timeout.test.lua applier-rollback.test.lua
config = suite.cfg
lua_libs = lua/fast_replica.lua lua/rlimit.lua
use_unix_sockets = True
--
2.20.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
@ 2020-02-03 14:37 ` Sergey Ostanevich
0 siblings, 0 replies; 15+ messages in thread
From: Sergey Ostanevich @ 2020-02-03 14:37 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
Thanks!
LGTM.
Sergos
On 28 Jan 22:22, Cyrill Gorcunov wrote:
> In request_create_from_tuple and request_handle_sequence
> we may be unable to request memory for tuples, don't
> forget to setup diag error otherwise diag_raise will
> lead to nil dereference.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> src/box/request.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/box/request.c b/src/box/request.c
> index 82232a155..994f2da62 100644
> --- a/src/box/request.c
> +++ b/src/box/request.c
> @@ -109,8 +109,10 @@ request_create_from_tuple(struct request *request, struct space *space,
> * the tuple data to WAL on commit.
> */
> char *buf = region_alloc(&fiber()->gc, size);
> - if (buf == NULL)
> + if (buf == NULL) {
> + diag_set(OutOfMemory, size, "region_alloc", "tuple");
> return -1;
> + }
> memcpy(buf, data, size);
> request->tuple = buf;
> request->tuple_end = buf + size;
> @@ -199,8 +201,10 @@ request_handle_sequence(struct request *request, struct space *space)
> size_t buf_size = (request->tuple_end - request->tuple) +
> mp_sizeof_uint(UINT64_MAX);
> char *tuple = region_alloc(&fiber()->gc, buf_size);
> - if (tuple == NULL)
> + if (tuple == NULL) {
> + diag_set(OutOfMemory, buf_size, "region_alloc", "tuple");
> return -1;
> + }
> char *tuple_end = mp_encode_array(tuple, len);
>
> if (unlikely(key != data)) {
> --
> 2.20.1
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure Cyrill Gorcunov
@ 2020-02-03 14:39 ` Sergey Ostanevich
2020-02-04 22:15 ` Konstantin Osipov
0 siblings, 1 reply; 15+ messages in thread
From: Sergey Ostanevich @ 2020-02-03 14:39 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
Hi!
Thanks for the patch!
LGTM.
Sergos
On 28 Jan 22:22, Cyrill Gorcunov wrote:
> In case if we're hitting memory limit allocating triggers
> we should setup diag error to prevent nil dereference
> in diag_raise call (for example from applier_apply_tx).
>
> Note that there are region_alloc_xc helpers which are
> throwing errors but as far as I understand we need the
> rollback action to process first instead of immediate
> throw/catch thus we use diag_set.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> src/box/applier.cc | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index ae3d281a5..2ed5125d0 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -796,8 +796,11 @@ applier_apply_tx(struct stailq *rows)
> sizeof(struct trigger));
> on_commit = (struct trigger *)region_alloc(&txn->region,
> sizeof(struct trigger));
> - if (on_rollback == NULL || on_commit == NULL)
> + if (on_rollback == NULL || on_commit == NULL) {
> + diag_set(OutOfMemory, sizeof(struct trigger),
> + "region_alloc", "on_rollback/on_commit");
> goto rollback;
> + }
>
> trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL);
> txn_on_rollback(txn, on_rollback);
> --
> 2.20.1
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-02-03 14:39 ` Sergey Ostanevich
@ 2020-02-04 22:15 ` Konstantin Osipov
2020-02-05 7:46 ` Cyrill Gorcunov
0 siblings, 1 reply; 15+ messages in thread
From: Konstantin Osipov @ 2020-02-04 22:15 UTC (permalink / raw)
To: Sergey Ostanevich; +Cc: tml
* Sergey Ostanevich <sergos@tarantool.org> [20/02/03 17:42]:
> Hi!
>
> Thanks for the patch!
>
> LGTM.
This code is dead actually. There is no region quota and OOM is
impossible here. We haven't had a policy to check these errors
before.
No harm in pushing it, but no value either.
--
Konstantin Osipov, Moscow, Russia
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback Cyrill Gorcunov
@ 2020-02-04 22:19 ` Konstantin Osipov
2020-02-05 7:33 ` Cyrill Gorcunov
0 siblings, 1 reply; 15+ messages in thread
From: Konstantin Osipov @ 2020-02-04 22:19 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
* Cyrill Gorcunov <gorcunov@gmail.com> [20/01/28 22:58]:
> + /*
> + * Something really bad happened, we can't proceed
> + * thus stop the applier and throw FiberIsCancelled
> + * exception which will be catched by the caller
> + * and the fiber gracefully finish.
> + *
> + * FIXME: Need to make sure that this is a really
> + * final error where we can't longer proceed and should
> + * zap the applier, probably we could reconnect and
> + * retry instead?
> + */
> fiber_cancel(applier->reader);
> + diag_set(FiberIsCancelled);
Now that I have seen the entire series and the test case, I think
there are two different issues here and only one of them is
leading to a crash.
One, is that we reset the original error with some vague
ER_WAL_IO. This shouldn't happen, but it's harmless. Still, your
fix for it is good. I don't think though it should check for !e.
Two, is a failure to propagate the error from the cancelled
applier fiber.
I think it's better to fix the call site to raise
replicaset.applier.diag than to add a yet another vague error
(fiberiscancelled) topped with prolific comments about how
the current code is broken.
--
Konstantin Osipov, Moscow, Russia
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE Cyrill Gorcunov
@ 2020-02-04 22:45 ` Konstantin Osipov
0 siblings, 0 replies; 15+ messages in thread
From: Konstantin Osipov @ 2020-02-04 22:45 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
* Cyrill Gorcunov <gorcunov@gmail.com> [20/01/28 22:58]:
> To test rollback error nil dereference
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> src/box/applier.cc | 6 +
> src/lib/core/errinj.h | 1 +
> test/box/errinj.result | 2614 +++++++++++++++++++++-------------------
> 3 files changed, 1365 insertions(+), 1256 deletions(-)
>
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index 967dc91de..e739f23e2 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -51,6 +51,7 @@
> #include "txn.h"
> #include "box.h"
> #include "scoped_guard.h"
> +#include "errinj.h"
>
> STRS(applier_state, applier_STATE);
>
> @@ -830,6 +831,11 @@ applier_apply_tx(struct stailq *rows)
> trigger_create(on_commit, applier_txn_commit_cb, NULL, NULL);
> txn_on_commit(txn, on_commit);
>
> + ERROR_INJECT(ERRINJ_REPLICA_TXN_WRITE, {
> + diag_set(ClientError, ER_INJECTION, "replica txn write injection");
> + goto rollback;
> + });
if the source of error is in txn_prepare(), I think the injection
should be moved in it. but I see no way txn_prepare() could fail
inside an applier, as long as we deal with memtx spaces.
--
Konstantin Osipov, Moscow, Russia
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback
2020-02-04 22:19 ` Konstantin Osipov
@ 2020-02-05 7:33 ` Cyrill Gorcunov
0 siblings, 0 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-02-05 7:33 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tml
On Wed, Feb 05, 2020 at 01:19:02AM +0300, Konstantin Osipov wrote:
> * Cyrill Gorcunov <gorcunov@gmail.com> [20/01/28 22:58]:
> > + /*
> > + * Something really bad happened, we can't proceed
> > + * thus stop the applier and throw FiberIsCancelled
> > + * exception which will be catched by the caller
> > + * and the fiber gracefully finish.
> > + *
> > + * FIXME: Need to make sure that this is a really
> > + * final error where we can't longer proceed and should
> > + * zap the applier, probably we could reconnect and
> > + * retry instead?
> > + */
> > fiber_cancel(applier->reader);
> > + diag_set(FiberIsCancelled);
>
> Now that I have seen the entire series and the test case, I think
> there are two different issues here and only one of them is
> leading to a crash.
>
> One, is that we reset the original error with some vague
> ER_WAL_IO. This shouldn't happen, but it's harmless. Still, your
> fix for it is good. I don't think though it should check for !e.
>
> Two, is a failure to propagate the error from the cancelled
> applier fiber.
>
> I think it's better to fix the call site to raise
> replicaset.applier.diag than to add a yet another vague error
> (fiberiscancelled) topped with prolific comments about how
> the current code is broken.
OK, thanks a huge, Kostya! I read all your mails about this
series, and really appreciate the feedback! I'll rework the
patchset.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-02-04 22:15 ` Konstantin Osipov
@ 2020-02-05 7:46 ` Cyrill Gorcunov
2020-02-05 9:49 ` Konstantin Osipov
0 siblings, 1 reply; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-02-05 7:46 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tml
On Wed, Feb 05, 2020 at 01:15:56AM +0300, Konstantin Osipov wrote:
> * Sergey Ostanevich <sergos@tarantool.org> [20/02/03 17:42]:
> > Hi!
> >
> > Thanks for the patch!
> >
> > LGTM.
>
> This code is dead actually. There is no region quota and OOM is
> impossible here. We haven't had a policy to check these errors
> before.
>
> No harm in pushing it, but no value either.
Wait, region_alloc (as other slab related functions) are using
malloc call on low level (see slab_get_large) thus there is
no guarantee that NULL won't be ever returned, moreover malloc
interface never claimed that NULL will be returned iif there
no free memory in the system (actually this is not how malloc
works now but api points explicitly that we should be ready
for NULL and handle it properly).
IOW I think we should handle NULLs to be stable in long terms.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-02-05 7:46 ` Cyrill Gorcunov
@ 2020-02-05 9:49 ` Konstantin Osipov
2020-02-05 10:06 ` Cyrill Gorcunov
0 siblings, 1 reply; 15+ messages in thread
From: Konstantin Osipov @ 2020-02-05 9:49 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
* Cyrill Gorcunov <gorcunov@gmail.com> [20/02/05 10:50]:
> > This code is dead actually. There is no region quota and OOM is
> > impossible here. We haven't had a policy to check these errors
> > before.
> >
> > No harm in pushing it, but no value either.
>
> Wait, region_alloc (as other slab related functions) are using
> malloc call on low level (see slab_get_large) thus there is
> no guarantee that NULL won't be ever returned, moreover malloc
> interface never claimed that NULL will be returned iif there
> no free memory in the system (actually this is not how malloc
> works now but api points explicitly that we should be ready
> for NULL and handle it properly).
>
> IOW I think we should handle NULLs to be stable in long terms.
While I sort of agree with the discipline of checking the malloc
return value, just as a style habit, you won't get NULL
from malloc() in practice. OOM killer will do its job first.
Also if you do, you're just as good crashing next line, when
accessing null pointer.
--
Konstantin Osipov, Moscow, Russia
https://scylladb.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure
2020-02-05 9:49 ` Konstantin Osipov
@ 2020-02-05 10:06 ` Cyrill Gorcunov
0 siblings, 0 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2020-02-05 10:06 UTC (permalink / raw)
To: Konstantin Osipov; +Cc: tml
On Wed, Feb 05, 2020 at 12:49:46PM +0300, Konstantin Osipov wrote:
> >
> > IOW I think we should handle NULLs to be stable in long terms.
>
> While I sort of agree with the discipline of checking the malloc
> return value, just as a style habit, you won't get NULL
> from malloc() in practice. OOM killer will do its job first.
>
> Also if you do, you're just as good crashing next line, when
> accessing null pointer.
Actually I've seen some ticket in our github about wrapping
malloc with xmalloc which would crash on malloc failure.
IOW, lets leave explisit check for now.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-02-05 10:06 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28 19:22 [Tarantool-patches] [PATCH v7 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
2020-02-03 14:37 ` Sergey Ostanevich
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 2/5] box/applier: add missing diag_set on region_alloc failure Cyrill Gorcunov
2020-02-03 14:39 ` Sergey Ostanevich
2020-02-04 22:15 ` Konstantin Osipov
2020-02-05 7:46 ` Cyrill Gorcunov
2020-02-05 9:49 ` Konstantin Osipov
2020-02-05 10:06 ` Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback Cyrill Gorcunov
2020-02-04 22:19 ` Konstantin Osipov
2020-02-05 7:33 ` Cyrill Gorcunov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE Cyrill Gorcunov
2020-02-04 22:45 ` Konstantin Osipov
2020-01-28 19:22 ` [Tarantool-patches] [PATCH v7 5/5] test: add replication/applier-rollback Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox