From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Nikita Pettik <korablev@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] vinyl: add NULL check of xrow_upsert_execute() retval Date: Fri, 29 May 2020 23:24:26 +0200 [thread overview] Message-ID: <7445b38c-f664-ca79-bb05-73a73ddc4d6d@tarantool.org> (raw) In-Reply-To: <776e8b91b93c79dabd2932b5d665236c5da313c8.1590546551.git.korablev@tarantool.org> Hi! Thanks for the patch! While the patch is obviously correct (we need to check NULL for sure), it solves the problem only partially, and creates another. We discussed that verbally, and here is a short resume of what is happening in the patch, and where we have a tricky problem: if there are 2 perfectly valid upserts, each with 2.5k operations, and they are merged into one, both of them are skipped, because after merge they become too fat - opcount > 4k. It looks at first that this can only happen when field count > 4k, because otherwise all the operations would be squashed into something smaller or equal than field count, but it is not. There are a few cases, when even after squash total operation count will be bigger than field count: 1) operations are complex - ':', '&', '|', '^', '#', '!'. The last two operations are actually used by people. These operations are not squashed. The last one - '!' - can't be squashed even in theory. 2) operations have negative field number. For example, {'=', -1, ...} - assign a value to the last field in the tuple. But honestly I don't remember. Perhaps they are merged, if in both squashed upserts the field number is the same. But imagine this: {'=', -1, 100}, and {'=', 5, 100}. They look different, but if the tuple has only 5 fields, they operate on the same field. That means it is not safe to drop any upsert having more than 4k operations. Because it can consist of many small valid upserts. I don't know how to fix it in a simple way. The only thing I could come up with is probably don't squash such fat upserts. Just keep them all on the disk, until they eventually meet bottom of their key, or a terminal statement like REPLACE/INSERT/DELETE. This is not only about disk, btw. 2 fat upserts could be inserted into the memory level, turn into an invalid upsert, and that will be skipped. Here is a test. Create a tuple, and dump it on disk so as it would disappear from the memory level and from the cache: box.cfg{} s = box.schema.create_space('test', {engine = 'vinyl'}) pk = s:create_index('pk') s:insert({1, 1}) box.snapshot() Then restart (to ensure the cache is clear), and create 2 upserts: box.cfg{} s = box.space.test ops = {} op = {'=', 2, 100} for i = 1, 2500 do table.insert(ops, op) end s:upsert({1}, ops) op = {'=', -1, 200} ops = {} for i = 1, 2500 do table.insert(ops, op) end s:upsert({1}, ops) Now if I do select, I get tarantool> s:select{} --- - - [1, 200] ... But if I do dump + select, I get: tarantool> box.snapshot() --- - ok ... tarantool> s:select{} --- - - [1, 100] ... During dump the second upsert was skipped even though it was valid.
next prev parent reply other threads:[~2020-05-29 21:24 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-27 2:56 Nikita Pettik 2020-05-29 21:24 ` Vladislav Shpilevoy [this message] 2020-05-29 21:34 ` Vladislav Shpilevoy 2020-07-08 12:22 ` Nikita Pettik 2020-05-29 23:04 ` Konstantin Osipov 2020-07-08 12:53 ` Nikita Pettik 2020-07-09 11:56 ` Nikita Pettik
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=7445b38c-f664-ca79-bb05-73a73ddc4d6d@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] vinyl: add NULL check of xrow_upsert_execute() retval' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox