From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp50.i.mail.ru (smtp50.i.mail.ru [94.100.177.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 12540469719 for ; Sat, 29 Feb 2020 20:09:57 +0300 (MSK) References: <08e87903-c00a-9f8a-ff25-5199eb6be0f8@tarantool.org> From: Vladislav Shpilevoy Message-ID: <98c0554d-be84-578c-f8e9-d64b23d63623@tarantool.org> Date: Sat, 29 Feb 2020 18:09:56 +0100 MIME-Version: 1.0 In-Reply-To: <08e87903-c00a-9f8a-ff25-5199eb6be0f8@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH vshard 1/1] router: bucket_id_strcrc32 and bucket_id_mpcrc32 List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Oleg Babin , tarantool-patches@dev.tarantool.org Thanks for the comments! > On 26/02/2020 02:52, Vladislav Shpilevoy wrote: >> Closes #207 >> >> @TarantoolBot document >> Title: vshard.router.bucket_id_strcrc32() and .bucket_id_mpcrc32() >> vshard.router.bucket_id() is deprecated, each its usage logs a >> warning. It still works, but will be deleted in future. > > I think it's quite agressive to print message in logs for each time. Moreover it could lead to unexpected behaviour for some people. > IMO the most proper way print such message only once. Consider using something like that - https://github.com/tarantool/errors/blob/master/errors/deprecate.lua#L21 The aforementioned module won't print a warning only once. It will print all the errors again once a code reload happens, because 'local errors_issued = {}' will be called again on reload, and will clear 'errors_issued' table. But perhaps it was done intentionally. I made it the same way. Intentionally. In that case it will help user to replace all bucket_id() calls during reloads, and after each reload he will be able to validate whether the message is printed again anywhere. ==================== +local router_bucket_id_deprecated_warn = true local function router_bucket_id(router, key) if key == nil then error("Usage: vshard.router.bucket_id(key)") end - log.warn('vshard.router.bucket_id() is deprecated, use '.. - 'vshard.router.bucket_id_strcrc32() or '.. - 'vshard.router.bucket_id_mpcrc32()') + if router_bucket_id_deprecated_warn then + router_bucket_id_deprecated_warn = false + log.warn('vshard.router.bucket_id() is deprecated, use '.. + 'vshard.router.bucket_id_strcrc32() or '.. + 'vshard.router.bucket_id_mpcrc32()') + end return lhash.strcrc32(key) % router.total_bucket_count + 1 end ==================== >> Behaviour of the old bucket_id() function is now available as >> vshard.router.bucket_id_strcrc32(). It works exactly like the old >> function, but does not log a warning. >> >> The reason why there is a new function bucket_id_mpcrc32() is that >> the old bucket_id() and the new bucket_id_strcrc32() are not >> consistent for cdata numbers. In particular, they return 3 >> different values for normal Lua numbers like 123, for unsigned >> long long cdata (like 123ULL, or ffi.cast('unsigned long long', >> 123)), and for signed long long cdata (like 123LL, or >> ffi.cast('long long', 123)). Note, this is important! >> >>      vshard.router.bucket_id(123) >>      vshard.router.bucket_id(123LL) >>      vshard.router.bucket_id(123ULL) >> >>      Return 3 different values!!! >> >> For float and double cdata (ffi.cast('float', number), >> ffi.cast('double', number)) these functions return different >> values even for the same numbers of the same floating point type. >> This is because tostring() on a floating point cdata number >> returns not the number, but a pointer at it. Different on each >> call. >> >> vshard.router.bucket_id_strcrc32() behaves exactly the same. >> >> vshard.router.bucket_id_mpcrc32() is safer. > > Looks like it's not enough safe for new "double" type: > > | tarantool> box.space.test:create_index('double', {type='tree', parts = {1, 'double'}}) > | tarantool> box.space.test:insert({ffi.cast('double', 1)}) -- cdata > | --- > | - [1] > | ... > | tarantool> type(box.space.test:select()[1][1]) > | --- > | - number > | ... > > And results of msgpack.encode are different. > > I think you can return to your idea (https://github.com/tarantool/vshard/issues/207#issuecomment-589976204) with "tonumber": > |tarantool> tonumber(ffi.cast('double', 100)) > | --- > | - 100 > | ... > > Such approach will lead to precision lost for huge unsigned/signed integers. But it will be predictable and determined for all cases. I added that example to docbot request, slightly modified.