* [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment @ 2020-05-15 23:03 Vladislav Shpilevoy 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift Vladislav Shpilevoy ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-15 23:03 UTC (permalink / raw) To: tarantool-patches, tsafin, gorcunov, alyapunov The patchset fixes unaligned memory access in UUID and bit libraries. Also in the bit library the bitshift overflow UB is fixed. This is not all of issue 4609, but it is going to consist of a lot of independent parts like this, which can be reviewed and pushed without waiting for the whole patchset. To speed up the process. Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4609-sanitize-uuid-and-bit Issue: https://github.com/tarantool/tarantool/issues/4609 Vladislav Shpilevoy (2): bit: fix unaligned memory access and UB bit shift uuid: fix unaligned memory access src/lib/bit/bit.h | 30 ++++--- src/lib/uuid/tt_uuid.h | 11 +-- test/unit/bit.c | 4 +- test/unit/bit.result | 180 ----------------------------------------- 4 files changed, 27 insertions(+), 198 deletions(-) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 14+ messages in thread
* [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift 2020-05-15 23:03 [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy @ 2020-05-15 23:03 ` Vladislav Shpilevoy 2020-05-21 14:37 ` Timur Safin 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access Vladislav Shpilevoy 2020-05-21 19:33 ` [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy 2 siblings, 1 reply; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-15 23:03 UTC (permalink / raw) To: tarantool-patches, tsafin, gorcunov, alyapunov Lib/bit library in some functions dereferenced a byte array as a pointer at uint64 or uint32. It led to unaligned memory access, when the array didn't have the needed alignment. That made clang 'alignment' sanitizer crazy. Also there were places, where bitwise shift was done for the type size. For example, a uint32 number was allowed to make 32 bit shift in both sides. This is undefined behaviour. The patch fixes these UBs. Bitwise shift of too long size is banned. Unaligned access is fixed in some places by changing uint64 to uint8, when it is possible without perf loss. And in the others by using explicitly specified unaligned load. So it works the same, but the sanitizer is aware of that. Part of #4609 --- src/lib/bit/bit.h | 30 +++++--- test/unit/bit.c | 4 +- test/unit/bit.result | 180 ------------------------------------------- 3 files changed, 21 insertions(+), 193 deletions(-) diff --git a/src/lib/bit/bit.h b/src/lib/bit/bit.h index b11034cb9..eb392b820 100644 --- a/src/lib/bit/bit.h +++ b/src/lib/bit/bit.h @@ -202,10 +202,10 @@ bitmap_size(size_t bit_count) inline bool bit_test(const void *data, size_t pos) { - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); + size_t chunk = pos / CHAR_BIT; + size_t offset = pos % CHAR_BIT; - const unsigned long *ldata = (const unsigned long *) data; + const uint8_t *ldata = (const uint8_t *)data; return (ldata[chunk] >> offset) & 0x1; } @@ -222,10 +222,10 @@ bit_test(const void *data, size_t pos) inline bool bit_set(void *data, size_t pos) { - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); + size_t chunk = pos / CHAR_BIT; + size_t offset = pos % CHAR_BIT; - unsigned long *ldata = (unsigned long *) data; + uint8_t *ldata = (uint8_t *)data; bool prev = (ldata[chunk] >> offset) & 0x1; ldata[chunk] |= (1UL << offset); return prev; @@ -244,10 +244,10 @@ bit_set(void *data, size_t pos) inline bool bit_clear(void *data, size_t pos) { - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); + size_t chunk = pos / CHAR_BIT; + size_t offset = pos % CHAR_BIT; - unsigned long *ldata = (unsigned long *) data; + uint8_t *ldata = (uint8_t *)data; bool prev = (ldata[chunk] >> offset) & 0x1; ldata[chunk] &= ~(1UL << offset); return prev; @@ -413,6 +413,7 @@ bit_count_u64(uint64_t x) inline uint32_t bit_rotl_u32(uint32_t x, int r) { + assert(r > 0 && r < 32); /* gcc recognises this code and generates a rotate instruction */ return ((x << r) | (x >> (32 - r))); } @@ -423,6 +424,7 @@ bit_rotl_u32(uint32_t x, int r) inline uint64_t bit_rotl_u64(uint64_t x, int r) { + assert(r > 0 && r < 64); /* gcc recognises this code and generates a rotate instruction */ return ((x << r) | (x >> (64 - r))); } @@ -433,6 +435,7 @@ bit_rotl_u64(uint64_t x, int r) __attribute__ ((const)) inline uintmax_t bit_rotl_umax(uintmax_t x, int r) { + assert(r > 0 && r < (int)(sizeof(uintmax_t) * CHAR_BIT)); /* gcc recognises this code and generates a rotate instruction */ return ((x << r) | (x >> (sizeof(uintmax_t) * CHAR_BIT - r))); } @@ -446,6 +449,7 @@ bit_rotl_umax(uintmax_t x, int r) inline uint32_t bit_rotr_u32(uint32_t x, int r) { + assert(r > 0 && r < 32); /* gcc recognises this code and generates a rotate instruction */ return ((x >> r) | (x << (32 - r))); } @@ -456,6 +460,7 @@ bit_rotr_u32(uint32_t x, int r) inline uint64_t bit_rotr_u64(uint64_t x, int r) { + assert(r > 0 && r < 64); /* gcc recognises this code and generates a rotate instruction */ return ((x >> r) | (x << (64 - r))); } @@ -538,9 +543,11 @@ bit_index_u64(uint64_t x, int *indexes, int offset); /* Use bigger words on x86_64 */ #define ITER_UINT uint64_t #define ITER_CTZ bit_ctz_u64 +#define ITER_LOAD load_u64 #else #define ITER_UINT uint32_t #define ITER_CTZ bit_ctz_u32 +#define ITER_LOAD load_u32 #endif /** @endcond **/ @@ -589,7 +596,7 @@ bit_iterator_init(struct bit_iterator *it, const void *data, size_t size, /* Check if size is a multiple of sizeof(ITER_UINT) */ const char *e = it->next + size % sizeof(ITER_UINT); if (bit_likely(it->next == e)) { - it->word = *(ITER_UINT *) it->next; + it->word = ITER_LOAD(it->next); it->next += sizeof(ITER_UINT); } else { it->word = it->word_xor; @@ -615,7 +622,7 @@ bit_iterator_next(struct bit_iterator *it) return SIZE_MAX; /* Extract the next word from memory */ - it->word = *(ITER_UINT *) it->next; + it->word = ITER_LOAD(it->next); it->word ^= it->word_xor; it->word_base = (it->next - it->start) * CHAR_BIT; it->next += sizeof(ITER_UINT); @@ -629,6 +636,7 @@ bit_iterator_next(struct bit_iterator *it) return it->word_base + bit; } +#undef ITER_LOAD #undef ITER_CTZ #undef ITER_UINT diff --git a/test/unit/bit.c b/test/unit/bit.c index da514484d..1dae3671c 100644 --- a/test/unit/bit.c +++ b/test/unit/bit.c @@ -80,7 +80,7 @@ test_rotl_rotr_one(int rot) printf("bit_rotr_u64(%" PRIu64 ", %d) => %" PRIu64 "\n", val64, rot, bit_rotr_u64(val64, rot)); - if (vals[i] > UINT32_MAX || rot > 32) + if (vals[i] > UINT32_MAX || rot >= 32) continue; printf("bit_rotl_u32(%" PRIu32 ", %d) => %" PRIu32 "\n", @@ -95,7 +95,7 @@ test_rotl_rotr(void) { header(); - int rots[] = { 0, 1, 15, 16, 31, 32, 63, 64 }; + int rots[] = { 1, 15, 16, 31, 32, 63 }; for (unsigned r = 0; r < sizeof(rots) / sizeof(rots[0]); r++) { test_rotl_rotr_one(rots[r]); } diff --git a/test/unit/bit.result b/test/unit/bit.result index 2d4ccc6f4..cfce9f532 100644 --- a/test/unit/bit.result +++ b/test/unit/bit.result @@ -134,96 +134,6 @@ bit_count_u64(15984546468465238145) => 25 bit_count_u64(18446744073709551615) => 64 *** test_count: done *** *** test_rotl_rotr *** -bit_rotl_u64(0, 0) => 0 -bit_rotr_u64(0, 0) => 0 -bit_rotl_u32(0, 0) => 0 -bit_rotr_u32(0, 0) => 0 -bit_rotl_u64(1, 0) => 1 -bit_rotr_u64(1, 0) => 1 -bit_rotl_u32(1, 0) => 1 -bit_rotr_u32(1, 0) => 1 -bit_rotl_u64(2, 0) => 2 -bit_rotr_u64(2, 0) => 2 -bit_rotl_u32(2, 0) => 2 -bit_rotr_u32(2, 0) => 2 -bit_rotl_u64(32768, 0) => 32768 -bit_rotr_u64(32768, 0) => 32768 -bit_rotl_u32(32768, 0) => 32768 -bit_rotr_u32(32768, 0) => 32768 -bit_rotl_u64(65535, 0) => 65535 -bit_rotr_u64(65535, 0) => 65535 -bit_rotl_u32(65535, 0) => 65535 -bit_rotr_u32(65535, 0) => 65535 -bit_rotl_u64(65536, 0) => 65536 -bit_rotr_u64(65536, 0) => 65536 -bit_rotl_u32(65536, 0) => 65536 -bit_rotr_u32(65536, 0) => 65536 -bit_rotl_u64(726075912, 0) => 726075912 -bit_rotr_u64(726075912, 0) => 726075912 -bit_rotl_u32(726075912, 0) => 726075912 -bit_rotr_u32(726075912, 0) => 726075912 -bit_rotl_u64(858993459, 0) => 858993459 -bit_rotr_u64(858993459, 0) => 858993459 -bit_rotl_u32(858993459, 0) => 858993459 -bit_rotr_u32(858993459, 0) => 858993459 -bit_rotl_u64(1073741824, 0) => 1073741824 -bit_rotr_u64(1073741824, 0) => 1073741824 -bit_rotl_u32(1073741824, 0) => 1073741824 -bit_rotr_u32(1073741824, 0) => 1073741824 -bit_rotl_u64(1245250552, 0) => 1245250552 -bit_rotr_u64(1245250552, 0) => 1245250552 -bit_rotl_u32(1245250552, 0) => 1245250552 -bit_rotr_u32(1245250552, 0) => 1245250552 -bit_rotl_u64(1431655765, 0) => 1431655765 -bit_rotr_u64(1431655765, 0) => 1431655765 -bit_rotl_u32(1431655765, 0) => 1431655765 -bit_rotr_u32(1431655765, 0) => 1431655765 -bit_rotl_u64(1656977767, 0) => 1656977767 -bit_rotr_u64(1656977767, 0) => 1656977767 -bit_rotl_u32(1656977767, 0) => 1656977767 -bit_rotr_u32(1656977767, 0) => 1656977767 -bit_rotl_u64(2147483648, 0) => 2147483648 -bit_rotr_u64(2147483648, 0) => 2147483648 -bit_rotl_u32(2147483648, 0) => 2147483648 -bit_rotr_u32(2147483648, 0) => 2147483648 -bit_rotl_u64(2283114629, 0) => 2283114629 -bit_rotr_u64(2283114629, 0) => 2283114629 -bit_rotl_u32(2283114629, 0) => 2283114629 -bit_rotr_u32(2283114629, 0) => 2283114629 -bit_rotl_u64(2502548245, 0) => 2502548245 -bit_rotr_u64(2502548245, 0) => 2502548245 -bit_rotl_u32(2502548245, 0) => 2502548245 -bit_rotr_u32(2502548245, 0) => 2502548245 -bit_rotl_u64(4294967295, 0) => 4294967295 -bit_rotr_u64(4294967295, 0) => 4294967295 -bit_rotl_u32(4294967295, 0) => 4294967295 -bit_rotr_u32(4294967295, 0) => 4294967295 -bit_rotl_u64(708915120906848425, 0) => 708915120906848425 -bit_rotr_u64(708915120906848425, 0) => 708915120906848425 -bit_rotl_u64(1960191741125985428, 0) => 1960191741125985428 -bit_rotr_u64(1960191741125985428, 0) => 1960191741125985428 -bit_rotl_u64(3689348814741910323, 0) => 3689348814741910323 -bit_rotr_u64(3689348814741910323, 0) => 3689348814741910323 -bit_rotl_u64(5578377670650038654, 0) => 5578377670650038654 -bit_rotr_u64(5578377670650038654, 0) => 5578377670650038654 -bit_rotl_u64(9223372036854775808, 0) => 9223372036854775808 -bit_rotr_u64(9223372036854775808, 0) => 9223372036854775808 -bit_rotl_u64(10755112315580060033, 0) => 10755112315580060033 -bit_rotr_u64(10755112315580060033, 0) => 10755112315580060033 -bit_rotl_u64(11163782031541429823, 0) => 11163782031541429823 -bit_rotr_u64(11163782031541429823, 0) => 11163782031541429823 -bit_rotl_u64(13903686156871869732, 0) => 13903686156871869732 -bit_rotr_u64(13903686156871869732, 0) => 13903686156871869732 -bit_rotl_u64(14237897302422917095, 0) => 14237897302422917095 -bit_rotr_u64(14237897302422917095, 0) => 14237897302422917095 -bit_rotl_u64(14302190498657618739, 0) => 14302190498657618739 -bit_rotr_u64(14302190498657618739, 0) => 14302190498657618739 -bit_rotl_u64(15766411510232741269, 0) => 15766411510232741269 -bit_rotr_u64(15766411510232741269, 0) => 15766411510232741269 -bit_rotl_u64(15984546468465238145, 0) => 15984546468465238145 -bit_rotr_u64(15984546468465238145, 0) => 15984546468465238145 -bit_rotl_u64(18446744073709551615, 0) => 18446744073709551615 -bit_rotr_u64(18446744073709551615, 0) => 18446744073709551615 bit_rotl_u64(0, 1) => 0 bit_rotr_u64(0, 1) => 0 bit_rotl_u32(0, 1) => 0 @@ -586,68 +496,36 @@ bit_rotl_u64(18446744073709551615, 31) => 18446744073709551615 bit_rotr_u64(18446744073709551615, 31) => 18446744073709551615 bit_rotl_u64(0, 32) => 0 bit_rotr_u64(0, 32) => 0 -bit_rotl_u32(0, 32) => 0 -bit_rotr_u32(0, 32) => 0 bit_rotl_u64(1, 32) => 4294967296 bit_rotr_u64(1, 32) => 4294967296 -bit_rotl_u32(1, 32) => 1 -bit_rotr_u32(1, 32) => 1 bit_rotl_u64(2, 32) => 8589934592 bit_rotr_u64(2, 32) => 8589934592 -bit_rotl_u32(2, 32) => 2 -bit_rotr_u32(2, 32) => 2 bit_rotl_u64(32768, 32) => 140737488355328 bit_rotr_u64(32768, 32) => 140737488355328 -bit_rotl_u32(32768, 32) => 32768 -bit_rotr_u32(32768, 32) => 32768 bit_rotl_u64(65535, 32) => 281470681743360 bit_rotr_u64(65535, 32) => 281470681743360 -bit_rotl_u32(65535, 32) => 65535 -bit_rotr_u32(65535, 32) => 65535 bit_rotl_u64(65536, 32) => 281474976710656 bit_rotr_u64(65536, 32) => 281474976710656 -bit_rotl_u32(65536, 32) => 65536 -bit_rotr_u32(65536, 32) => 65536 bit_rotl_u64(726075912, 32) => 3118472296453373952 bit_rotr_u64(726075912, 32) => 3118472296453373952 -bit_rotl_u32(726075912, 32) => 726075912 -bit_rotr_u32(726075912, 32) => 726075912 bit_rotl_u64(858993459, 32) => 3689348813882916864 bit_rotr_u64(858993459, 32) => 3689348813882916864 -bit_rotl_u32(858993459, 32) => 858993459 -bit_rotr_u32(858993459, 32) => 858993459 bit_rotl_u64(1073741824, 32) => 4611686018427387904 bit_rotr_u64(1073741824, 32) => 4611686018427387904 -bit_rotl_u32(1073741824, 32) => 1073741824 -bit_rotr_u32(1073741824, 32) => 1073741824 bit_rotl_u64(1245250552, 32) => 5348310396165947392 bit_rotr_u64(1245250552, 32) => 5348310396165947392 -bit_rotl_u32(1245250552, 32) => 1245250552 -bit_rotr_u32(1245250552, 32) => 1245250552 bit_rotl_u64(1431655765, 32) => 6148914689804861440 bit_rotr_u64(1431655765, 32) => 6148914689804861440 -bit_rotl_u32(1431655765, 32) => 1431655765 -bit_rotr_u32(1431655765, 32) => 1431655765 bit_rotl_u64(1656977767, 32) => 7116665319464108032 bit_rotr_u64(1656977767, 32) => 7116665319464108032 -bit_rotl_u32(1656977767, 32) => 1656977767 -bit_rotr_u32(1656977767, 32) => 1656977767 bit_rotl_u64(2147483648, 32) => 9223372036854775808 bit_rotr_u64(2147483648, 32) => 9223372036854775808 -bit_rotl_u32(2147483648, 32) => 2147483648 -bit_rotr_u32(2147483648, 32) => 2147483648 bit_rotl_u64(2283114629, 32) => 9805902664574173184 bit_rotr_u64(2283114629, 32) => 9805902664574173184 -bit_rotl_u32(2283114629, 32) => 2283114629 -bit_rotr_u32(2283114629, 32) => 2283114629 bit_rotl_u64(2502548245, 32) => 10748362868937195520 bit_rotr_u64(2502548245, 32) => 10748362868937195520 -bit_rotl_u32(2502548245, 32) => 2502548245 -bit_rotr_u32(2502548245, 32) => 2502548245 bit_rotl_u64(4294967295, 32) => 18446744069414584320 bit_rotr_u64(4294967295, 32) => 18446744069414584320 -bit_rotl_u32(4294967295, 32) => 4294967295 -bit_rotr_u32(4294967295, 32) => 4294967295 bit_rotl_u64(708915120906848425, 32) => 16541238372230140555 bit_rotr_u64(708915120906848425, 32) => 16541238372230140555 bit_rotl_u64(1960191741125985428, 32) => 14229128056835145728 @@ -732,64 +610,6 @@ bit_rotl_u64(15984546468465238145, 63) => 17215645271087394880 bit_rotr_u64(15984546468465238145, 63) => 13522348863220924675 bit_rotl_u64(18446744073709551615, 63) => 18446744073709551615 bit_rotr_u64(18446744073709551615, 63) => 18446744073709551615 -bit_rotl_u64(0, 64) => 0 -bit_rotr_u64(0, 64) => 0 -bit_rotl_u64(1, 64) => 1 -bit_rotr_u64(1, 64) => 1 -bit_rotl_u64(2, 64) => 2 -bit_rotr_u64(2, 64) => 2 -bit_rotl_u64(32768, 64) => 32768 -bit_rotr_u64(32768, 64) => 32768 -bit_rotl_u64(65535, 64) => 65535 -bit_rotr_u64(65535, 64) => 65535 -bit_rotl_u64(65536, 64) => 65536 -bit_rotr_u64(65536, 64) => 65536 -bit_rotl_u64(726075912, 64) => 726075912 -bit_rotr_u64(726075912, 64) => 726075912 -bit_rotl_u64(858993459, 64) => 858993459 -bit_rotr_u64(858993459, 64) => 858993459 -bit_rotl_u64(1073741824, 64) => 1073741824 -bit_rotr_u64(1073741824, 64) => 1073741824 -bit_rotl_u64(1245250552, 64) => 1245250552 -bit_rotr_u64(1245250552, 64) => 1245250552 -bit_rotl_u64(1431655765, 64) => 1431655765 -bit_rotr_u64(1431655765, 64) => 1431655765 -bit_rotl_u64(1656977767, 64) => 1656977767 -bit_rotr_u64(1656977767, 64) => 1656977767 -bit_rotl_u64(2147483648, 64) => 2147483648 -bit_rotr_u64(2147483648, 64) => 2147483648 -bit_rotl_u64(2283114629, 64) => 2283114629 -bit_rotr_u64(2283114629, 64) => 2283114629 -bit_rotl_u64(2502548245, 64) => 2502548245 -bit_rotr_u64(2502548245, 64) => 2502548245 -bit_rotl_u64(4294967295, 64) => 4294967295 -bit_rotr_u64(4294967295, 64) => 4294967295 -bit_rotl_u64(708915120906848425, 64) => 708915120906848425 -bit_rotr_u64(708915120906848425, 64) => 708915120906848425 -bit_rotl_u64(1960191741125985428, 64) => 1960191741125985428 -bit_rotr_u64(1960191741125985428, 64) => 1960191741125985428 -bit_rotl_u64(3689348814741910323, 64) => 3689348814741910323 -bit_rotr_u64(3689348814741910323, 64) => 3689348814741910323 -bit_rotl_u64(5578377670650038654, 64) => 5578377670650038654 -bit_rotr_u64(5578377670650038654, 64) => 5578377670650038654 -bit_rotl_u64(9223372036854775808, 64) => 9223372036854775808 -bit_rotr_u64(9223372036854775808, 64) => 9223372036854775808 -bit_rotl_u64(10755112315580060033, 64) => 10755112315580060033 -bit_rotr_u64(10755112315580060033, 64) => 10755112315580060033 -bit_rotl_u64(11163782031541429823, 64) => 11163782031541429823 -bit_rotr_u64(11163782031541429823, 64) => 11163782031541429823 -bit_rotl_u64(13903686156871869732, 64) => 13903686156871869732 -bit_rotr_u64(13903686156871869732, 64) => 13903686156871869732 -bit_rotl_u64(14237897302422917095, 64) => 14237897302422917095 -bit_rotr_u64(14237897302422917095, 64) => 14237897302422917095 -bit_rotl_u64(14302190498657618739, 64) => 14302190498657618739 -bit_rotr_u64(14302190498657618739, 64) => 14302190498657618739 -bit_rotl_u64(15766411510232741269, 64) => 15766411510232741269 -bit_rotr_u64(15766411510232741269, 64) => 15766411510232741269 -bit_rotl_u64(15984546468465238145, 64) => 15984546468465238145 -bit_rotr_u64(15984546468465238145, 64) => 15984546468465238145 -bit_rotl_u64(18446744073709551615, 64) => 18446744073709551615 -bit_rotr_u64(18446744073709551615, 64) => 18446744073709551615 *** test_rotl_rotr: done *** *** test_bswap *** bswap_u64(0) => 0 -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift Vladislav Shpilevoy @ 2020-05-21 14:37 ` Timur Safin 0 siblings, 0 replies; 14+ messages in thread From: Timur Safin @ 2020-05-21 14:37 UTC (permalink / raw) To: 'Vladislav Shpilevoy', tarantool-patches, gorcunov, alyapunov LGTM, also : -----Original Message----- : From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> : Sent: Saturday, May 16, 2020 2:04 AM : To: tarantool-patches@dev.tarantool.org; tsafin@tarantool.org; : gorcunov@gmail.com; alyapunov@tarantool.org : Subject: [PATCH 1/2] bit: fix unaligned memory access and UB bit shift : : Lib/bit library in some functions dereferenced a byte array as a : pointer at uint64 or uint32. It led to unaligned memory access, : when the array didn't have the needed alignment. That made clang : 'alignment' sanitizer crazy. : : Also there were places, where bitwise shift was done for the type : size. For example, a uint32 number was allowed to make 32 bit : shift in both sides. This is undefined behaviour. : : The patch fixes these UBs. Bitwise shift of too long size is : banned. Unaligned access is fixed in some places by changing : uint64 to uint8, when it is possible without perf loss. And in the : others by using explicitly specified unaligned load. So it works : the same, but the sanitizer is aware of that. : : Part of #4609 : --- : src/lib/bit/bit.h | 30 +++++--- : test/unit/bit.c | 4 +- : test/unit/bit.result | 180 ------------------------------------------- : 3 files changed, 21 insertions(+), 193 deletions(-) : : diff --git a/src/lib/bit/bit.h b/src/lib/bit/bit.h : index b11034cb9..eb392b820 100644 : --- a/src/lib/bit/bit.h : +++ b/src/lib/bit/bit.h : @@ -202,10 +202,10 @@ bitmap_size(size_t bit_count) : inline bool : bit_test(const void *data, size_t pos) : { : - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); : - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); : + size_t chunk = pos / CHAR_BIT; : + size_t offset = pos % CHAR_BIT; : : - const unsigned long *ldata = (const unsigned long *) data; : + const uint8_t *ldata = (const uint8_t *)data; : return (ldata[chunk] >> offset) & 0x1; : } : : @@ -222,10 +222,10 @@ bit_test(const void *data, size_t pos) : inline bool : bit_set(void *data, size_t pos) : { : - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); : - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); : + size_t chunk = pos / CHAR_BIT; : + size_t offset = pos % CHAR_BIT; : : - unsigned long *ldata = (unsigned long *) data; : + uint8_t *ldata = (uint8_t *)data; : bool prev = (ldata[chunk] >> offset) & 0x1; : ldata[chunk] |= (1UL << offset); : return prev; : @@ -244,10 +244,10 @@ bit_set(void *data, size_t pos) : inline bool : bit_clear(void *data, size_t pos) : { : - size_t chunk = pos / (CHAR_BIT * sizeof(unsigned long)); : - size_t offset = pos % (CHAR_BIT * sizeof(unsigned long)); : + size_t chunk = pos / CHAR_BIT; : + size_t offset = pos % CHAR_BIT; : : - unsigned long *ldata = (unsigned long *) data; : + uint8_t *ldata = (uint8_t *)data; : bool prev = (ldata[chunk] >> offset) & 0x1; : ldata[chunk] &= ~(1UL << offset); : return prev; : @@ -413,6 +413,7 @@ bit_count_u64(uint64_t x) : inline uint32_t : bit_rotl_u32(uint32_t x, int r) : { : + assert(r > 0 && r < 32); : /* gcc recognises this code and generates a rotate instruction */ : return ((x << r) | (x >> (32 - r))); : } : @@ -423,6 +424,7 @@ bit_rotl_u32(uint32_t x, int r) : inline uint64_t : bit_rotl_u64(uint64_t x, int r) : { : + assert(r > 0 && r < 64); : /* gcc recognises this code and generates a rotate instruction */ : return ((x << r) | (x >> (64 - r))); : } : @@ -433,6 +435,7 @@ bit_rotl_u64(uint64_t x, int r) : __attribute__ ((const)) inline uintmax_t : bit_rotl_umax(uintmax_t x, int r) : { : + assert(r > 0 && r < (int)(sizeof(uintmax_t) * CHAR_BIT)); : /* gcc recognises this code and generates a rotate instruction */ : return ((x << r) | (x >> (sizeof(uintmax_t) * CHAR_BIT - r))); : } : @@ -446,6 +449,7 @@ bit_rotl_umax(uintmax_t x, int r) : inline uint32_t : bit_rotr_u32(uint32_t x, int r) : { : + assert(r > 0 && r < 32); : /* gcc recognises this code and generates a rotate instruction */ : return ((x >> r) | (x << (32 - r))); : } : @@ -456,6 +460,7 @@ bit_rotr_u32(uint32_t x, int r) : inline uint64_t : bit_rotr_u64(uint64_t x, int r) : { : + assert(r > 0 && r < 64); : /* gcc recognises this code and generates a rotate instruction */ : return ((x >> r) | (x << (64 - r))); : } : @@ -538,9 +543,11 @@ bit_index_u64(uint64_t x, int *indexes, int offset); : /* Use bigger words on x86_64 */ : #define ITER_UINT uint64_t : #define ITER_CTZ bit_ctz_u64 : +#define ITER_LOAD load_u64 : #else : #define ITER_UINT uint32_t : #define ITER_CTZ bit_ctz_u32 : +#define ITER_LOAD load_u32 : #endif : /** @endcond **/ : : @@ -589,7 +596,7 @@ bit_iterator_init(struct bit_iterator *it, const void : *data, size_t size, : /* Check if size is a multiple of sizeof(ITER_UINT) */ : const char *e = it->next + size % sizeof(ITER_UINT); : if (bit_likely(it->next == e)) { : - it->word = *(ITER_UINT *) it->next; : + it->word = ITER_LOAD(it->next); : it->next += sizeof(ITER_UINT); : } else { : it->word = it->word_xor; : @@ -615,7 +622,7 @@ bit_iterator_next(struct bit_iterator *it) : return SIZE_MAX; : : /* Extract the next word from memory */ : - it->word = *(ITER_UINT *) it->next; : + it->word = ITER_LOAD(it->next); : it->word ^= it->word_xor; : it->word_base = (it->next - it->start) * CHAR_BIT; : it->next += sizeof(ITER_UINT); : @@ -629,6 +636,7 @@ bit_iterator_next(struct bit_iterator *it) : return it->word_base + bit; : } : : +#undef ITER_LOAD : #undef ITER_CTZ : #undef ITER_UINT : : diff --git a/test/unit/bit.c b/test/unit/bit.c : index da514484d..1dae3671c 100644 : --- a/test/unit/bit.c : +++ b/test/unit/bit.c : @@ -80,7 +80,7 @@ test_rotl_rotr_one(int rot) : printf("bit_rotr_u64(%" PRIu64 ", %d) => %" PRIu64 "\n", : val64, rot, bit_rotr_u64(val64, rot)); : : - if (vals[i] > UINT32_MAX || rot > 32) : + if (vals[i] > UINT32_MAX || rot >= 32) : continue; : : printf("bit_rotl_u32(%" PRIu32 ", %d) => %" PRIu32 "\n", : @@ -95,7 +95,7 @@ test_rotl_rotr(void) : { : header(); : : - int rots[] = { 0, 1, 15, 16, 31, 32, 63, 64 }; : + int rots[] = { 1, 15, 16, 31, 32, 63 }; : for (unsigned r = 0; r < sizeof(rots) / sizeof(rots[0]); r++) { : test_rotl_rotr_one(rots[r]); : } : diff --git a/test/unit/bit.result b/test/unit/bit.result : index 2d4ccc6f4..cfce9f532 100644 : --- a/test/unit/bit.result : +++ b/test/unit/bit.result : @@ -134,96 +134,6 @@ bit_count_u64(15984546468465238145) => 25 : bit_count_u64(18446744073709551615) => 64 : *** test_count: done *** : *** test_rotl_rotr *** : -bit_rotl_u64(0, 0) => 0 : -bit_rotr_u64(0, 0) => 0 : -bit_rotl_u32(0, 0) => 0 : -bit_rotr_u32(0, 0) => 0 : -bit_rotl_u64(1, 0) => 1 : -bit_rotr_u64(1, 0) => 1 : -bit_rotl_u32(1, 0) => 1 : -bit_rotr_u32(1, 0) => 1 : -bit_rotl_u64(2, 0) => 2 : -bit_rotr_u64(2, 0) => 2 : -bit_rotl_u32(2, 0) => 2 : -bit_rotr_u32(2, 0) => 2 : -bit_rotl_u64(32768, 0) => 32768 : -bit_rotr_u64(32768, 0) => 32768 : -bit_rotl_u32(32768, 0) => 32768 : -bit_rotr_u32(32768, 0) => 32768 : -bit_rotl_u64(65535, 0) => 65535 : -bit_rotr_u64(65535, 0) => 65535 : -bit_rotl_u32(65535, 0) => 65535 : -bit_rotr_u32(65535, 0) => 65535 : -bit_rotl_u64(65536, 0) => 65536 : -bit_rotr_u64(65536, 0) => 65536 : -bit_rotl_u32(65536, 0) => 65536 : -bit_rotr_u32(65536, 0) => 65536 : -bit_rotl_u64(726075912, 0) => 726075912 : -bit_rotr_u64(726075912, 0) => 726075912 : -bit_rotl_u32(726075912, 0) => 726075912 : -bit_rotr_u32(726075912, 0) => 726075912 : -bit_rotl_u64(858993459, 0) => 858993459 : -bit_rotr_u64(858993459, 0) => 858993459 : -bit_rotl_u32(858993459, 0) => 858993459 : -bit_rotr_u32(858993459, 0) => 858993459 : -bit_rotl_u64(1073741824, 0) => 1073741824 : -bit_rotr_u64(1073741824, 0) => 1073741824 : -bit_rotl_u32(1073741824, 0) => 1073741824 : -bit_rotr_u32(1073741824, 0) => 1073741824 : -bit_rotl_u64(1245250552, 0) => 1245250552 : -bit_rotr_u64(1245250552, 0) => 1245250552 : -bit_rotl_u32(1245250552, 0) => 1245250552 : -bit_rotr_u32(1245250552, 0) => 1245250552 : -bit_rotl_u64(1431655765, 0) => 1431655765 : -bit_rotr_u64(1431655765, 0) => 1431655765 : -bit_rotl_u32(1431655765, 0) => 1431655765 : -bit_rotr_u32(1431655765, 0) => 1431655765 : -bit_rotl_u64(1656977767, 0) => 1656977767 : -bit_rotr_u64(1656977767, 0) => 1656977767 : -bit_rotl_u32(1656977767, 0) => 1656977767 : -bit_rotr_u32(1656977767, 0) => 1656977767 : -bit_rotl_u64(2147483648, 0) => 2147483648 : -bit_rotr_u64(2147483648, 0) => 2147483648 : -bit_rotl_u32(2147483648, 0) => 2147483648 : -bit_rotr_u32(2147483648, 0) => 2147483648 : -bit_rotl_u64(2283114629, 0) => 2283114629 : -bit_rotr_u64(2283114629, 0) => 2283114629 : -bit_rotl_u32(2283114629, 0) => 2283114629 : -bit_rotr_u32(2283114629, 0) => 2283114629 : -bit_rotl_u64(2502548245, 0) => 2502548245 : -bit_rotr_u64(2502548245, 0) => 2502548245 : -bit_rotl_u32(2502548245, 0) => 2502548245 : -bit_rotr_u32(2502548245, 0) => 2502548245 : -bit_rotl_u64(4294967295, 0) => 4294967295 : -bit_rotr_u64(4294967295, 0) => 4294967295 : -bit_rotl_u32(4294967295, 0) => 4294967295 : -bit_rotr_u32(4294967295, 0) => 4294967295 : -bit_rotl_u64(708915120906848425, 0) => 708915120906848425 : -bit_rotr_u64(708915120906848425, 0) => 708915120906848425 : -bit_rotl_u64(1960191741125985428, 0) => 1960191741125985428 : -bit_rotr_u64(1960191741125985428, 0) => 1960191741125985428 : -bit_rotl_u64(3689348814741910323, 0) => 3689348814741910323 : -bit_rotr_u64(3689348814741910323, 0) => 3689348814741910323 : -bit_rotl_u64(5578377670650038654, 0) => 5578377670650038654 : -bit_rotr_u64(5578377670650038654, 0) => 5578377670650038654 : -bit_rotl_u64(9223372036854775808, 0) => 9223372036854775808 : -bit_rotr_u64(9223372036854775808, 0) => 9223372036854775808 : -bit_rotl_u64(10755112315580060033, 0) => 10755112315580060033 : -bit_rotr_u64(10755112315580060033, 0) => 10755112315580060033 : -bit_rotl_u64(11163782031541429823, 0) => 11163782031541429823 : -bit_rotr_u64(11163782031541429823, 0) => 11163782031541429823 : -bit_rotl_u64(13903686156871869732, 0) => 13903686156871869732 : -bit_rotr_u64(13903686156871869732, 0) => 13903686156871869732 : -bit_rotl_u64(14237897302422917095, 0) => 14237897302422917095 : -bit_rotr_u64(14237897302422917095, 0) => 14237897302422917095 : -bit_rotl_u64(14302190498657618739, 0) => 14302190498657618739 : -bit_rotr_u64(14302190498657618739, 0) => 14302190498657618739 : -bit_rotl_u64(15766411510232741269, 0) => 15766411510232741269 : -bit_rotr_u64(15766411510232741269, 0) => 15766411510232741269 : -bit_rotl_u64(15984546468465238145, 0) => 15984546468465238145 : -bit_rotr_u64(15984546468465238145, 0) => 15984546468465238145 : -bit_rotl_u64(18446744073709551615, 0) => 18446744073709551615 : -bit_rotr_u64(18446744073709551615, 0) => 18446744073709551615 : bit_rotl_u64(0, 1) => 0 : bit_rotr_u64(0, 1) => 0 : bit_rotl_u32(0, 1) => 0 : @@ -586,68 +496,36 @@ bit_rotl_u64(18446744073709551615, 31) => : 18446744073709551615 : bit_rotr_u64(18446744073709551615, 31) => 18446744073709551615 : bit_rotl_u64(0, 32) => 0 : bit_rotr_u64(0, 32) => 0 : -bit_rotl_u32(0, 32) => 0 : -bit_rotr_u32(0, 32) => 0 : bit_rotl_u64(1, 32) => 4294967296 : bit_rotr_u64(1, 32) => 4294967296 : -bit_rotl_u32(1, 32) => 1 : -bit_rotr_u32(1, 32) => 1 : bit_rotl_u64(2, 32) => 8589934592 : bit_rotr_u64(2, 32) => 8589934592 : -bit_rotl_u32(2, 32) => 2 : -bit_rotr_u32(2, 32) => 2 : bit_rotl_u64(32768, 32) => 140737488355328 : bit_rotr_u64(32768, 32) => 140737488355328 : -bit_rotl_u32(32768, 32) => 32768 : -bit_rotr_u32(32768, 32) => 32768 : bit_rotl_u64(65535, 32) => 281470681743360 : bit_rotr_u64(65535, 32) => 281470681743360 : -bit_rotl_u32(65535, 32) => 65535 : -bit_rotr_u32(65535, 32) => 65535 : bit_rotl_u64(65536, 32) => 281474976710656 : bit_rotr_u64(65536, 32) => 281474976710656 : -bit_rotl_u32(65536, 32) => 65536 : -bit_rotr_u32(65536, 32) => 65536 : bit_rotl_u64(726075912, 32) => 3118472296453373952 : bit_rotr_u64(726075912, 32) => 3118472296453373952 : -bit_rotl_u32(726075912, 32) => 726075912 : -bit_rotr_u32(726075912, 32) => 726075912 : bit_rotl_u64(858993459, 32) => 3689348813882916864 : bit_rotr_u64(858993459, 32) => 3689348813882916864 : -bit_rotl_u32(858993459, 32) => 858993459 : -bit_rotr_u32(858993459, 32) => 858993459 : bit_rotl_u64(1073741824, 32) => 4611686018427387904 : bit_rotr_u64(1073741824, 32) => 4611686018427387904 : -bit_rotl_u32(1073741824, 32) => 1073741824 : -bit_rotr_u32(1073741824, 32) => 1073741824 : bit_rotl_u64(1245250552, 32) => 5348310396165947392 : bit_rotr_u64(1245250552, 32) => 5348310396165947392 : -bit_rotl_u32(1245250552, 32) => 1245250552 : -bit_rotr_u32(1245250552, 32) => 1245250552 : bit_rotl_u64(1431655765, 32) => 6148914689804861440 : bit_rotr_u64(1431655765, 32) => 6148914689804861440 : -bit_rotl_u32(1431655765, 32) => 1431655765 : -bit_rotr_u32(1431655765, 32) => 1431655765 : bit_rotl_u64(1656977767, 32) => 7116665319464108032 : bit_rotr_u64(1656977767, 32) => 7116665319464108032 : -bit_rotl_u32(1656977767, 32) => 1656977767 : -bit_rotr_u32(1656977767, 32) => 1656977767 : bit_rotl_u64(2147483648, 32) => 9223372036854775808 : bit_rotr_u64(2147483648, 32) => 9223372036854775808 : -bit_rotl_u32(2147483648, 32) => 2147483648 : -bit_rotr_u32(2147483648, 32) => 2147483648 : bit_rotl_u64(2283114629, 32) => 9805902664574173184 : bit_rotr_u64(2283114629, 32) => 9805902664574173184 : -bit_rotl_u32(2283114629, 32) => 2283114629 : -bit_rotr_u32(2283114629, 32) => 2283114629 : bit_rotl_u64(2502548245, 32) => 10748362868937195520 : bit_rotr_u64(2502548245, 32) => 10748362868937195520 : -bit_rotl_u32(2502548245, 32) => 2502548245 : -bit_rotr_u32(2502548245, 32) => 2502548245 : bit_rotl_u64(4294967295, 32) => 18446744069414584320 : bit_rotr_u64(4294967295, 32) => 18446744069414584320 : -bit_rotl_u32(4294967295, 32) => 4294967295 : -bit_rotr_u32(4294967295, 32) => 4294967295 : bit_rotl_u64(708915120906848425, 32) => 16541238372230140555 : bit_rotr_u64(708915120906848425, 32) => 16541238372230140555 : bit_rotl_u64(1960191741125985428, 32) => 14229128056835145728 : @@ -732,64 +610,6 @@ bit_rotl_u64(15984546468465238145, 63) => : 17215645271087394880 : bit_rotr_u64(15984546468465238145, 63) => 13522348863220924675 : bit_rotl_u64(18446744073709551615, 63) => 18446744073709551615 : bit_rotr_u64(18446744073709551615, 63) => 18446744073709551615 : -bit_rotl_u64(0, 64) => 0 : -bit_rotr_u64(0, 64) => 0 : -bit_rotl_u64(1, 64) => 1 : -bit_rotr_u64(1, 64) => 1 : -bit_rotl_u64(2, 64) => 2 : -bit_rotr_u64(2, 64) => 2 : -bit_rotl_u64(32768, 64) => 32768 : -bit_rotr_u64(32768, 64) => 32768 : -bit_rotl_u64(65535, 64) => 65535 : -bit_rotr_u64(65535, 64) => 65535 : -bit_rotl_u64(65536, 64) => 65536 : -bit_rotr_u64(65536, 64) => 65536 : -bit_rotl_u64(726075912, 64) => 726075912 : -bit_rotr_u64(726075912, 64) => 726075912 : -bit_rotl_u64(858993459, 64) => 858993459 : -bit_rotr_u64(858993459, 64) => 858993459 : -bit_rotl_u64(1073741824, 64) => 1073741824 : -bit_rotr_u64(1073741824, 64) => 1073741824 : -bit_rotl_u64(1245250552, 64) => 1245250552 : -bit_rotr_u64(1245250552, 64) => 1245250552 : -bit_rotl_u64(1431655765, 64) => 1431655765 : -bit_rotr_u64(1431655765, 64) => 1431655765 : -bit_rotl_u64(1656977767, 64) => 1656977767 : -bit_rotr_u64(1656977767, 64) => 1656977767 : -bit_rotl_u64(2147483648, 64) => 2147483648 : -bit_rotr_u64(2147483648, 64) => 2147483648 : -bit_rotl_u64(2283114629, 64) => 2283114629 : -bit_rotr_u64(2283114629, 64) => 2283114629 : -bit_rotl_u64(2502548245, 64) => 2502548245 : -bit_rotr_u64(2502548245, 64) => 2502548245 : -bit_rotl_u64(4294967295, 64) => 4294967295 : -bit_rotr_u64(4294967295, 64) => 4294967295 : -bit_rotl_u64(708915120906848425, 64) => 708915120906848425 : -bit_rotr_u64(708915120906848425, 64) => 708915120906848425 : -bit_rotl_u64(1960191741125985428, 64) => 1960191741125985428 : -bit_rotr_u64(1960191741125985428, 64) => 1960191741125985428 : -bit_rotl_u64(3689348814741910323, 64) => 3689348814741910323 : -bit_rotr_u64(3689348814741910323, 64) => 3689348814741910323 : -bit_rotl_u64(5578377670650038654, 64) => 5578377670650038654 : -bit_rotr_u64(5578377670650038654, 64) => 5578377670650038654 : -bit_rotl_u64(9223372036854775808, 64) => 9223372036854775808 : -bit_rotr_u64(9223372036854775808, 64) => 9223372036854775808 : -bit_rotl_u64(10755112315580060033, 64) => 10755112315580060033 : -bit_rotr_u64(10755112315580060033, 64) => 10755112315580060033 : -bit_rotl_u64(11163782031541429823, 64) => 11163782031541429823 : -bit_rotr_u64(11163782031541429823, 64) => 11163782031541429823 : -bit_rotl_u64(13903686156871869732, 64) => 13903686156871869732 : -bit_rotr_u64(13903686156871869732, 64) => 13903686156871869732 : -bit_rotl_u64(14237897302422917095, 64) => 14237897302422917095 : -bit_rotr_u64(14237897302422917095, 64) => 14237897302422917095 : -bit_rotl_u64(14302190498657618739, 64) => 14302190498657618739 : -bit_rotr_u64(14302190498657618739, 64) => 14302190498657618739 : -bit_rotl_u64(15766411510232741269, 64) => 15766411510232741269 : -bit_rotr_u64(15766411510232741269, 64) => 15766411510232741269 : -bit_rotl_u64(15984546468465238145, 64) => 15984546468465238145 : -bit_rotr_u64(15984546468465238145, 64) => 15984546468465238145 : -bit_rotl_u64(18446744073709551615, 64) => 18446744073709551615 : -bit_rotr_u64(18446744073709551615, 64) => 18446744073709551615 : *** test_rotl_rotr: done *** : *** test_bswap *** : bswap_u64(0) => 0 : -- : 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 14+ messages in thread
* [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-15 23:03 [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift Vladislav Shpilevoy @ 2020-05-15 23:03 ` Vladislav Shpilevoy 2020-05-18 12:55 ` Aleksandr Lyapunov 2020-05-21 19:33 ` [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy 2 siblings, 1 reply; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-15 23:03 UTC (permalink / raw) To: tarantool-patches, tsafin, gorcunov, alyapunov Functions tt_uuid_is_nil() and tt_uuid_is_equal() used struct tt_uuid as an array of two uint64. But it is wrong, because struct tt_uuid has 4 byte alignment, while uint64 requires 8. That led to unaligned memory access in these 2 functions. The patch makes these functions treat struct tt_uuid as 4 uint32 instead of 2 uint64. Part of #4609 --- src/lib/uuid/tt_uuid.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/src/lib/uuid/tt_uuid.h b/src/lib/uuid/tt_uuid.h index c2c436ed4..d62991c65 100644 --- a/src/lib/uuid/tt_uuid.h +++ b/src/lib/uuid/tt_uuid.h @@ -158,8 +158,8 @@ tt_uuid_bswap(struct tt_uuid *uu) inline bool tt_uuid_is_nil(const struct tt_uuid *uu) { - const uint64_t *p = (const uint64_t *) uu; - return !p[0] && !p[1]; + const uint32_t *p = (const uint32_t *) uu; + return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; } /** @@ -172,9 +172,10 @@ tt_uuid_is_nil(const struct tt_uuid *uu) inline bool tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) { - const uint64_t *lp = (const uint64_t *) lhs; - const uint64_t *rp = (const uint64_t *) rhs; - return lp[0] == rp[0] && lp[1] == rp[1]; + const uint32_t *lp = (const uint32_t *) lhs; + const uint32_t *rp = (const uint32_t *) rhs; + return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && + lp[3] == rp[3]; } extern const struct tt_uuid uuid_nil; -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access Vladislav Shpilevoy @ 2020-05-18 12:55 ` Aleksandr Lyapunov 2020-05-18 21:17 ` Vladislav Shpilevoy 0 siblings, 1 reply; 14+ messages in thread From: Aleksandr Lyapunov @ 2020-05-18 12:55 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches, tsafin, gorcunov Thanks for the patch! see my comment below. On 5/16/20 2:03 AM, Vladislav Shpilevoy wrote: > tt_uuid_is_nil(const struct tt_uuid *uu) > { > - const uint64_t *p = (const uint64_t *) uu; > - return !p[0] && !p[1]; > + const uint32_t *p = (const uint32_t *) uu; > + return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; > } > > /** > @@ -172,9 +172,10 @@ tt_uuid_is_nil(const struct tt_uuid *uu) > inline bool > tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) > { > - const uint64_t *lp = (const uint64_t *) lhs; > - const uint64_t *rp = (const uint64_t *) rhs; > - return lp[0] == rp[0] && lp[1] == rp[1]; > + const uint32_t *lp = (const uint32_t *) lhs; > + const uint32_t *rp = (const uint32_t *) rhs; > + return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && > + lp[3] == rp[3]; It seems that we degrade performance just for clang to be happy.. I would suggest to use memcmp in this case. It's portable and allows a compiler to generate the best possible code. I've measured it (gcc) and memcmp version is twice faster than your solution. Even for _is_nil method it's better to use memcmp with statically allocated zero buffer. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-18 12:55 ` Aleksandr Lyapunov @ 2020-05-18 21:17 ` Vladislav Shpilevoy 2020-05-19 7:28 ` Aleksandr Lyapunov 0 siblings, 1 reply; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-18 21:17 UTC (permalink / raw) To: Aleksandr Lyapunov, tarantool-patches, tsafin, gorcunov Thanks for the review! > On 5/16/20 2:03 AM, Vladislav Shpilevoy wrote: >> tt_uuid_is_nil(const struct tt_uuid *uu) >> { >> - const uint64_t *p = (const uint64_t *) uu; >> - return !p[0] && !p[1]; >> + const uint32_t *p = (const uint32_t *) uu; >> + return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; >> } >> /** >> @@ -172,9 +172,10 @@ tt_uuid_is_nil(const struct tt_uuid *uu) >> inline bool >> tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) >> { >> - const uint64_t *lp = (const uint64_t *) lhs; >> - const uint64_t *rp = (const uint64_t *) rhs; >> - return lp[0] == rp[0] && lp[1] == rp[1]; >> + const uint32_t *lp = (const uint32_t *) lhs; >> + const uint32_t *rp = (const uint32_t *) rhs; >> + return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && >> + lp[3] == rp[3]; > > It seems that we degrade performance just for clang to be happy.. Yeah, well. This is the same like saying that we degrade performance when we do OOM checks. Unaligned memory access is UB. This is a bug. > I would suggest to use memcmp in this case. > It's portable and allows a compiler to generate the best possible code. > I've measured it (gcc) and memcmp version is twice faster than your solution. > Even for _is_nil method it's better to use memcmp with statically allocated zero buffer. Could you please show the benchmark? I did my own, and I can't see any significant difference. The present difference is so small, that it looks like jitter. Both in is_nil and is_eq. I did a very simple bench in Lua, without any GCed objects. uuid = require('uuid') uuid1 = uuid.new() uuid2 = uuid.new() clock = require('clock') function bench1() local res = 0 for i = 1, 10000000 do res = res + ((uuid1 == uuid2) and 1 or 0) end return res end function bench2() local res = 0 for i = 1, 100000000 do res = res + ((uuid1:isnil() and uuid2:isnil()) and 1 or 0) end return res end clock.bench(bench2) Bench1 is to check is_eq(). Bench2 is to check is_nil(). The build was Release. Here is what I've got: uuid is eq: 4 x uint32 (my solution): 2.691714 2.727981 2.852932 2.764013 memcmp (your solution): 2.701216 2.731426 2.641924 2.628619 2 x uint64 (old solution): 2.671254 2.644062 2.639914 2.66742 uuid is nil: 4 x uint32: 0.2599 0.256139 0.258356 0.259495 memcmp: 0.26042 0.263186 0.262118 0.261504 2 x uint64: 0.256665 0.259839 0.259746 0.258651 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-18 21:17 ` Vladislav Shpilevoy @ 2020-05-19 7:28 ` Aleksandr Lyapunov 2020-05-19 8:34 ` Timur Safin 2020-05-19 21:24 ` Vladislav Shpilevoy 0 siblings, 2 replies; 14+ messages in thread From: Aleksandr Lyapunov @ 2020-05-19 7:28 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches, tsafin, gorcunov On 5/19/20 12:17 AM, Vladislav Shpilevoy wrote: > > Yeah, well. This is the same like saying that we degrade performance > when we do OOM checks. Unaligned memory access is UB. This is a bug. It's not the same. You could see I propose a solution that both violates no rules and have the best performance. It's very far from just a complain. >> I would suggest to use memcmp in this case. >> It's portable and allows a compiler to generate the best possible code. >> I've measured it (gcc) and memcmp version is twice faster than your solution. >> Even for _is_nil method it's better to use memcmp with statically allocated zero buffer. > Could you please show the benchmark? I did my own, and I can't see any > significant difference. The present difference is so small, that it > looks like jitter. Both in is_nil and is_eq. > > I did a very simple bench in Lua, without any GCed objects. I fear that Lua is not suitable for performance tests. Here you are: https://pastebin.com/VXkS1v6M Also please take a look at disasm: https://godbolt.org/z/s6Cti4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-19 7:28 ` Aleksandr Lyapunov @ 2020-05-19 8:34 ` Timur Safin 2020-05-19 21:24 ` Vladislav Shpilevoy 1 sibling, 0 replies; 14+ messages in thread From: Timur Safin @ 2020-05-19 8:34 UTC (permalink / raw) To: 'Aleksandr Lyapunov', 'Vladislav Shpilevoy', tarantool-patches, gorcunov : From: Aleksandr Lyapunov <alyapunov@tarantool.org> : : : > Could you please show the benchmark? I did my own, and I can't see any : > significant difference. The present difference is so small, that it : > looks like jitter. Both in is_nil and is_eq. : > : > I did a very simple bench in Lua, without any GCed objects. : : I fear that Lua is not suitable for performance tests. : Here you are: https://pastebin.com/VXkS1v6M : Also please take a look at disasm: https://godbolt.org/z/s6Cti4 Haha, good example - love it! One may assume that this is awkward C++ aliasing rules (where only std::byte_t, char, and unsigned char are aliasing) and should look differently in C mode, but no - it's the same for C enforced https://godbolt.org/z/fdbBLf So I'd agree - memcmp is the simplest way to do the fastest comparison while Keeping that picky aliasing analysis in compiler happy. Regards, Timur ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-19 7:28 ` Aleksandr Lyapunov 2020-05-19 8:34 ` Timur Safin @ 2020-05-19 21:24 ` Vladislav Shpilevoy 2020-05-20 8:18 ` Aleksandr Lyapunov 2020-05-21 14:37 ` Timur Safin 1 sibling, 2 replies; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-19 21:24 UTC (permalink / raw) To: Aleksandr Lyapunov, tarantool-patches, tsafin, gorcunov Thanks for the comments! On 19/05/2020 09:28, Aleksandr Lyapunov wrote: > > On 5/19/20 12:17 AM, Vladislav Shpilevoy wrote: >> >> Yeah, well. This is the same like saying that we degrade performance >> when we do OOM checks. Unaligned memory access is UB. This is a bug. > It's not the same. You could see I propose a solution that both violates no > rules and have the best performance. It's very far from just a complain. The best performance, but not in tarantool as I see. Since your bench has nothing to do with tarantool code. >>> I would suggest to use memcmp in this case. >>> It's portable and allows a compiler to generate the best possible code. >>> I've measured it (gcc) and memcmp version is twice faster than your solution. >>> Even for _is_nil method it's better to use memcmp with statically allocated zero buffer. >> Could you please show the benchmark? I did my own, and I can't see any >> significant difference. The present difference is so small, that it >> looks like jitter. Both in is_nil and is_eq. >> >> I did a very simple bench in Lua, without any GCed objects. > I fear that Lua is not suitable for performance tests.> Here you are: https://pastebin.com/VXkS1v6M > Also please take a look at disasm: https://godbolt.org/z/s6Cti4 Lua is fine if you don't use GC, for most of the cases. To see if there is any observable difference. In the code we check I couldn't find a difference. *Observable*. However the disassemble persuaded me. I checked tarantool's executable in Release mode. Here is what I've got for 4 x uint32 in objdump: 10021b210: 55 pushq %rbp 10021b211: 48 89 e5 movq %rsp, %rbp 10021b214: 8b 07 movl (%rdi), %eax 10021b216: 3b 06 cmpl (%rsi), %eax 10021b218: 75 1b jne 27 <_tt_uuid_is_equal+0x25> 10021b21a: 8b 47 04 movl 4(%rdi), %eax 10021b21d: 3b 46 04 cmpl 4(%rsi), %eax 10021b220: 75 17 jne 23 <_tt_uuid_is_equal+0x29> 10021b222: 8b 47 08 movl 8(%rdi), %eax 10021b225: 3b 46 08 cmpl 8(%rsi), %eax 10021b228: 75 13 jne 19 <_tt_uuid_is_equal+0x2d> 10021b22a: 8b 47 0c movl 12(%rdi), %eax 10021b22d: 3b 46 0c cmpl 12(%rsi), %eax 10021b230: 0f 94 c0 sete %al 10021b233: 5d popq %rbp 10021b234: c3 retq Here is what I got for memcmp: 10021b220: 55 pushq %rbp 10021b221: 48 89 e5 movq %rsp, %rbp 10021b224: f3 0f 6f 07 movdqu (%rdi), %xmm0 10021b228: f3 0f 6f 0e movdqu (%rsi), %xmm1 10021b22c: 66 0f 74 c8 pcmpeqb %xmm0, %xmm1 10021b230: 66 0f d7 c1 pmovmskb %xmm1, %eax 10021b234: 3d ff ff 00 00 cmpl $65535, %eax 10021b239: 0f 94 c0 sete %al 10021b23c: 5d popq %rbp 10021b23d: c3 retq So yeah, looks like memcmp() is better indeed. Also I looked at is_nil() - it looked like huge shit. I am really surprised the compiler couldn't optimize it to the same what we see with memcmp(). Thanks for helping, I am glad you took a look at this. I used memcmp() for both places. I tried using a const 0 buffer in tt_uuid_is_nil() inside of it, but appeared there is no difference with comparing with uuid_nil directly. In the assembly. In clang. So I just took tt_uuid_is_equal(uu, &uuid_nil) ==================== diff --git a/src/lib/uuid/tt_uuid.h b/src/lib/uuid/tt_uuid.h index d62991c65..70c3b98b1 100644 --- a/src/lib/uuid/tt_uuid.h +++ b/src/lib/uuid/tt_uuid.h @@ -149,19 +149,6 @@ tt_uuid_bswap(struct tt_uuid *uu) uu->time_hi_and_version = bswap_u16(uu->time_hi_and_version); } -/** - * \brief Test that uuid is nil - * \param uu UUID - * \retval true if all members of \a uu 0 - * \retval false otherwise - */ -inline bool -tt_uuid_is_nil(const struct tt_uuid *uu) -{ - const uint32_t *p = (const uint32_t *) uu; - return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; -} - /** * \brief Test that \a lhs equal \a rhs * \param lhs UUID @@ -172,14 +159,23 @@ tt_uuid_is_nil(const struct tt_uuid *uu) inline bool tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) { - const uint32_t *lp = (const uint32_t *) lhs; - const uint32_t *rp = (const uint32_t *) rhs; - return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && - lp[3] == rp[3]; + return memcmp(lhs, rhs, sizeof(*lhs)) == 0; } extern const struct tt_uuid uuid_nil; +/** + * \brief Test that uuid is nil. + * \param uu UUID. + * \retval true If all members of \a uu 0. + * \retval false Otherwise. + */ +inline bool +tt_uuid_is_nil(const struct tt_uuid *uu) +{ + return tt_uuid_is_equal(uu, &uuid_nil); +} + char * tt_uuid_str(const struct tt_uuid *uu); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-19 21:24 ` Vladislav Shpilevoy @ 2020-05-20 8:18 ` Aleksandr Lyapunov 2020-05-20 21:38 ` Vladislav Shpilevoy 2020-05-21 14:37 ` Timur Safin 1 sibling, 1 reply; 14+ messages in thread From: Aleksandr Lyapunov @ 2020-05-20 8:18 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches, tsafin, gorcunov Thanks for the changes! Perfect with uuid_nil, lgtm. On 5/20/20 12:24 AM, Vladislav Shpilevoy wrote: > Thanks for the comments! > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-20 8:18 ` Aleksandr Lyapunov @ 2020-05-20 21:38 ` Vladislav Shpilevoy 2020-05-21 8:28 ` Aleksandr Lyapunov 0 siblings, 1 reply; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-20 21:38 UTC (permalink / raw) To: Aleksandr Lyapunov, tarantool-patches, tsafin, gorcunov Only the second commit? Did you look at the first commit, for lib/bit? On 20/05/2020 10:18, Aleksandr Lyapunov wrote: > Thanks for the changes! Perfect with uuid_nil, lgtm. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-20 21:38 ` Vladislav Shpilevoy @ 2020-05-21 8:28 ` Aleksandr Lyapunov 0 siblings, 0 replies; 14+ messages in thread From: Aleksandr Lyapunov @ 2020-05-21 8:28 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches, tsafin, gorcunov The first commit lgtm too. Sorry, I was meant to approve it one I checked it about a week ago and examined the performance too. I found out that on some platforms (at least on Xeon Gold) you solution for bit_set is even faster. On 5/21/20 12:38 AM, Vladislav Shpilevoy wrote: > Only the second commit? Did you look at the first commit, for lib/bit? > > On 20/05/2020 10:18, Aleksandr Lyapunov wrote: >> Thanks for the changes! Perfect with uuid_nil, lgtm. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access 2020-05-19 21:24 ` Vladislav Shpilevoy 2020-05-20 8:18 ` Aleksandr Lyapunov @ 2020-05-21 14:37 ` Timur Safin 1 sibling, 0 replies; 14+ messages in thread From: Timur Safin @ 2020-05-21 14:37 UTC (permalink / raw) To: 'Vladislav Shpilevoy', 'Aleksandr Lyapunov', tarantool-patches, gorcunov LGTM : -----Original Message----- : From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> : Sent: Wednesday, May 20, 2020 12:25 AM : To: Aleksandr Lyapunov <alyapunov@tarantool.org>; tarantool- : patches@dev.tarantool.org; tsafin@tarantool.org; gorcunov@gmail.com : Subject: Re: [PATCH 2/2] uuid: fix unaligned memory access : : ==================== : diff --git a/src/lib/uuid/tt_uuid.h b/src/lib/uuid/tt_uuid.h : index d62991c65..70c3b98b1 100644 : --- a/src/lib/uuid/tt_uuid.h : +++ b/src/lib/uuid/tt_uuid.h : @@ -149,19 +149,6 @@ tt_uuid_bswap(struct tt_uuid *uu) : uu->time_hi_and_version = bswap_u16(uu->time_hi_and_version); : } : : -/** : - * \brief Test that uuid is nil : - * \param uu UUID : - * \retval true if all members of \a uu 0 : - * \retval false otherwise : - */ : -inline bool : -tt_uuid_is_nil(const struct tt_uuid *uu) : -{ : - const uint32_t *p = (const uint32_t *) uu; : - return p[0] == 0 && p[1] == 0 && p[2] == 0 && p[3] == 0; : -} : - : /** : * \brief Test that \a lhs equal \a rhs : * \param lhs UUID : @@ -172,14 +159,23 @@ tt_uuid_is_nil(const struct tt_uuid *uu) : inline bool : tt_uuid_is_equal(const struct tt_uuid *lhs, const struct tt_uuid *rhs) : { : - const uint32_t *lp = (const uint32_t *) lhs; : - const uint32_t *rp = (const uint32_t *) rhs; : - return lp[0] == rp[0] && lp[1] == rp[1] && lp[2] == rp[2] && : - lp[3] == rp[3]; : + return memcmp(lhs, rhs, sizeof(*lhs)) == 0; : } : : extern const struct tt_uuid uuid_nil; : : +/** : + * \brief Test that uuid is nil. : + * \param uu UUID. : + * \retval true If all members of \a uu 0. : + * \retval false Otherwise. : + */ : +inline bool : +tt_uuid_is_nil(const struct tt_uuid *uu) : +{ : + return tt_uuid_is_equal(uu, &uuid_nil); : +} : + : char * : tt_uuid_str(const struct tt_uuid *uu); : ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment 2020-05-15 23:03 [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift Vladislav Shpilevoy 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access Vladislav Shpilevoy @ 2020-05-21 19:33 ` Vladislav Shpilevoy 2 siblings, 0 replies; 14+ messages in thread From: Vladislav Shpilevoy @ 2020-05-21 19:33 UTC (permalink / raw) To: tarantool-patches, tsafin, gorcunov, alyapunov Pushed to master. On 16/05/2020 01:03, Vladislav Shpilevoy wrote: > The patchset fixes unaligned memory access in UUID and bit > libraries. Also in the bit library the bitshift overflow UB is > fixed. > > This is not all of issue 4609, but it is going to consist of a lot > of independent parts like this, which can be reviewed and pushed > without waiting for the whole patchset. To speed up the process. > > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4609-sanitize-uuid-and-bit > Issue: https://github.com/tarantool/tarantool/issues/4609 > > Vladislav Shpilevoy (2): > bit: fix unaligned memory access and UB bit shift > uuid: fix unaligned memory access > > src/lib/bit/bit.h | 30 ++++--- > src/lib/uuid/tt_uuid.h | 11 +-- > test/unit/bit.c | 4 +- > test/unit/bit.result | 180 ----------------------------------------- > 4 files changed, 27 insertions(+), 198 deletions(-) > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-05-21 19:33 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-15 23:03 [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 1/2] bit: fix unaligned memory access and UB bit shift Vladislav Shpilevoy 2020-05-21 14:37 ` Timur Safin 2020-05-15 23:03 ` [Tarantool-patches] [PATCH 2/2] uuid: fix unaligned memory access Vladislav Shpilevoy 2020-05-18 12:55 ` Aleksandr Lyapunov 2020-05-18 21:17 ` Vladislav Shpilevoy 2020-05-19 7:28 ` Aleksandr Lyapunov 2020-05-19 8:34 ` Timur Safin 2020-05-19 21:24 ` Vladislav Shpilevoy 2020-05-20 8:18 ` Aleksandr Lyapunov 2020-05-20 21:38 ` Vladislav Shpilevoy 2020-05-21 8:28 ` Aleksandr Lyapunov 2020-05-21 14:37 ` Timur Safin 2020-05-21 19:33 ` [Tarantool-patches] [PATCH 0/2] Sanitize uuid and bit alignment Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox