-
Notifications
You must be signed in to change notification settings - Fork 111
Description
Describe the bug
When I construct a string with a char*, the string is initialised like this:
void init(string_view other) noexcept(false)
{
// "other" is the string_view I passed in the constructor
sz_ptr_t start; // after allocating memory start = 0x7bfff5900029
if (!_with_alloc(
[&](sz_alloc_type &alloc) { return (start = sz_string_init_length(&string_, other.size(), &alloc)); }))
throw std::bad_alloc();
sz_copy(start, (sz_cptr_t)other.data(), other.size());
}
The result of the allocation is at address 0x7b......29, which seems strange. Then the first copy to "target" here:
#if SZ_USE_MISALIGNED_LOADS
while (length >= 8) *(sz_u64_t *)target = *(sz_u64_t const *)source, target += 8, source += 8, length -= 8;
#endif
results in the UBSAN warning:
runtime error: store to misaligned address 0x7bfff5900029 for type 'sz_u64_t' (aka 'unsigned long'), which requires 8 byte alignment SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
There are a number of things I don't understand about this.
- I'm on an x86_64 platform, unaligned reads and stores should be supported according to the comment in strinzilla.h:
/**
- @brief A misaligned load can be - trying to fetch eight consecutive bytes from an address
that is not divisible by eight. On x86 enabled by default. On ARM it's not.- Most platforms support it, but there is no industry standard way to check for those.
- This value will mostly affect the performance of the serial (SWAR) backend.
*/
#ifndef SZ_USE_MISALIGNED_LOADS
#if defined(x86_64) || defined(_M_X64) || defined(i386) || defined(_M_IX86)
#define SZ_USE_MISALIGNED_LOADS (1) // true or false
- This custom copy function exists because memcpy(NULL) is UB, and so this proceeds to copy 8 bytes at a time, or 1 byte at a time in a loop. Wouldn't it just be more efficient to check for null and memcpy? Memcpy performs better than looping manually.
Steps to reproduce
Constructing a string as static global variable, initialised by the dynamic initialisation process, passing a string literal (const char*) to constructor of Stringzilla string class.
Expected behavior
???
StringZilla version
3.12.6
Operating System
Linux Mint
Hardware architecture
x86
Which interface are you using?
C++ bindings
Contact Details
No response
Are you open to being tagged as a contributor?
- I am open to being mentioned in the project
.githistory as a contributor
Is there an existing issue for this?
- I have searched the existing issues
Code of Conduct
- I agree to follow this project's Code of Conduct
Edit: I have posted a question on Stack Overflow here . Apparently it is undefined behaviour. Apparently the address is 0x.....9 because that's where the SSO buffer starts? Wouldn't it just be easier to use memcpy?