I’ve done a lot of reading recently on reinterpret_cast
as I want to ensure I’m using it correctly and not accidentally invoking undefined behavior. I feel like cppreference and this great writeup on strict aliasing has me 95% of the way there, but I wanted some clarification on my understanding of what is, and is not UB.
Let’s say I have a struct:
struct __attribute__((packed)) SimpleStruct {
uint32_t a = 0;
uint8_t b = 1;
int16_t c = 2;
uint8_t d[5] = {0, 1, 2, 3, 4};
};
I’ve used the __attribute__((packed))
directive to ensure no padding bytes are used, to the detriment of performance/optimizations. Per the standard, examining the byte representation via a reinterpret_cast
to unsigned char *
of the object is allowed, and not UB:
unsigned char *bytes_of_simple_struct = reinterpret_cast<unsigned char *>(&simple_struct);
Now, and this is the part I wanted clarification on, I believe modifying bytes of the struct via this pointer is also allowed, and not UB (assuming you obey the size of the object):
static_assert(sizeof(simple_struct) == 12);
bytes_of_simple_struct[0] = 0x1U;
Now, I understand that what the value of simple_struct.a
will be depends on endianness of the system. However, accessing simple_struct.a
post this modification of bytes is still defined behavior correct? Because as long as I haven’t modified the bytes to be an invalid representation of the type they make up, behavior should still be defined.
Conversely, if my struct had a bool
instead:
struct __attribute__((packed)) SimpleStruct {
bool a_bool = false;
uint8_t b = 1;
int16_t c = 2;
uint8_t d[5] = {0, 1, 2, 3, 4};
};
Then doing something like this:
bytes_of_simple_struct[0] = 0xFFU;
assert(simple_struct.a_bool == false);
Would be invoking UB, since I’ve now modified the underlying bytes of a_bool
such that there is not valid representation of type bool
. Basically, as long as any byte modification still obeys the rules for what bytes can represent each type, behavior should be defined? And in the case of the basic numeric types, you can essentially modify the bytes to anything (whether or not this is useful is another story), as any byte value is a valid uint8_t
, any two bytes are a valid uint16_t
, etc…
Is my understanding correct?