I have structure
typedef struct {
int i;
float f;
} S;
The rules for mutual conversions between pointers to a structure and its first member say that I can write like this
S s;
int *pInt = (int *)&s;
However, is it legal according to the C standard to convert an arbitrary pointer to an integer back into a pointer to a structure, and then access its other members?
int *pInt = ...; // some pointer to int
...
S *pS = (S *)pInt;
float f = pS->f;
pS->f = 10;
I tried to find answers to this question in the standard, but did not find a direct answer.
I have a guarantee that the place where pInt points to actually contains a structure (and in fact, structures can be of different types and sizes, but they all have the same first int member – this is how I determine what kind of structure it is), but I want to understand if there is undefined behavior here?
8
I have a guarantee that the place where pInt points to actually contains a structure (and in fact, structures can be of different types and sizes, but they all have the same first int member – this is how I determine what kind of structure it is), but I want to understand if there is undefined behavior here?
Yes, the behavior is undefined. If pInt
points to memory where there is a structure of type Q
(meaning that memory was declared as a Q
or has been given an effective type of Q
by copying bytes representing a Q
into it) and Q
is not S
and does not contain an S
member, then the behavior of using ((S *) pInt)->member
to read a member of the Q
structure is not defined by the C standard.
It is not clear why you are asking this question. If you have a pInt
that is a pointer to an int
, and that int
tells you what type of structure is there, then simply examine *pInt
to decide what type to use for later access. You could have a switch
statement or other code that uses ((S *) pInt)->member
if the int
indicates S
or ((Q *) pInt)->member
if the int
indicates Q
, and so on. Why would you want to access the memory as an S
type if the memory does not contain an S
?
In response to a comment, here is sample code to access the structure after identifying its type. We presume some constants CodeForStructureN
have been defined, each identifying a corresponding structure type named StructureN
:
void foo(void *buffer)
{
switch ((int *) buffer)
{
case CodeForStructure0:
Structure0 *S = buffer;
… Use S to access Structure0 members, as in S->member …
break;
case CodeForStructure1:
Structure1 *S = buffer;
… Use S to access Structure1 members, as in S->member …
break;
case CodeForStructure2:
Structure2 *S = buffer;
… Use S to access Structure2 members, as in S->member …
break;
…
default:
fprintf(stderr, "Unrecognized structure code.n");
exit(EXIT_FAILURE);
}
}
18
Eric Postpischil’s solution is a good one, but I think Andrew Henle’s mention of unions begs for expansion.
typedef struct {
int type;
union {
struct a_s a;
struct b_s b;
struct c_s c;
} u;
} s_t;
...
s_t x;
get_s_msg(&x);
switch (x.type) {
case 0: {
struct a_s *a = &x.u.a;
printf("a->a_field=%dn", a->a_field);
break;
}
...
An advantage of using the union is you can declare a buffer of that type and it will be the right size to hold any of the underlying structure types. Another advantage is better compile-time type checking which can find some bugs early. Many would say the resulting code is more readable.
A disadvantage is wasted space if you need to hold many of them, and they are of greatly varying size; they’ll all be the worst-case size. Also the union is pretty much defeated for a message type that is itself of variable length.
Streve Ford is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
The way the aliasing rules for C are written does not include any general provision allowing an object of structure or union type to be accessed via lvalue of any non-character primitive type, though it does provide for accessing an object of member type using an lvalue of the struct or union type.
This would make sense if one viewed as so obvious that it shouldn’t need saying the notion that lvalues that are freshly visibly derived from lvalues of other type should be assumed capable of aliasing the same things as the former lvalue. A compiler that isn’t being willfully obtuse would be able to recognize that even though foo->intArray[2] = 3;
would be equivalent to *(foo->intArray+3) = 3;
and the left operand is a dereferenced int*
, it is freshly visibly derived from an lvalue of *foo
‘s type. If a structure has a member of type int
, it would be rare for code to access that storage through the structure, and then via an int*
, without an intervening operation that takes the address of the structure member, but it would be common for code to access the storage via int*
and then through the structure, without any intervening access.
The authors of the Standard made no effort to consider all possible corner cases and ensure that the Standard mandated useful treatment. Given that it relies upon compilers to exercise common sense with such constructs as structPtr->intArray[index]
, even though they violate the type-based access constraint, and given that compilers almost invariably support a dialect where the notion of type-based aliasing doesn’t exist, treating code which does anything tricky as requiring the use of such a dialect is wiser than trying to guess what usage patterns a compiler will or will not support.