Introduction
I am looking for advice on the best way to guarantee a particular layout in memory of a data struct to be passed from one CPU to another via shared memory in an asymmetric multiprocessing (AMP) system. In my case, the sending CPU is AArch64, and the receiving CPU is Arm32. I know that the compiler has to add padding between the members of the struct in order to ensure each field gets its required alignment, but is there any way to guarantee a particular amount of padding? Even if you ensure that no padding is needed—by rearranging the fields or adding manual padding with dummy variables—is there any guarantee that the compiler will not add padding anyway? I believe that using #pragma packed
or the packed
attribute will force the compiler not to include any padding at all, but it has some significant downsides.
- First, you lose the ability to create references pointing to the struct or its fields, even if that specific field is aligned. If the field is aligned, you can create a pointer to it, but if it is misaligned, you will get a warning when creating the pointer, and the existence of the pointer is undefined behavior.
- Second, if the field is misaligned, the compiler has to include instructions to align the each field every time it is accessed. And it is easy to add a field and accidentally unalign all the following fields without noticing.
My problem with #pragma packed
is not that it doesn’t get the job done but that it makes it harder to work with the struct (not being able to create references to it), and you might end up with a bunch of overhead (or undefined behavior) without noticing. I understand the sentiment of not adding a feature to do something that could already be done just as well, but IMO, you should be able to do something sustainably, that is, resilient to future changes.
All of this would be solved if we could just be sure that my compiler isn’t going to add padding beyond what is necessary to align the next field in the struct. That is to say that the the AArch64 compiler and Arm32 compiler will generate a deterministic layout given the same sizes and alignments of the fields. On both AArch64, Arm32, and x86-64 (where my unit tests are running), the alignment of each of the standard int types and float types are equal to their size. So if I had the above guarantee, then I could be sure that a struct of standard int and float types would be laid out the same way in both programs. Rust guarantees deterministic layouts given the size and alignment of the fields for C-compatible structs (marked, #repr(C)
), but I have not been able to find a similar guarantee in any C/C++/GNU/Arm Standard. I am wondering what is the typical way to tackle this problem and if there is some such guarantee out there that I missed.
I actually did find something close in the Arm ABI (both 32-bit and 64-bit)—the document furthest down reference chain—that said that the members of a struct are laid out “sequentially in memory”. (The 64-bit version of this ABI adds the parenthetical, “possibly with inter-member padding”.) Can I assume that the ABI is calling for no more padding than necessary? At a theoretical level, the ABI is the specification for how two programs are meant to communicate with one another using function calls, data layout, etc. That way all programs and libraries that are compliant with the ABI will be compatible with all other programs and libraries. Since the 32-bit and 64-bit Arm ABIs use the exact same wording for how to layout structs, can I assume that structs can be passed from programs that use the 64-bit ABI to programs that use the 32-bit ABI?
Research to Date (from most general to most specific)
C/C++ Standard
The C and C++ standards both require that the struct/class fields be laid out in memory in the order in which they are declared but put no restriction on how much padding the compiler can put between the fields.
GNU Documentation
The GCC documentation does not specify how much padding the compiler should insert but leaves it up to the ABI (Application Binary Interface).
The application binary interface implemented by a C or C++ compiler affects code generation and runtime support for:
- size and alignment of data types
- layout of structured types
- calling conventions
- register usage conventions
- interfaces for runtime arithmetic support
- object file formats
In addition, the application binary interface implemented by a C++ compiler affects code generation and runtime support for:
- name mangling
- exception handling
- invoking constructors and destructors
- layout, alignment, and padding of classes
- layout and alignment of virtual tables
— https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Compatibility.html
Arm C++ ABI
The Arm C++ ABIs (both the 32-bit and 64-bit ABIs) reference the generic Itanium C++ ABI for non-POD (plain old data) class types, but for POD types (which is what we care about), the generic Itanium C++ ABI leaves the layout to the respective C ABI.
The generic C++ ABI (originally developed for Itanium, [GCPPABI]) specifies:
The layout of C++ non-POD class types in terms of the layout of POD types (specified for this ABI by the Procedure Call Standard for the Arm Architecture, summarized in Procedure call standard for the Arm architecture).
— https://github.com/ARM-software/abi-aa/blob/2982a9f3b512a5bfdc9e3fea5d3b298f9165c36b/bsabi32/bsabi32.rst#the-generic-c-abi
Procedure Call Standard for the Arm Architecture
This part of the Arm C/C++ ABIs (both the 32-bit and 64-bit ABIs) describes structures and unions according to their “Fundamental Data Types”. The members of an “aggregate composite type” are “laid out sequentially in memory”. Is this the guarantee I am looking for? Does it guarantee no extra spacing or just order?
Structs specifically:
8.1.6 Structure, Union and Class Layout
Structures and unions are laid out according to the Fundamental Data Types of which they are composed (see Composite Types). All members are laid out in declaration order. Additional rules applying to C++ non-POD class layout are described in CPPABI32 and GCPPABI.
— https://github.com/ARM-software/abi-aa/blob/2a70c42d62e9c3eb5887fa50b71257f20daca6f9/aapcs32/aapcs32.rst#structure-union-and-class-layout
“Fundamental” composites/aggregates:
5.3 Composite Types
A Composite Type is a collection of one or more Fundamental Data Types that are handled as a single entity at the procedure call level. A Composite Type can be any of:
- An aggregate, where the members are laid out sequentially in memory
- A union, where each of the members has the same address
- An array, which is a repeated sequence of some other type (its base type).
The definitions are recursive; that is, each of the types may contain a Composite Type as a member.- The member alignment of an element of a composite type is the alignment of that member after the application of any language alignment modifiers to that member
- The natural alignment of a composite type is the maximum of each of the member alignments of the ‘top-level’ members of the composite type i.e. before any alignment adjustment of the entire composite is applied
5.3.1 Aggregates
- The alignment of an aggregate shall be the alignment of its most-aligned component.
- The size of an aggregate shall be the smallest multiple of its alignment that is sufficient to hold all of its members when they are laid out according to these rules.”
— https://github.com/ARM-software/abi-aa/blob/2a70c42d62e9c3eb5887fa50b71257f20daca6f9/aapcs32/aapcs32.rst#composite-types
11
You can use a bunch of static_assert
s on offsetof
and sizeof
(and possibly alignof
) to confirm that the layout is exactly as you expect on any possible platform that you want to use it on. This will ensure you have what you want even absent any standard guarantees.
(Though I do agree with the commenters – you probably should use a proper serialization library for this task)
5