Some hardware cannot read non-aligned data. For example Bitmap images are aligned to 4-bytes from the header and with each scan line in order to maintain device independence.
For example you can cause an alignment fault by accessing unaligned memory as per the example in the wiki. However, does this actually hold true for files?
If I tried to read data from a binary file(like Bitmaps) at an non-aligned index will this cause an alignment fault on certain hardware? My guess is that it will because Bitmaps safeguard against this, however is it entirely necessary to take this precaution if I want to make a binary file or even a piece of code to load that file device independent?
Some hardware cannot read non-aligned data.
True.
For example Bitmap images
are aligned to 4-bytes from the header and with each scan line in
order to maintain device independence.
I have a number of points to make on this sentence:
-
The internal organization of bitmap files is not an example of hardware being or not being able to read non-aligned data.
-
The word “device” in the term “device independence” for bitmaps refers to the display, not to the CPU. Data alignment is an issue that the CPU has to deal with.
-
The reason for this organization is not in order to maintain device independence. Windows .BMP files (I suppose that’s what you mean by “Bitmap images”) are 4-byte aligned because the x86 architecture will perform best when reading aligned data. It can, however, read non-aligned data. It just won’t perform as fast.
For example you can cause an alignment fault by accessing unaligned
memory as per the example in the wiki. However, does this actually
hold true for files?
Yes and no. Obviously, a CPU could not care less whether a certain file sitting on the disk is or is not aligned. But in order for the file to be of any use at all, it will need to be loaded into memory at some point, right? That’s when proper alignment is useful.
If I tried to read data from a binary file(like Bitmaps) at an
non-aligned index will this cause an alignment fault on certain
hardware?
Yes, it will. But not on any x86 or x64. On a x86 & x64 you will just get sub-optimal performance.
My guess is that it will because Bitmaps safeguard against
this,
Taking file format organizations as hints about potential CPU behaviors is as misguided as taking football team colors as hints about stock market investment decisions.
however is it entirely necessary to take this precaution if I
want to make a binary file or even a piece of code to load that file
device independent?
Again, regarding “Device Independent”,
I presume that by “device independent” you mean “hardware architecture independent”, (which actually means “CPU independent”,) so the answer here is that it depends on how independent you want to be.
-
If you want to be completely independent, then you cannot go with binary files at all.
-
If you want to be moderately independent, then go ahead with your binary file but make sure to use an alignment of 4 or better yet of 8.
-
If you want to be just reasonably independent, ignore alignment. It is just a file. If someone has to read it on some weird hardware, they can read it byte by byte.
If I tried to read data from a binary file(like Bitmaps) at an
non-aligned index will this cause an alignment fault on certain
hardware?
Yes, it will. This depends on the hardware, some are more and some less tolerant, but it can cause faults. In particular, in C reading from improperly aligned address is undefined behaviour, meaning anything (crash, data corruption, nothing) may happen.
The only safe and platform-independent way is to always read byte-by-byte. If that is too inefficient, you’ll need a platform-dependent solution, which can then perform only the alignment truly needed by this platform.
4
The code which reads a file just has to respect the file rules. A zip file often contains data which isn’t even byte-aligned (basically that’s how the format achieve compression – use less than 8 bits for common bytes in the input). Unzipping code must therefore deal with individual bits.
For bitmaps, it’s often easier. Having 4-byte aligned structures means that you can memory map the file, and then safely access the header fields as integers. If they weren’t aligned, you’d need to copy the 4 unaligned bytes to 4 aligned bytes. This can often be done transparently, e.g. by fread
. It copies bytes from its internal buffer to the address you specified.