My question
I have tried using NASM’s strict keyword and -O0
to inhibit it from turning mov dword [rdi+0], 0xab
into mov dword [rdi], 0xab
, but haven’t been successful. My hunch is that the +0
gets preprocessed, or that it is such a simple “optimization” that it can’t be disabled, but I don’t know NASM well enough to be certain, so I’m looking for confirmation, and other ideas.
Reproducing
foo.s:
foo:
mov dword [rdi+0], 0xab
mov dword [rdi+4], 0xcd
ret
Running nasm -f elf64 foo.s
and then xxd foo.o
outputs this line:
00000180: c707 ab00 0000 c747 04cd 0000 00c3 0000 .......G........
So the first mov
is turned into c707 ab00 0000
, and the second into c747 04cd 0000 00
. Notice how the second mov
isn’t the same, taking an extra 04
offset byte.
Why the hell do you want this
So I know this sounds like an XY problem, but I’m writing my first compiler for my own programming language. I am trying to keep the compiler as small and simple as possible, happily trading some of the output’s performance away. In order to make sure my compiler’s output is correct, I am comparing the .so it outputs to the one that NASM+ld together outputs, which means that every one of my tests has a .s
NASM file. This frees me from having to worry about whether my compiler’s output is correct, but it does mean my compiler needs to output the exact same bytes as NASM+ld for my tests.
Someone suggested using objcopy to edit text section bytes, but simply search-and-replacing c707
with c747 04
of course won’t do, since those bytes might be say a regular constant in the code. I could have my tests.sh
do something like first get the offsets of all of the instructions’ first bytes, and then use that to limit the search-and-replace, but I’m not sure yet whether I’d use objcopy
or objdump
to somehow get these offsets.
For now I just put this if-statement in my compiler for the special case of offset 0. I am aware that compilers for programming languages typically can’t hardcode stuff like this, but with the way I designed my programming language it works just fine:
for (size_t i = 0; i < moved_value_count; i++) {
if (i == 0) {
// moving value (0xab) to index 0:
// c707 ab00 0000
} else {
// moving value (0xcd) to index 1, 2, etc:
// c747 04cd 0000 00
}
}