I was leaning linker script and wrote assembly code as below:
#hello.s
.data
msg : .string "Hello, world!n"
len = . - msg
.text
.global _start
_start:
movl $len, %edx
movl $msg, %ecx
movl $1, %ebx
movl $4, %eax # sys_write
int $0x80
movl $0,%ebx
movl $1,%eax # sys_exit
int $0x80
and linker script as below:
/* hello.ld */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SECTIONS
{
. = 0x10000;
.text :
{
*(.text);
}
.data :
{
*(.data);
}
.bss :
{
*(.bss);
}
}
as -o hello.o hello.s
ld -T hello.ld -o hello hello.o
./hello
hello can be executed normally when setting .=0x10000
. But when I set .=0x1040
, I got Segmentation fault
. There is an answer in here that explains why custom code can not be put before address 0x10000.
However, When I used readelf -h
to get informations of an executable file(such as a.out) which was generated by gcc, the entry point address
is 0x1040
. Using objdump -d
to disassemble such file can also find the definition of _start
at 0x1040
. But gdb can not insert a breakpoint in that position by b *1040
. When using b _start
and run
in gdb, there is an another _start
in /lib64/ld-linux-x86-64.so.2
.
here’s the definition in
0x1040
:
0000000000001040 <_start>:
1040: f3 0f 1e fa endbr64
1044: 31 ed xor %ebp,%ebp
1046: 49 89 d1 mov %rdx,%r9
1049: 5e pop %rsi
104a: 48 89 e2 mov %rsp,%rdx
104d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1051: 50 push %rax
1052: 54 push %rsp
1053: 45 31 c0 xor %r8d,%r8d
1056: 31 c9 xor %ecx,%ecx
1058: 48 8d 3d ca 00 00 00 lea 0xca(%rip),%rdi # 1129 <main>
105f: ff 15 73 2f 00 00 call *0x2f73(%rip) # 3fd8 <__libc_start_main@GLIBC_2.34>
1065: f4 hlt
1066: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
106d: 00 00 00
The question is:
- Why gcc can put
_start
in address 0x1040? - Will the code at
0x1040
be executed? - which
_start
is the true one?
I have searched in google but only found why custom code can not be put before 0x10000 in here.