Thiết kế website giá rẻ

Question

I’m currently playing with elf files. I’m getting confused about how PT_LOAD segments are loaded into memory. I mean how p_offset, p_filesz, p_vaddr, and p_memsz are used.

First thing first, this is my program header output using **readelf**: =

➜  ~ readelf -l /usr/bin/cat

Elf file type is DYN (Shared object file)
Entry point 0x31f0
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000002d8 0x00000000000002d8  R      0x8
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000016e0 0x00000000000016e0  R      0x1000
  LOAD           0x0000000000002000 0x0000000000002000 0x0000000000002000
                 0x0000000000004431 0x0000000000004431  R E    0x1000
  LOAD           0x0000000000007000 0x0000000000007000 0x0000000000007000
                 0x00000000000021d0 0x00000000000021d0  R      0x1000
  LOAD           0x0000000000009a90 0x000000000000aa90 0x000000000000aa90
                 0x0000000000000630 0x00000000000007c8  RW     0x1000
  DYNAMIC        0x0000000000009c38 0x000000000000ac38 0x000000000000ac38
                 0x00000000000001f0 0x00000000000001f0  RW     0x8
  NOTE           0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000020 0x0000000000000020  R      0x8
  NOTE           0x0000000000000358 0x0000000000000358 0x0000000000000358
                 0x0000000000000044 0x0000000000000044  R      0x4
  GNU_PROPERTY   0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000020 0x0000000000000020  R      0x8
  GNU_EH_FRAME   0x000000000000822c 0x000000000000822c 0x000000000000822c
                 0x00000000000002bc 0x00000000000002bc  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x0000000000009a90 0x000000000000aa90 0x000000000000aa90
                 0x0000000000000570 0x0000000000000570  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 
   03     .init .plt .plt.got .plt.sec .text .fini 
   04     .rodata .eh_frame_hdr .eh_frame 
   05     .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 
   06     .dynamic 
   07     .note.gnu.property 
   08     .note.gnu.build-id .note.ABI-tag 
   09     .note.gnu.property 
   10     .eh_frame_hdr 
   11     
   12     .init_array .fini_array .data.rel.ro .dynamic .got

In the above output, I can see four LOAD segments. I’m using gdb to start examine how those four segments are mapped into memory. I start the program and check the process mappings as follow:

(gdb) shell cat /proc/18331/maps
555555554000-555555556000 r--p 00000000 103:03 3276961                   /usr/bin/cat
555555556000-55555555b000 r-xp 00002000 103:03 3276961                   /usr/bin/cat
55555555b000-55555555e000 r--p 00007000 103:03 3276961                   /usr/bin/cat
55555555e000-555555560000 rw-p 00009000 103:03 3276961                   /usr/bin/cat
7ffff7fc9000-7ffff7fcd000 r--p 00000000 00:00 0                          [vvar]
7ffff7fcd000-7ffff7fcf000 r-xp 00000000 00:00 0                          [vdso]
7ffff7fcf000-7ffff7fd0000 r--p 00000000 103:03 3325699                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7ffff7fd0000-7ffff7ff3000 r-xp 00001000 103:03 3325699                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7ffff7ff3000-7ffff7ffb000 r--p 00024000 103:03 3325699                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7ffff7ffc000-7ffff7ffe000 rw-p 0002c000 103:03 3325699                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0 
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

Let’s look at the fourth segment, as from the process mappings, I can see that this is a 0x2000 bytes segment with offset 0x9000 from the /usr/bin/cat file. What makes me confused is the difference between the p_offset (0x9a90) and p_vaddr (0xaa90).

After reading through a lot of materials, I think that because p_offset of this segment is 0x9a90, which means it will reside on the same page as the previous segment. Thus, p_vaddr is move 0x1000 bytes forward in order to be put in another page. So this means that the actual address used for mmap will be caculated by a bias plus p_vaddr, rounded down to the closest page start address. The offset is calculated by p_offset minus p_offset mod p_align. Meanwhile the length of mmap will be computed by p_filesz plus p_offset mod p_align:

mmap(PAGESTART(bias + p_vaddr), PAGE_ALIGN(p_filesz + p_offset mod p_align), FLAGS , p_offset - p_offset mod p_align)

So here comes to my first question: Is the above guess right?

Moving on, in the fourth segment, I can see many sections:

.init_array .fini_array .data.rel.ro .dynamic .got .data .bss

I’m checking this using objdump, the output is as follow:

➜  ~ objdump --disassemble-all /usr/bin/cat --start-address=0x9000 --stop-address=0xd000

/usr/bin/cat:     file format elf64-x86-64


Disassembly of section .eh_frame:

0000000000009000 <.eh_frame+0xb18>:

Disassembly of section .init_array:

000000000000aa90 <.init_array>:
...

Disassembly of section .fini_array:

000000000000aa98 <.fini_array>:
...

Disassembly of section .data.rel.ro:

000000000000aaa0 <quoting_style_args@@Base-0x140>:
...

The point is, .init_array section starts at 0xaa90 (which is the p_vaddr), not as I expected. I think that this section should start at 0x9a90, which is the p_offset of this segment.

So if the offset is 0x9000 and the size is 0x2000 for this section, doesn’t it mean that I will totally miss some bytes from this segment (mmap only maps from 0x9000 to 0xB000, however .init_array starts at 0xaa90, which means the segment should ended at 0xaa90 + p_filesz = 0xaa90 + 0x630 = 0xb0c0) ?

Thiết kế website giá rẻ

Danh mục

PT_LOAD mapping mechanism