Golfer - Part 1
8 minutes to read
We have a 32-bit binary called golfer
:
$ file golfer
golfer: ELF 32-bit
Reverse engineering
It is quite a strange binary since file
did not output so much information. In fact, we cannot analyze it with Ghidra or objdump
. Therefore, let’s look at the hexdump with xxd
:
$ xxd golfer
00000000: 7f45 4c46 0161 3466 5455 487d 7952 7b6c .ELF.a4fTUH}yR{l
00000010: 0200 0300 0100 0000 4c00 0008 2c00 0000 ........L...,...
00000020: 675f 3330 4272 efbe 3400 2000 0100 0000 g_30Br..4. .....
00000030: 0000 0000 0000 0008 0000 0008 3801 0000 ............8...
00000040: 3801 0000 0500 0000 0010 0000 e9d6 0000 8...............
00000050: 00fe c3fe c2b9 0a00 0008 e8d0 0000 00b9 ................
00000060: 0800 0008 e8c6 0000 00b9 2400 0008 e8bc ..........$.....
00000070: 0000 00b9 0e00 0008 e8b2 0000 00b9 0c00 ................
00000080: 0008 e8a8 0000 00b9 2300 0008 e89e 0000 ........#.......
00000090: 00b9 0900 0008 e894 0000 00b9 2100 0008 ............!...
000000a0: e88a 0000 00b9 0600 0008 e880 0000 00b9 ................
000000b0: 0d00 0008 e876 0000 00b9 2200 0008 e86c .....v...."....l
000000c0: 0000 00b9 2100 0008 e862 0000 00b9 0500 ....!....b......
000000d0: 0008 e858 0000 00b9 2100 0008 e84e 0000 ...X....!....N..
000000e0: 00b9 2000 0008 e844 0000 00b9 2300 0008 .. ....D....#...
000000f0: e83a 0000 00b9 0f00 0008 e830 0000 00b9 .:.........0....
00000100: 0700 0008 e826 0000 00b9 2200 0008 e81c .....&....".....
00000110: 0000 00b9 2500 0008 e812 0000 00b9 0b00 ....%...........
00000120: 0008 e808 0000 0030 c0fe c0b3 2acd 8055 .......0....*..U
00000130: 89e5 b004 cd80 c9c3 ........
It is indeed a very short binary. If we run it, nothing happens apparently. If we take a closer look at the output of xxd
we see some printable characters at the top. We can view them using strings
:
$ strings golfer
a4fTUH}yR{l
g_30Br
There’s an H
, a T
, a B
and also {
and }
. So, we might think that the flag is there, but scrambled.
Using pwntools
, we are able to disassemble some low-level instructions and see a curious pattern:
$ xxd -p golfer | tr -d \\n
7f454c46016134665455487d79527b6c02000300010000004c0000082c000000675f33304272efbe340020000100000000000000000000080000000838010000380100000500000000100000e9d6000000fec3fec2b90a000008e8d0000000b908000008e8c6000000b924000008e8bc000000b90e000008e8b2000000b90c000008e8a8000000b923000008e89e000000b909000008e894000000b921000008e88a000000b906000008e880000000b90d000008e876000000b922000008e86c000000b921000008e862000000b905000008e858000000b921000008e84e000000b920000008e844000000b923000008e83a000000b90f000008e830000000b907000008e826000000b922000008e81c000000b925000008e812000000b90b000008e80800000030c0fec0b32acd805589e5b004cd80c9c3
$ pwn disasm 38010000380100000500000000100000e9d6000000fec3fec2b90a000008e8d0000000b908000008e8c6000000b924000008e8bc000000b90e000008e8b2000000b90c000008e8a8000000b923000008e89e000000b909000008e894000000b921000008e88a000000b906000008e880000000b90d000008e876000000b922000008e86c000000b921000008e862000000b905000008e858000000b921000008e84e000000b920000008e844000000b923000008e83a000000b90f000008e830000000b907000008e826000000b922000008e81c000000b925000008e812000000b90b000008e80800000030c0fec0b32acd805589e5b004cd80c9c3
0: 38 01 cmp BYTE PTR [ecx], al
2: 00 00 add BYTE PTR [eax], al
4: 38 01 cmp BYTE PTR [ecx], al
6: 00 00 add BYTE PTR [eax], al
8: 05 00 00 00 00 add eax, 0x0
d: 10 00 adc BYTE PTR [eax], al
f: 00 e9 add cl, ch
11: d6 (bad)
12: 00 00 add BYTE PTR [eax], al
14: 00 fe add dh, bh
16: c3 ret
17: fe c2 inc dl
19: b9 0a 00 00 08 mov ecx, 0x800000a
1e: e8 d0 00 00 00 call 0xf3
23: b9 08 00 00 08 mov ecx, 0x8000008
28: e8 c6 00 00 00 call 0xf3
2d: b9 24 00 00 08 mov ecx, 0x8000024
32: e8 bc 00 00 00 call 0xf3
37: b9 0e 00 00 08 mov ecx, 0x800000e
3c: e8 b2 00 00 00 call 0xf3
41: b9 0c 00 00 08 mov ecx, 0x800000c
46: e8 a8 00 00 00 call 0xf3
4b: b9 23 00 00 08 mov ecx, 0x8000023
50: e8 9e 00 00 00 call 0xf3
55: b9 09 00 00 08 mov ecx, 0x8000009
5a: e8 94 00 00 00 call 0xf3
5f: b9 21 00 00 08 mov ecx, 0x8000021
64: e8 8a 00 00 00 call 0xf3
69: b9 06 00 00 08 mov ecx, 0x8000006
6e: e8 80 00 00 00 call 0xf3
73: b9 0d 00 00 08 mov ecx, 0x800000d
78: e8 76 00 00 00 call 0xf3
7d: b9 22 00 00 08 mov ecx, 0x8000022
82: e8 6c 00 00 00 call 0xf3
87: b9 21 00 00 08 mov ecx, 0x8000021
8c: e8 62 00 00 00 call 0xf3
91: b9 05 00 00 08 mov ecx, 0x8000005
96: e8 58 00 00 00 call 0xf3
9b: b9 21 00 00 08 mov ecx, 0x8000021
a0: e8 4e 00 00 00 call 0xf3
a5: b9 20 00 00 08 mov ecx, 0x8000020
aa: e8 44 00 00 00 call 0xf3
af: b9 23 00 00 08 mov ecx, 0x8000023
b4: e8 3a 00 00 00 call 0xf3
b9: b9 0f 00 00 08 mov ecx, 0x800000f
be: e8 30 00 00 00 call 0xf3
c3: b9 07 00 00 08 mov ecx, 0x8000007
c8: e8 26 00 00 00 call 0xf3
cd: b9 22 00 00 08 mov ecx, 0x8000022
d2: e8 1c 00 00 00 call 0xf3
d7: b9 25 00 00 08 mov ecx, 0x8000025
dc: e8 12 00 00 00 call 0xf3
e1: b9 0b 00 00 08 mov ecx, 0x800000b
e6: e8 08 00 00 00 call 0xf3
eb: 30 c0 xor al, al
ed: fe c0 inc al
ef: b3 2a mov bl, 0x2a
f1: cd 80 int 0x80
f3: 55 push ebp
f4: 89 e5 mov ebp, esp
f6: b0 04 mov al, 0x4
f8: cd 80 int 0x80
fa: c9 leave
fb: c3 ret
There are a lot of mov ecx, <addr>; call 0xf3
instructions. Notice that the binary will be loaded at address 0x8000000
, so all addresses <addr>
are relative to the base one. For instance, 0x800000a
refers to the tenth byte of the hexdump (H
), then 0x8000008
is for the eighth byte (T
), and 0x8000024
is for the byte at position 36 (B
).
Solution
We can get those characters using head
(but indices start at 1
, not at 0
, so we need to increase the offsets by 1
):
$ head -c 11 golfer | tail -c 1
H
$ head -c 9 golfer | tail -c 1
T
$ head -c 37 golfer | tail -c 1
B
Nice, all we have to do now is automate the extraction of the flag using the following Python script:
#!/usr/bin/env python3
with open('golfer', 'rb') as f:
binary = f.read()
flag = ''
index = binary.index(b'\xfe\xc2') + 3
while binary[index - 1] == 0xb9:
addr = int.from_bytes(binary[index:index + 4], 'little') - 0x8000000
flag += chr(binary[addr])
index += 10
print(flag)
Flag
If we run the script, we will get the flag:
$ python3 solve.py
HTB{y0U_4R3_a_g0lf3r}
The full script can be found in here: solve.py
.
Another approach
According to readelf
, the program’s entrypoint is at address 0x800004c
:
$ readelf -a golfer
ELF Header:
Magic: 7f 45 4c 46 01 61 34 66 54 55 48 7d 79 52 7b 6c
Class: ELF32
Data: <unknown: 61>
Version: 52 <unknown>
OS/ABI: <unknown: 66>
ABI Version: 84
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x800004c
Start of program headers: 44 (bytes into file)
Start of section headers: 808673127 (bytes into file)
Flags: 0xbeef7242
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
readelf: Warning: possibly corrupt ELF file header - it has a non-zero section header offset, but no section headers
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08000000 0x08000000 0x00138 0x00138 R E 0x1000
There is no dynamic section in this file.
This means that we will find actual machine instructions at offset 0x4c
of the binary (the whole file gets mapped at virtual address 0x8000000
in 32-byte programs):
$ xxd -p -s 0x4c golfer
e9d6000000fec3fec2b90a000008e8d0000000b908000008e8c6000000b9
24000008e8bc000000b90e000008e8b2000000b90c000008e8a8000000b9
23000008e89e000000b909000008e894000000b921000008e88a000000b9
06000008e880000000b90d000008e876000000b922000008e86c000000b9
21000008e862000000b905000008e858000000b921000008e84e000000b9
20000008e844000000b923000008e83a000000b90f000008e830000000b9
07000008e826000000b922000008e81c000000b925000008e812000000b9
0b000008e80800000030c0fec0b32acd805589e5b004cd80c9c3
$ pwn disasm $(xxd -p -s 0x4c golfer | tr -d \\n)
0: e9 d6 00 00 00 jmp 0xdb
5: fe c3 inc bl
7: fe c2 inc dl
9: b9 0a 00 00 08 mov ecx, 0x800000a
e: e8 d0 00 00 00 call 0xe3
13: b9 08 00 00 08 mov ecx, 0x8000008
18: e8 c6 00 00 00 call 0xe3
...
d1: b9 0b 00 00 08 mov ecx, 0x800000b
d6: e8 08 00 00 00 call 0xe3
db: 30 c0 xor al, al
dd: fe c0 inc al
df: b3 2a mov bl, 0x2a
e1: cd 80 int 0x80
e3: 55 push ebp
e4: 89 e5 mov ebp, esp
e6: b0 04 mov al, 0x4
e8: cd 80 int 0x80
ea: c9 leave
eb: c3 ret
Now the code makes more sense. We see that it starts with a jump instruction to 0xdb
, which simply executes a sys_exit
, because $eax
is set to 1
, so the program simply exits. The error code is 42
(0x2a
), set in $ebx
:
$ ./golfer; echo $?
42
However, if we patch the jmp 0xdb
instruction with nop
instructions (0x90
in hexadecimal), the program will simply print out the flag character by character, because it sets $ebx = 1
and $edx = 1
, and subsequently puts a character address in $ecx
and calls 0xe3
. Here, the program will perform a sys_write
instruction because $eax
is set to 4
. The file descriptor is $ebx = 1
(stdout
), and the length to print is also $edx = 1
.
So, let’s do this:
$ xxd -p golfer | tr -d \\n | sed s/e9d6000000/9090909090/g | xxd -r -p > golfer_patched
$ chmod +x golfer_patched
$ ./golfer_patched
HTB{y0U_4R3_a_g0lf3r}