Although I have spent a summer in OPPO, and my mentor there is really an expert in Android, I have to say I have little experience in Android reverse engineering. Over the past few months, I spent a lot of time analyzing an Android service binary, and files associated with it.
Over the past few weeks, I spent quite a lot of time on frida. Although there are few examples, most of them focus on hooking Java-Level codes. Also, due to the nature of my current project, I faced some unique problems that maybe most of the other APK reverse engineers will never encounter. As I spent more time on that, my frida scripts grew more and more messy. And I have to find a way to organize these files.
In this blog, I plan to document some of the problems and some frida features that other people seldom mention.
The simplest example
You may already know that frida uses JavaScript, and provides some other language interfaces. But no matter what language you use on the client side, you have to write JavaScript files to inject into the process.
To make your life easier, I would use Python directly. And sometimes I would write some tiny scripts and invoke frida via the terminal directly.
Let's start with the simplest example.
import frida, sys
def on_message(message, data):
if message['type'] == 'send':
print("[*] {0}".format(message['payload']))
else:
print(message)
jscode = """
Java.perform(() => {
// Function to hook is defined here
const MainActivity = Java.use('com.example.seccon2015.rock_paper_scissors.MainActivity');
// Whenever button is clicked
const onClick = MainActivity.onClick;
onClick.implementation = function (v) {
// Show a message to know that the function got called
send('onClick');
// Call the original onClick handler
onClick.call(this, v);
// Set our values after running the original onClick handler
this.m.value = 0;
this.n.value = 1;
this.cnt.value = 999;
// Log to the console that it's done, and we should have the flag!
console.log('Done:' + JSON.stringify(this.cnt));
};
});
"""
process = frida.get_usb_device().attach('com.example.seccon2015.rock_paper_scissors')
script = process.create_script(jscode)
script.on('message', on_message)
print('[*] Running CTF')
script.load()
sys.stdin.read()
This is an Android example listed on the official website. To inject your code into the target process, you need to find the process first. To find the process, you need to find the device.
Organize frida project
As you want to hook and modify more and more functions, the script may increase and become hard to read. One thing you might want to do is separate the code into different modules, but how to achieve that in frida?
You can not write multiple JavaScript files like a normal web project, because all your scripts need to be injected into the process's memory separately(so you cannot use import
).
Fortunately, someone has solved this problem. We can write TypeScript
code to generate Javascript
codes.
Use this repository, and separate your code into different TypeScript modules. When running, you can compile them into an _agent.js
.
$ git clone git://github.com/oleavr/frida-agent-example.git
$ cd frida-agent-example/
$ npm install
With this, we can write structures into different files, and write the main logic in index.ts
.
[!notice] The folder path cannot contain space, or
npm build
cannot pass.
Hook service binary
In the description of the above repository, the final line used a frida -f
to spawn the process. This may work for APK applications, but if you want to hook a native service, spawn
the process may not work.
So most of the time I would use -n
to attach to the process.
frida -U -l agent.js -n vendor.oplus.hardware.biometrics.face@1.0-service
However, you need to make sure you can find this process using frida-ps
.
For the injected script part, if you want to hook an APK and run some JAVA code, you can use Java.perform
like the example above. What's the API for binary?
Here is the most trivial example of hooking binaries
var func_address = Module.findBaseAddress("libc.so").add(0xdeadbeef);
Interceptor.attach(func_address, {
onEnter: function(args) {
console.log('entered function');
}
});
When you want to hook a binary file, the target function may not be implemented inside the main file. It could be inside a library. That's why we need Module.findBaseAddress("libxxxx.so").add(0xdeadbeef)
.
Access registers (or handle indirect calls)
When analyzing binary files, one of the most annoying things is the indirect calls.
You can never know where the branch
or jump
instructions go to. You have to run it.
In the previous example, you may notice that we can specify a function start address, and print something before the function starts. Actually, this address doesn't have to be a function start's address.
For example, in the following code snippets, we want to know where it branches to. We need the value of x4
.
00xxx9e4 a0 6b 40 f9 ldr param_1,[x29, #local_270]
00xxx9e8 00 18 40 f9 ldr param_1,[param_1, #0x30]
00xxx9ec 40 06 00 54 b.eq LAB_0035dab4
00xxx9f0 04 38 40 f9 ldr x4,[param_1, #0x70]
00xxx9f4 00 08 40 f9 ldr param_1,[param_1, #0x10]
00xxx9f8 80 00 3f d6 blr x4
We can hook at position 9f4
, which is before the jump and after the load of x4
.
var func_address = Module.findBaseAddress("libx.so").add(0x9f4);
Interceptor.attach(func_address, {
onEnter: function(args) {
console.log(this.context.x4);
}
});
Send result back
Now we know how to pause a function and modify it. What's next? What if I don't want to modify it, I just want to test it (for example fuzzing?). I have a bunch of input test cases and I just want to know the running result of each input.
onEnter : function(args){
send('tag', content);
}
Parallel running
# manager.py
import subprocess
from multiprocessing import Pool
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
# ...
def worker(task):
process = Popen(['python', 'worker.py', task])
with ThreadPoolExecutor(max_workers=4) as executor:
progress = tqdm.tqdm(total=len(all_tasks))
for i, task in enumerate(all_tasks):
executor.submit(worker, task)
progress.update(1)
# worker.py
import argparse
# ...
def worker(xxx,xxx):
device = frida.get_usb_device()
# ...
# put your frida logic here
# ...
if __name__ == '__main__':
# ...
worker(args.param1, args.param2, ... )
Data consistency issue
To solve data consistency issue, we can pass the arguments to a separate process, and use a python thread lock to make sure the result is not overwritten.
Early Tracing
https://github.com/frida/frida/issues/13
I didn't find a good way to solve this issue,
Instead, i figured out using frida.spawn
to launch the process directly.
And if there are functions for initialization, we can directly invoke it using NativeFunction
API.
Frida server dead
https://github.com/ViRb3/magisk-frida
In a rooted Android phone, we can install this magisk module, which will automatically start the frida server. So if the phone was restarted due to some frida error, or the frida server itself was dead, the frida server will still be launched and we can connect to it later.
Another related issue is that frida server often gets stuck. The process itself doesn't terminate, which makes it more difficult to monitor.
One way that I used is running a separate script, monitoring the result file. If the result file doesn't have new contents for several minutes, the frida server are likely stuck. Since we already installed the magisk module, we just need to kill the server process and let it restart automatically.
#!/bin/bash
FILE_PATH="result.txt"
SLEEP_DURATION=60
LAST_LINE_COUNT=0
echo "[+] Start Monitoring result file"
while true; do
CURRENT_LINE_COUNT=$(wc -l < "$FILE_PATH")
echo "[+] Current Line Num: $CURRENT_LINE_COUNT"
if [ "$CURRENT_LINE_COUNT" -eq "$LAST_LINE_COUNT" ]; then
PIDS=$(adb shell "su -c 'ps -A | grep frida' | awk '{print \$2}'")
for PID in $PIDS; do
if [ ! -z "$PID" ]; then
adb shell "su -c 'kill -9 $PID'"
echo "[+] Long time no new result, killing frida: PID=$PID"
fi
done
fi
LAST_LINE_COUNT=$CURRENT_LINE_COUNT
sleep $SLEEP_DURATION
done
Hook instruction
We have introduced how to hook a function in Frida, but what if you want to access some register inside a function while it's not a function start?
This is kind of tricky because the callback function in Frida (like onEnter
and onLeave
) will assume you are hooking at a function start. As a result, if you do this way in a non-function start, the stack got screwed up and further program cannot be executed correctly.
Fortunately, Frida provided a hidden way for such case. In JavaScript, we can pass in a function, rather than a call back, to only execute the function without any additional operations.
Interceptor(address, function (args) {
console.log(this.context.x0);
});
^09328c
CModule
Sometimes you may need to perform the operation in lower level code, (C for example), and that's the point we should use CModule
, which allows you compile a small piece of C source code and directly inject the binary into the process.
Notice that the C source need to include frida-gum header files.