Dynamic string deobfuscation on Android

Introduction

A couple years ago when I analyzed Android malware at Threat Fabric we encountered a time consuming problem. Obfuscated strings, a lot of malware did it and most of them in different ways. Strings are definitely useful while analyzing malware, they give out a lot of information about what is happening under the hood.

At first we wrote a custom plugin for our decompiler for every type of string obfuscation we encountered. However, this was very time consuming, so I tried to come up with a more efficient solution. In this blogpost I will be going over my proposed solution, explain how it works, why I did some things the way I did and the challenges.

Problem definition

As briefly discussed in the introduction, our main problem is string obfuscation. Here I will be explaining the problem more in-depth, so skip ahead if you are already familiar.

When reverse engineering Android malware, or any malware for that matter, malware authors will try their best to hide their intentions from analysts. Let's look at a real-life sample.

package com.threesuchm;

import android.app.admin.DeviceAdminReceiver;
import android.content.Context;
import android.content.Intent;

public class p044x extends DeviceAdminReceiver {
   public void onDisabled(Context var1, Intent var2) {
      double.ifdf(var1, class.fddo("8aa669cb198f9845042291e46b13a0c1"), false);
   }

   public void onEnabled(Context var1, Intent var2) {
      double.ifdf(var1, class.fddo("8aa669cb198f9845042291e46b13a0c1"), true);
   }

   public void onReceive(Context var1, Intent var2) {
      super.onReceive(var1, var2);
   }
}

Let's unpack this. Something which immediately catches the eye is the use of double here, because in Java the double is a primitive data type and they seem to call a rather strange method. Upon further analysis of the malware it seems they have made their own class which they called double. They probably did this because they attempted to trip up decompilers and more novice analysts.

Another similar type of obfuscation can be seen within the parenthesis of the aforementioned method call. Now suppose we wanted to know what the onEnabled method here does. Firstly they call the following method:

public static void ifdf(Context var0, String var1, Boolean var2) {
    var0.getSharedPreferences(class.fddo("83a276cc"), 0).edit().putBoolean(var1, var2).apply();
}

This method alters the Shared Preferences of the app. As you can see this method takes three arguments, Context, a String and a Boolean. If we now look at the call site again we can see that the following is passed as a String:

class.fddo("8aa669cb198f9845042291e46b13a0c1")

The keyword class is used here, but this is also a class they made themselves, just like double. The signature of this method is:

public static String fddo(String var0)

In other words, it is a method which takes a String and returns a String. This is what we call string obfuscation. The actual String used here does not appear in the decompiled code, but is computed at runtime. So if I want to know what String is computed there, I would need to reverse engineer the method which computes the String. These methods can be lengthy, there can also be a couple of different ones. More importantly, I need to reverse engineer every obfuscation method for every sample I analyze. As there are a infinite ways of implementing string obfuscation, reverse engineering string obfuscation is a never-ending endeavour.

SHA256 hash of the sample used:

b918f476abaf16b19b0115f22e85a0e2b5946e3b9cb386bf80d5785698472961

Architecture

Deobfuscation will be achieved through code injection. We will inject code into every app on the device so we can communicate with it. A server will run on the device which will receive information from my decompiler which runs on my computer. The decompiler first analyzes the code, extracts all necessary information and sends it to the server. The server will send this information to the target-app. The injected code inside of the target-app will then use this information to hook the deobfuscation method and call it with the provided arguments. This will result in the deobfuscated strings. These will be sent back to the server, which will send this back to my decompiler, which will show up in my view instead of the method call to deobfuscate a string.

Server

The server running on the Android device opens a socket and waits for a connection. My decompiler will connect to it and send it some information, like the following:

{
    "data": [
        {
            "static": true,
            "methodName": "test",
            "className": "ObfuscationMethods",
            "locationPackage": "nl.securify.stringobfuscationtest",
            "arguments": [
                {
                    "type": "Ljava/lang/String;",
                    "arg": "test"
                }
            ]
        }
    ],
    "manifestPackagename": "nl.securify.stringobfuscationtest"
}

This JSON contains all the information necessary to look for the app nl.securify.stringobfuscationtest and calling the test method in class ObfuscationMethods with single String argument "test". The server will broadcast this JSON to all the apps on the device using an Intent. The server exposes an IntentService which will receive the deobfuscated strings. Once our target app has send the deobfuscated strings to the IntentService, the server returns it to my decompiler.

Code injection

To achieve code injection we will use the Xposed framework which makes it relatively easy to inject code into an app. The injected code will register a Broadcast Receiver. So every app on the device now exposes a Broadcast Receiver which we can talk to! We essentially created our own API inside of all the other apps on our device.

Upon receiving a broadcast message we will check if the server is requesting something from us. If not, we stop. If it is the current app the server is requesting, the app will read all the information. Using the information it will call the specified methods with the specified arguments. The results of these method calls are the deobfuscated strings. Once all the methods have been called the injected code will send the deobfuscated strings to the server using an Intent. The response will look like this:

{
    "deobfuscated": [
        "Foobar"
    ]
}

Currently deobfuscation is limited to static-methods only, luckily most string obfuscation solutions use static methods.

Client

Currently I have only implemented a client for the JEB decompiler. The plugin analyzes the currently opened class, searches for predefined signatures, e.g.: A method call which is static, takes a String as input and returns a String as output. It will create a JSON payload like the one in the Server section, send it to the server and wait for a response. Once the response is received the strings will show up in my view instead of method calls to deobfuscate strings.

As the protocol is simple, it should not be too complicated to add support for more decompilers. The biggest hurdle is getting the analysis right.

Considerations

Why dynamically?

Firstly because it required less work. To me it seemed easier to use code injection as a means to deobfuscate strings than to write my own emulator or use someone else's emulator (which were mostly buggy and hard to use at the time of developing this).

Why Xposed?

Another attractive tool would be Frida. Frida is a great tool which I use daily while pentesting mobile apps. However, at the time of developing this tool I was less acquainted with Frida and more so with Xposed. Also, Xposed proves to be more stable than Frida most of the time.

Challenges

Hookability

For the deobfuscation to succeed we need to be able to hook the app. However, this can pose to be a challenge.

When analyzing an older malware sample it would check if its C2 (Command & Control) server was up. If it was not, it would quit and therefore significantly increase the level of difficulty to deobfuscate strings. I am pretty certain there exists a solution to this, I just have not come around to research what it is.

Lots of Android malware also turned off their main component. This ensures the user does not see the app icon in the app drawer. But this also means that when the malware terminates itself, you cannot start it by sending a normal Intent.

Also RASP solutions which detect and (technically) prevent hooking will make it harder to use this solution for deobfuscation.

When to use?

This tool can be used when dealing with String obfuscation on the Java layer. Although not every type of obfuscation is supported, it would not be complicated to add support for more ways of String obfuscation on the Java layer.

Conclusion

There is definitely lots that can be improved upon, but it is a nice start. Whenever I have had to deal with string obfuscation on the Java layer, this tool has helped me deobfuscate most of them. I hope this tool can help others facing the same issues.

Thanks for reading.

Questions or feedback?