Analyzing GuLoader: How to Approach Deobfuscation of Complex Samples

Written by anyrun | Published 2023/06/22
Tech Story Tags: malware-analysis | malware | malware-threat | malware-detection | cybersecurity | cyber-security-awareness | security | good-company | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-vi | hackernoon-fr | hackernoon-pt | hackernoon-ja

TLDRThis article focuses on static analysis, but if you want to analyze a Gu loader sample dynamically, you can use **ANYRUN** cloud malware sandbox. Visit our blog to find the sample we'll analyze, as well as unpacking instructions and a Ghidra script that partially automates much of what we’re going to cover.via the TL;DR App

As a malware researcher, you're often up against complex, heavily obfuscated samples. And GuLoader, the malware we're dealing with today, is a classic case in point.

Take a look at this pseudo-code, produced by decompiling its assembly code — it is ugly and unreadable.

Faced with a puzzle like this, you might find yourself at a loss. Where should you even begin? How do you tackle the analysis of this sample? Let's break it down.

In this article, then, we’ll explore strategies to deobfuscate such code, using GuLoader as a reference. You’ll learn about:

  • Common obfuscation tactics used by adversaries
  • How to defeat them
  • How to deobfuscate a GuLoader sample

The article is based on the GuLoader malware analysis previously published by ANY.RUN. Visit our blog to find the sample we'll analyze, as well as unpacking instructions and a Ghidra script that partially automates much of what we’re going to cover from this point on.

This article focuses on static analysis. But if you want to analyze a GuLoader sample dynamically, you can use ANY.RUN cloud malware sandbox.

Apply for a free 14-day trial of the Enterprise plan using your business email. Take advantage of the extended lifetime of VM instances, unlimited tasks, and Windows versions 7 through 11.

Dynamic analysis lets you see how malware works in the real world by linking its behavior to its technical setup. It's a way to check its actions across different system setups and collect IOCs.

Without delaying further, let's dive in.

Identifying Obfuscation Methods

After unpacking the sample, we begin with a manual analysis of the shellcode and quickly realize that it’s obfuscated. Studying the code enables us to group the obfuscation techniques GuLoader employs into categories:

  • XMM instructions
  • Unconditional JMP instructions
  • Junk instructions
  • Fake comparison instructions
  • Fake PUSHAD instructions
  • Fake PUSH instructions
  • Opaque predicates
  • And arithmetic expressions

This is a good time to stop, analyze the obfuscation methods being used, and develop a debofuscation strategy.

For example, in our case, most of these techniques introduce code that doesn't alter the final execution outcome. Therefore, we can often safely “NOP” them to improve readability. But proceed with caution — not all obfuscated code is irrelevant to the program's operation, as we’ll soon discover.

Now, let's examine these obfuscation techniques individually, and see how to defeat them.

1. XMM instructions

The code is littered with many XMM instructions. These appear disordered and add complexity to the analysis process. We encounter them right from the first byte of the unpacked shellcode.

Note that many emulation engines stumble over them due to the lack of default support. We put Angr, Triton, and Ghidra's embedded engines to the test — all fell short.

How we dealt with it

In GuLoader's case, XMM instructions don't actually impact the intended behavior of the code. You’ll encounter similar obfuscation methods in a lot of malware. Consequently, we can safely replace all XMM instructions with "NOP," as shown in the following table:

Here's how the result looks in the Ghidra disassembler:

2. Unconditional JMP instructions

Unconditional JMP instructions segment the code into smaller chunks. This method is frequently used to avoid detection by antivirus software and other security tools. Additionally, it can make the analysts' job more time-consuming and frustrating, as they must jump between these blocks, especially when dealing with a large amount of code. GuLoader and other malware commonly employ this technique.

How we dealt with it

This obfuscation method is quite easy to defeat. The disassembler in the decompiled code often successfully concatenates these blocks, improving code readability, even with these unconditional jumps present. As such, we can leave the small blocks as they are without needing to merge them.

3. Junk instructions

Junk assembly instructions often come into play as an extra layer of obfuscation. These instructions do not perform any tangible function, typically leaving the register values, execution flow, or memory unaltered.

You will encounter these within GuLoader as well.

Look out for instructions that perform no action ("NOP", "FNOP") and those that shift or rotate by zero bits ("SHL reg, 0"; "ROL reg, 0"). Other non-impactful instructions like "OR reg, 0", "XOR reg, 0", "CLD", "WAIT" are also present.

How we dealt with it

Addressing fake comparison instructions is more challenging than just replacing junk ones with "NOP". We can't remove all comparison instructions as some are necessary for the correct code function. One way to tackle this is to “mark” all comparison instructions we encounter. If no instructions are found that use the comparison's result, it's safe to NOP it. If we find a conditional jump or similar, we unmark the comparison to avoid removal.

The following table shows an example where all comparison instructions except "CMP EDX,0x0" were selectively replaced with NOP:

5. Fake PUSHAD instructions

GuLoader also employs the obfuscation tactic of using fake "PUSHAD" instructions, coupled with a matching "POPAD" instruction. They can temporarily modify register values but are nullified by the "POPAD" restoring the original register values.

How we dealt with it

Our research showed that all "PUSHAD" instructions in GuLoader are extraneous. So, we address this by replacing "PUSHAD", "POPAD", and the intermediate instructions with NOP:

However, not all "POPAD" instructions in GuLoader are junk. We leave those without a corresponding "PUSHAD" untouched.

6. Fake PUSH instructions

Another obfuscation technique similar to the previous one is the use of fake "PUSH" instructions. These instructions push a value onto the stack only to pop it off immediately.

An example is the inclusion of a "PUSH SS" instruction, possibly followed by instructions modifying a register or memory location. However, the subsequent "POP SS" restores the stack pointer to its initial value.

How we dealt with it

Defeating fake PUSH instructions resembles the process for fake PUSHAD, but it's crucial to leave the non-pushed registers unaltered.

7. Opaque predicates

Opaque predicates are conditional statements that always return true or false, yet they're tough to analyze or predict. These are found in GuLoader's code and complicate the logic understanding.

For instance, a pair of instructions like “MOV BL, 0xB6” and “CMP BL, 0xB6” could be followed by a conditional jump like “JNZ ADDR”. The comparison always returns false since the compared value equals the moved value, making the jump unnecessary and baffling.

How we dealt with it

Overcoming opaque predicates can seem challenging due to the requirement of "predicting" the jump condition. However, our situation is more straightforward as all opaque predicates fall within the "PUSHAD" and "POPAD" blocks. Therefore, we simply replace all predicates between these instructions with NOP.

8. Arithmetic Expressions

Obfuscated arithmetic expressions are among the more interesting techniques used by GuLoader. They make it tougher to understand the actual operations performed. These expressions incorporate arithmetic actions like addition or subtraction. Sometimes, they’re mixed with other obfuscations like fake comparisons, opaque predicates, and junk instructions.

One example is moving a constant value into a register and executing arithmetic operations:

Another instance is pushing a constant value onto the stack and performing calculations on the memory:

How we dealt with it

To deobfuscate GuLoader's arithmetic expressions, we adopt an approach similar to handling fake comparisons. We mark all "MOV" instructions where the second argument is a scalar value, and all "PUSH" instructions where the argument is scalar. On encountering an arithmetic operation, we update the constant value in the first instruction and replace the current one with NOP. This way, the first instruction has the result, and subsequent arithmetic instructions are replaced by NOP.

Below are examples with optimized “MOV” and “PUSH” operations:

Be cautious with operand sizes. Preserving the correct size during operations is crucial.

Wrapping up our analysis

In this article from ANY.RUN experts, we used GuLoader as a real-world example to highlight typical deobfuscation strategies. We started with analyzing the code, then grouped similar obfuscation tactics, and finally, thought out unique ways to tackle each.

But it's crucial to understand that these techniques aren't a one-size-fits-all solution. Each malware sample may present its unique obfuscation strategies that need unique countermeasures.

Our discussion underlines the significance of breaking down a complex task like deobfuscation into clear, manageable steps. Remember, that successful deobfuscating of any malware involves detailed analysis, spotting obfuscation patterns, and creating optimized strategies.


Written by anyrun | ANY.RUN is an online interactive cloud-based sandbox for malware analysis.
Published by HackerNoon on 2023/06/22