Handling Vulnerabilities Due to De-serialization in Java Applications

Java is a programming language that has been around for a long time and continues to be relevant today. Many well-known firms and open-source projects rely significantly on Java. Without question, Java is a popular programming language.

But why is that?

Java performs admirably in the majority of situations. The object-oriented nature, efficient memory allocation/deallocation, and standard syntax contribute to this. So, does this mean Java is the "holy grail" of programming language?

Of course not! Just as everything has its benefits and drawbacks, Java has its own set of problems. As a result, developers frequently choose a hybrid solution, in which they combine several technologies that perform well.

When it comes to vulnerabilities, it's crucial to understand how Java objects are stored and accessed. As we all know, Java is an object-oriented programming language, which means that classes are constructed as instances to access its methods and data.

Getters and setters are two popular methods used to ensure data encapsulation in a program. This is one of Java's most important features when it comes to security. All of the processing happens largely at compilation and runtime, which means that any memory that is allocated is just transitory and volatile.

This is not a good idea when it comes to building applications or systems that should persist information over sessions. There should be some storage mechanism to store and retrieve any piece of information. This mechanism is accomplished in Java by using serialization and deserialization.

Serialization, Deserialization, and Byte Stream

Source

As mentioned earlier, Java extensively uses objects to create, access, and manipulate class data. However, this object is removed by the garbage collector when it is no more used.

Suppose this object needs to be, say, for example, stored on a disk or transmitted over a network. It needs to be converted into a format that is suitable for storing or transmitting---byte stream.

So putting it together, serialization is the process of converting an object or an instance into a byte stream. This byte stream does not contain the actual code. Instead, it contains the information as bits and bytes that represent the object. Typically, to transform the object into a byte stream, the class of the instance should implement the interface 'serializable'.

Below is a snippet showing the serialization process:

Source

As we see above, the class "Demo" implements the interface 'serializable'. Further, we call the writeObject method on the ObjectOutputStream instance and pass the object to be serialized as an argument.

As you might have guessed, deserialization is just the opposite process, where we construct the original object from the byte array. When deserializing the object, it does not use the constructor to trace the values. Rather, it creates a new object and uses reflection to write the data to the fields (this is the same as how serialization uses reflection to scrape all the data that needs to be serialized).

Refer to the below snippet to understand the deserialization process:

Source

Here, we use the readObject method in contrast to writeObject (used during serialization).

Before we move on to the vulnerability in deserialization, let's discuss a tool that can help you recognize the vulnerability. After all, you can't fix a problem if you don't know it exists!

WhiteSource Cure is an amazing free remediation tool for custom code that highly prioritizes security. The approach is to write code that is mostly free from any possible vulnerabilities, and it integrates right into your IDE.

After identifying a vulnerable piece of code and marking it for review, remediation is provided to make the necessary adjustments to make the code less vulnerable. It also generates a detailed analysis of what was wrong with the existing code and how the remediation will help remedy the flaw. This is possible through static code analysis.

Let's see how remediation helps you write code that is resistant to SQL injection attacks:

Below are two snippets of a query statement before and after remediation:

Source

After remediation:

Source

If we look closely, we see there's a slight difference in the query string after the remediation (2nd image). The input is not directly interpolated with the query string, and the input is first checked for its validity (if it's a regular string or a SQL statement).

There's also an additional comment that says, "Bad user input should always be sanitized", which is what it does exactly. For complete code, the detailed vulnerability explanation, and the remediation, head over to WhiteSource.

This security remediation tool is not just limited to detecting SQL vulnerabilities, but any vulnerabilities, be it an XSS attack or deserialization vulnerability, which we'll discuss next.

The Problem with Deserialization and Workarounds

So far, we have seen how serialization and deserialization create a medium to store and fetch Java objects. At the same time, though, it is important to know that there exists a potential exposure in deserialization.

A Java deserialize vulnerability arises when a malicious user attempts to insert a changed serialized object into a system, resulting in the system or its data being compromised.

Consider the possibility of arbitrary code execution when deserializing a serialized item. The cause for such exploitation is how the deserialization process actually happens.

As discussed above, we understand that Java doesn't use the class constructor during deserialization but rather uses reflection to insert data fields on an empty object. Since the constructor is not invoked, the validation done in the constructor is also skipped, creating a backdoor for the intruder to add malicious code.

As a simple example, we can consider the deserialization example code shown above. We see that during deserialization the stream is fetched from a file (file.ser). Manipulating the byte stream is essentially tampering with the serialized file to change or add different values. That way, when the manipulated file is streamed, the manipulated changes are also reflected in the program.

While this is already harmful, to make things worse, such vulnerabilities are further prone to gadget chains, in which field values are changed, and remote code execution can be done.

To safeguard from these vulnerabilities, several approaches can be used, a few of them being:

When processing objects from a deserialized byte stream, one apparent way is to apply basic input sanitization.
Allowing only specified types (classes) of objects to be deserialized is another important component of mitigating unsafe deserialization attacks. This removes any uncertainty in your application and is a graceful technique to avoid application crashes or DoS attacks.
Such restrictions can be applied by incorporating the whitelist approach. The whitelist approach is safer since the program only deserializes objects that belong to a pre-approved set of classes.
Lastly, the surest way to tackle such vulnerabilities is to prevent using serialization and deserialization in the application. It is understandable that it is difficult and a long process to re-architect the application, but it is preferred if you're risk-averse when it comes to security.

Conclusion

Whatever technique you use to prevent such attacks and patch vulnerabilities, the bottom line is that you should never trust input, even if it looks to come from legitimate sources or an application (rather than a user). Prior to processing an input, doing basic sanitization checks using tools like WhiteSource Cure can help prevent serious exploitation.