Scala Object Serialization for MapR-DB

Written by anicolaspp | Published 2018/12/20
Tech Story Tags: scala | mapr | big-data | programming | coding

TLDRvia the TL;DR App

Previously, we have discussed some of the advantages and features accompanying MapR-DB. However, this time we are going to get our hand dirty while using this enterprise-grade database.

When using MapR-DB, it is a common practice to serialize and deserialize our business objects (commonly known as POJO) from/to JSON all the time since MapR-DB data is stored using JSON format. These operations are very common and frequent so we will take a look at them from Scala’s point of view.

MapR-DB Document API

There are a series of steps to be followed to create new objects and to insert them into MapR-DB. Let’s look at the typical workflow.

This is the basic steps to insert into MapR-DB. This snippet can be extended for more complex use cases, but essentially, they will look very similar to this one.

There is one issue that can be easily recognized here. The way we create the document object through the fluent API is far from convenient. Normally, we would like to pass a POJO instead of building the document manually.

In Java, we could do the following.

In the code above, we can see that our class Link is used to create the document that will be saved to the database. MapR-DB will utilize the Java Beam object to create the document object.

Now, the problem becomes a title more tedious when using Scala, which should be your language of choice, anyway.

The Scala Issue

Using Scala, we could also use Java Bean to create the desired objects as in Java, yet other problems quickly arise. Let’s see the same example we used before, but this time in Scala.

If you try this out, you will discover that the object link cannot be converted to Java Bean because of the value _id starts with _. This might look small, but all documents inserted to MapR-DB should have the field _id, converting this initial, small issue into a deal breaker.

We can always go back to use manual object construction for each POJO object we have, but we should walk away from this idea as soon it comes to us, for obvious reasons.

Another alternative is to look at mechanisms to convert Scala objects to Document. It is evident we need a type class for doing the heavy lifting and bring flexibility to the conversion system.

Let’s define a type class for doing this work. Let’s name it MySerializer for a lack of a better name.

As we can see, MySerializer uses a default way to convert objects to document using Jackson serialization. Having a default serializer is a good option since the majority of the objects will use it, yet not everyone is built the same, so we need specializations as well.

Now, our code will look like as follows.

As mentioned before, sometimes the default document conversion won’t work, for instance, let’s look at the following example.

Using the default converter with Person will cause an error when trying to save the generated document to the database. MapR-DB needs an _id as the document key as stated before. In this case, we need to a custom converter for the class Person.

This is where the type class mechanism shines. We can specify the exact way to create documents from Person. Let’s see how.

Notice that we have both options, one is to use the default serializer and the second is to use a custom serializer to the specific object in question. This allows a fine-grained serialization mechanism that ultimately yields genericity without given specialization up.

At the same time, the serialization system is outside of the object itself. We should be able to modify how the serialization works without affecting the objects at all. Ultimately, we could override how serialization is done based on a specific context while having different serialization mechanics for different situations as they are needed. This is almost impossible to do in Java, but Scala is a beast at the Ad-Hoc polymorphism world.

Conclusions

MapR-DB OJAI API is nice, but it does not play well with Scala objects, especially around those that do not comply with the Java Bean specifications. On the other hand, Scala offers advanced constructs like type classes that allow us to go around many of the interoperability issues out there while keeping type safety and enabling ad-hoc polymorphism.

Thanks to Lombok Project for helping us out to write cleaner Java code.

Thanks to Simulacrum for enabling type classes in Scala without boilerplate.


Published by HackerNoon on 2018/12/20