Microservices Deserve Modern Programming Platforms: Java May Not be the Best Option

Microservices are very popular today, even in traditional corporate IT shops. Often though they are implemented using languages, such as Java, born in the early ’90s and designed for a world of monolithic applications. Do you remember the big old Application Servers?

Ignoring the programming platforms developed in the last ten years can lead to sub-optimal results and high run costs when adopting Microservices

The last decade has seen the rise of new programming platforms, all aimed to provide a better support to “modern distributed computing”, which is at the basis of Microservices. Such technologies promise to optimise the costs of infrastructure and address efficiently the ever increasing workloads generated by the digital revolution.

Moreover, with the advent of Containers, developers can “write in whatever language they want and run everywhere”, making the original Java proposition, “write once, run everywhere,” much less relevant.

Ignoring such advancements in the application development space can lead to sub-optimal results when adopting Microservices based architectures.

The focus of this post is on 2 of such technologies, Node and Go. Why those? I was intrigued by a sort of strange fact: they share the same date of birth, I mean almost the same day. And maybe not by chance.

November 2009, almost 15 years after the release of Java

On November 8, 2009 Ryan Dahl presented for the first time Node, an open source platform to run Javascript (and now Typescript as well) on the server.

Two days later Google announced Go, a new open source programming language. Since then, both Node and Go have steadily gained traction and can now be considered mainstream. According to StackOverflow 2020 Developer Survey, Node is the most popular platform and Go is the third most wanted language.

This may be seen as a coincidence, considering the very different nature of the two technologies. But maybe there is something profound behind such close birth dates, something that is still relevant for us today.

Microservices provide scalability as long as they are used to run efficiently many small tasks in a concurrent mode

By 2009, the exponential growth in the demand of digital services was already a fact. But such exponential growth of load could not be addressed by a similarly exponential increase in infrastructure. As response to this challenge, a new architectural style emerged: horizontally scalable distributed systems based on multi core commodity (e.g. x86) processors, in other words the basis of what Microservices are.

With Microservices you better optimize the concurrent processing of many small tasks which is not what languages like Java and C# were designed for

Such architectures have proved to scale at sustainable costs as long as they are able to process many small tasks at “the same time”, maximising the use of cpu cycles and memory of those commodity multicore processors.

Traditional languages like Java, born in a different era, had not been designed with horizontally scalable distributed architectures in mind. Applications in the ’90s were monolithic: monolithic application servers running monolithic processes.

In 2009, the new requirement was to concurrently run many small tasks on many small machines (massively concurrent/parallel systems). So there was clearly a mismatch.

Both Node and Go came in to address this mismatch, even if from very different directions.

What do Node and Go share? A natural support for concurrency

Being born at the same time may be seen as a coincidence, specifically if we talk about Node and Go which are very different platforms in many respects.

But maybe it is not a coincidence if we look at what they have in common: a natural support for concurrency. And this is why they can be seen as two different ways to respond to the new challenge posed by modern distributed computing: a strong support for concurrency.

Why is concurrency important in distributed architectures?

Let’s look at a typical program, a program that interacts with databases, REST services and maybe storage.

What does such a program normally do? Most of the time it stays idle waiting for some I/O (Input/Output) operations to be performed somewhere else. This is called I/O bound processing, since the amount of processing that can actually be performed in a unit of time is limited by the capacity of I/O to respond fast.

I/O operations are, for reasons of physics, orders of magnitude slower than the operations performed by CPUs. Access to a reference in main memory takes about 100 nano-seconds. A round trip within the same data center about 250,000 nano-seconds, round trips among different regions more than 2,000,000 nano-seconds — see how latency has evolved and a recent Microsoft Azure report. As a result, the processing of a single requests ends up wasting most of the CPU cycles just waiting for I/O operations to complete.

In distributed architectures concurrency is crucial if you want to optimise the use of infrastructure and minimise its costs

So, what does this mean for our computing power? It means that it will remain under-utilised, unless we do something, unless we make sure that more than one request can be managed by our CPU “at the same time”. And this is exactly what concurrency is all about.

This is not a new problem. Traditionally, in the Java world, this was the task of the Application Servers. But Application Servers are not a good fit for distributed and horizontally scalable architectures. And this is where the likes of Node and Go can come to rescue.

Concurrency and parallelism are similar but different concepts. Here I use concurrency, since this is what really matters in this context.

Node and concurrency

Node is single threaded. So everything runs in one single thread (well… almost everything, but that’s not relevant in our context). So how can it support concurrency? The secret is that Node is also non blocking on I/O operations.

In Node, when you run an I/O operation, the program does not stop waiting for the I/O response. Rather it provides the system with a function, the so called "call-back", which represents what has to be done when I/O returns, and then immediately moves on to the next operation. When I/O completes, the callback will be executed resuming the logical processing of your program.

So, in a request/response scenario, we trigger an I/O operation and we free up the Node thread, so that another request can be immediately taken care of by the same Node instance.

In the above example, the first request Req1 runs some initial logic (the first dark green bar) and then starts an I/O operation (I/O operation 1.1) specifying the function that will have to be called when I/O completes (the function cb11).

At that point, the processing of Req1 halts and Node can start processing another request, e.g. Req2. When I/O Operation 1.1 completes, Node is ready to resume the processing of Req1 and will invoke cb11. cb11 will itself start another I/O operation (I/O Operation 1.2) passing cb12 as callback function, which will be invoked when the second I/O operation completes. And so on until Req1 processing ends and the response Resp1 is sent back to the client.

In this way, with a single thread, Node can serve many requests at the same time, i.e. concurrently. The non blocking model is the key for concurrency in Node.

Being single threaded though means that we can not use more than one core (for multi core scenarios it is possible to use Node clusters, but going along this path is inevitably adding some complexity to the overall solution).

Another aspect to note is that the non blocking model implies the use of an asynchronous style of programming, which at the beginning may result hard to reason about and can lead to complicated code, the so called “callback hell”, unless properly managed.

Go and concurrency

The Go approach to concurrency is based on goroutines, which are lightweight threads managed by the Go runtime communicating among each other via channels.

Programs can launch many goroutines and the Go runtime will take care of scheduling them on the CPU cores available for Go, according to its optimised algorithm. Goroutines are not Operating System tasks, they require much less resources and can be spawned very fast in very high numbers (there are several references of Go running hundred of thousands, even millions of goroutines concurrently).

Go is also non blocking, but this is all done behind the scene by the Go runtime. For instance, if a goroutine fires a network I/O operation, its state is changed from “executing” to “waiting” and the Go runtime scheduler picks another goroutine for execution.

So, from a concurrency perspective, this is similar to what Node does, but has 2 main differences:

it does not require any callback mechanism and the code flows pretty much as in normal sequential synchronous logic, which is usually easier to reason about
it is multi-threaded and can seamlessly leverage all CPU cores made available to the Go runtime

In the above example, the Go runtime has 2 cores available. All processors are used to serve incoming requests. Each incoming request is processed by a goroutine.

For instance, Req1 is processed by goroutine gr1 on Core1. When gr1 issues an I/O operation, the Go runtime scheduler moves gr1 to “waiting” state and starts processing another goroutine. When the I/O operation completes, gr1 is put in “runnable” state and the Go scheduler will resume its execution as soon as possible.

A similar thing happens with Core2. So, if we look at a single core, we have a picture similar to that of Node. The switch of goroutine state (from “running” to “waiting” to “runnable” to “running” again) is performed by the Go runtime under the hood and the code is a simple flow of statements to be performed sequentially, which is different from the callback-based mechanism imposed by Node.

In addition to all of the above, Go provides with a very simple and powerful mechanism of communication among goroutines based on channels and mutex which allows smooth synchronisation and orchestration among different goroutines.

There is more than concurrency though

Now that we have seen how concurrency is naturally supported by Node and Go, we can also look at other reasons for which it is worth start considering them as effective tools to be added to our toolbox.

Node and the Holy Grail of one language for Front End and Back End

Javascript/Typescript dominate the world of Front End. It is almost impossible to imagine a SW shop which has to build some Front End software not using Javascript/Typescript extensively.

But what if you need to build also the Back End? With Node you can leverage the same language, the same constructs and the same ideas (asynchronous programming) also to build the Back End. Even in the serverless space Node plays a central role, having been the first platform to be supported by all major Cloud providers for their FaaS offering (Function as a Service, i.e. AWS Lambda, Google Cloud and Azure functions).

The ease of switching between Front End and Back End may be one of the reasons of the incredible success of Node which has led to a super vast echo system of packages (you have a Node package for practically everything) and a super vibrant community.

At the same time, not all types of Back End processing are efficiently supported by Node. For instance, CPU intensive logic is not for Node given its single threaded nature. And therefore you should not get into the trap of “one language fits all”.

Node, with the enormous amount of packages available in its ecosystem, can also be seen as “the Far West of programming”, a place where quality and safety of what you import has to be constantly checked (read this fictional story for a feeling of the risks). But this is probably true anytime we leverage external libraries: the broader the ecosystem is, the higher the attention level has to be.

Still in many cases, especially for I/O bound concurrent scenarios, Node can be a choice and would help maximise the Javascript/Typescript skills that may be already in house.

Simplicity and Performance rediscovered with Go

Go is simple. There are 25 reserved words and the language specs are 84 pages including extensive examples (if printed as pdf from the official site). Just for comparison, the Java specs 2000 edition was over 350 pages.

Simplicity makes it is easy to learn. Simplicity helps writing code which is easy to understand and maintain. Often with Go there is just one way to do what needs to be done and this is not to frustrate creativity, but rather to simplify the life of whoever has to read and understand the code.

Simplicity can also be felt limiting. Concepts like generics or exceptions are just not in the language since the authors did not consider them as necessary (even if generics seem to appear on the horizon). On the other side some really useful tools such as Garbage Collection are part of the core design.

Go is also about runtime performances and efficient use of resources. Go is a strongly typed compiled language that can be used to build programs which run fast and efficiently, specifically for scenarios where we can leverage the power of multi core concurrency.

Go also produces small self-contained binaries. A docker image containing a Go executable can be significantly smaller than one containing an equivalent Java program, and this is because Java requires a JVM to run while Go executables are standalone (according to benchmarks the size of an optimised docker image of an “Hello World” application is 85 MB for Java 8 and 7.6 MB for Go, an order of magnitude difference). Size of images is important: it speeds up Build and Pull time, reduces network band requirements and improves control on security.

The other “new kids” that are in town

It is not only Node and Go. The recent years have seen the entrance of other technologies that promise benefits over the traditional monolithic platforms.

Rust. It is an open source language, backed by Mozilla which presented it in 2010 (so it is one year younger than Node and Go). The main goal of Rust is to optimise performance like C/C++ with a much safer programming model, i.e. with less probability to stumble into obnoxious runtime bugs. The new way of thinking Rust introduces, specifically around owning/borrowing memory, is often considered challenging in terms of learning curve. If performance is supercritical though, then Rust is definitely an option to be considered.

Kotlin. It is a language that runs on the JVM, developed by JetBrain and released in 2016 (in 2017 Google announced its support for Kotlin as an official language for Android). It is more concise than Java and embeds in its original design concepts like functional programming and coroutines making it part of the modern languages league. It can be seen as a natural evolution of Java, with a low entry barrier for developers coming from that world.

GraalVM. This is a new promising approach to Java and other JVM based languages. GraalVM Native Image allows to compile Java code to native executable binaries. This can produce smaller images and improve performance both at startup and execution time. The technology is still pretty young (released in 2019) and, at the moment, it shows some limitations. Given the Java popularity, it is likely to see significant improvements as it evolves towards maturity.

Conclusions

The software world is evolving fast. New solutions are always popping up. Some never get to the forefront, some enjoy a period of hype and then get removed from mainstream. Choosing the right one is not easy and is always an exercise of balancing new capabilities with battle tested stability and available expertise.

Node and Go have definitely proved to be viable technologies at enterprise level. They have the potential to bring significant benefits compared to traditional OO languages, specifically improving the efficiency of containerized distributed applications. They are backed by great communities and have wide echo systems.

While enterprises must continue to use and support traditional platforms such as Java, given the incredible extension of their production code base, it is highly advisable that they start embracing also relatively new tools, such as Node and Go, if they want to rip off all the benefits of modern distributed computing.