Service discovery in microservice architecture

Many years ago, the easiest way to lose contact with your friend is to change your phone number without informing them.

Same applies to services in a microservice architecture system. Two services may be happily talking to each other until one of them moves to another ip address.

What is service discovery

Service discovery is about finding the network location of a service provider.

Why do we need it

If a team is maintaining physical servers, a configuration file will mostly satisfy the need.

However, if you are using cloud, your services may have dynamic network location due to restart, failure and scaling. Maintaining a configuration file manually is just not feasible.

What are the components

Service discovery involves 3 parties: service provider, service consumer and service registry.

service provider registers itself with service registry when it enters and deregister itself when it leaves the system
service consumer gets the location of a provider from registry, and then talks to the provider
service registry maintains the latest location of providers

There are many existing service discovery tools to use. But what if we want to build our own?

Designing service discovery

As service registry is basically maintaining key-value pairs (provider name, provider locations) , redis can be a good choice. Let’s simulate the service discovery process with redis as registry.

When a service provider inventory_service registers itself in registry, we use SADD to add its locations to a set :

When a service consumer query for the location of inventory_service , we can either use SMEMBERS to get the all locations, or we can randomly pick one with SRANDMEMBER :

When inventory_service deregister itself, we use SREM to remove it from the set:

But there’s complexity to handle:

The service may not deregister itself when it’s gone. Then the registry provides an invalid address to the consumer. To tackle this problem, service provider needs to send its heartbeat periodically (every 5 second maybe). If the provider hasn’t send any heartbeat for sometime, the registry will assume the death of provider, and deregister it.
Querying registry before calling provider every time? It’s place too much load on registry and impose unnecessary performance impact. It’s better to keep a copy in consumer itself.
If kept in consumer, how to notify consumer about the changes in provider? There are 2 ways to do it. 1) consumer use polling to get latest version. Since the locations usually don’t change so frequently, this still works. The drawback is the possible downtime between polling. 2) pubsub pattern. It provides more immediate update of locations, but it will hold up additional thread of consumer.
Sending back all data of a provider may not be necessary. We can keep a global versioning of providers and consumer only needs to update its local copy when version got incremented.
Single point of failure. If the service registry (e.g. the redis instance we are using here) is down, all consumer and provider will be affected. To alleviate this, we can use a distributed database as service registry, such as zookeeper/etcd/consul .

Client-side discovery or server-side discovery

client-side discovery: service consumer keep all locations of providers and load balance the requests across locations. Pros: registry is the only one more component. Cons: need to implement service discovery client for different languages/frameworks used in your system.
server-side discovery: consumer send requests to a load balancer, the load balancer query from registry and decide which location of providers to send to. Pros: language/ framework agnostic. Cons: now you need to manage another moving part — the load balancer.

Conclusion

This article explains how service discovery works in a microservice architecture system. It helps reader to understand or debug when they work with open source tools like Netflix Eureka.

Reference and recommend reading: