Unlocking Microservices Reliability With ACID and the Outbox Pattern

Written by lookingforere | Published 2024/03/25
Tech Story Tags: microservice-architecture | outbox | microservice-patterns | kafka-connect | transactional-outbox | golang-application | event-driven-architecture | acid-transactions

TLDRIn distributed systems, ensuring atomicity and consistency across data updates and message queue dispatches poses challenges. The traditional Two-Phase Commit (2PC) method is often impractical due to its blocking nature and recovery complexities after failures. The Outbox Pattern provides an alternative solution, leveraging an additional "Outbox" table within the service's database. This table acts as a buffer, facilitating atomic write operations to the database and ensuring reliable dispatch of messages to a message queue. The Outbox Pattern enhances efficiency and reliability in data synchronization, offering a robust mechanism for maintaining data integrity and supporting seamless communication between distributed services.via the TL;DR App

Hi There!

So, I believe each of us has encountered a situation when, after modifying an entity in the database, it's necessary to notify other systems about this change. For example, upon accepting an order, we need to inform all related services and teams about the arrival of a new order.

The first solution that comes to mind is straightforward: have each service or database polled at regular intervals to check for new orders or status updates. But what if the logic becomes too complex and needs to be segmented? This is where we turn to the event-driven approach, where we broadcast what happened with our objects.

Event-driven architecture (EDA) is an architectural pattern that allows systems to asynchronously exchange messages or events. The main advantage of EDA is its ability to increase system scalability and flexibility by minimizing direct dependencies between its components.

This facilitates easier adaptation to changes and growth, ensures higher resilience, and simplifies the integration of

diverse services.

However, a challenge arises in synchronizing data between the database and the messaging system. If we first record data about a new order in the database and then attempt to send a message to Kafka, we cannot guarantee that the message will actually be sent. A potential producer error could lead to the loss of critical notifications.

Attempting to wrap these operations in a single transaction (writing to the database and sending to Kafka, followed by a transaction commit) faces fundamental difficulties, as Kafka operates outside the transaction context of databases.

This necessitates an alternative solution that ensures atomicity and delivery guarantees for messages without compromising the system's performance and reliability. Here is where the Outbox pattern comes to the rescue.

https://microservices.io/patterns/data/transactional-outbox.html?embedable=true

The Outbox pattern involves creating a special table (outbox) in the database, where an event intended for the messaging system is recorded within the same transaction as the main data changes. Thus, both operations - changing the state in the main table and creating an event in the outbox table - are guaranteed to be executed together.

Subsequently, a dedicated connector (e.g., Kafka Connect) asynchronously reads data from the outbox table and sends it to Kafka, thereby ensuring the delivery of events to the end recipients.

This approach achieves high reliability and data consistency without adding undue complexity to the system's architecture or compromising its performance. Let’s review it.


The Problem

ACID is an acronym describing key transaction properties in databases:

  1. Atomicity: A transaction is either executed in full or not executed at all.

  2. Consistency: A transaction transitions the database from one consistent state to another.

  3. Isolation: The execution of one transaction doesn't impact the execution of others.

  4. Durability: Completed transactions persist in the database, even in case of system failures.

In distributed systems, the requirement to synchronize data across multiple services is a common challenge. The conventional Two-Phase Commit (2PC) approach is often deemed suboptimal due to its sluggish and blocking nature. Moreover, dealing with data integrity recovery in the event of a system failure poses significant difficulties.

The Outbox Pattern emerges as a remedy to address the need for atomicity in writing operations to the database and dispatching messages to a message queue. The core of this solution involves incorporating an additional table within the service's database, referred to as the "Outbox." This table acts as a buffer, storing messages destined for dispatch to a message queue.

How the Outbox Pattern Works:

  1. Local Transaction:

    • When performing a business operation, data changes are written to the respective database tables.

    • A message intended for other services is placed in the Outbox table within the same transaction.

  2. Message Publication:

    • Pollers periodically scan the Outbox table for new messages.

    • Each message from the Outbox table is published to the corresponding message queue.

    • After successful publication, the message is either deleted from the Outbox table or marked as sent.

Let’s Code

Let's break down this process step by step with a simple example, utilizing Go. You can find the source code and additional materials at the following link.

Step 1: Setting Up the Order Service

We'll start by creating a service for handling orders. This service will be responsible for receiving orders and recording them in the database. A key component will be implementing the "Outbox" pattern to ensure that every change in order status is reliably transmitted to Kafka.

For our case, we need to create only one method:

curl -X POST
  http://localhost:8080/api/v1/create-order
  -H 'Content-Type: application/json'
  -d '{
    "name": "John",
    "surname": "Doe",
    "details": {
      "code": 123,
      "operation_code": 456,
      "transaction_code": "sdfsdfsdfs"
    }
  }'

The core concept of this service involves writing to both the orders table and the outbox table within a single transaction. For instance, in Go, this would be implemented as follows:

func (r *OrderRepository) CreateOrder(userUUID string, order domain.Order) (*int32, error) {
	det := order.Details
	var id int32

	tx, err := r.db.Begin()
	if err != nil {
		return nil, err
	}

	err = tx.QueryRow(
		query,
		order.Name, order.Surname, det.Code, det.OperationCode, det.TransactionCode, StatusCreated).Scan(&id)

	if err != nil {
		tx.Rollback()
		return nil, err
	}

	// Save to outbox
	err = r.createOutbox(userUUID, order)
	if err != nil {
		tx.Rollback()
		return nil, err
	}

	return &id, tx.Commit()
}

As a result, we’ll expect messages like this:

Step 2: Configuring Kafka JDBC Connector

Next, we'll set up the Kafka JDBC Connector to automatically transfer messages from the outbox table to Kafka. This allows other services to subscribe to these events and react to changes in order statuses in real-time.

For the Kafka connector, I'll be using the JDBC connector. Find a good way for you to install it:

https://docs.confluent.io/kafka-connectors/jdbc/current/source-connector/overview.html?embedable=true#install-the-jdbc-sink-connector

After installation, just call connector API:

curl http://localhost:8083/connector-plugins 

We’ll expect to see:

[{"class":"io.confluent.connect.jdbc.JdbcSinkConnector","type":"sink","version":"10.7.5"},
{"class":"io.confluent.connect.jdbc.JdbcSourceConnector","type":"source","version":"10.7.5"},
{"class":"org.apache.kafka.connect.mirror.MirrorCheckpointConnector","type":"source","version":"7.6.0-ccs"},
{"class":"org.apache.kafka.connect.mirror.MirrorHeartbeatConnector","type":"source","version":"7.6.0-ccs"},
{"class":"org.apache.kafka.connect.mirror.MirrorSourceConnector","type":"source","version":"7.6.0-ccs"}]

Next, we need to create settings for the source.

Be careful at this stage, as we are providing credentials for our connector. I recommend creating separate credentials with read-only permissions for the outbox table so that this service can only observe changes. Overall, there are plenty of best practices in this regard, but we won't delve into them here.

{
  "name": "jdbc-source-connector",
  "config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "connection.url": "jdbc:postgresql://postgres:5432/postgres",
    "connection.user": "postgres",
    "connection.password": "postgres",
    "topic.prefix": "microshop.order-",
    "mode": "incrementing",
    "incrementing.column.name": "id",
    "table.whitelist": "outbox",
    "poll.interval.ms": "5000",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "key.converter.schemas.enable": "false",
    "transforms": "createKey",
    "transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
    "transforms.createKey.fields": "user_id"
  }
}

Set "table.whitelist": "outbox" - our table.

Set "topic.prefix": "microshop.order-" - our topic.

Set "transforms.createKey.fields": "user_id" - our message key.

More about settings and fields: here

And send it.

curl -X POST -H "Content-Type: application/json" --data @jdbc-source-config.json http://localhost:8083/connectors

And that's it! Now, you just need to create an order, and check if everything is working and if messages are being recorded in the tables. To view what we have in Kafka, I use Kafka UI for such tasks, which allows us to quickly see what we wrote there and if it's working.

Step 3: Creating the Background Processing Service

We'll create a background service that listens for messages from Kafka and executes the appropriate processing logic. For instance, this service might be responsible for updating order statuses or initiating delivery processes.

Full schema:

The last service is a background (order-background-service). In our case, it’s just a simple service that can consume messages from Kafka.

...
	err = c.SubscribeTopics([]string{"microshop.order-outbox"}, nil)
	if err != nil {
		fmt.Fprintf(os.Stderr, "Failed to subscribe to topics: %s\n", err)
		os.Exit(1)
	}

...

	ev := c.Poll(cfg.Poll)
	if ev == nil {
	  continue
	}

	switch e := ev.(type) {
		case *kafka.Message:

		// Create an instance of Outbox to store the unmarshaled data
		var outbox domain.Outbox

		// Unmarshal the JSON message into the Outbox struct
		err := json.Unmarshal(e.Value, &outbox)
		if err != nil {
			fmt.Println("Error decoding message:", err)
			return
		}
		fmt.Println("Payload:", outbox.Payload)

		_, err = c.CommitMessage(e)
		if err != nil {	
			fmt.Printf("Failed to commit message: %v\n", err)
		} 

Be Careful!

It is essential to pause here and reflect on the following: as we can observe, entries are now being recorded in tables atomically, ensuring ACID compliance. However, our pipeline has been expanded to include both a Kafka connector and Kafka itself. It is crucial to remember the delivery guarantees that Kafka promises.

Apache Kafka offers developers various levels of message delivery guarantees, enabling a balance between performance and reliability according to application requirements. These guarantees are based on the concepts of atomicity, durability, and isolation. Here are the primary delivery guarantees provided by Kafka:

  1. At Least Once: This foundational guarantee ensures a message is delivered at least once. It implies that, in the event of failures or retries, a message may be delivered or processed more than once. This guarantee is facilitated by retry mechanisms in case of errors and acknowledgments from consumers.

  2. At Most Once: In this mode, messages can be lost but will not be delivered more than once. This is the fastest and least reliable mode, as it does not account for retries in case of errors. This approach may be acceptable for scenarios where the loss of some messages is not critical.

  3. Exactly Once: This is the most complex and desirable guarantee, ensuring that each message is delivered and processed “exactly once”, eliminating duplication or loss. Kafka has supported "exactly once" delivery since version 0.11, using a combination of techniques such as idempotent producers and transactions to ensure atomicity across various topics and consumers.

The choice among these guarantees depends on your application's requirements for reliability and performance. If you need more information about it just let me know.


In conclusion, implementing an Outbox pattern alongside a reliable Kafka connector like the JDBC connector offers a robust solution for orchestrating data changes across microservices.

This approach ensures atomicity and consistency when updating multiple systems, significantly reducing the chances of data inconsistencies.

This method serves as a crucial first step toward an Event-Driven Architecture (EDA). By introducing a reliable messaging system, like Kafka, and seamlessly integrating it with a transactional database, we establish a foundation for distributing events across the microservices landscape.

This approach becomes particularly valuable when considering the need for analytics, as all relevant events are readily available for consumption. Analytics teams can effortlessly retrieve and process these events, contributing to the generation of valuable insights and reports.

In essence, the Outbox pattern, coupled with Kafka integration, not only addresses immediate concerns of data consistency but also lays the groundwork for a scalable, event-driven ecosystem that is essential for modern microservices architectures. This facilitates the seamless flow of information between services, unlocking the potential for enhanced analytics and real-time decision-making.

If you're interested in learning more about advanced topics such as managing deletions from the Outbox table and further insights into working with this setup, consider subscribing for future updates.

Your subscription will help gauge the interest in this topic and guide the direction of upcoming articles. Stay tuned for more in-depth explorations into the intricacies of microservices architectures and event-driven systems!

Useful Links

Kafka UI: click here

Kafka JDBC connector: click here

Outbox pattern: click here


Written by lookingforere | Lead software engineer | Mentor | Try to become a rockstar
Published by HackerNoon on 2024/03/25