An Intro to Services Oriented Architecture for Internal Tools

The Bezos API Mandate was a strategy to transform Amazon’s inner piping by requiring all products and services to be consumable via a standardized API. In other words, Amazon’s internal system needed to be transformed into a Services Oriented Architecture (SOA). Once an SOA is implemented, information flows much more freely throughout an organization. A robust SOA (for both internal and third-party data sources) should be an engineering priority for teams that want to foster stable, secure internal application development.

Like it did at Amazon, implementing an SOA across your products and data sources will unlock the power of internal tooling. An SOA allows developers to seamlessly and consistently build dashboards, workflow automation, asynchronous tasks, and more.

Often, there are challenges associated with building a robust internal SOA. In this article, I discuss common challenges and strategies for architecting this type of system.

What is a Services Oriented Architecture (SOA)?

There are myriad articles and YouTube videos that detail SOA’s, so I don’t want to focus too much time here. For the sake of this article, it is helpful to think of SOA as a standardized set of APIs for internal and third-party data sources used within an organization. For example, with SOA, an internal database has an API for reading / writing data to a particular table. So does a file storage system such as S3, Azure File Storage, or GoogleDrive. Instead of writing code that directly interfaces with these systems, you are interfacing with internal APIs which handle credentials and interfacing with these services.

Example of an SOA

import boto3
@route('/upload_file_to_s3', methods=['POST'])
def upload_file_to_s3(bucket, file_name, file_bytes):
  #centralized permission check
  is_authorized = user_has_auth( request.get('api_key') ) 
  
  if is_authorized:
    # Consumer doesn't need to worry about keys/credentialing
    client = boto3.client(
      's3',
      aws_access_key_id=os.env.get('s3_access_key'),
      aws_secret_access_key=os.env.get('s3_secret_access_key'),
      aws_session_token=os.env.get('s3_session_token') )
    s3_client = boto3.client('s3')
    response = s3_client.upload_file(file_name, bucket, object_name)
    return 200
  
  return 404

With this example, a developer who needs to write to S3 would interface with this API instead of hard coding the s3 upload logic into their codebase. This architecture is powerful because:

Developers can interface with various products and services without needing help from the product’s creator. Developers also don’t need to interface with the underlying service credentials (see code example above).
The service centralizes this logic into a single location with appropriate error handling, permission checking, etc. Also, if this logic needs updating, this only needs to take place in a single location.
The service can be leveraged by multiple programs, codebases, prog languages, etc., without repeated code (all the consumer needs is the URL and credentials).
The service can scale up/down in isolation from the applications interfacing with it.
It is much easier to set up observability and audit logging by monitoring a single endpoint.

# Using SOA in my code:
customer_file_stored = requests.post(f'{api_url}/write_to_s3', data = data)

# No SOA in my codebase. This code plus setup, config, and credentials 
# would need to be repeatedly setup in separate codebases.
s3_client = boto3.client('s3')
customer_file_stored = s3_client.upload_file(file_name, bucket, object_name)

With services set up for all data sources (both internal and external) — Developers in your organization are empowered to rapidly interface and access services when building internal tools. All the developer needs is their internal API key. With SOA, there is less overhead, fewer security risks (such as them hardcoding a password/credential), better observability, and general ease for the developer, which saves them time.

For an engineering organization, the goal should be to set up services for all internal and third-party data sources with a single permission management layer.

# Example SOA - Each item here is a service with an API to interface.
Internal
- MySQL
- Internal Processes (kick off a script, async task, etc.)
- DevOps / Deployments
Third-Party Data Sources
- Data Warehouse (snowflake, s3, etc.)
- CRM
- Slack
- Datadog / Monitoring
- GoogleDrive

Architecting SOA Infrastructure for Internal Tools

When architecting your SOA — there are some key areas that pose challenges.

Permission and Credential Management

For your services, you need to have a way to identify whether the client requesting data from a service route has the proper permissions. Ideally, this permission layer is a single level of abstraction which applies to all of your services. Having a single API key per person unlocks their ability to work with services instead of needing credentials for each data source they want to interface with.

Generally, there are 2 types of permissions layers that will be required for a robust SOA.

Service Level Permissions

When an api_key / credential is passed, a validation takes place to verify that this user has permission to access the requested endpoint. Mainly, the service checks if the API key is valid. If the API key is invalid, a single set of error codes is returned, which doesn’t need to be coded in at the endpoint level. For larger orgs, API keys should auto-generate and sync with your active directory.

Route Level Custom Permissions — User Groups

As mentioned above, user metadata is passed to the endpoint being queried once funneled through the Permission Validation Layer. Here is an example of user metadata:

user_a = {
  first_name: 'Steve',
  last_name: 'Wosniak',
  user_groups: ['devops','backend'],
  is_admin: false,
  ...
}

user_b = {
  first_name: 'Bob',
  last_name: 'Dylan',
  user_groups: ['marketing'],
  is_admin: false,
  ...
}

With this user metadata provided, you can build custom logic into your services. The most useful feature here is built-in User Groups. With this, you can build custom logic in routes around users based on this metadata. As an example, let’s say we are writing a Service for a customer DB table. We want team members in marketing to be able to read data but not write data (read-only).

# Write Endpoint
@route('/update_customer', methods=['POST'])
def update_customer(customer_id):
  #passed by permission validation layer
  user_metadata = request.get('user_metadata') 
  if 'marketing' in user_metadata.get('user_groups):
  return 'Unauthorized', 401
  ...

# Read Endpoint
@route('/get_customer', methods=['GET'])
def get_customer(customer_id):
  return sql.query.get(customer_id).json() #db query logic

With these 2 levels of permission management, your organization can build robust permission and credential management. A user groups feature is a requirement here. A note, this paradigm requires constructing a way to add users into user groups within the organization.

Other Considerations

Documentation

For your services, making sure that documentation stays up to date is crucial. One of the key purposes of SOA is to reduce friction and stakeholders for devs to interface with. If docs are out of date, this benefit is greatly diminished.

If you have feedback on ways to ensure docs are kept up to date with any code changes, please feel free to leave them in the comments!

2. Observability

A large benefit of SOAs is a single point of observability. Mainly, who is querying the data, what is the request, what was served back, etc. Moreover, this single point can be used to create a robust audit log.

In a future blog post, I’ll discuss how WayScript X is helping dev teams set up and maintain SOAs for their products and services.

Also published here