Building Maintenance Mode for your API Gateway Using Terraform

Written by omryhay | Published 2020/07/11
Tech Story Tags: terraform | aws | aws-api-gateway | cloud-computing | env0 | amazon | api | apis

TLDR Building Maintenance Mode for your API Gateway using Terraform on API Gateway. When the API is in maintenance it should return “503 Service Unavailable” response. This led us to the following solution: create a new API Gateway dedicated to the maintenance mode. This API will use a mock integration and return the same response for all endpoints using proxy resource. This mocked API will co-exist with our real API. We will use custom domain and change the base mapping between the real API and the mocked one. We are using ANY method together with “proxy+” to achieve the ability to cover all resources with a simple configuration.via the TL;DR App

In my previous blog post, I went through how to create a maintenance mode page for your application, and how to implement it using Terraform and github pages. But the website is just one part of an Application, and often there’s also a public (or private) API that also needs to have a maintenance mode. Let’s see how we can do that using Terraform on API Gateway. 

Architecture

As previously mentioned, we host all of our infrastructure on AWS, with a clear separation between the Application and the API:
  1. The front end is a React application, hosted on S3 with CloudFront as a CDN.
  2. The Backend services are mostly Serverless using AWS Lambda with API Gateway that manages our public API.
  3. DNS is managed by Route 53.

Implementing a Solution

Beyond the requirements for our webpage maintenance mode, we had one new one: When the API is in maintenance it should return “503 Service Unavailable” response with an informative message. This led us led us to the following solution:
  1. Create a new API Gateway that will be dedicated to the maintenance mode
  2. This API will use a mock integration
  3. It will return the same response for all endpoints using proxy resource 
  4. This mocked API will co-exists with our real API.
  5. We will use custom domain and change the base mapping between the real API and the mocked one.
Based on that, let’s see the Terraform code in action:
Mocked API Gateway
Looking at the code you are able to see that we are using ANY method together with “{proxy+}” to achieve the ability to cover all resources with a simple configuration, and using the response template to generate the “503 Service Unavailable” response with a clear message to the end user.
resource "aws_api_gateway_rest_api" "maintenance_mode_api" {
  name        = "maintenance-mode-API"
  description = "This is my Maintenance mode API"
}

resource "aws_api_gateway_resource" "maintenance_mode_resource" {
  rest_api_id = aws_api_gateway_rest_api.maintenance_mode_api.id
  parent_id   = aws_api_gateway_rest_api.maintenance_mode_api.root_resource_id
  path_part   = "{proxy+}"
}

resource "aws_api_gateway_method" "maintenance_mode_method" {
  rest_api_id   = aws_api_gateway_rest_api.maintenance_mode_api.id
  resource_id   = aws_api_gateway_resource.maintenance_mode_resource.id
  http_method   = "ANY"
  authorization = "NONE"
}

resource "aws_api_gateway_integration" "maintenance_mode_integration" {
  rest_api_id          = aws_api_gateway_rest_api.maintenance_mode_api.id
  resource_id          = aws_api_gateway_resource.maintenance_mode_resource.id
  http_method          = aws_api_gateway_method.maintenance_mode_method.http_method
  type                 = "MOCK"
  timeout_milliseconds = 29000
  passthrough_behavior = "WHEN_NO_MATCH"
  request_templates = {
    "application/json" = <<EOF
    {"statusCode": 503}
EOF
  }
}

resource "aws_api_gateway_method_response" "maintenance_mode_response_503" {
  rest_api_id = aws_api_gateway_rest_api.maintenance_mode_api.id
  resource_id = aws_api_gateway_resource.maintenance_mode_resource.id
  http_method = aws_api_gateway_method.maintenance_mode_method.http_method
  status_code = "503"
}

resource "aws_api_gateway_integration_response" "maintenance_mode_integration_response" {
  rest_api_id = aws_api_gateway_rest_api.maintenance_mode_api.id
  resource_id = aws_api_gateway_resource.maintenance_mode_resource.id
  http_method = aws_api_gateway_method.maintenance_mode_method.http_method
  status_code = aws_api_gateway_method_response.maintenance_mode_response_503.status_code

  # Transforms the backend JSON response to XML
  response_templates = {
    "application/json" = <<EOF
#set($inputRoot = $input.path('$')) { "message" : "Sorry for the inconvenience but we are performing some maintenance at the moment. We will be back online shortly!" }
EOF
  }
}

resource "aws_api_gateway_deployment" "maintenance_mode_deployment" {
  depends_on  = [aws_api_gateway_integration.maintenance_mode_integration]
  rest_api_id = aws_api_gateway_rest_api.maintenance_mode_api.id
  stage_name  = "prod"
}
Now using a custom domain we can control the maintenance mode, so the base path mapping will either point to the real API and in case of a maintenance mode it will point to the mocked API.
resource "aws_api_gateway_domain_name" "domain_name" {
  certificate_arn = var.certificate_arn
  domain_name     = var.domain_name
}

resource "aws_api_gateway_base_path_mapping" "base_mapping" {
  api_id      = var.is_maintenance_mode ? aws_api_gateway_rest_api.maintenance_mode_api.id : aws_api_gateway_rest_api.api.id
  stage_name  = var.is_maintenance_mode ? aws_api_gateway_deployment.maintenance_mode_deployment.stage_name : aws_api_gateway_deployment.deployment.stage_name
  domain_name = aws_api_gateway_domain_name.domain_name.domain_name
  base_path   = "/"
}
Also in AWS, we want to ensure that Route53 is pointing in the custom domain we’ve created. As opposed to the application maintenance mode, here our custom domain switches on and off our maintenance mode and not our DNS record.
resource "aws_route53_record" "api_dns_record" {
  zone_id = var.route53_zone_id
  name    = var.route53_suffix_domain_name
  type    = "A"
  alias {
    name                   = aws_api_gateway_domain_name.domain_name.regional_domain_name
    zone_id                = aws_api_gateway_domain_name.domain_name.regional_zone_id
    evaluate_target_health = false
  }
}
⚠️ Pay attention that this Terraform code does not create the Route53 hosted zone, nor the SSL certificates — you need to complete those as appropriate for your own setup ⚠️

Deployment

Now that our system is all configured, all I have to do is change the Terraform variable of the maintenance mode to be true/false and deploy the environment (in our case via the env0 UI).
The complete template source code can be found in this github repo which includes all the Terraform code. We hope you find this useful, or get other ideas for more ways to use Terraform for your deployment workflows.

About env0

env0 lets your team manage their own environments in AWS, Azure and Google, governed by your policies and with complete visibility & cost management. You can learn more about env0 here and you can also try it out yourself.
Feel free to drop us your thoughts below!

Published by HackerNoon on 2020/07/11