How to Use AWS Lambda Authorizer for Flexible and Scalable Web Service Authorization

Introduction

On one of the projects I worked on, there were 8 services that used Auth0 for front-end authentication and a rotated static token for back-end authentication. Despite this, the main authentication and authorization single sign-on (SSO) system was Microsoft Azure Active Directory (AD), which was used for accessing company email, the corporate VPN, AWS infrastructure, etc.

The reason Azure AD wasn’t used exclusively for user identification has more to do with the history of the project.

Implementation

Original Auth0 Flow

Even though a hardcoded token was periodically rotated, there was still a risk that a developer or system user could access the system after leaving the company.

While Auth0 was mentioned as part of our setup, my company predominantly utilized Microsoft Azure as a Single Sign-On (SSO) platform, and we aimed to integrate it further.

Let's break down what occurs on the front-end part of the process:

The initial step involves landing on the authentication page.
After entering the login credentials, password, and two-factor authentication code, the application redirects us back to the service, appending a “?code=” parameter in the URL.
The front-end library then exchanges this code for an access token. It's important to note that to exchange the code for an access token, the application must be configured to allow this without needing a client secret key.

As for the communication between the front-end and back-end - despite the OAuth authentication steps described above, the application sends a predefined token, which the back-end checks for an exact match. This step is pretty standard.

New Microsoft Azure AD Flow

To enhance security and add greater flexibility to our system configuration — with the ultimate goal of getting rid of Auth0 and fully integrating Azure AD into existing infrastructure — I implemented a Lambda Authorizer approach.

From the AWS doc point of view:

Technically, we can validate the token at the back-end, without making the infrastructure changes and adding an extra authorizing layer, which could be even more secure - instead of obtaining an access token on the front-end, we could first acquire an oAuth access CODE, which is then exchanged for an access token in the API back-end.

Although this process might seem simpler, it's important to remember that we needed to accommodate an authentication flow for 8 different services.

Once it’s done with a Lambda authorizer for one service, it could be applied to other services by simply setting Api Gateway + Lambda authorizer in the middle of the request.

Additionally, we required functionality to validate user permissions, determining whether users are authorized to access a service or to make requests to specific endpoints of a given service.

ENV/Service	Reporting service	Advertisement service
dev	✅	✅
prod	✘	✅

All of this can be managed within a single Lambda function.

Infrastructure Configuration and Challenges

Normally, Route 53 is used to point a domain's A record straight to the server where the website or service lives.

In my case, the A record was pointing to an AWS Application Load Balancer (ALB), set up to handle secure (HTTPS) connections on port 443, using an SSL certificate. The problem was that this certificate only allowed connections from websites with names like *.my-service-domain.com.

But, when I used API Gateway as a middleman between Route 53 and the ALB, the ALB rejected connections because they were coming from API Gateway, not from the allowed domain name in the certificate.

To fix this, I moved an SSL certificate validation at the API Gateway level instead. However, I still needed to make sure the ALB would only get connections from API Gateway. A big issue is that API Gateway doesn't have a fixed IP address that I could use to tell the ALB to only accept connections from this IP, by configuring a security group.

So, I replaced the ALB with a Network Load Balancer (NLB), which can work with the API Gateway VPC link.

Also, I had different setups for development and production environments, needing separate paths like /dev and /prod for API Gateway. But, when setting up /dev and /prod in API Gateway, it gave me URLs like this: https://apiId123456.execute-api.us-west-2.amazonaws.com/dev and https://apiId123456.execute-api.us-west-2.amazonaws.com/prod.

Since Route 53 A records can't handle URLs with paths in them, I couldn't use these URLs directly.

The solution was to use API Gateway's custom domains feature. This way, I could make unique URLs for each environment, like:

custdomainForDev.execute-api.us-west-2.amazonaws.com custdomainForProd.execute-api.us-west-2.amazonaws.com

Then, I could use these URLs as A record values in Route 53, solving my problem.

Lambda Authorizer

API Gateway expects a policy returned by the Lambda function that allows or denies endpoints access.

Let’s have a look at the example https://github.com/amazon-archives/serverless-app-examples/blob/master/javascript/api-gateway-authorizer-nodejs/index.js (archived, but can be used as a template).

'use strict';

console.log('Loading function');

class AuthPolicy {
  constructor(principal, awsAccountId, apiOptions) {
    this.awsAccountId = awsAccountId;
    this.principalId = principal;
    this.version = '2012-10-17';
    this.pathRegex = new RegExp('^[/.a-zA-Z0-9-*]+$');
    this.allowMethods = [];
    this.denyMethods = [];

    this.restApiId = apiOptions && apiOptions.restApiId ? apiOptions.restApiId : '*';
    this.region = apiOptions && apiOptions.region ? apiOptions.region : '*';
    this.stage = apiOptions && apiOptions.stage ? apiOptions.stage : '*';
  }

  static HttpVerb = {
    GET: 'GET',
    POST: 'POST',
    PUT: 'PUT',
    PATCH: 'PATCH',
    HEAD: 'HEAD',
    DELETE: 'DELETE',
    OPTIONS: 'OPTIONS',
    ALL: '*',
  };

  setStage(stage) {
    this.stage = stage;
  }

  addMethod(effect, verb, resource, conditions) {
    if (verb !== '*' && !Object.prototype.hasOwnProperty.call(AuthPolicy.HttpVerb, verb)) {
      throw new Error(`Invalid HTTP verb ${verb}. Allowed verbs in AuthPolicy.HttpVerb`);
    }

    if (!this.pathRegex.test(resource)) {
      throw new Error(`Invalid resource path: ${resource}. Path should match ${this.pathRegex}`);
    }

    let cleanedResource = resource.startsWith('/') ? resource.substring(1) : resource;
    const resourceArn = `arn:aws:execute-api:${this.region}:${this.awsAccountId}:${this.restApiId}/${this.stage}/${verb}/${cleanedResource}`;

    if (effect.toLowerCase() === 'allow') {
      this.allowMethods.push({ resourceArn, conditions });
    } else if (effect.toLowerCase() === 'deny') {
      this.denyMethods.push({ resourceArn, conditions });
    }
  }

  getEmptyStatement(effect) {
    return {
      Action: 'execute-api:Invoke',
      Effect: effect[0].toUpperCase() + effect.substring(1).toLowerCase(),
      Resource: []
    };
  }

  getStatementsForEffect(effect, methods) {
    const statements = [];
    if (methods.length > 0) {
      const statement = this.getEmptyStatement(effect);

      methods.forEach(method => {
        if (!method.conditions || method.conditions.length === 0) {
          statement.Resource.push(method.resourceArn);
        } else {
          const conditionalStatement = this.getEmptyStatement(effect);
          conditionalStatement.Resource.push(method.resourceArn);
          conditionalStatement.Condition = method.conditions;
          statements.push(conditionalStatement);
        }
      });

      if (statement.Resource.length > 0) {
        statements.push(statement);
      }
    }

    return statements;
  }

  allowAllMethods() {
    this.addMethod('allow', '*', '*', null);
  }

  denyAllMethods() {
    this.addMethod('deny', '*', '*', null);
  }

  allowMethod(verb, resource) {
    this.addMethod('allow', verb, resource, null);
  }

  denyMethod(verb, resource) {
    this.addMethod('deny', verb, resource, null);
  }

  allowMethodWithConditions(verb, resource, conditions) {
    this.addMethod('allow', verb, resource, conditions);
  }

  denyMethodWithConditions(verb, resource, conditions) {
    this.addMethod('deny', verb, resource, conditions);
  }

  build() {
    if ((!this.allowMethods || this.allowMethods.length === 0) &&
      (!this.denyMethods || this.denyMethods.length === 0)) {
      throw new Error('No statements defined for the policy');
    }

    const policy = {
      principalId: this.principalId,
      policyDocument: {
        Version: this.version,
        Statement: [
          ...this.getStatementsForEffect('Allow', this.allowMethods),
          ...this.getStatementsForEffect('Deny', this.denyMethods)
        ]
      }
    };

    return policy;
  }
}

export const handler = (event, context, callback) => {
  const token = event.authorizationToken;

  let authContext;
  if (process.env.IS_OFFLINE) { // for local development
    authContext = JSON.parse(event.body);
  } else {
    authContext = event;
  }

  console.log('authContext', authContext);

  // [The token validation logic should be here]

  const principalId = 'user|a1b2c3d4';

  const apiOptions = {};
  const tmp = authContext.methodArn.split(':');
  const apiGatewayArnTmp = tmp[5].split('/');
  const awsAccountId = tmp[4];
  apiOptions.region = tmp[3];
  apiOptions.restApiId = apiGatewayArnTmp[0];
  apiOptions.stage = '*';

  const policy = new AuthPolicy(principalId, awsAccountId, apiOptions);

  policy.allowAllMethods();

  const authResponse = policy.build();

  authResponse.context = {
    key: 'value',
    number: 1,
    bool: true
  };

  callback(null, authResponse);
};

The result of a function test run

{
  "type": "TOKEN",
  "authorizationToken": "12345678901234567890123456789012345678901234567890",
  "methodArn": "arn:aws:execute-api:us-east-1:1234567899:apiId1234/dev/POST/{proxy+}"
}

will be

{
  "principalId": "user|a1b2c3d4",
  "policyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": "execute-api:Invoke",
        "Effect": "Allow",
        "Resource": [
          "arn:aws:execute-api:us-east-1:1234567899:apiId1234/*/*/*"
        ]
      }
    ]
  },
  "context": {
    "key": "value",
    "number": 1,
    "bool": true
  }
}

that is telling the API Gateway that the user is authorized to access ANY environment using ANY request method and ANY endpoint of apiId1234 API.

Optionally, the policy can return a context object containing additional information, which can be passed into the integration back-end, as per the documentation.

Certainly, the above description serves as just a framework, since it lacks the token and permissions validation logic. Let’s add some now.

Authorization

Please note that the code below is a simplified version of an app authorization logic

Adding new functionality to the Authorizer function.

async function validateToken(token) {
  const res = await fetch('https://graph.microsoft.com/v1.0/me', {
    headers: {
      Authorization: `Bearer ${token}`
    }
  })

  return res.status === 200;
}

token is what API Gateway passes to the Lambda function. Checking if it’s valid and the session exists.

Modifying the handler’s code to handle token validation.

if (await validateToken(token)) {
  policy.allowAllMethods();
} else {
  policy.denyAllMethods();
}

More Fine-Grained Authorization

Now, extracting groups that the user belongs to:

List appRoleAssignments granted to a user

GET https://graph.microsoft.com/v1.0/me/appRoleAssignments

Authorization: Bearer token

Response:

{
  "value": [
    {
      "id": "12345667-12345678891233455678-12345667",
      "deletedDateTime": null,
      "appRoleId": "00000000-0000-0000-0000-000000000000",
      "createdDateTime": "2021-02-02T04:22:45.9480566Z",
      "principalDisplayName": "Reporting-service.dev.approved",
      "principalId": "12344567-1234-1234-123-12345678900",
      "principalType": "Group",
      "resourceDisplayName": "dxprovisioning-graphapi-client",
      "resourceId": "12344567-1234-1234-123-12345678900"
    },
    {
      "id": "12345667-12345678891233455678-12345667",
      "deletedDateTime": null,
      "appRoleId": "00000000-0000-0000-0000-000000000000",
      "createdDateTime": "2021-02-02T04:22:45.9480566Z",
      "principalDisplayName": "Advertisement-service.dev.approved",
      "principalId": "12344567-1234-1234-123-12345678900",
      "principalType": "Group",
      "resourceDisplayName": "dxprovisioning-graphapi-client",
      "resourceId": "12344567-1234-1234-123-12345678900"
    },
    {
      "id": "12345667-12345678891233455678-12345667",
      "deletedDateTime": null,
      "appRoleId": "00000000-0000-0000-0000-000000000000",
      "createdDateTime": "2021-02-02T04:22:45.9480566Z",
      "principalDisplayName": "Advertisement-service.prod.approved",
      "principalId": "12344567-1234-1234-123-12345678900",
      "principalType": "Group",
      "resourceDisplayName": "dxprovisioning-graphapi-client",
      "resourceId": "12344567-1234-1234-123-12345678900"
    }
  ]
}

Let us add some code to handle that:

async function getUserGroups(token) {
  const res = await fetch('https://graph.microsoft.com/v1.0/me/appRoleAssignments', {
    headers: {
      Authorization: `Bearer ${token}`
    }
  })
  const groups = res.data.value.map(group => group.principalDisplayName);
  return groups;
};

// const userGroups = await getUserGroups();
// hardcoded for visibility
const userGroups = [
  "Advertisement-service.dev.approved",
  "Advertisement-service.prod.approved",
  "Reporting-service.dev.approved"
];

userGroups.forEach(group => {
  if (group.includes("Advertisement-service")) {
    const env = group.split('.')[1]; // Extracts 'dev' or 'prod'
    policy.setStage(env);
    policy.allowMethod(AuthPolicy.HttpVerb.ALL, "/advertisement/*");
  }
});

// Explicitly allow access to dev endpoints for Reporting-service
if (userGroups.includes("Reporting-service.dev.approved")) {
  policy.setStage('dev');
  policy.allowMethod(AuthPolicy.HttpVerb.ALL, "reporting/*");
}

// Explicitly deny access to production endpoints for Reporting-service if not in the prod approved group
if (!userGroups.includes("Reporting-service.prod.approved")) {
  policy.setStage('prod');
  policy.denyMethod(AuthPolicy.HttpVerb.ALL, "reporting/*");
}

An output policy has fine-grained permission. If we want to reduce authorizer hits, we can then set a policy caching from 0 - 3600 seconds (60 mins), where 0 means no cache. But it should be thoroughly thought through, taking access token expiration time into account.

{
  "principalId": "user|a1b2c3d4",
  "policyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": "execute-api:Invoke",
        "Effect": "Allow",
        "Resource": [
          "arn:aws:execute-api:us-east-1:1234567899:apiId1234/dev/*/advertisement/*",
          "arn:aws:execute-api:us-east-1:1234567899:apiId1234/prod/*/advertisement/*",
          "arn:aws:execute-api:us-east-1:1234567899:apiId1234/dev/*/reporting/*"
        ]
      },
      {
        "Action": "execute-api:Invoke",
        "Effect": "Deny",
        "Resource": [
          "arn:aws:execute-api:us-east-1:1234567899:apiId1234/prod/*/reporting/*"
        ]
      }
    ]
  },
  "context": {
    "key": "value",
    "number": 1,
    "bool": true
  }
}

Conclusion

We have discussed a flexible approach to user authorization for accessing resources, utilizing any logic and provider not natively supported by AWS. Although our implementation uses Azure AD, this method could easily accommodate any other authorization service, including a custom one.

This approach offers several benefits:

It provides a unified authentication and authorization point for both the front-end and back-end.

There is flexibility in configuring permissions.

Scalability. Once implemented for one service, it can be applied to many others.