Packing AWS SDK in Deployment Artefact - Does it Help to Your Infrastructure

A version of the AWS SDK is always bundled into the Lambda runtime for your language. So the conventional wisdom says you don’t need to include it in your deployment artefact.

I’m here to tell you otherwise.

But first, let’s examine the argument for not packing the AWS SDK.

Deployment artefacts are smaller

This means deployment is faster. Although the bulk of the deployment time is with CloudFormation, not uploading the artefact. So the actual time saving depends on your internet speed.

It also means you are less likely to hit the 75GB code storage limit. Which is fair, but I feel this risk can be mitigated with the lambda-janitor.

Lastly, you can actually see and edit your code in the Cloud9 editor! Alright, this is pretty useful when you just wanna quickly check something out. Although when you’re building a production service, you wouldn’t want to rely on the AWS console, ever. You certainly wouldn’t be making code changes directly in the console!

So what’s wrong with relying on the built-in version of the AWS SDK?

Old versions

The version of the built-in AWS SDK is often much older than the latest. Which means it doesn’t have the latest security patches, bug fixes as well as support for new features. Known vulnerabilities can exist in both the AWS SDK itself as well as its transient dependencies.

This can also manifest itself in subtle differences in the behaviour or performance of your application. Or as bugs that you can’t replicate when you invoke the function locally. Both are problems that are difficult to track down.

Invalidates tests

If you rely on tests that execute the Lambda code locally then the fact the runtime has a different version of AWS SDK would invalidate your tests. Personally, I think you should always have end-to-end tests as well. Which runs against the deployed system and checks everything works as expected when running inside AWS.

No immutable infrastructure

AWS can update the version of the built-in AWS SDK without notice. This can introduce subtle differences to your application’s behaviour without any action from you.

A while back, AWS updated the version of boto in the Python runtime. Unfortunately, the version they upgraded to had a bug. It caused many people’s functions to suddenly break. This affected Bustle amongst others and many hours were wasted on debugging the issue. As you can imagine, it wasn’t the easiest problem to track down!

Summary

In summary, my arguments against using the built-in AWS SDK are:

The built-in AWS SDK is often out-dated and missing security patches and bug fixes.It invalidates integration tests since the runtime uses a different version of the AWS SDK to what was tested.AWS can update the built-in AWS SDK without notice

Ultimately, I think the tradeoff is between convenience and safety. And I for one, am happy to sacrifice small conveniences for immutable infrastructure.

What about layers?

Quite a few of you have mentioned Lambda layers as a solution. Yes, you can ship the AWS SDK via layers instead. It would address point 1 and 3 above, but it introduces other challenges. I discussed those challenges in this post, so please give that a read.

Granted, some of those challenges have been tackled – e.g. the Serverless framework now supports layers when you invoke functions locally, and you can introduce semantic versioning to layers by wrapping them in SAR apps. However, it still remains a tooling problem if you want to run unit or integration tests before deploying your functions to AWS. In both cases, you’d still need a copy of the AWS SDK locally. For a dependency that is relatively small (a few MBs) and should be updated frequently, such as the AWS SDK, I don’t feel layers is worth the hassle it brings.

Hi, my name is Yan Cui. I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.

You can contact me via Email, Twitter and LinkedIn.

Check out my new course, Complete Guide to AWS Step Functions. In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, design patterns and best practices.

Get your copy here.

Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.

You can also get 40% off the face price with the code ytcui.

Get your copy here.

Originally published at theburningmonk.com on September 30, 2019