Performance Tuning Alexa Skills using AWS Lambda

I’m an Alexa Development Champion and O’Reilly Author on Alexa. I create new Alexa skills every month, and monitor the usage of those already running and update as needed.

AWS Lambda is a great service, and by default, the way of coding logic that sits behind the popular Amazon Alexa platform. One of my Alexa Skills that uses it gets popular at this time of the year, especially from August through October. It’s called Hurricane Center, and serves data from the National Hurricane Center on your Amazon Alexa device.

Recently with Hurricanes Harvey and Irma, it’s been seeing peak traffic, hitting new all-time usage highs. Given that it’s using the Serverless model, it automatically scales with the load, so I don’t have to do anything in support of the spike. It does make me take a look at my monthly AWS invoice, just to make sure I stay on top of the related costs. Here is my analysis from performance tuning the Hurricane Center skill, and the cost implications.

Performance Tuning AWS Lambda

One of the default variables when provisioning Lambda functions is the memory allocation. It’s controlled by a slider (see below) on the advanced settings tab of the console, and by default, starts out at 128MB. This setting can be adjusted without any coding changes to the Lambda function. Just save, and the platform will redeploy the function with the new capacity.

By changing the memory allocation from the default of 128MB to 512MB (and the relative corresponding vCPU change of 4x), the performance improves dramatically. Here is the graph of response time in CloudWatch showing daily average performance of the calls to this function. The change was made on August 14th. Previously the performance was in the 300ms range, it’s now averaging in the 100ms range — a 3X improvement! The traffic spikes I’ve received in the last two weeks haven’t made a dent in this either as performance has stayed quick — at 100ms.

What is the Cost Impact?

A good question anytime we throw extra resources at an application is what is the cost impact. Something that seems to be inexpensive in relative terms can quickly add up.

The pricing structure that AWS has on Lambda is quite generous. This is especially true given the relative usage of an Alexa skill. A free tier exists on accounts, and the limits for staying under it are 1M requests, as well as 400,000 GB-seconds of CPU usage in a month. It’s the second measure that we are impacting by going with a higher memory allocation, as the volume of transactions stays the same.

Let’s do the Math for this Change

The impact of going from 128MB to 512MB doesn’t mean that the cost goes up by 4X. The unit of measure is the combination of memory allocated multiplied by the transaction time in increments of 100ms. Given that we see the a 3X reduction in response time, they almost offset.

Scenario #1 — Small (128MB) version

10k executions x 300ms x 128MB = 384 GB-seconds.

Scenario #2 — Larger (512MB) version

10k executions x 100ms x 512MB = 512 GB-seconds.

So the net result of quadrupling the memory is a 34% jump in consumption — and my skill gets a 3X performance improvement. BOOM!

Don’t tune beyond 100ms

Going beyond 512MB is possible as the service supports sizes up to 1.5GB. There is “fine print” in the pricing model for Lambda highlighting that the cost for each transaction gets rounded to the nearest 100ms. The implication is that once you get below 100ms in performance, any increase in heap will just grow the overall usage. Here’s the scenario matching the example above.

Scenario #3 — Largest (1.5GB) version

10k executions x 65ms (rounded up to 100ms) x 1.5GMB = 1500 GB-seconds.

The net result would be a tripling of memory may lead to improved performance, but the usage cost also goes up by 3X.

How much does my bill go up?

Now comes the absurd part. The first 400k GB-seconds per month is free, so unless the skill that uses this Lambda function gets insanely popular, it will stay under the free tier. Let’s do the math on how popular the Alexa skill would be to hit this level. (example assumes scenario #2 above)

33k executions/day x 30 days/month = 1.0M executions/month.

33k executions/day x 30 days/month at 512MB = 50k GB-seconds.

So the thing to watch out for with this skill is not the memory, rather the transaction volume. If you’re not familiar with Alexa skill usage, 33k executions/day is extremely high, and it’s only the top 1% of skills on the platform that could see volume along these lines. When developing Alexa Skills, don’t hesitate to bump up the memory as it improves performance, and is free!