Alexa, tell me how to build a skill

Written by slobodan | Published 2017/07/02
Tech Story Tags: amazon-echo | aws-lambda | alexa | nodejs | tutorial

TLDRvia the TL;DR App

Amazon Alexa is one of the most popular voice assistants available on the market today. Building Alexa skills is fun and it can be beneficial to your product as a cool feature or another marketing channel.

This article is a tutorial on how to design and build an Alexa skill using Node.js and AWS Lambda.

Why would you build an Alexa skill?

With tens of millions of Alexa-enabled devices sold until September 2017, according to Jeff Bezos¹, and more than 25,000 available skills², Alexa is one of the most popular voice assistants on the market.

According to Jeff Bezos, Amazon sold tens of millions Alexa-enabled devices until September 2017

Alexa skills can add a value to your product, either as new features or an additional marketing channel. They can, also, be a fun weekend side project that will give your living room superpowers — just imagine telling Alexa to turn on your Christmas tree lights in front of your non-programmer friends.

A good indicator of the growth rate for the Alexa-enabled devices is the number of daily downloads of the Alexa app³ on the Google Play store, shown on the next image.

Number of daily Alexa app downloads on Google Play store in 2017

Let’s build a skill

Because you started reading this article, “Alexa, tell me how to build a skill,” I am sure you are interested in building the skills more than just a number of downloads, so let’s jump to the interesting bits.

This article is a tutorial on how to build an Alexa skill and it will cover following:

  • How to define and design a simple Alexa skill.
  • How to program a skill using Node.js.
  • How to host and connect the skill with your Alexa.

The question is: what skill should we build?

I know you can come up with a lot of great ideas, but because crypto currencies are probably the top buzzword today, let’s create a simple skill that will be able to tell you the exchange rate of a crypto currency.

It should also be able to tell you the amount of crypto currency you can get for a specified amount of euros: for example, “How many Litecoins can I get for 1000 euros?”

Meet Crypto, your first Alexa skill

Crypto is a simple Alexa skill that can tell you current price of Bitcoin, Ethereum, and Litecoin.

Crypto, an Alexa skill for current crypto currency exchange rate

Skill design

The most important part of a good Alexa skill is its interaction design. Interaction design describes how do you talk to the skill and how natural the communication is.

As opposed to web or desktop applications, most of the Alexa skills (except some skills on the Alexa Show device) don’t have visual feedback that will guide users through your skill features. Instead, you need to guide the user through the skill using voice. All Alexa skill replies needs to tell the user clearly what the next options are.

A detailed look at interaction design is beyond the scope of this tutorial as it would require much more than an article. Fortunately, there are many good resources on the internet. As a good starting point, read Amazon’s official voice design guide at https://developer.amazon.com/designing-for-voice/.

Getting back to our Crypto skill, there are two different scenarios that we want to cover:

  1. Asking for the current crypto currency exchange rate.
  2. Asking for the amount of crypto currency you can get for a certain amount of euros.

For the current crypto currency exchange rate, we need the name of the crypto currency in the command:

  • “Alexa, ask Crypto for the Bitcoin price”
  • “Alexa, ask Crypto how much is Litecoin”
  • “Alexa, ask Crypto for the current Ethereum value”

For the amount of a crypto currency for a certain amount of euros, we can use the following commands:

  • “Alexa, ask Crypto how many Bitcoins can I get for 1,000 euros”
  • “Alexa, ask Crypto for the amount of Ethereums for 1,500 euros”
  • “Alexa, ask Crypto for the number of Litecoins I can get for 500 euros”

Now that we have an idea about what we want to build, let’s see how Alexa skills work.

How does the skill work?

An Alexa skill is basically a small application that interacts with Alexa via an HTTP webhook or AWS Lambda function.

The flow of an Alexa skill request includes following:

  • Alexa-enabled device (for example, Amazon Echo or Amazon Echo Dot).
  • Amazon Alexa, a smart personal assistant in the cloud.
  • Your AWS Lambda function or HTTP webhook, that receives the request from Alexa and replies using text that will be transformed to voice, audio files, and Alexa Show content.

Amazon Alexa is not a device, it’s an intelligent personal assistant in the cloud, made popular by the Amazon Echo and the Amazon Echo Dot devices.

Full flow of the Alexa command should look like the following image:

Flow of the Alexa command

In this tutorial we’ll use AWS Lambda because it doesn’t require a server setup and it’s free for a skill such as this one. If you want to learn more about the advantages and disadvantages of AWS Lambda and serverless in general, see the introduction to serverless here⁵.

Anatomy of an Alexa skill invocation

Some of the main characteristics of Amazon Alexa, and voice assistants in general, are:

  • Alexa has a built-in natural language processing (NLP) engine that will parse the request and pass the data in JSON format.
  • Voice assistants typically require a wake word or phrase — a sound that will tell them to expect a command immediately after.

Alexa is command-based and a typical command consists of the following:

  • Wake word is used to turn Alexa into attentive state. The default wake word is “Alexa,” but it can be customized in the device settings. At the time of writing, available wake words are “Alexa,” “Amazon,” “Echo,” and “Computer.”
  • Launch phrase tells Alexa to trigger a certain skill. Launch phrases include “ask,” “launch,” “start,” “show,” and many others.
  • Invocation name is the name of the skill you want to trigger.
  • Utterance tells Alexa what the skill should do (unless your launch phrase is “start” or “launch”). Those instructions are known as utterances. Having static utterances would not give you much flexibility, so Alexa allows you to add some dynamic parts to the command, called slots.

When user invokes the skill Alexa parses it into JSON and passes the parsed data to your AWS Lambda function or an HTTPs webhook.

The following image shows how the Alexa skill invocation works:

Prerequisites for this tutorial

Besides a skill idea and a good will to program that skill, there are a few more items you’ll need to be able to follow this tutorial:

  • Amazon account (because you’ll need to setup the skill on Amazon’s developer portal)
  • AWS account (because your Alexa skill will be deployed on AWS Lambda)
  • Configured AWS credentials (see this guide⁶ for more info)
  • Node.js (version 4 or more, v6.11 is recommended), and a basic knowledge of Node.js (because this guide is using it for the skill development)

If you have all of the prerequisites, then let’s jump to the setup of your skill.

Skill setup

The next step is to create and configure your Alexa skill.

To do so, go to https://developer.amazon.com/alexa-skills-kit, sign in with your Amazon account and click on the “Start a Skill” button.

This will open the “Skill information” section of Alexa skill configuration. In this section, set the name of your skill to “Crypto Bot” and invocation name to “crypto.” The rest of the fields should stay the same (skill type should be “Custom Interaction Model” and all global fields should be turned off).

Then save and click on the Next button to go to the “Interaction Model” section.

Interaction model is a set of rules that defines the way the user interacts with your skill. As a part of an interaction model, you need to define Intent Schema, a list of intents that your Alexa skill commands will be parsed to, and sample utterances; that is, sample phrases that will match each of the intents.

The intent schema should be in JSON format and it should define an array of intents, each with a name, and an optional list of dynamic parts — slots.

Our skill has two intents: GetPrice and AmountInfo . Both of the intents should have a coin name as a slot, which we’ll call Coin, and we set its type as COIN_TYPES (we’ll define it in a minute). AmountInfo intent should also have a number slot, for an amount in euros. This slot can be named Amount and it should use AMAZON.NUMBER, which is one of the built-in slot types. For more info about built-in slot types, see the documentation.

Your intent schema should look like the following JSON snippet:

{"intents": [{"intent": "GetPrice","slots": [{"name": "Coin","type": "COIN_TYPES"}]},{"intent": "AmountInfo","slots": [{"name": "Coin","type": "COIN_TYPES"},{"name": "Amount","type": "AMAZON.NUMBER"}]}]}

Because Coin slot type is obviously not a built-in type, you’ll need to add it as a custom slot type in “Custom Slot Types” section. Name it the same you named it in the intents schema (COIN_TYPE) and add the following data as values (they need to be line-separated):

BitcoinLitecoinEthereum

After adding custom slot values, click on the Add button to save them.

The last part of the Interaction Model configuration is sample utterances. Sample utterances are also line-separated. They always start with the intent name, followed by the sample phrase, excluding the wake word and skill name. Sample utterances should also contain slots, which are specified using slot names (from your intent schema) in curly braces, for example {Coin}. Slot types are case sensitive.

You should add at least 2–3 utterances for each intent type.

Your sample utterances should look like the following text snippet:

GetPrice How much is {Coin}GetPrice What is the price for {Coin}GetPrice Current {Coin} valueGetPrice {Coin} value

AmountInfo How many {Coin} can I get for {Amount} eurosAmountInfo Amount of {Coin} for {Amount} eurosAmountInfo Number of {Coin} for {Amount} euros

Now that you have built the interaction model, click Next to save it. Saving can take up to a few seconds because Alexa will automatically train itself with the provided interaction model.

After the model is saved, you’ll be transferred to the “Configuration” section. This section requires connection to your Lambda function or webhook. But because you don’t have the Lambda function yet, let’s first stop at this step and create and deploy code for your skill first.

Coding your skill

Code for the Lambda function is a simple Node.js module. This means that you need to create a folder (for example crypto-skill), enter your new folder name and initialize a new NPM project in it.

Because the skill is simple, it can fit into a single file, so create skill.js file in your crypto-skill folder.

This file is a standard Node.js file, with one important note — deploying on AWS Lambda requires your main file to do exports.handler instead of module.exports. So, the base of your skill.js file should look like this:

'use strict'

function cryptoSkill(event, context, callback) {// Do something with `event` and use `callback` to reply}

exports.handler = cryptoSkill

There are many ways to build a skill, but to keep everything as simple as we can we’ll use following Node.js modules:

  • [alexa-skill-kit](https://www.npmjs.com/package/alexa-skill-kit) — a simple Node.js library that simplifies building Alexa skills for AWS Lambda.
  • [alexa-message-builder](https://www.npmjs.com/package/alexa-message-builder) — a class for creating rich Alexa replies, with reprompts, cards, etc.
  • [cryptocompare](https://www.npmjs.com/package/cryptocompare) — a JavaScript API for https://www.cryptocompare.com.
  • [node-fetch](https://www.npmjs.com/package/node-fetch) — a light-weight module that brings window.fetch to Node.js. This module is required by cryptocompare.

To install all the dependencies, run the following command:

npm install alexa-skill-kit alexa-message-builder cryptocompare node-fetch --save

The next step is to integrate the Alexa Skill Kit module in your handler function. This module accepts three arguments: event that triggered AWS Lambda, Lambda context, and a callback function that will be invoked with the parsed event.

You can also use Alexa Message Builder to build nice, readable replies. For full documentation, see https://github.com/stojanovic/alexa-message-builder.

In Alexa Skill Kit callback, you want to check if the event is GetPrice or AmountInfo intent and do something for each of them.

You should also check if the event is LaunchRequest and give initial instructions about your skill if it is.

LaunchRequest is one of the Alexa request types (see full list of types here). It is sent when the user invokes your skill without providing a specific intent, for example, by saying “Alexa start Crypto.”

At this point, your code should look like this:

function cryptoSkill(event, context, callback) {alexaSkillKit(event, context, (message) => {if (message.intent.name === 'GetPrice') {// Get the price for selected crypto currency}

if (message.intent.name === "AmountInfo") {  
  // Get an amount of crypto currency that user can get for specified amount of euros  
}

if (message.type === 'LaunchRequest') {  
  // Answer to \`Alexa, start Crypto\` command

return new MessageTemplate().addText(`Hello from crypto currency bot.I can give you the info about bitcoin, litecoin and ethereum prices.How can I help you today?

              You can say:  
              What is the current Bitcoin price?  
              Or How many Ethereums can I get for 100 euros?  
             \`)  
    .addReprompt(\`You can say:  
              What is the current Bitcoin price?  
              Or How many Ethereums can I get for 100 euros?  
             \`)  
    .keepSession()  
    .get();  
}  

})}

Then program both GetPrice and AmountInfo intents using the cryptocompare module.

For example, code for the GetPrice intent can look like this:

const token = message.intent.slots.Coin.value

if (Object.keys({bitcoin: 'BTC',litecoin: 'LTC',ethereum: 'ETH'}).indexOf(token.toLowerCase()) < 0) {return 'I can give you the info only for bitcoin, litecoin and ethereum'}

return cc.price(tokens[token], 'EUR').then(prices => `Current price of ${token} is ${prices.EUR} euros.`)

After adding the code for both intents, your skill.js file should look like this in the end:

skill.js

Deploying the skill

We’ll use Claudia.js to deploy the skill. Claudia.js is a Node.js tool that simplifies the deployment of AWS Lambda functions. This section assumes you have Claudia.js installed globally, as explained in this guide⁶, but you can put each of the following commands in NPM scripts and install Claudia.js as a dev dependency.

To deploy your Lambda function run theclaudia create command. This command requires specifying the following:

  • AWS region: Alexa supports Asia Pacific (Tokyo), EU (Ireland), US East(N. Virginia), and US West (Oregon) regions. We’ll use Ireland (eu-west-1).
  • Handler, which is the name of your handler file without .js extension followed by .handler. In our case, it should be skill.handler because handler file is skill.js and it exports handler.
  • Version because Alexa Skills doesn’t allow usage of default latest version. You can name your version whatever you want, but my recommendation is to name it simply skill.

The full command should look like this:

claudia create --region eu-west-1 --handler skill.handler --version skill

When the command executes, your Lambda function will be deployed. The next step is to allow Alexa to trigger your function. To do so, run the following command (just make sure you are using the same version you used in the claudia create command):

claudia allow-alexa-skill-trigger --version skill

This command will return JSON that looks like this:

{"Sid": "Alexa-1234567890123","Effect": "Allow","Principal": {"Service": "alexa-appkit.amazon.com"},"Action": "lambda:InvokeFunction","Resource": "arn:aws:lambda:eu-west-1:123456789012:function:crypto-skill-medium:skill"

}

The most important information in this response is “Resource,” which contains the Lambda function Amazon Resource Name (ARN) that you need to finalize in the Alexa skill configuration.

Connecting and testing your skill

Now that you have the Lambda function ARN, go back to the “Configuration” section of the skill creation form (https://developer.amazon.com/alexa-skills-kit).

Select “AWS Lambda ARN” as “Service Endpoint Type” and enter the ARN from the previous command as the default endpoint.

Answer with “No” to the “Provide geographical region endpoints?” question and click on the Next button.

This will save your skill configuration and take you to the “Test” section. In this section, you’ll see the “Service Simulator,” which can be used to test your skill.

Type “What is the Bitcoin price?” into “Service Simulator” and you’ll see JSON containing text similar to “Current price of Bitcoin is 9011.37 euros.” in the “Service Response.”

Congratulations, you have successfully built an Alexa skill 🎉

If your Alexa is connected to the same Amazon account, then try to say:

Alexa, start Crypto.

or

Alexa, ask Crypto what is the Bitcoin price?

It should answer with text similar to “Current price of Bitcoin is 9011.37 euros.”

Further improvements

Now that you have built an initial Alexa skill, here are the few next steps you might want to try out:

  • Add historic prices — Crypto skill should be able to tell you the price of Bitcoin yesterday or on any other specified day
  • Add better code structure by splitting the commands to the separate files
  • Add automated tests to the code of your skill
  • Submit the skill to the Alexa store by filling out the “Publishing Information” and the “Privacy & Compliance” sections
  • Build another skill

I might cover these topics in one of the next articles but, in the meantime, you can get some ideas on how to do those tasks in the “Serverless Apps with Node and Claudia.js” book.

Alexa limitations

After a few experiments with Alexa skills, you’ll realize that the platform still has certain limitations.

Some of the most important limitations are:

  • You cannot update custom slots or intents dynamically or using the script. This sounds logical, but it is also a limitations because you cannot pull the data from your API. For example, if you want to add a new crypto currency, then the only way is to edit the custom slot manually in the Alexa Developer dashboard.
  • Alexa doesn’t recognize your voice, which means that anyone that can talk to your Alexa can trigger all your skills.
  • You cannot chain the intents within certain sessions out-of-the-box. For example, if you have few intents that are followed by yes-no question, both “Yes” and “No” will be intents that you define just once, and than you need to implement the logic inside each of those intents that will respond depending on a previous command stored in the session attributes. For an example of session usage see this article⁷.

Fortunately, Amazon is constantly improving the platform, so the list of limitations is smaller with each new update or device.

Initial version of this skill was built as a part of live coding session at NoSlidesConf 2017 in Bologna. To watch the video of my “Alexa, start the presentation” session, visit: https://www.youtube.com/watch?v=D-eUnlaqUTw.

Many thanks to my friends Aleksandar Simović and Milovan Jovičić for help and feeback on the article.

Learn more about Alexa skills and serverless

If you want to learn more about building Alexa skills, chatbots for multiple platforms, and serverless applications, you should check the book called “Serverless Apps with Node and Claudia.js” that I am writing with Aleksandar Simovic for Manning Publications. More info about the book is available here: https://www.manning.com/books/serverless-apps-with-node-and-claudiajs.

This book will teach you:

  • What is serverless and why is it important.
  • How to build a real world serverless API using Node and Claudia.js.
  • How to connect your API with serverless database (AWS DynamoDB) and add an authentication.
  • How to debug and test serverless applications.
  • How to build chatbots for FB, SMS (using Twilio) and Alexa skills.
  • And much more…

The first chapter is free and you can read it here: https://livebook.manning.com/#!/book/serverless-apps-with-node-and-claudiajs/chapter-1/

References

¹ Press release: Amazon.com Announces Third Quarter Sales up 34% to $43.7 Billion (link)

² Amazon Echo 2017 review: Alexa isn’t niche anymore (link)

³ Number of Alexa app downloads (link)

⁴ AWS Lambda pricing (link)

⁵ Serverless Apps with Node and Claudia.js book, chapter 1: Introduction to serverless with Claudia (link)

⁶ Installing and configuring Claudia.js (link)

⁷ An Example of Sessions with Amazon Alexa Skills (link)


Published by HackerNoon on 2017/07/02