Serverless Loose Ends

I amassed a few tidbits of nice information while playing with Serverless over time. These are all too small to do a real post about, so I decided to outline a few in a single one. This time, I will go over two subjects: make simple project deployment faster and using the same S3 bucket for all your projects. Here is how I ended up fixing these two Serverless loose ends.

Make simple project deployment faster

Most projects for this blog have the same shape: they require no external dependency except the AWS SDK and serverless. Both of these are completely useless at runtime: serverless is dev-only and AWS SDK is already provided in lambdas.

My issue is that it looks like serverless removes development dependencies instead of installing them in a clean directory. The side effect of this is that it takes quite a bit of time to remove all those small files. And all that work to end up with an empty directory anyway. For example, deploying a simple project would take almost 6 minutes:

I decided to look around and found a forum post trying to fix this issue (see here). The fix described in the post is to simply tell serverless to not care about your dependencies and exclude the node_modules folder completely:

Once this was done, deployment time went down quite a bit to less than 4 minutes:

Using a single S3 bucket for multiple projects

By default, serverless will generate an S3 bucket per project. This might not be an issue for small businesses or a simple playground, but it can become an issue later. Amazon has put a soft limit to 100 S3 buckets per account. This limit can be increased to a maximum of 1000 (source: link). This might seem like a lot, but you could also be using buckets for a lot of things. Of course, this limit can be bypassed by creating new AWS accounts, but it becomes a nightmare once you start accessing resources cross-account.

Following a forum post (here), I discovered the deploymentBucket option. This needs to be added to your provider section and needs to contain the ARN of the bucket to use. For example, I use the following in my playground:

I wanted to make things simple for me and not have to hardcode anything. At the same time, I wanted to play with CloudFormation, so I decided to expose the bucket through a CloudFormation stack. In the previous sample, the name I use for the deployment bucket is ${cf:ServerlessRoot.ServerlessDeploymentBucketArn}. This tells serverless to get the CloudFormation stack called ServerlessRoot. Then, it extracts the output named ServerlessDeploymentBucketArn from it.

The Stack I deployed can be found here. And a quick helper script to deploy it. Right now, it contains only the deployment bucket, but maybe I’ll add some IAM stuff in there too.

Serverless Plugin for S3 security and Git status

I’ve been using Serverless at work for a little bit of time. While doing that, I’ve been exploring it even more on my own time. There are a few things that I want on all my projects and don’t want to do manually every time. Amongst other things, I want my S3 buckets to never be public and I want some kind of git status on my lambda. I know there are plenty of Serverless plugins to do just that, but I want to learn how to do a plugin. Here is how I ended up writing a Serverless Plugin for S3 security and Git status.

As always, the code I worked on for this sample can be found on my GitHub.

Plugin goals

Basically, I wanted my plugin to be used as a master plugin for my other projects. I want it to do the following:

  • Store the SHA, branch, and user that deployed
  • Ensure that all my S3 buckets can’t be made public (using PublicAccessBlockConfiguration)
  • Ensure that all my log groups have a retention policy
  • Validate the stage that I’m using for deployment to select environment configuration

Since I wanted to see how to setup custom options, I allowed the user to whitelist some S3 buckets to not receive the block.

Writing the plugin

Plugin bootstrap

Creating the base of the plugin is simple enough. Still, I decided to use serverless to create the skeleton for me using serverless create --template plugin. Running this command will create a index.js file containing the boilerplate for the plugin. You will then need to build a package around that file.

Registering to the hooks

Integrating your plugin into Serverless’s workflow is done through hooks. Everything is a step and you can hook before or after any step. You can even create new ones when you add commands. Finding documentation on these steps is quite hard. The best documentation of the flows I could find is on this gist.

After investing quite a bit of time I found the two places where I needed to hook my plugin. The first one was after the package:initialize step. In my code, hooking to this step can be found here. Using this hook, I validate that the configuration used makes sense for me.

The second hook I needed is used to inject into the resources. This needed to be far enough in the flow so that resources are already created, but not so far so the CloudFormation templates are not yet generated. I found that hooking before the package:finalize step gets that done. This part can be found here in my code.

Finding all generated resources

Serverless can store your resources in two different places depending on how they got created.

If a resource is added through the resources section of serverless.yml it will end up in serverless.service.resources.Resources. An example of this is my public assets S3 bucket.

If a resource is generated by Serverless itself, for example, a deployment bucket, it will end up in serverless.service.provider.compiledCloudFormationTemplate.Resources.

To make handling these two types of resources easier, I wrote a quick wrapper to iterate on both of these.

Using custom options from serverless.yml

To make my plugin a bit more usable, I wanted to see how to interact with options. These are added in the custom section of your serverless.yml file.

In your plugin code, you’ll be able to access the configured values through the serverless object. For example, in my code, I access  serverless.service.custom.core.skippedBuckets here. I use this value to skip some buckets and allow these to be public.

Getting the git repository status

I wanted to keep my code as self-contained as possible. In order to do so, I wanted to refrain from using a shell command to get the git information. I looked at NodeGit and that was exactly what I needed. Sadly, installing the package on my RaspberryPi ended up taking way too long and I decided to simply use shell commands.

I ended up using 4 commands to get the information I needed. These can be found here in my code, nothing too fancy around there.

Using the plugin

In order to use the plugin, you’ll need to do two things. The first one is to add the dependency to your package.json. In my case, I decided to simply use a file path to refer to the project. Once this is done, you’ll need to register the plugin in your serverless.yml file.

If your plugin requires any special configuration, you’ll be able to add it into the custom section of your serverless.yml file. In my case, I added a whitelisting for one of my S3 buckets. This was done by adding the following code section:

Related links

Service Discovery Through Lambda Extension

I started to read more and more about serverless architecture and the power of AWS Lambda. Amongst other things, Lambda extensions caught my eye. In a few words, this allows you to run code before or after your Lambda executes and also during the initialization or shutdown of your Lambda. There are already quite a few extensions in the wild. The first scenario I wanted to explore was a poor-man Service Discovery Through Lambda Extension.

As always, the code associated with this experiment is available on my GitHub repository.

Lambda extensions

Lambda extensions are a relatively new thing: they were introduced at the end of 2020. In order to keep things simple, Amazon built extensions on top of layers. Basically, you’ll need a special directory structure to signal to the runtime that your layer is an extension. Then, you register your extension like any other lambda layer.

I could not find any official documentation on that directory structure and had to rely on samples to figure it out. It turns out that you will need the following structure in your ZIP for your layer:

  • extensions: this directory needs to contain the executable that will start your extension
  • myExtensionName: this directory will contain your extension code

It is worth noting that nothing forces you to use a specific language to write your extension. You could write your extension in GO and still use it in a nodejs lambda.

Basically, you can see the extension as an external process that runs in the same context as your lambda. A little like a Kubernetes Sidecar.

My actual extension

Service discovery is always a complex topic in a microservices system. Therefore the extension I wrote tries to solve this problem. It allows a deployment tool to insert services, their version, and ARN into DynamoDB. Users of the extension can then set an environment variable named  NEEDED_SERVICES on their lambda. This environment variable supports a comma-separated list of services. It even handles a quick notation  MyService==1.2.0 to get a specific version and not the latest one.

When the extension is initialized, it does a few things:

  1. Read the environment variable to get the needed services
  2. Get the ARNs for the services from DynamoDB
  3. Write a file containing the discovered services
  4. The lambda can then read the file and knows how to contact the other services

Of course, none of this is usable as-is and is simply me playing. There are better ways of doing the same thing as this extension is doing.

Writing the extension

Entry point

The entry point of your extension is an executable file found in the extensions/ directory of your layer’s ZIP file. Make it easy for yourself and name it the same as the directory containing your code. A sample of this executable can be found here. The name of this executable will be important later, choose it well. Remember to set that file as executable, else your extension won’t work.

Registering the extension

One of the first things to do in your extension is to register it to the lambda service. This is done through an HTTP call. For example, the register call in my extension can be found here. There are two points to note in this call:

  • The events types to listen for, the only valid values are  INVOKE and SHUTDOWN.
  • Most importantly, the  Lambda-Extension-Name header needs to be exactly the same string as your executable from the extensions directory

If you are registering for SHUTDOWN events and are getting errors like ShutdownEventNotSupportedForInternalExtension: you haven’t passed the right extension name in the register call.

Polling for events

Once your extension is registered, you will be able to poll for events associated with it. Doing so is, again, done through an HTTP call and can be found here in my playground. You can then do a good old  while ... true to poll and process the events, like here. In my example, I do not care about invocations. Therefore, only the shutdown event is handled.

Remember to be a good neighbor and handle SIGINT and SIGTERM. These two signals could be received in some cases and you should shut down your extension. Handling process signals in nodejs is done trivially using process.on(signal, function), for example: here.

Using the extension

Amazon made it really easy to use extensions: they are configured as layers on a lambda. Therefore, you can add an extension really easily through any deployment tool. In my case, I decided to use serverless, so adding the layer is done when declaring the function, see this section.

Related links

Deploying an AWS lambda for Alexa using Serverless

I’ve played with deploying AWS lambdas manually, and I’ve played with SAM. Now is time to play with Serverless. The good thing with Serverless is that it is cloud-agnostic. Therefore, it can be used to deploy on a lot more platforms than simply AWS. Of course, all the Alexa skill linking will be AWS-specific. So here is how I ended up deploying an AWS lambda for Alexa using Serverless. As always, the code of the associated project can be found on my GitHub page.

Getting ready to use Serverless

Again, I had issues installing Serverless, pretty much the same as with SAM. Basically, it is not pre-built for a Raspberry Pi:

The fix, again, is to bypass the installer and install using npm: npm install serverless --global.

Running Serverless on a Raspberry Pi

Besides the installation of the framework, I could not find any issues running it. After generating a test event using SAM, I could even invoke the function locally using:

Preparing the project

Bootstrapping the project was quick and simple:

  1. run  serverless  (no arguments), which prompts a few questions and generates a basic project
  2. copy all files from my original project to the newly created directory
  3. delete the useless deploy scripts from before

Writing Serverless template

Writing the deployment template for Serverless was way easier than using SAM. It is a bit weird when you think about the fact that SAM is supported by AWS and only supports AWS. The two issues I got when using SAM were fixed trivially.

The first one, selecting a retention policy for the Log Group is as simple as adding a parameter:

The second one, linking to the Alexa skill, is done again using a simple parameter:

Deploying everything

Deploying the skill is done through the serverless command: serverless deploy --region us-east-1. This command will create a Cloud Formation stack for your function.

Removing your stack is again done directly through the serverless command: serverless remove --region us-east-1.

Comparison with SAM

Overall, Serverless seems easier to use and have more feature than SAM. Of course, my only testing of the frameworks is these few posts, so it is possible that things change over time.

Deploying an AWS lambda for Alexa using SAM

After manually deploying my lambdas during my previous posts (for example, this post), I decided it is time to look at the automation available. The first one I wanted to look at was AWS Serverless Application Model (SAM). The setup is quite straight-forward, but still, a few points warrant documentation since I run from a Raspberry Pi. Here is how I ended up deploying an AWS lambda for Alexa using SAM. As always, this full example can be found on my GitHub.

Getting ready to use SAM

The official documentation from AWS uses brew to install the SAM CLI. Sadly, brew only works on Intel Processor, so this route was a no-go on my Raspberry Pi. After a bit of searching, I found that you could also install the python module manually through pip: pip install aws-sam-cli.

Installing it manually made everything work for me. Maybe I already had the other dependencies, unsure.

Limitations due to Raspberry Pi

The greatest limitation I could find was that I could not run the lambda locally using SAM. When trying to invoke the lambda locally through sam local invoke, I simply get:

Error: Could not find amazon/aws-sam-cli-emulation-image-nodejs12.x:rapid-1.16.0 image locally and failed to pull it from docker.

This is a side effect of the docker image used being available only for linux/amd64. Since I don’t plan on writing tests for this lambda, this limitation is of no consequence to me.

Preparing the project

I decided to go with  sam init in order to bootstrap the project. The command line will help you generate a basic template, I went with something that seemed really simple. I then replaced the hello-world directory with the source code from my previous project. No changes had to be done to the source code to get it running.

Writing the SAM template

Handling the lambda in the template was easy enough. Two parts of the deployment ended up harder: log management and linking to a specific Alexa skill.

In order to fix the log management issue, I followed this post. Basically, I had to freeze the name of the function and manually generate the AWS::Logs::LogGroup. The following snippet shows the relevant parts of the template for this fix:

The second issue, linking to a specific Alexa skill, was fixed roughly the same way, following this post. Again, the fix is basically manually handling a resource associated with the function, this time the AWS::Lambda::Permission. The following snippet shows the relevant parts of the template for this fix:

Deploying everything

In theory, deploying everything should be as trivial as running sam deploy. Sadly, IAM got in the way. After trying quite a bit, I decided to be a bad boy and give way too much access to my user. I’ll play with the permissions in a clean way another day.

Once you have deployed your newly created lambda, you’ll be able to find all associated resources on the Cloud Formation page. This page is also where you’ll be able to delete your stack if you need to. In order for the SAM stack to be deleted cleanly, I had to manually delete a few S3 bucket content.

Related links

Making AWS Lambda deploy faster using layers

I’ve been playing with AWS Lambdas for a little while, at the same time as learning NodeJS. Let’s say that I’ve uploaded quite a few new versions of my test Lambdas. Even with a small code base and minimal dependencies, I always feel like deploying is slow. Here is how I ended up making AWS Lambda deploy faster using layers. I decided to update the scripts I was using for my previous post, therefore everything can be found in this repository. More precisely, update_dependencies.sh shows an example of how to deploy a new version of a layer and update your lambda.

The problem

I timed a few parts of my deploy script to find out that the zipping process was taking quite a bit of time:

This is a whole 15 seconds to deploy a change to the lambda. You might think that this is millions of files, images, and such. But it is merely 1900 files, the Alexa SDK. Okay, some of that might be my fault, I’m using a Raspberry Pi to develop my things.

Fixing the issue

I already know about Lambda layers since I’ve used HashiCorp’s Vault AWS Lambda extension in a previous job. I was not sure if that concept could be used to store libraries. Turns out it can!

Using layers might even allow you to directly edit your code in the web interface of the lambda if it is small enough.

The first thing that I did was to manually create the layer one time through the web page. This will give you an ARN and make things easier later. The zip file that I uploaded was simply containing a random file of mine.

Uploading the layer version

Once you have the ARN you’ll be able to use the AWS cli to upload the new version of the layer. In order to do so, you’ll need to use the command: aws lambda publish-layer-version. In my case, I invoke that command line like so:

The one ambiguous part is the creation of the zip file. In my case, the only thing that will be included in the zip file will be libraries, the documentation explains the directory structure that needs to be used. Basically, for nodejs package that is not specific to a given version, you’ll need to have all your dependencies in a nodejs/node_modules directory in the zip file. There are ways to make this specific to a given runtime, the documentation explains these.

I decided to make it simple to create the dependency tree: create a temporary directory with a sub-directory named nodejs, and run npm install in there to get the dependencies. Once this is done, zip everything starting at the root of the temporary directory.

For the next step, I will need the version number of the new layer. Luckily, the return value of the command used to update the layer will give you that information. The return format is JSON formatted, I decided to use jq to extract the version. Using the great tool, extracting the version is as simple as piping the result of the update command into  jq .Version.

Updating the function to use the new version

Do note that updating the function at this point is probably a bad idea for production code. Any instance of the lambda will use the new dependencies, and might simply break.

Since I’m only playing with all this and do not mind if my lambdas break, I decided to update the functions directly in the script that uploads a new layer version. It can be done using update-function-configuration:

A few related links

Storing data in Alexa-triggered Lambda

My latest project is to be able to somewhat control my RaspberryPi with my Alexa devices. While playing around, I ended up wanting to store data associated with the Amazon account. I decided to explore storing data with two retention policies: data kept for the session and data kept forever. Here is how I ended up storing data in Alexa-triggered Lambda for those two scenarios. As always, the source code used for this experiment can be found in my GitHub repository.

Identifying the user

I saw two identifiers that were interesting: the userId and the personId. According to the documentation, userId is the identifier associated with the account of the user. On the other side, the personId is used to represent the human that executed the command. Therefore, multiple personIds can be associated with the same userId.

Alexa Skill setup

In order to be able to test both storage policies, I needed a few different intents. I decided to keep it simple and went with the following:

  • Add: adds one to a session counter and returns the new value
  • Current: returns the value of the counter
  • Forget: resets the counter
  • Persist: stores the value of the counter in a persistent database
  • Restore: restores the value of the counter from the persistent database

Lambda environment setup

As with all my other projects, one of my main goals here was to learn a little something. Therefore I decided to try a few new technologies this time around. My previous test with Alexa was done using Python and handled HTTP calls directly.

In order to shake things up, I decided to go with: Node.js and the Alexa SDK. This meant a new language and using the official SDK instead of raw HTTP queries.

For the persistent storage, I wanted something that was easy to use, simple, and, most importantly, free. I decided to go with DynamoDB.

Handling session storage

The session storage is available through the JSON payload received and returned by your lambda. Using the Alexa SDK makes managing these attributes easy through the attribute manager.

In order to read a value from the session, you can use the attribute manager:

You can write to the session through the same object:

Getting a new Alexa session

While testing this skill, I ended up having to reset my session a few times. The simplest way to do so is to say: Alexa, exit. This will give you a new sessionId and reset all stored session attributes.

Handling persistent storage

Reading and writing to DynamoDB is almost as easy as writing to the session storage.

In order to write to the database, you need to build the parameters and pass these to the put function of an instance of AWS.DynamoDB.DocumentClient. The following is an example of that flow:

You can then read the value following the same pattern: build the params and pass those to the get function of the instance.

AWS Lambda triggered by catch-all Alexa Skill

For my most recent project, I wanted to have a way to control my RaspberryPi from my Amazon Echo. I already figured out a way to connect to my RaspberryPi from AWS in a secure way in my last post. Now it was time to play with the other end: writing an Alexa Skill. As a first step towards that direction, I wanted a catch-all Alexa skill that calls an AWS Lambda. This post outlines how I got an AWS Lambda triggered by catch-all Alexa Skill.

AWS Setup

The first thing needed to do this whole project is an AWS account. In order to create one, follow the instruction on the AWS Portal.

At the time of this writing, AWS Lambda has a free tier allowing 1,000,000 free requests per month and up to 3.2 million seconds of compute time per month. There is no reason why this should not be enough for this project. But do watch out: you can easily occur costs if you add other services.

Once you created your account, you will need to generate an access key for your user. It is a bad plan to create an access key for your root user. You should look into how to secure your AWS account but for the sake of this quick project, following this guide will give you what you need.

Alexa Developer Console Setup

In the Developer Console, you will create a new Skill. That new skill will use a custom model and backend resources that you will provision yourself. No need to select any template, we will be doing all the work ourselves.

The Alexa Skill

I won’t go over everything that needs to be done to configure the Skill. The most important parts I could find to achieve what I wanted were:

  1. Create a new Intent; in this Intent, add one slot of type AMAZON.SearchQuery
  2. In the ToolsAccounts Linking section, you will need to disable account linking (first option)

Every time you do changes, remember to hit the Build button. This will validate your setup. You can also go into the Certification section. This will allow you to run validations on your configuration and give pointers on how to fix issues.

The AWS Lambda

The next step is to create your AWS Lambda. In order to do so, simply go to the AWS Lambda home page and hit Create Function. We will be writing our function from scratch, using a Python 3.7 runtime.

The code we will be using for this lambda resides in this repository. It does not do much: it outputs any slots it could find to the logs of your lambda. But it can at least show that the linking worked correctly.

Deploying the code can be done directly through the Lambda UI or using the publish.sh script in the above repository. Once deployed you can use the Test button directly in the Lambda UI in the AWS portal to trigger a run of the code. You will need to enter the content of the lambda_test.json file when asked for a test configuration.

Linking the Skill to the Lambda

In order to link everything, you will need two pieces of information:

  1. The ARN of the Lambda you created
    You can get this information by going to the AWS Portal and selecting your Lambda, the ARN will be at the top right
  2. Your Skill ID
    This information you can get from the Amazon Developer portal, by going into your skill and into the endpoint section

The linking needs to be done in both direction:

  1. In the Amazon Developer portal, in the endpoint section of your skill, you will need to enter the Lambda ARN into the default region field
  2. In the AWS portal, after selecting your Lambda, you will need to add a new trigger of type Alexa Skills Kit and enter the Skill ID of your skill

Testing Everything

There are multiple ways to test this setup:

  1. Testing using an Echo device
  2. Use the Amazon Alexa app on your smartphone
  3. Use the test UI in the Amazon Developer portal

I strongly suggest using the Amazon Developer portal since this removes the ambiguity of speech-to-text. In order to access this UI, simply go onto the Amazon Developer portal, select your skill and go into the Test section on top. There you will be able to enter text directly into the simulator.

Various related links