Serverless Loose Ends

I amassed a few tidbits of nice information while playing with Serverless over time. These are all too small to do a real post about, so I decided to outline a few in a single one. This time, I will go over two subjects: make simple project deployment faster and using the same S3 bucket for all your projects. Here is how I ended up fixing these two Serverless loose ends.

Make simple project deployment faster

Most projects for this blog have the same shape: they require no external dependency except the AWS SDK and serverless. Both of these are completely useless at runtime: serverless is dev-only and AWS SDK is already provided in lambdas.

My issue is that it looks like serverless removes development dependencies instead of installing them in a clean directory. The side effect of this is that it takes quite a bit of time to remove all those small files. And all that work to end up with an empty directory anyway. For example, deploying a simple project would take almost 6 minutes:

I decided to look around and found a forum post trying to fix this issue (see here). The fix described in the post is to simply tell serverless to not care about your dependencies and exclude the node_modules folder completely:

Once this was done, deployment time went down quite a bit to less than 4 minutes:

Using a single S3 bucket for multiple projects

By default, serverless will generate an S3 bucket per project. This might not be an issue for small businesses or a simple playground, but it can become an issue later. Amazon has put a soft limit to 100 S3 buckets per account. This limit can be increased to a maximum of 1000 (source: link). This might seem like a lot, but you could also be using buckets for a lot of things. Of course, this limit can be bypassed by creating new AWS accounts, but it becomes a nightmare once you start accessing resources cross-account.

Following a forum post (here), I discovered the deploymentBucket option. This needs to be added to your provider section and needs to contain the ARN of the bucket to use. For example, I use the following in my playground:

I wanted to make things simple for me and not have to hardcode anything. At the same time, I wanted to play with CloudFormation, so I decided to expose the bucket through a CloudFormation stack. In the previous sample, the name I use for the deployment bucket is ${cf:ServerlessRoot.ServerlessDeploymentBucketArn}. This tells serverless to get the CloudFormation stack called ServerlessRoot. Then, it extracts the output named ServerlessDeploymentBucketArn from it.

The Stack I deployed can be found here. And a quick helper script to deploy it. Right now, it contains only the deployment bucket, but maybe I’ll add some IAM stuff in there too.

Serverless Plugin for S3 security and Git status

I’ve been using Serverless at work for a little bit of time. While doing that, I’ve been exploring it even more on my own time. There are a few things that I want on all my projects and don’t want to do manually every time. Amongst other things, I want my S3 buckets to never be public and I want some kind of git status on my lambda. I know there are plenty of Serverless plugins to do just that, but I want to learn how to do a plugin. Here is how I ended up writing a Serverless Plugin for S3 security and Git status.

As always, the code I worked on for this sample can be found on my GitHub.

Plugin goals

Basically, I wanted my plugin to be used as a master plugin for my other projects. I want it to do the following:

  • Store the SHA, branch, and user that deployed
  • Ensure that all my S3 buckets can’t be made public (using PublicAccessBlockConfiguration)
  • Ensure that all my log groups have a retention policy
  • Validate the stage that I’m using for deployment to select environment configuration

Since I wanted to see how to setup custom options, I allowed the user to whitelist some S3 buckets to not receive the block.

Writing the plugin

Plugin bootstrap

Creating the base of the plugin is simple enough. Still, I decided to use serverless to create the skeleton for me using serverless create --template plugin. Running this command will create a index.js file containing the boilerplate for the plugin. You will then need to build a package around that file.

Registering to the hooks

Integrating your plugin into Serverless’s workflow is done through hooks. Everything is a step and you can hook before or after any step. You can even create new ones when you add commands. Finding documentation on these steps is quite hard. The best documentation of the flows I could find is on this gist.

After investing quite a bit of time I found the two places where I needed to hook my plugin. The first one was after the package:initialize step. In my code, hooking to this step can be found here. Using this hook, I validate that the configuration used makes sense for me.

The second hook I needed is used to inject into the resources. This needed to be far enough in the flow so that resources are already created, but not so far so the CloudFormation templates are not yet generated. I found that hooking before the package:finalize step gets that done. This part can be found here in my code.

Finding all generated resources

Serverless can store your resources in two different places depending on how they got created.

If a resource is added through the resources section of serverless.yml it will end up in serverless.service.resources.Resources. An example of this is my public assets S3 bucket.

If a resource is generated by Serverless itself, for example, a deployment bucket, it will end up in serverless.service.provider.compiledCloudFormationTemplate.Resources.

To make handling these two types of resources easier, I wrote a quick wrapper to iterate on both of these.

Using custom options from serverless.yml

To make my plugin a bit more usable, I wanted to see how to interact with options. These are added in the custom section of your serverless.yml file.

In your plugin code, you’ll be able to access the configured values through the serverless object. For example, in my code, I access  serverless.service.custom.core.skippedBuckets here. I use this value to skip some buckets and allow these to be public.

Getting the git repository status

I wanted to keep my code as self-contained as possible. In order to do so, I wanted to refrain from using a shell command to get the git information. I looked at NodeGit and that was exactly what I needed. Sadly, installing the package on my RaspberryPi ended up taking way too long and I decided to simply use shell commands.

I ended up using 4 commands to get the information I needed. These can be found here in my code, nothing too fancy around there.

Using the plugin

In order to use the plugin, you’ll need to do two things. The first one is to add the dependency to your package.json. In my case, I decided to simply use a file path to refer to the project. Once this is done, you’ll need to register the plugin in your serverless.yml file.

If your plugin requires any special configuration, you’ll be able to add it into the custom section of your serverless.yml file. In my case, I added a whitelisting for one of my S3 buckets. This was done by adding the following code section:

Related links

Service Discovery Through Lambda Extension

I started to read more and more about serverless architecture and the power of AWS Lambda. Amongst other things, Lambda extensions caught my eye. In a few words, this allows you to run code before or after your Lambda executes and also during the initialization or shutdown of your Lambda. There are already quite a few extensions in the wild. The first scenario I wanted to explore was a poor-man Service Discovery Through Lambda Extension.

As always, the code associated with this experiment is available on my GitHub repository.

Lambda extensions

Lambda extensions are a relatively new thing: they were introduced at the end of 2020. In order to keep things simple, Amazon built extensions on top of layers. Basically, you’ll need a special directory structure to signal to the runtime that your layer is an extension. Then, you register your extension like any other lambda layer.

I could not find any official documentation on that directory structure and had to rely on samples to figure it out. It turns out that you will need the following structure in your ZIP for your layer:

  • extensions: this directory needs to contain the executable that will start your extension
  • myExtensionName: this directory will contain your extension code

It is worth noting that nothing forces you to use a specific language to write your extension. You could write your extension in GO and still use it in a nodejs lambda.

Basically, you can see the extension as an external process that runs in the same context as your lambda. A little like a Kubernetes Sidecar.

My actual extension

Service discovery is always a complex topic in a microservices system. Therefore the extension I wrote tries to solve this problem. It allows a deployment tool to insert services, their version, and ARN into DynamoDB. Users of the extension can then set an environment variable named  NEEDED_SERVICES on their lambda. This environment variable supports a comma-separated list of services. It even handles a quick notation  MyService==1.2.0 to get a specific version and not the latest one.

When the extension is initialized, it does a few things:

  1. Read the environment variable to get the needed services
  2. Get the ARNs for the services from DynamoDB
  3. Write a file containing the discovered services
  4. The lambda can then read the file and knows how to contact the other services

Of course, none of this is usable as-is and is simply me playing. There are better ways of doing the same thing as this extension is doing.

Writing the extension

Entry point

The entry point of your extension is an executable file found in the extensions/ directory of your layer’s ZIP file. Make it easy for yourself and name it the same as the directory containing your code. A sample of this executable can be found here. The name of this executable will be important later, choose it well. Remember to set that file as executable, else your extension won’t work.

Registering the extension

One of the first things to do in your extension is to register it to the lambda service. This is done through an HTTP call. For example, the register call in my extension can be found here. There are two points to note in this call:

  • The events types to listen for, the only valid values are  INVOKE and SHUTDOWN.
  • Most importantly, the  Lambda-Extension-Name header needs to be exactly the same string as your executable from the extensions directory

If you are registering for SHUTDOWN events and are getting errors like ShutdownEventNotSupportedForInternalExtension: you haven’t passed the right extension name in the register call.

Polling for events

Once your extension is registered, you will be able to poll for events associated with it. Doing so is, again, done through an HTTP call and can be found here in my playground. You can then do a good old  while ... true to poll and process the events, like here. In my example, I do not care about invocations. Therefore, only the shutdown event is handled.

Remember to be a good neighbor and handle SIGINT and SIGTERM. These two signals could be received in some cases and you should shut down your extension. Handling process signals in nodejs is done trivially using process.on(signal, function), for example: here.

Using the extension

Amazon made it really easy to use extensions: they are configured as layers on a lambda. Therefore, you can add an extension really easily through any deployment tool. In my case, I decided to use serverless, so adding the layer is done when declaring the function, see this section.

Related links