Playing with a native C extension for Node.js

I’ve always felt like a good way to fully understand a language is to play with the native interface it exposes. I’ve already done a post playing with a native C Ruby extension, and now that I’m using Node.js more I felt like I should dig into it. Here is how I ended up playing with a native C extension for Node.js. I did not have a real goal while playing around, so I did the exact same thing I did for Ruby. As always, the code is hosted on my GitHub repository!

Why a native extension

Node.js is a great tool for event-based concurrency. Having a garbage collector and being single-threaded makes for a language that is easy to get into and fast. But like any language, it is not perfect.

For example, Node.js is not the best if you need to do some CPU-bound computations. Even using threads won’t help since only one Node.js thread will run at a time. There is also the case of wanting to interface with existing native libraries or with the OS.

For all these reasons, and many others, a native extension could be useful.

Needed setup

There are multiple tooling frameworks that can be used to do a native library for Node.js. In the end, most end up wrapping around N-API, the Node.js native API. I decided to use node-gyp.

The initial repository setup was quite trivial:

  1. Create (and fill) the binding.gyp file
  2. Run npm init

The binding.gyp file is used to configure how node-gyp will build the native part of the extension. In my case, I went with something simple:

Once you have the file in your directory, npm init will recognize it and do the rest of the setup.

Loading the native interface can be tricky, luckily there is a package for that! The library is called node-bindings. Using this library, you can load your library with one simple line: require('bindings')('nci.node').

This will find your native bundle (in my case nci.node) and just load it. No need to think about release vs debug, about platforms, or anything else.

Module management

Exposing your module is done through the  NAPI_MODULE macro. The macro requires a function as a second argument, this function will be called to initialize the module. The initialization method needs to return everything that is exported from the module. For example, the following would register a new module with the method Init as initializer: NAPI_MODULE(NODE_GYP_MODULE_NAME, Init).

In my case, I add the newly created class definition to the existing exports and return that. I also take this time to store a few global variables that I’ll need to reuse in various calls.

Creating the class definition is simple enough: register the class by giving it a name and a constructor callback and then export it. For example, I went with the following:

Class/instance management

There are two Node.js syntaxes that can get your constructor called:

I decided to support only the first version simply because it is a bit easier to do so. In order to distinguish the two versions, you’ll need to check the target of the call. Basically, if the target is set, you are in the  new MyClass()  version. Doing so is pretty easy:

Once in the constructor, I get and validate the first argument, expecting an int. Once this is done, I simply create the native instance and wrap it like so:

When wrapping the instance is the moment where you define a cleanup callback. That callback will be called when the object is garbage collected.

Once the object is wrapped, you’ll need to register all functions on the instance. In order to do so, simply repeatedly call napi_create_function and napi_set_named_property to create and add the function to the object.

Accessing the native object

In the function callbacks, you’ll need to access your native object. In order to do so, you’ll need to get the call information and then unwrap the target object (this) like so:

Loading the module

If you are using node-bindings like me, loading the library is done simply through:  require('bindings')('nci.node'). It is a good idea to wrap that into your own module and not require your customer to do this for you.

Related links

A tale of C++ native Ruby and RAII

While I was playing with my last project, I wanted to gather the execution time of a few native functions. Doing so in C is a little bit painful: it requires quite a bit of code, temporary variable, and such. One really powerful idiom I liked using while doing C++ was Resource Acquisition Is Initialization (RAII) to do this kind of task. This post will contain two subjects: using RAII to time function execution and using rake-compiler with C++ for native Ruby. So here is a tale of C++ native Ruby and RAII.

What is RAII

In a class, RAII is represented in a really simple way: the constructor acquires something (file handle, lock, etc.) and the destructor releases it.

An example of this is the std::ofstream class. An instance of this class acquires a handle on the given file and the destructor releases it. Therefore, the following is completely valid:

This idiom ensures that the underlying resource gets released correctly in all cases. No need to catch exceptions and rethrow in order to manually close the file, it will be done when the stack unwinds.

Using RAII to time function execution

In order to use RAII to time function execution, I went with a really simple flow:

  • The constructor gets the start time
  • The destructor gets the end time and prints the elapsed time

The code using the previous flow can be found here:

Using this class becomes super simple: create an instance and let it go out of scope. An example of this can be found in this file:

Using rake-compiler with C++

I was expecting some major differences when using C++ instead of C with rake-compiler. Turns out that the tool does most of the heavy lifting. I only needed to put the C++ files in the directory and they got built magically.

There were only a few things that I needed to care about:

  • Ensure that the entrypoints were in an Extern C section in order for the method signatures to be valid
  • Define a typedef to enforce C signatures whenever you call a C Ruby function that requires a function pointer
  • In order to manage memory correctly, remember to delete any allocated instances

Using rake-compiler to build native Ruby extensions

Having renewed my love for my Raspberry Pi, I wanted to make a quick library to access a temperature sensor through Ruby. This sensor uses a single-bus format, and one communication lasts about 4ms. I felt like this was good enough of an excuse to play with the native layer of Ruby. Here is how I ended up using rake-compiler to build native Ruby extensions.

Quick disclaimer: most of what I did was heavily inspired by a RubyGems guide. I’ll try to point out things that I had most issues with or that I found particularly interesting. All the code associated with this post is hosted on my GitHub project.

Adding a dependency on an external library

Adding a dependency on an external library was one of the things that were not completely clear. In my case, I needed to use libgpiod. In order to do so, I had to modify extconf.rb to add the following lines:

These two lines ensure that the generated Makefile will include the right search directories for the headers and link to the right library.

Controlling the native classes’ modules

I ended up with a really simple solution for handling my modules: the module is created in Ruby, and then fetched in C and used there. In the sample project, the module creation is done in the main Ruby file and then used when initializing the C module.

Getting the module was done easily: rb_const_get(rb_cObject, rb_intern("NCI")). Once this was done, creating the class was done using the handy rb_define_class_underrb_define_class_under(mNCI, "NCINativeDevice", rb_cObject). The third parameter to this last function is the superclass, in this case the Object class.

Associating a C structure with the Ruby instance

Three functions come into play when using a C structure: nci_native_device_allocnci_native_device_init, and nci_native_device_free.

The first function (_alloc) is used to allocate an empty C structure. This function is not the initializer, it will not get any of the arguments passed on initialization. The important part of the function is the call to Data_Make_Struct. This function will allocate the given C structure, but won’t initialize it for you.

The second function (_init) is the true initializer. It will receive the arguments that are passed in the Ruby code. In order to store these values in the C structure, you need to extract the C pointer from the instance. This is done using Data_Get_Struct. Weirdly enough, this won’t return the pointer to the C structure, it stores it in the last argument.

The last function (_free) is the most important of all. It is responsible for deallocating whatever you allocated. The C structure (allocated for you by Data_Make_Struct) does not need to be deallocated, the system will do it for you.

Various sources