May 31, 2021

Locking and Synchronization for Node.js

When building software that is accessed by multiple actors concurrently, such as APIs and web services in general, you sometimes need to restrict access to only a couple of entities, maybe even just one at a time.

In multi-threaded systems, it can be important to have exclusive access to a non-threadsafe resource, for example.

Even in Node.js, which runs single-threaded by nature, you might want to be sure a specific resource is accessed only once at a time, or that outgoing requests are capped so you don't run into networking issues.

Let's start by thinking about the mechanisms we have to synchronize concurrent access to shared resources, then get to some examples!

Semaphore

A semaphore is a data structure initialized with a predefined positive integer value. Each time you access it by acquiring a lock, the value decreases by one. Completing your operation and releasing the lock will increase it again. Once the semaphore hits zero, no further access will be granted and any caller will be blocked until running processes release their acquired locks.

Semaphores are incredibly useful to limit access to a desired amount of concurrency: If you're sending network requests and want to prevent too many sockets from being used, you can add a semaphore, acquire a lock before sending the request, and release it once you're done. Initializing the semaphore with a value of 5, for example, would allow up to five concurrent requests to run at each point in time.

Mutex

A special case of the semaphore, a mutex (short for mutual exclusion) is a semaphore with an initial value of one, which makes it useful as a simple locking structure, allowing only one operation at a time.

Node.js and synchronization

While Go exposes Semaphores and Mutexes through the built-in sync package, Node.js offers no such functionality out of the box. Thankfully, async-mutex is an amazing package that does just that.

Let's start with an example of preparing a shared resource, but limiting the number of concurrent invocations so the preparation is not run more than once at a time. For this, we're going to use a simple Mutex

import { Mutex } from 'async-mutex';

// While global values are discouraged for
// production code, we use it to serve as
// a shared source of state.
const mutex = new Mutex();
let sharedResource = null

async function prepareResource() {
  await mutex.runExclusive(async () => {
    // Initialize our resource
    sharedResource = { ... }
  });
}

As you can see, next to the shared resource we instantiated a new mutex, which naturally has to be shared as well, otherwise, we could not limit concurrent access. Before performing any action on the shared resource, we'll attempt to acquire a lock.

While this could be done manually, it is recommended to make use of the runExclusive helper which will always release a lock, once the operation is completed. If you choose the manual path, check that the lock is returned, otherwise, your application will be stuck once you run into the case that is not handled correctly.

For semaphores, let's imagine an example where we want to limit the number of concurrent requests.

import { Semaphore } from 'async-mutex';

const maxConcurrentRequests = 10;
const semaphore = new Semaphore(maxConcurrentRequests);

async function performRequest(...) {
  // Acquire access before doing anything!
  return await semaphore.runExclusive(async () => {
    // Dispatch the network request
    await fetch(...);
  });
}

In this example, we create a semaphore with an initial value of 10. We can now continue and invoke the performRequest function up to ten times before we have to wait for any pending request to finish.

Using semaphores for more than one concurrent actor and mutexes for exclusive access makes it incredibly easy to create guarantees about how shared resources are accessed.

Benefits of async-mutex

Now that we discovered some of the core features of async-mutex, let's take a second to talk about the remaining features, which are similarly useful!

When you know that all waiting (or pending) locks should be canceled, for example when shutting down your application or after a terminated request, you can simply call .cancel() on your mutex or semaphore. This will reject all pending promises with E_CANCELED.

If you would like to limit the time your application waits for a lock to be acquired, you can use withTimeout to enforce your mutex or semaphore to reject the pending promise with E_TIMEOUT after a specified period, if no lock could be acquired.

import { withTimeout, E_TIMEOUT } from 'async-mutex';

const mutexWithTimeout = withTimeout(new Mutex(), 100);
const semaphoreWithTimeout = withTimeout(new Semaphore(5), 100);

If your application uses custom errors, or you want to expose the error to your users, you can supply your own error with

// Mutexes with custom errors
const mutex = new Mutex(new Error('fancy custom error'));
const mutexWithTimeout = withTimeout(
  new Mutex(),
  100,
  new Error('new fancy error')
);

// Semaphores with custom errors
const semaphore = new Semaphore(2, new Error('fancy custom error'));
const semaphoreWithTimeout = withTimeout(
  new Semaphore(5),
  100,
  new Error('new fancy error')
);

If you want to know early on if a resource is locked, call isLocked() on your mutex or semaphore. This could help you to decide if you want to abort an operation before you even try to acquire a lock that would require you to wait. In scenarios with limited time or strict rate-limits, this can be useful.

If you're fine with receiving an error, you can also use tryAcquire(semaphoreOrMutex), which immediately rejects pending promises with E_ALREADY_LOCKED if no lock can be acquired. If needed, you can also pass a customized error as the second argument to tryAcquire.

Careful when synchronizing

Locks are great, they make your concurrent logic easier to reason with, but they come at a cost. Locking shared resources naturally leads to bottlenecks if the demand is higher than what your semaphore allows. For a mutex, if you have more than one concurrent request, you'll have to wait. For some applications, this may be useful by design, for others, this can lead to scary times your application just spends waiting.

Implementing strategies to fail early or handle cases where no lock can be acquired in an acceptable timeframe may help alleviate any problems caused by congested locks.

And when you do not use runExclusive but decide to go with manual locking, always always always make sure to release the lock once you're done. Way too many times did I investigate why the application would just stall for no reason, only finding out later that the lock wasn't released. If you don't have a valid reason, use runExclusive and save yourself some trouble.

Also, note that we've talked about instance-level locks in this post. If you're dealing with a deployment of multiple instances and processes and want to enforce concurrency limitations, you need a distributed mechanism (Redis, Consul, etc.).