Adventures in Async Calls and bcrypt

We use bcrypt to store passwords in Luma Health. In the NodeJS world there are two common libraries used to do this, bcryptjs, which is a pure JavaScript implementation and bcrypt, which is a wrapper on a C++ library.

We had originally used the pure JS version since it was helpful when we upgraded NodeJS versions that linked libraries didn’t have different binary versions, which caused long recompile and npm install times.

Our original implementation (mostly out of laziness) used the synchronous version of the JS library. Since bcrypt is a cost-of-compute type of algorithm, any time we came under high login login (e.g. the start of the day), we started to run CPU hot and then start to time connections as the cost of doing password hashes started to starve out other work happening on our REST servers.

We diagnosed this using Clinic and generating flamegraphs of the system under standard load patterns and it became clear that the bcrypt work was sucking up all the oxygen in the NodeJS process.

Our first fix was to just move to the async methods of the bcryptjs library, which would mean we wouldn’t block the main event loop doing bcrypt’s hashes. Unforutnatly, this didn’t lead to as much of a performance imprvoement as we were hoping to get becuase after digging in to the implementation of bcryptjs (<3 open source), it turns out the main difference between the async version and sync version is that the library would do one round of blowfish per callback. It was definitly an improvement but it still consumed a lot of time within the main NodeJS process.

We then looked at moving to the C++ based module and again it had two different functions, one sync and another async. The sync one would run the hashing functions in C++, which is faster, but even better, the async version would run the hashing function in an nan::AsyncWorker.

Result? About a 30% increase in response times through our front ends a lot smoother load management under high load. Moral of the story? Always use async even when it’s tempting / easy / lazy to use a sync version of a function in NodeJS. The graph from the very beginning shows the improvements right after we deployed the new code to production.

Unbounded Range Queries in Mongo

In Luma Health, often times we have to do queries against collections that have 50, 60, 100M+ records — as you’d expect, well thought through queries and good indexes are the building blocks to querying these types of collections.

In today’s example, we had a collections in production that contains a DATE field where we have to do range queries (e.g. DATE > something, DATE < something). We started to notice a large number of the underlying API calls that hit that table were getting logged to our slow execution monitoring. Specifically in one service, we started seeing about 500 slow logs per 5 minute interval.

We spent some time looking through the Mongo query planner, digging in to the DB queries the API calls were making and and few found a few examples like this:

    date: { $lte: endDate },
    endDate: { $gte: startDate }

Now both the date and endDate fields were in a proper compound index that the Mongo query planner was using, but when looking through the execution stats, the query planner was canonicalizing each end of the ranges as date: {$lte: endDate, $gte: Infinity }. Yikes! All the hard word in indexing, query design, etc, went out the window — when the query executed, Mongo had to pull two entire ranges and then intersect them in memory rather than through the index.

We quickly fixed the queries and as you’d expect, much happier production monitoring. In the graph below you see around 6am the daily load starts to pick up and the the slow logs pick up in frequency. We deployed the change about 915am and like magic, the slow logs go back down to zero.

Moral of the story: unbounded range queries can lead to very unintended performance consequences.

Performance Implications When Comparing Types in Node.js

Like in any language that is weakly typed, you can’t avoid the fact that performing comparisons across types will cost you CPU cycles.

Consider the following code which does a .filter on an array of 5M entries, all of which are Numbers:

let arrOfNumbers = Array(5000000).fill(1);
console.time('eqeq-number')
arrOfNumbers.filter(a => a == 1)
console.timeEnd('eqeq-number')
console.time('eqeqeq-number')
arrOfNumbers.filter(a => a === 1)
console.timeEnd('eqeqeq-number')

On my Mac, they’re roughly equivalent, with a marginal difference in the performance in the eqeq and eqeqeq case:

eqeq-number: 219.409ms
eqeqeq-number: 225.197ms

I would have assumed that the eqeqeq would have been faster given there’s no possibility of data type coercion, but it’s possible the VM knew everything was a number in the array and the test value, so, meh, about the same.

Now, for the worst case scenario, consider this following code: the same .filter, but the array is now full of 5M strings of the value “1”:

let arrOfStrings = Array(5000000).fill('1');
console.time('eqeq-string')
arrOfStrings.filter(a => a == 1)
console.timeEnd('eqeq-string')
console.time('eqeqeq-string')
arrOfStrings.filter(a => a === 1)
console.timeEnd('eqeqeq-string')

The eqeq costs about the same as the original example with the weakly typed Number to Number comparison, but now the eqeqeq is significantly faster:

eqeq-string: 258.572ms
eqeqeq-string: 72.275ms

In this case it’s clear to see that the eqeqeq case doesn’t have to do any data coercion since the types don’t match, the evaluation is automatically false without having to muck the String to a Number. If you were to continue to mess around and have the .filters compare eqeq and eqeqeq to a String ‘1’ the results again are the same as the first few tests.

Conclusion? Same the VM work if you can. This is a really obtuse example as the eqeqeq can quickly shortcut the comparison to “false” since the types don’t match, but anywhere you can save effort when working on large data sets, it’s helpful to do so, and typing is an easy win when you can take it.