Part 2 – NodeJS: The good, the bad and the Javascript

Following up on my previous post, NodeJS: The good, the bad and the Javascript – Part 1, let’s now talk a bit about the bad things that may happen when developing NodeJS applications.

The bad

Memory Leaks

Memory leaks – at least to me – are the black holes of the black holes. They can be so deeply nested in the application code that even with a flash light and night vision goggles it can become a chore.

We had a memory leak issue that haunted us for a couple of weeks and we eventually found the issue so quite happy to share the story here.

First of all, let me get this out of the way:

Most memory leaks in Javascript can be avoided by properly declaring your variables. Be it within your module file or within your function scope, declaring them at the proper scope level will ensure they are discarded when not needed anymore.

Best of all you can enable this detection in Mocha every time your run your tests by enabling the option –check-leaks as I have mentioned on my previous post.

However there will be situations where you’re not entirely sure what’s going on and the memory leak is not related to your code or dependencies at all. We had a case that was causing our process to halt at a certain point: the memory would spike up, the CPU would increase dramatically and eventually the process would be killed.

We are using Sequelize to manage our domain model, database connectivity, migrations and the like. It seemed like a good tool and is quite mature, supporting popular database engines and seems to be the go to solution for people starting with ORM in NodeJS.

Our model contained multiple joins and nested relationships which made Sequelize generate a massive join query which would yield a result set so large that was completely out of boundaries for the process to handle.

To properly identify the line where the issue was happening our tech lead use the tool node-heapdump which writes a snapshot file that can be later inspected in Chrome. The cool thing about this tool is that you can compare dump differences if you have more than one file.

Eventually we wrote some code around the issue to get the queries executed the way we wanted but, after some research, we felt that Sequelize was not really the right choice.

Although it provides strict modelling the way the queries are constructed don’t really scream performance. BookshelfJS, on the other hand, executes the query the way we expect it to be executed: one at a time for each model object, using the result of the previous query as input for the next one. It lacks on modelling unfortunately forcing us to rely on other tools to validate the JSON object being inserted.

To find more about memory leaks I would strongly suggest reading the excellent 4 part article on Profiling Node.JS applications by Will Villanueva.

Dependency Injection

Angular brought the wonderful world of dependency injection into Javascript. It was quite easy to follow the framework rules and write tests against your controllers. With NodeJS things don’t go so smoothly like that but you still have some good ways around things.

Every module you write can be required by any other module. The main issue with this is that the module you are requiring is relative to the module you’re writing thus leading to situations like require(“../../../../../myModule”) which is not only ugly but pretty confusing.

On my googling around I found that Bran van der Meer already solved this issue in a multitude of ways. My favourite from the list is the last one: using a global wrapper, which goes like this (credit to @a-ignatov-parc):

<span class="pl-s3" style="font-style: inherit;">global</span>.<span class="pl-en" style="font-style: inherit;">rootRequire</span> <span class="pl-k" style="font-style: inherit;">=</span> <span class="pl-st" style="font-style: inherit;">function</span>(<span class="pl-vpf" style="font-style: inherit;">name</span>) {
    <span class="pl-k" style="font-style: inherit;">return</span> <span class="pl-s3" style="font-style: inherit;">require</span>(<span class="pl-sv" style="font-style: inherit;">__dirname</span> <span class="pl-k" style="font-style: inherit;">+</span> <span class="pl-s1" style="font-style: inherit;"><span class="pl-pds" style="font-style: inherit;">'</span>/<span class="pl-pds" style="font-style: inherit;">'</span></span> <span class="pl-k" style="font-style: inherit;">+</span> name);
}

Then use it like this:

<span class="pl-s" style="font-style: inherit;">var</span> dependency <span class="pl-k" style="font-style: inherit;">=</span> rootRequire(<span class="pl-s1" style="font-style: inherit;"><span class="pl-pds" style="font-style: inherit;">'</span>app/models/dependency<span class="pl-pds" style="font-style: inherit;">'</span></span>);

When it comes to testing things actually get worse because you don’t want to test the required modules. For that you can use a tool named rewire which allows you to redefine a required module’s functions. It’s quite handy and will enable you to do your unit testing but depending on the number of dependencies your module have you may find yourself rewriting so many functions that your unit test will be as long the module you are testing.

At that point you should ask yourself: should I really continue down that path?

Sometimes you do but sometimes all you need is a little refactor. Pull the logic out and test it individually: you will end up with smaller, maintainable modules and will be able to mock things at the appropriate level without leading to a test that is pure mock.

Another approach for dependencies is to bend the rules a bit and do things a bit different: instead of requiring something at a modular level, you can pass that dependency on the method call. Tests may not change much at least you won’t need to rewire anything and have a better test for your module and clear stubs.

Dependency Management

package.json is where your dependencies live. You have dev dependencies, peer dependencies, optional dependencies… the lot. It works pretty well until you decide to blow away your node_modules folder and install everything again and suddenly things don’t work so well.

The issue is related to locking your dependencies down. Things move quickly in the Node world and, in the space of 4 months, Sequelize went from version 2 to version 3 and multiple minor revisions. That broke a few things and we had to find a way to lock stuff down.

npm-shrinkwrap is what you want for that. It generates a new file that contains the versions installed in your node_modules folder and, when you execute npm install again, it will install from the shrinkwrap file and not from package.json.

It got a few “gotchas” though: installing and removing dependencies become a 3 step process as you have to install the dependency – saving to package.json – then generate the shrinkwrap file and commit the new file.

If you have anything extra in your node_modules folder, shrinkwrap will complain and you will have to start from the top again.

If there are dependencies missing your node_modules folder, shrinkwrap will complain and you will have to start from the top again.

With shrinkwrap you don’t have to commit your node_modules folder but if you ever have to do that for whatever reason, a good way to keep your node_modules folder free from junk is ModClean. It strips out all useless files like readmes, tests, examples, build files, etc. Worth a look.

And since we are on the topic of dependencies, native dependencies are quite a nuisance. When it runs beautifully on your Mac it won’t on Linux distributions because you’re missing development libraries. And if you want to run on Windows, be prepared for a world of pain as you will have to install .NET and a bunch of other libraries that may be needed by your project.

A good piece of advice is to be very careful on the dependencies you pick: native dependencies will run faster but they need extra libraries. Some pure Javascript solutions (very likely) will run slower but you don’t have to concern yourself about operating systems.

Too much out there

Finally, the last down side: there’s just too much out there. It’s not that hard to pick as you just need to make sure that whatever you pick is maintained constantly and has a good backing but, sometimes, small libraries written by “unknowns” are very good and they do a better job.

One trick is to have a clear criteria: dependencies recently maintained have done the trick for us most of the time but also broke some stuff as well as they were not backwards compatible. Well used libraries like Express and Restify were also very good and with a great community backing.

Ultimately it will take some installing and uninstalling to clearly identify what works and what doesn’t but don’t be discouraged by that: sometimes hidden gems are worth the effort.

Source of blog: http://tarciosaraiva.com/2015/06/18/nodejs-the-good-the-bad-and-the-javascript-part-2/

Part 2 – NodeJS: The good, the bad and the Javascript

The bad

Offices

Melbourne

Subscribe to updates from DiUS