Introduction

Node has changed quite a bit since I first wrote this book. In those days, npm was not bundled with Node, libuv didn't exist, streams were much less commonly used, the zlib, cluster and domain functionality didn't exist and DNS and SSL were slower.

It's only been about a year since I released the first edition, but I felt like I needed to do a second edition.

I wanted to expand on a couple of topics:

  • Streams. Node 0.10 introduces a new Streams API, and ever since 0.4 added .pipe() Streams have been steadily gaining steam. I did correctly identify them as a core construct, but now I feel like they deserve their own chapter.
  • npm and packaging for the web. When I started writing the book, npm was a 3rd party addon to Node that you had to install separately. Nowadays, npm is bundled with Node, and it is a major source of happiness for me, so it needed it's own chapter. Additionally, packaging code for the browser has become an issue that more people pay attention to, and I felt like it was worth giving it a more detailed treatment (partly based on my book on single page apps).
  • Indoctrination. I realized later on that people who'd read my book would still do a lot of silly things, so I figured that it was worth spending a bit of time on the small niceties.

Basics

Learn about the event loop, asynchronous coding style, the gotchas around scope rules and "this", the ES5 functions that you didn't use because of IE, how OOP is done in JS and most importantly, the basic control flow structures for asynchronous code.

Node concepts

I highly recommend that you read chapter 7 on control flow. It's still my favorite chapter in this book.

What are the Node-specific concepts that someone needs to know?

  • modules
  • events
  • streams

Control flow

Illustration:

| --> later
|
V now
  • the fact that you don't really need a bunch of fancy control flow patterns, given that most task definition is local and there is only one pattern: parallel execution with limited concurrency
  • the task queue pattern
  • other concurrency issues:
    • preventing duplicate requests to a object
    • shared concurrency queue
    • retrying requests and timing out requests

Neat fns

['a', false, 'b', undefined, 'c', null, NaN, 'd'].filter(Boolean)

function uniq(){
  var prev;
  return function(i){
    var isDup = (i == prev);
    prev = i;
    return !isDup;
  }
}
// Usage: ['a','b','a'].sort().filter(uniq())


// find matching keys by regexp
re = /c/;
['a', 'b', 'c'].filter(re.test, re)
// returns ["c"]

// find matching keys by object
inc = { b: true };
['a', 'b', 'c'].filter(inc.hasOwnProperty, inc)
// returns ["b"]

// negate
function negate(func) {
  return function() {
    return !func.apply(this, arguments);
  };
}

// filter array by regexp
re = /c/;
['a', 'b', 'c'].filter(negate(re.test), re)
// returns ["a", "b"]

// filter array by object
x = { b: true };
['a', 'b', 'c'].filter(negate(x.hasOwnProperty), x)
// returns ["a", "c"]


var dep = { a: 'aa', b: 'vv' };
// exclude object keys
var whitelist = Object.keys(dep).filter(negate(Object.hasOwnProperty), { a: true });
return JSON.parse(JSON.stringify(dep, whitelist));

// Quick way to track down global variable leak in node.js:

Object.defineProperty(global,'name', { set: function() { console.trace(); }});

Streams

Unix streams:

a | b | c | d

Node streams:

a.pipe(b).pipe(c).pipe(d)
  • getting data into writable streams
  • writing your own streams
  • using object mode

Writing a module as a stream:

#!/usr/bin/env node

process.stdin.pipe(require('../index.js')()).pipe(process.stdout);

File system

  • sync is ok, most of the time
  • file handles are a limited resource
  • the path-walking, task definition, task execution pattern: don't try to do all the things at once
    • antipattern: the directory walker function (traverseDirectory(startDir, onFile)
  • use a shared task queue if you need to constrain parallelism accross a complex app
  • debugging file handle leaks

Node core is small:

  • the basic philosophy is that core should not contain "nice to have" functionality. Things where that can be implemented in many ways are pushed out to modules.
  • This means that for some functionality you are expected to install a module, for example:
    • the mkdirp module is often used for mkdir -p
    • the rimraf module is often used for rm -rf

Events

  • using EventEmitter (or microee)
  • using a mixin to extend a class with ee
  • once
  • microee.when

Parallel execution

  • child_process.fork
  • on('message')
  • passing file descriptors like sockets

Advanced concurrency

  • continously updateable queue
  • stratified execution
  • dealing with concurrency conflicts using a event emitter (e.g. two concurrent fs.read operations or http.get operations)

Error handling

  1. Return errors rather than throwing when writing async functions
  2. Wrap all JSON.parse calls in try ... catch

Throwing errors is not a reliable way of handling errors if asynchronous processing is involved.

Always return errors as the first parameter.

Do not ignore errors in your own code. You will need to handle errors eventually, so might as well get started immediately.

What things are becoming increasingly important?

  • Testing
  • Packaging and sharing code between front- and backend

Control flow

  • rather than doing the tedious passing around of results as some random parameter, just use lexical scoping

Two pictures:

|  first
|  ---> later, when some task is done
|  second
V

vs:

| first
 \ second
 / third
V

Modules

  • underscore
  • mocha
  • express
  • jade

Searching for modules

npmjs.org or npm search zip

Installing modules

Run:

npm install archey

Create:

var archey = require('archey');

archey();

Installing a module globally

npm install -g archey

Run:

archeyjs

Creating your own module

npm init

Show the steps.

Tracking dependencies using package.json

npm install --save archey

Publishing modules on npm

npm publish

Basics

Best of npm

npm --save

npm ls

npm scripts:

"scripts": {
  "test": "mocha --reporter spec test"
}

Notable: prepublish, pretest, prestart.

npm link:

  • run npm link in the package folder you want to link
  • run npm link <name> to add a symlink in another folder (typically your app or other code which uses the development version of the code)

Node standard library

Specialized topics