Requiring Modules in CouchDB

Thanks to node.js, using the same code on the server and client is easier than ever. What about running application code in the database? If you have a function that is expensive to compute, but always returns the same result for the same input, then you could get a significant speedup by having your database cache the results of the computation ahead of time. This is especially true if you’re using CouchDB, which uses incremental MapReduce to make view computation very efficient.

Our realtime app at Getable is backed by CouchBase, a derivative of CouchDB. CouchBase and CouchDB share the same replication protocol, but differ in several important ways. One such difference is that CouchBase does not support require, while CouchDB does. One major caveat with CouchDB’s require is that you have to push the modules you want up as design documents. Since forcing humans to resolve dependency trees is outlawed in the Geneva Conventions, we need a better way of doing this.

Enter Browserify’s standalone option. We can use it to create a string of code that is easily prepended to the map function in our views. Browserify’s standalone option takes a string, which is the name of the exported module. Critically, the export will be added to the global object if require is not available, which is the case in CouchBase views. Therefore, if you create a browserify bundle with the {standalone: ‘mymodule’} option and prepend that string to your map function, you will now have global.mymodule available for use. The one gotcha is that the global object does not exist in CouchBase views either, so it must be initialized ahead of the bundle.

Create an entry script that just exports the module you want:

// entry.js
module.exports = require(‘mymodule’)  

Then bundle it with the standalone option and initialize global ahead of time:

// bundler.js
var bundle = new browserify({standalone: ‘mymodule’})  
bundle.add(‘entry.js’)  
bundle.bundle(function (err, src) {  
  var prelude = ‘var global={};’ + src.toString()
})

Now, if you have a map function, you can just insert this prelude right after the function header:

function mapFunction (doc) {  
  // prelude goes here!
  emit(doc._id, global.mymodule(doc))
}

As an implementation note, we have a small script that uses Function.toString() to manage our design documents. It turns our map functions into strings, searches for the use of application logic, and browserifies the appropriate standalone bundle for each function. It’s less prone to failure than manual updates, and makes the experience just a bit more magical.


The “One Year Later” update: we’ve seen vastly improved performance by pushing these standalone bundles up under the lib key.

Jankproof Javascript

Javascript is getting faster all the time, but things like long lists of complex cells are always going to be expensive to compute. To solve this, I wrote the unjank module last week. It helps you do expensive things in Javascript without causing the user experience to suffer.

I’m happy to report that after a week of production use, it’s clear that this technique is a significant improvement over what we were doing before for several reasons.

Device Agnostic

It doesn’t matter how fast or slow the task is; unjank benchmarks it on-the-fly and runs it as quickly as the device will allow. This means that your application is jank-free on all devices without you having to come up with magic numbers that determine how quickly a task should run.

Smooth Scrolling

An unexpected discovery was that kinetic scrolling in Webkit works very well even if the page is getting longer during the scroll. This means that if your user is scrolling down a long list as it is being rendered with unjank, they will not perceive it as slow at all. Webkit preserves the momentum of the scroll and keeps going as the page gets longer.

Aborting Tasks

The ability to abort an ongoing task is critical because most tasks are initiated by a user action. For example, if you have two tabs that have a long list each, quickly switching between the tabs will eventually crash the application unless the rendering of the lists is aborted when the tab becomes inactive.

Conclusion

I’m going to be using unjank a lot more going forward, especially where lists are involved. I pulled up the Getable app to experience it pre-unjank, and it has that signature lagginess associated with web apps, despite our use of requestAnimationFrame. With unjank, our longest lists no longer cause the browser to stutter — a small step out of the uncanny valley of hybrid apps.

Tips For Debugging JavaScript Race Conditions

Race conditions are tedious to debug, even in a single-threaded language like JavaScript. Here are three tips I hope you never need.

1. Breaking On DOM Changes

When dealing with nested views and asynchronous renders, it’s common to run into issues where a subview fails to appear reliably.

To debug these issues, I like using Chrome’s ability to break on DOM changes. Right click on a node in the Elements panel of your dev tools and you’ll be presented with different types of breakpoints you can set.

dom-breakpoint-screenshot.png

2. Spying On Object Properties

A particularly frustrating problem that results from improperly cloning objects is when properties change their values seemingly at random.

// One source of this problem is devious .toJSON methods
// that don't clone their source object
var myModel = new Backbone.Model({uhoh: {nested: 'somevalue'})}  
var copy = myModel.toJSON()  
copy.uhoh.nested = 'changed'  
myModel.get('uhoh').nested == 'changed'  

View on jsfiddle

Here the source of the mischief is clear, but you won’t always be so lucky. If the code that manipulates the object is in some asynchronous callback three modules deep, its not trivial to figure out where the source of the problem is since there are so many ways to manipulate the value:

// The many different ways to cause this problem
myModel.get('uhoh').nested = 'changed'  
myModel.toJSON().uhoh.nested = 'changed'  
myModel.attributes.uhoh.nested = 'changed'  

You can handle all these cases by defining a setter on the uhoh object with a debugger breakpoint in it:

myModel.attributes.uhoh.__defineSetter__('nested', function(val) {  
  debugger
  uhoh._nested = val
})

View on jsfiddle

Now you’ll be able to look at the call stack no matter how that property was changed.

3. Making It Worse

Debugging is hard because it’s the inverse of programming: “Given the code, pinpoint the source of this behavior”. This is especially true when dealing with race conditions, because the behavior doesn’t happen all the time. This is why when all else fails, I solve the inverse problem: writing code that makes the bug happen reliably.

If you can make the issue 100% reproducible, not only have you gained insight into the root cause of the problem, you’ve also shortened the debugging cycle.

In general, this is what I do:

  1. Locate the sections of code that manipulate the misbehaving data
  2. Wrap those sections in .setTimeout with large-ish intervals to force an deterministic order of operations
  3. Permute the order of operations until the problem is 100% reproducible
  4. Permute the order of operations until the problem is gone

I like to use prime numbers as intervals to reduce the odds that multiple sections of code run in close temporal proximity to each other because they happened to be repeating in phase.


I hope these tips save you as much time as they’ve saved me.dom-breakpoint-screenshot.png

Sleepsort

Sleepsort might be a joke, but its implementation demonstrates three very important things about javascript: closures, asynchronous functions, and variable hoisting.

It is therefore an excellent example for a beginner to study to understand javascript better.

The Naive Implementation

function sleepsort (input) {  
  for(var i=0; i<input.length; ++i) {
    setTimeout(function () {
      console.log(input[i]);
    }, input[i] * 1000);
  }
};

sleepsort([3,1,2]);  

Output:

undefined  
undefined  
undefined  

What happened here? Why did we get three undefined elements?

If we modify the example a little bit, the reason is clear:

function sleepsort (input) {  
  for(var i=0; i<input.length; ++i) {
    setTimeout(function () {
      console.log(input[i]);
    }, input[i] * 1000);

    console.log('i is ' + i);
  }
};

sleepsort([3,1,2]);  

Output:

i is 0  
i is 1  
i is 2  
undefined  
undefined  
undefined  

The for loop kept running, incrementing the value of i. By the time the callback functions ran, the value of i was 3, which was out of bounds of the input array.

This demonstrates closure, which is the ability of a function to “remember” the scope it was created in, and access the variables in that scope.

The Naive Fix

A beginner would likely try to fix the problem with this code:

function sleepsort (input) {  
  for(var i=0; i<input.length; ++i) {
    var j = i; // copy i into j.

    setTimeout(function () {
      console.log(input[j]);
    }, input[j] * 1000);
  }
};

sleepsort([3,1,2]);  

Output:

2  
2  
2  

Why did we get three 2s this time? The reason is because of hoisting. All variable declarations in javascript are “hoisted” to the top of the scope they are in. Since only functions create new scopes in javascript, this means that our code was equivalent to this:

function sleepsort (input) {  
  var j;

  for(var i=0; i<input.length; ++i) {
    j = i; // copy i into j.

    setTimeout(function () {
      console.log(input[j]);
    }, input[j] * 1000);
  }
};

sleepsort([3,1,2]);  

Unlike languages like Java and C, the for loop did not have its own scope. Thus, j was last assigned the value of 2 before the loop ended, and that was the element that was printed three times.

The Real Fix

The real solution to this problem is to create a new scope. By wrapping our timeout code in an immediately invoked function expression (IIFE), we create a variable j in a new scope and assign it the current value of i. Since j is in a new scope, it is untouched by future iterations of the loop.

function sleepsort (input) {  
  for(var i=0; i<input.length; ++i) {
    (function (j) {
      setTimeout(function () {
        console.log(input[j]);
      }, input[j] * 1000);
    })(i);
  }
};

sleepsort([3,1,2]);  

Output:

1  
2  
3  

Optimization

While optimizing sleepsort is somewhat of a laughing matter, creating functions inside a loop is almost always a bad idea. The following code is equivalent and resolves this problem:

function sleepsort (input) {

  function sort (j) {
    setTimeout(function () {
      console.log(input[j]);
    }, input[j] * 1000);
  };

  for(var i=0; i<input.length; ++i) {
    sort(i);
  }
};

sleepsort([3,1,2]);  

I hope that helped your understanding of closure, asynchrony, and hoisting!

Is this really O(n)?

No, it’s actually using your operating system’s insertion sort.

This code is all available on github and of course, npm.