Isomorphisms Explained

I haven’t seen a good introduction to isomorphisms since they came into vogue last year, so I decided to write one.

Here goes!

A Parallel Postulate

Try defining “distance”. Most of us would say something like “how far two things are from each other”. Unfortunately, that makes answering a question like “how far is Alice from Bob” very difficult because there are several perfectly valid answers: Ten meters. One thousand centimeters. If Alice is ten meters away from Bob, is Bob negative ten meters away from Alice? Can I measure a zig-zag line from Alice to Bob?

To avoid ambiguity, a mathematician would use a definition like “Distance is the minimum number of one meter rulers, laid end-to-end, needed to touch both A and B”. It’s interesting how fragile yet robust these definitions are. Remove any piece of the definition and it falls apart. The word “minimum” ensures that the shortest path from A to B is measured. The use of one meter rulers ensures that distance is always an integer (no decimals or fractions). The answer will always be a positive number because you can’t have a negative number of rulers.

Now let’s change our perspective. We zoom really far out and see that Alice and Bob are on opposite sides of the Earth. Well, now “distance” depends on whether we are allowed to lay rulers through the center of the Earth. What if Alice and Bob were on the surface of a balloon? If we defined distance as the smallest angle between them from the center of the balloon, we’d have a unique answer even if the balloon inflates or deflates.

Here’s the main takeaway: the word “distance” doesn’t have any inherent meaning. Its meaning depends on our perspective, and if we are creative enough with our perspective, multiple meanings can exist at the same time. Keep this idea in mind, as it will return in the conclusion of this post.

I’m going to explain three types of isomorphism in this post, in order of complexity. The first is Graph Isomorphism. The second is Group Isomorphism. The third is Gödel Numbering.

Graph Isomorphism

I find that I learn best when the same thing is explained in at least two different ways. We’re going to start with graph isomorphisms because we can approach them both visually and theoretically.

A graph is a tuple (V, E) of a set of vertices V and a set of edges E. An edge is a tuple (Vi, Vj) where Vi and Vj are vertices. The vertices Va and Vb are adjacent if and only if there exists an edge (Va, Vb).

The graphs G and H are isomorphic if there is an isomorphism between them. An isomorphism between G and H is a bijection that maps vertices in G to a vertices in H such that if two vertices are adjacent in G, the vertices that they are mapped to are also adjacent in H.

It’s easier to understand this with pictures. Two graphs are isomorphic if you can arrange their vertices and edges in such a way that they look exactly the same.

graph-isomorphism-1-1-1-2.png

Figures 1.1 and Figures 1.2 are isomorphic. You can verify this visually by rotating the second figure ninety degrees clockwise and verifying that it looks exactly the same as the first figure. The other way is to define a function f(x) that meets the strict requirements we laid out above. Here is such a function f(x):

f(1) = d
f(2) = b
f(3) = c
f(4) = a

We now verify that this function meets our requirements. The first rule is that f(x) must be a bijective function. Bijective functions are also called bijections. A bijection is both injective and surjective. Injective functions map every member of their input set to a unique member of their output set. Surjective functions map at least one member of their input set to every member of their output set. Take a moment to understand this with the help of the following diagram, and then convince yourself that f(x) is a bijection.

graph-isomorphism-function-types.png

The second and final rule is that vertices that were adjacent in 1.1 must be adjacent in 1.2 after going through f(x). For succinctness, we’ll write “Vertex 1 is adjacent to vertex 2 and vertex 3” as 1 ADJ 2, 3. Convince yourself that f(x) preserves adjacency with the help of the table below.

1 ADJ 2, 3    => d ADJ b, c
2 ADJ 1, 3    => b ADJ d, c
3 ADJ 1, 2, 4 => c ADJ d, b, a
4 ADJ 3       => a ADJ c

That’s all there is to it. Simple? Not quite. Can you tell if these two graphs are isomorphic?

graph-isomorphism-1-3-1-4.png

As it turns out, it’s pretty difficult to figure out if two graphs are isomorphic once they get large enough. How about these two?

graph-isomorphism-1-5-1-6.png

There is no easy way to tell if two graphs are isomorphic. Figures 1.3 and 1.4 are isomorphic, but look nothing alike. Counting degrees (the number of edges each vertex has) does not work either — figures 1.5 and 1.6 both have degrees 2, 2, 3, 3, yet they are not isomorphic. Why is that? In order to prove that two graphs are not isomorphic, you need to show that no isomorphism could possibly exist between them. This is easy if their vertex degrees do not match, but as we just saw, it is possible for vertex degrees to match when no isomorphism is possible.

For more about graph isomorphism, I recommend this blog post by Jeremy Kun.

Group Isomorphism

Groups are a tuple (S, P) where S is a set of things and P is a function P(a, b) that operates on things a and b in S and returns something else. Two groups (G, *) and (H, +) are isomorphic if there is an isomorphism between them. The * and + are just symbols for operations on elements in the groups. They don’t necessarily mean multiplication or addition, though it is less confusing to choose a symbol that makes sense for whatever you’re doing.

The rules for group isomorphisms look similar to those for graph isomorphisms:

  1. There is a bijection P that maps elements in A to elements in B
  2. P preserves group operations P(a * b) = P(a) + P(b)

There isn’t a great way I know of to visualize this, so we’ll work through a few examples to understand what’s going on.

Degrees and Radians

Let’s show that angles in degrees and angles in radians are isomorphic under addition. Define the function P(d) = d * PI / 180. This function converts angles in degrees to angles in radians. We’re going to show that it’s an isomorphism. First we need to show that it is a bijection using a proof by contradiction. I’m only going to do this part of the proof for this example because it is quite lengthy and fairly obvious.

Show That P Is Injective

  1. Assume that P is not injective. That implies that at least one of the following is true:
    1. P is not defined for some input. This is not possible since P accepts all real numbers as input, and never divides by zero.
    2. P maps two inputs to the same output. Assume that this is possible. This implies that there exists some x and y where x != y such that P(x) = P(y). Substitute to get x * PI / 180 = y * PI / 180. Simplify to get x = y. This is a contradiction, so P cannot map two inputs to the same input.
  2. Since neither of the above is true, P must be injective.

Show That P Is Surjective

Assume that P is not surjective. That implies that there is a real number y such that there is no real number x where P(x) = y. Substitute to get x * PI / 180 = y. Rearrange to get x = y * 180 / PI. We now have a function C(y) that always returns a real number x that satisfies the equation P(x) = y for all y.

Show That P Preserves Group Operations

We need to show that P(a + b) = P(a) + P(b), where + stands for addition.

P(a + b) = (a + b) * PI / 180
         = (a * PI / 180) + (b * PI / 180)  // via the distributive property
         = P(a) + P(b)  // substitute back into definition of P

Phew, that’s it! We’ve shown that angles in degrees and angles in radians are isomorphic under addition.

Odds And Evens

Are odd integers are isomorphic to even integers under addition? Let’s give it a shot with this function: P(x) = x + 1. As you can see, given an odd number, it returns an even number. To show that it is an isomorphism, we first need to show that it is a bijection. You can do this now, or just take my word for it so we can move on.

Let’s look at whether this function preserves addition.

P(1 + 3) = P(4) = 5
P(1) + P(3) = 2 + 4 = 6

Oops, we’ve found a counterexample that violates rule number two. Can you think of a different way of mapping the odds to the evens that will preserve addition? Is there an operator besides addition that would work with the same mapping function?

Gödel Numbering

This is a very special sort of isomorphism that links formal systems with number theory. Formal systems are systems of thought. They have an alphabet, a grammar, a set of axioms, and a set of inference rules.

Formal systems let you encode statements like “5 is prime” in strings like ~Ea:Eb:(SSa*SSb)=SSSSS0, and prove the statement by using strict rules to manipulate strings. We’re going to start with a simple formal system and work our way up.

The EASY System

The EASY system is so named because it has four symbols in its alphabet: E, A, S, and Y. The grammar for this system says that “All Es appear before As, all As appear before Ss, and all Ss appear before Ys”. In other words, they appear in the order EASY.

A formula is a string made with the formal system’s alphabet. These are strings: EASS, SSSS, EY. These are not strings: XKCD, KISS, JUMP.

A formula is well-formed if it obeys the formal system’s grammar. ESYY and AAA are well-formed formulas. YES and EYES are not.

Let’s introduce some axioms and inference rules. Axioms are well-formed formulas that form the basis of all other theorems of the system. The EASY system has one axiom, A. It also has the following inference rules:

  1. A => AYY
  2. AYYY => S
  3. AS => E
  4. SY => EA

Using these inference rules, we can derive more theorems from the axioms:

A
AYY (Rule 1)
AYYYY (Rule 1)
AYYYYYY (Rule 1)
SYYY (Rule 2)
EAYY (Rule 4)
EAYYYY (Rule 1)
EASY (Rule 2)

All of these formulas are theorems because they can be derived from the formal system’s axioms using its inference rules. Note that formulas might not be well-formed, and well-formed formulas might not be theorems. This distinction is important! EY is a well-formed formula of the EASY system, but is it a theorem? Try deriving it from the axiom using the inference rules. How can you be sure that it isn’t a theorem?

The question “is EY a theorem of the EASY system” can be generalized to “is X a theorem of the Y system”. As Y gets more powerful, the questions we can ask get more interesting. The EASY system does not appear to do anything remotely useful because it is not isomorphic to something else we already know about. This is important: meaning is something that humans attach to formal systems. Formal systems are rather dumb systems of thought based on mechanically manipulating strings using pedantic rules. Let’s look at a formal system that we can attach some meaning to.

Arithmetic

Before we can express statements like “5 is prime” we need to be able to do some basic math. Let’s create a new formal system, the POES system.

The alphabet of our new system is S, 0, p, e.

The grammar of this system is {x}p{y}e{z}. The placeholders that look like {x} represent any string that starts with S and ends with 0.

The axiom of this system is 0p0e0

The rules of our system are:

1. {x}p{y}e{z}  => S{x}p{y}eS{z}

2. {x}p{y}e{z}  => {x}pS{y}eS{z}

Now let’s derive some theorems from the axiom:

0p0e0
S0p0eS0 (Rule 1)
S0pS0eSS0 (Rule 2)

This formal system looks nothing like the EASY system, but it works the exact same way. You have a set of axioms, and you can derive theorems from those axioms by applying some rules.

Although it is based on the same principles as the EASY system, the POES system is more interesting because it happens to be isomorphic to a tiny part of number theory. Here’s how you can interpret theorems of this system:

  • 0 means zero
  • S means “successor of” (so SSSSS0 means five)
  • p means “plus”
  • e means “equals”

An example:

0p0e0      =>  0+0=0
S0p0eS0    =>  1+0=1
S0pS0eSS0  =>  1+1=2

If you were given a string S0p0e0, would you be able to tell if it was a theorem of the system? It fits the grammar of the system, but in order to tell if it is a theorem, we need to show that we can derive it from the axioms. Now that you know that this system models addition, you might be tempted to say “no way, because 1+0 does not equal 0”. However, that would be using human intuition to work outside the system. You can’t jump to conclusions in a formal system; you have to mechanically prove that a string is a theorem by starting from an axiom and applying rules until you get the same string.

Is it possible to show that a string is not a theorem of a formal system?

Logic

In order to express a statement like “five is prime”, we need a more powerful formal system. One such system is Typographical Number Theory, or TNT. Explaining the entire system is beyond the scope of this post, but I’ll explain how to understand the example string from the beginning of this section.

~Ea:Eb:(SSa*SSb)=SSSSS0

From the left: ~ means “not”. E{x}: means “exists {x}”. a and b are simply variables in the statement. * means

“multiply”, and the parenthesis () are just for grouping.

Put together, this string says “There do not exist a and b such that (2 + a) * (2 + b) = 5”. In other words, five is prime.

It bears repeating that strings by themselves have no meaning. We find meaning in strings through isomorphism. The meaning in TNT comes from propositional calculus and number theory. However, you do not need to know any of that to work with this formal system.

It takes practice in order to turn statements like “16 is a power of 2” into TNT, but it can be done. The more complex the thought, the larger the resulting string. For example, you could try to prove Fermat’s Last Theorem by turning it into a gigantic string of TNT, and trying to derive it from the axioms.

Here is Ray Toal’s summary of TNT if you would like further reading.

Gödel Numbering

We’re finally ready to understand a very special type of isomorphism. Earlier we saw how otherwise meaningless strings like S0pS0eSS0 can acquire meaning through isomorphism. Now we’ll see how these strings can acquire an additional layer of meaning through a second isomorphism.

Computers store strings by encoding them into numbers. One popular encoding is ASCII. In ASCII, the lowercase letter a is encoded as the number 97. Assuming that we stay within the ASCII character set, we can encode any string into a number. Here is how the string S0pS0eSS0 can be encoded as the number 83,048,112,083,048,101,083,083,048:

S   0   p   S   0   e   S   S   0
083 048 112 083 048 101 083 083 048

Recall that we derived this string from an axiom of the POES system using inference rules. I’ve reproduced the proof below, this time with the ASCII encoding of each step.

0p0e0       048,112,048,101,048
S0p0eS0     083,048,112,048,101,083,048
S0pS0eSS0   083,048,112,083,048,101,083,083,048

Hey, this looks like the beginning of an isomorphism! We have two sets of things, and a way of mechanically converting between them. Specifically, we have a bijection between strings of a formal system and natural numbers.

To complete the isomorphism, we also need to convert our inference rules so that they work on numbers.

Gödel’s key insight was that each step of a formal system is isomorphic to some sequence of arithmetic operations. You can multiply by ten to shift a string to the left, or divide by ten to shift a string to the right. You can add to insert new characters, and subtract to remove old ones.

For example, one of the inference rules of the EASY system was A => AYY. If we encode A` as 1 and Y as `2, then this rule could be expressed as “If x is equal to 1, and x is an EASY number, then x*100 + 2*10 + 2 is an EASY number”.

This strategy works even for more complex rules. Recall that the rules of the POES system are:

1. {x}p{y}e{z}  => S{x}p{y}eS{z}

2. {x}p{y}e{z}  => {x}pS{y}eS{z}

The trick to translating more complex inference rules is to use multiple variables and constraints. For example, rule two can be translated into this:

  1. If j*10^(m+9) + 112*10^(m+6) + p*10^(m+3) + 101*10^m + n is a POES number where the following is true:
    • n < 10^m
    • p < 10^3
  2. then j*10^(m+15) + 112*10^(m+12) + 83 * 10^(m+9) + p*10^(m+6) + 101*10^(m+3) + 83*10^m + n is a POES number

Let’s see this rule in action!

Input string and number:
  S0p0eS0  =>   83,048,112,048,101,083,048

Fit the components of the input formula:
                         83,048 => n
                    101,000,000 => 101*10^m where m=6
                 48,000,000,000 => p*10(m+3) where p=48
            112,000,000,000,000 => 112*10^(m+6)
     83,048,000,000,000,000,000 => j*10(m+9) where j=83,048

Verify that the conditions are true:
  n < 10^m   =>  83,048 < 10^6  =>  true
  p < 10^3   =>  48 < 10^3      =>  true

Create the components of the output formula:


             83,048,000,000,000,000,000,000,000 => j*10^(m+15)
                    112,000,000,000,000,000,000 => 112*10^(m+12)
                         83,000,000,000,000,000 => 83*10^(m+9)
                             48,000,000,000,000 => p*10^(m+6)
                                101,000,000,000 => 101*10^(m+3)
                                     83,000,000 => 83*10^m
                                         83,048 => n


Sum it all up to get:
   83,048,112,083,048,101,083,083,048  =>  S0pS0eSS0

It takes some practice, but the key is this: we can take any formal system and talk about it using the natural numbers and simple arithmetic — even TNT! What a wonderful new type of isomorphism. What can we do with it?

A Strange Loop

Can TNT make statements about itself? For example, can we create a string of TNT that says “S is a theorem of TNT”?

Using Gödel numbering, we can! To make things easier to follow, we’ll define two operations, tnt2nt and nt2tnt. They stand for “TNT to Number Theory” and vice versa.

The operation tnt2nt represents the tedious work of converting from formal systems to number theory. For example, tnt2nt('S0p0eS0') would spit out the number 83,048,112,048,101,083,048, and tnt2nt(Rules) would spit out the arithmetic equivalents of the inference rules. Remember that axioms are strings, so you can also pass them through tnt2nt().

The operation nt2tnt represents the far more difficult task of converting statements of number theory into TNT. For example, nt2tnt('Five is a prime number') would produce ~Ea:Eb:(SSa*SSb)=SSSSS0. nt2tnt('Four is an even number') would produce Ea:SS0*a=SSSS0. We can even use nt2tnt() to turn “There exist numbers a, b, and c where c < 10^b such that a = 101*10^(b+3) + 99*10^b + c” into a massive string of TNT. Note the similarity of this string to the previous example!

With these two operations defined, we now have a way to create TNT strings that talk about TNT:

  1. S is a theorem of TNT
  2. The string S can be derived from axioms A following the rules R
  3. The number tnt2nt(S) can be produced from the numbers tnt2nt(A) with the arithmetic operations tnt2nt(R)
  4. nt2tnt('The number tnt2nt(S) can be produced from the numbers tnt2nt(A) with the arithmetic operations tnt2nt(R)')

This mind-blowing result is what happens when you have not one, but two isomorphisms working together. We’ve created a string of TNT that talks about number theory, and through number theory, talks about other TNT strings.

In Conclusion

To summarize: an isomorphism is a bijection that preserves relationships.

  • When graphs are isomorphic to each other, the bijection is between vertices, and adjacency is preserved.
  • When groups are isomorphic to each other, the bijection is between items, and group operations are preserved.
  • The POES system is isomorphic to a tiny piece of number theory, where the bijection is between strings like 0pS0eS0 and formulas like a+b=c, and the properties of addition are preserved.
  • Formal systems and number theory are isomorphic to each other, where the bijection is between strings and natural numbers, and inference rules are preserved.

Make Reversible Decisions

I recently learnt about the “Rule of Three” from this tweet by Jacob. It says that code should be copied once, and extracted into a procedure only the third time it has to be used. After further thought, I realized that this simple programming rule is a domain-specific manifestation of a more general decision-making guideline.

Lets go back to computing for a brief moment, since many of you reading this are programmers. Abstracting too early is more dangerous than it seems. When abstraction is done too early, it increases the complexity of the product before a complete understanding of the problem has been obtained. As the decision was made with insufficient information, it is more likely to be wrong than right. As more abstractions are layered on top of these bad decisions, it becomes more and more difficult to backtrack as time goes on. Furthermore, as the size of a team grows, it becomes difficult to switch out core infrastructure without stalling the entire team’s progress.

This idea of delaying abstraction can be generalized to all decision making.

In general, decisions that are difficult to reverse should be made as late as possible. Decisions that are easily reversible are great because they are a thinly veiled version of “heads I win, tails you lose”. When you guess correctly, you win, and when you guess wrongly, you get cheap information that can inform your next decision. For example, startups frequently make easily reversible decisions as part of their search for a repeatable business model.

A Framework For Decision Making

  1. If you don’t have to make the decision now, wait
  2. If you have to make a decision now, do something you can undo

The rest of this blog post takes this framework for decision making and applies it to decisions that get progressively more important. I’ll begin with a harmless code quality issue developers can relate to, and end with infrastructure decisions that affect the competitiveness of a business. If you’re not a programmer, I would scroll down to “Decision Making For Managers” and start from there.

Decision Making For Programmers

Functions As Comments

Functions are for defining reusable logic. With very few exceptions, if you are not going to reuse something, don’t make it a function.

Some of you might be wonder why anyone would create a function that only gets called in one place. As a TA for an introductory C++ course who saw this happen quite often, the most common reason I got is that teachers required code to look well organized. Novices then pick up this bad habit of “organizing” code and continue making pointless methods in their own projects and work.

That is how we end up with code that looks like this:

class Car {

public:

Car () {  
  Person owner = createOwner();
  Chassis chassis = createChassis();
  Key key = createKey(chassis);
}

private:

Person createOwner () {  
  return new Person("Jimbo");
}

Chassis createChassis () {  
  return new Chassis("Honda Accord");
}

Key createKey (Chassis chassis) {  
  return new Key(chassis);
}

};

These “do the thing” methods don’t increase the robustness or maintainability of the code, they just slow down the person reading it. This is not abstraction, this is using function names to annotate blocks of code that are usually so simple they don’t need any annotation to begin with.

This is a fairly benign example of knee-jerk decision making. Abstraction for the sake of abstraction is a best, annoying. At worst, it’s a source of bugs.

Recommendation: Unless you get paid per line of code you write, if your function is private and only called in one place, you probably should just use a comment. The “rule of three” is a good guideline here.

Frameworks Are Glaciers

Asynchrony is a powerful way to improve the performance of code. While async APIs are slower by nature, total throughput of the program may be improved because of reduced blocking.

Asynchrony, like multi-threading, is not a silver bullet. It has the potential to introduce insidious bugs, and one that I have been dealing with is attempting to speed up a sync API by doing something asynchronously in it.

I’ve been working with a view system that has gotten just about everything right. It’s modular, has just the right amount of convention without being draconian, and keeps it API lean enough that I hardly ever have to pull up the documentation to do something.

Many months back a small change was made to improve the speed of rendering views: elements were now appended to the DOM in batches using RequestAnimationFrame to stop the browser from locking up when rendering expensive components like long list views.

This addition has caused quite a headache. Render tasks used to have a well defined beginning and end. Now, some components in the view hierarchy may not actually be on-screen when their render method returns. This is an issue in an event-driven system where renders are being triggered all the time. Without knowing when a view is actually done rendering or not, you get weirdness like duplicate components being appended when two render events are triggered in quick succession. There are also more sinister problems, like memory leaks that result from nodes being detached from the DOM while hot code still retains references to them.

Later on, we realized that appending wasn’t even causing the slowdown in the first place. The problem was in parsing large amounts of template code repeatedly, and creating too much DOM at a time. Unfortunately, since we made a bad decision at the framework level, we now had an ecosystem of modules depending on the flawed API, often employing hacks to smooth over the problems bubbling up from the framework.

In hindsight, we should have resisted the pressure to find a quick fix for our users, and taken the time to properly understand the performance issue before we modified a critical piece of our infrastructure. Nevertheless, the lean API will make reverting the bad behavior quite easy, and the bulk of the work will be updating dependents that relied on the misbehaving methods.

Recommendations: Be extra careful when tinkering with lower levels of abstraction, and even more so if there is an ecosystem on top of it. Be as restrictive as possible with your API.

Decision Making For Managers

Organize Your Data

Poor abstraction doesn’t just happen in code. It’s even worse then it happens with data.

Every part of a business is affected by how your data is modeled. Poorly modeled data puts constraints on user interface design, bogs down developers with technical debt, and cripples the ability of the business to take advantage of new opportunities.

Document oriented (noSQL) databases have been growing in popularity lately and for good reason: relational databases are suboptimal for managing semi-structured data. While there are good use cases for a document database, they are being used in situations where a relational database is a much better fit.

The highly abstracted nature of document databases creates an illusion of freedom and speed, and the early stages of a software project will speed by without talk of schemas and migrations. However, from my experience the vast majority of data is relational, and document databases struggle to match the performance and simplicity of an RDBMS when burdened with highly relational data.

Using a document database from the start is therefore a prime example of making a difficult to reverse decision too early. Before core questions like “is our data relational?” have been answered, the developer has already climbed so high up the tower of abstraction that they will encounter significant resistance trying to return to a technology that better fits their needs.

Concretely, the permissiveness of document storage leads to the team accumulating massive amounts of tacit knowledge about their data structure that cannot be easily codified as a schema. This makes moving to an RDBMS later extremely expensive, since they require well defined models and relationships. Furthermore, being able to work directly with plain objects leads to poorly defined or nonexistent interfaces between application code and the database. Since there is no clean interface between the application and the database, a prerequisite for changing the database is defining such an interface and performing a major refactor on the application. For these reasons, starting with a document database can lead to technical debt and switching cost snowballing out of control.

In contrast, it is far simpler to move from a restrictive data model to a more permissive one. Simply turn each row of an SQL database into JSON and throw it into document storage.

Recommendation: Unless you are modeling data that is very well understood and clearly non-relational, resist the temptation to go with a document database, and start with an RDBMS. It is the reversible choice.

Defensive Outsourcing

With the rise of SaaS and module ecosystems it is easier than ever to outsource everything from payroll to server operations. Outsourcing is trading money for time, so it makes sense if you expect to see a net gain in productivity.

Outsourcing comes with its own problems. The goals of your partner are seldom aligned with yours. For example, while you would prefer minimizing switching costs, it is to your partner’s advantage to maximize them. Sometimes, these switching costs are not apparent until you try to break up with a provider.

We were using Cloudant as our database service for a while. To our pleasant surprise, we added a major feature in a few hours by building on top of Cloudant’s integrated search. This came back to haunt us when we grew tired of the terrible performance and excessive downtime.

We went back to self-hosting our database, but this meant figuring out how to implement the search functionality ourselves. Since our app is quite portable, we did not encounter many issues moving it to a different host.

Outsourcing also happens in code. Module ecosystems like npm make it easy for anyone to publish reusable bits of code. It is tempting to rely on someone else’s code, but building on top of someone else’s work without doing your due diligence can come back to haunt you.

When we adopted prova last year as our test runner, we were won over by its beautiful reporting interface. We started building on top of it despite its immaturity, and all was good for a few months. Later we realized that prova had been ignoring and silencing uncaught exceptions. We even caught prova silently skipping entire test cases because of a syntax error! With thousands of assertions already written, it was demoralizing to find out that we couldn’t trust our test runner.

The good news is that prova uses the same unopinionated API that tape uses. Since our test suite is portable, we will be able to switch test frameworks quite painlessly. This wouldn’t have been possible if we started with a very opinionated solution.

Recommendation: Have an exit strategy when building on top of something you’ve outsourced. Keep things portable, so you can undo any bad infrastructure decisions.

Wisdom Is More Valuable Than Reputation

It is not uncommon to see a new technology gain massive adoption. Regardless of the credentials of the body pushing the technology, the risk of adoption is inversely proportional to the age of the technology. This is the same reason why refactors are usually a bad idea: old solutions often look ugly because they have been painstakingly patched to handle edge cases.

In a previous example, I talked about how my experience with Cloudant has been less than ideal. I should mention that Cloudant is an IBM company, a huge name in IT enterprise, and one of the reasons why we trusted them. Another tech behemoth is Microsoft, whose new Windows Azure service has been unreliable enough that companies like K2 are afraid of committing fully. At the same time, K2 has demonstrated some smart decision making by keeping some of its capacity on AWS. This makes the move to Windows Azure an easily reversible decision, should they decide that it fails to meet their needs.

For another example of early adoption gone wrong, here is Khan Academy’s experience as an early adopter of Swift, a promising new technology created by brilliant people under the direction of the well respected Chris Lattner of LLVM fame. 20,000 lines of code later, here is what Andy Matuschak has to say:

In terms of problems, the tooling is not there, and that’s not a dig on the engineers. I happen to think, actually, that the Swift engineering team is one of Apple’s finest engineering teams, but it’s not done yet. It’s full of bugs, it crashes too often. It generates faulty code occasionally. Most critically for us, it’s just really slow. Really, really slow.

Andy Matuschak, A Generation of Lifelong Learners

Since new technology is so unpredictable, it is wise to use a mature solution today, and consider migrating in the future.

Recommendation: Reputation is not a substitute for maturity. Be cautious when adopting new technology, especially if your core competencies as a business depend on them.

Conclusion

Leaning Tower.jpeg

I’ve chosen these examples to show how making reversible decisions can lead to competitive advantage. Like any framework, there are exceptions. Risk usually comes with a commensurate reward; those who made their riches in the early days of the Internet certainly aren’t regretting being early adopters.

Nevertheless, the next time you are faced with a decision, consider the reversibility of your choices, and see if it leads you to a better conclusion. I certainly could have benefited from this thinking a year ago.


Thank you Oscar, Jacob, Matt, Chris, and Dan for reviewing my drafts.

A Faster Horse: The Future Of Web Development

Let’s talk about the rise and fall of technologies. There’s a neat graphical representation of the life-cycle of a technology called an S-Curve:

s-curve.jpeg

JavaScript is enjoying tremendous success right now, so I’m not going to bore you with the metrics. Exponential growth is fantastic, and as someone who uses JavaScript at almost every layer of the tech stack, I couldn’t be happier for my community. It’s certainly made my job as a web developer much easier.

Now, what about the future of JavaScript are you most excited about? I can think of a few off the top of my head:

  • New language features: ES6 generators, template strings, SIMD, …
  • Package manager/module ecosystem upgrades: parameterized scripts, private repositories, …
  • Framework updates: Angular 2, koa, …

Now, these are all quite exciting. New language features let us write more expressive code, and do faster computations. New frameworks help us write more robust applications. npm is amazing, and it is certainly the innovation that made node so successful, so any improvement to it is just icing on the cake.

However, these are all incremental improvements. We’re still sliding up the same S-Curve, and we’re going to reach maturity eventually because all technologies experience diminishing returns. You’re optimistic to a fault if you think that HTML+CSS+JavaScript is the holy grail of web development, and that we can’t do better. As much as we love our tools, we have to accept that they are far from perfect.

multiple-s-curves.jpeg

S-Curves don’t exist in isolation. Something else is on the horizon, and it’s not going to be an incremental improvement. This is why I think it was fantastic that TJ made a high-profile jump to a different S-Curve. Go has its own set of problems, but that’s not the point; he recognized the limits of the tools at hand, and tried something different. It’s far easier to pour more effort into the tools you are already familiar with than it is to try something completely different.

Pick any technology out there and you’ll find someone who can wax poetic about how its better than what you’re using right now. It doesn’t matter if you think they’re right or wrong, listen like you’re wrong, because eventually, you will be wrong. Eventually, you will be the cranky administrator who still believes that JSP is the holy grail of web development. Eventually, you will have to do something insane like write a new VM for the aging technology you’re locked into. Eventually, you will still be concatenating strings like this when everyone else is using the + operator.

What is next-generation web development going to look like? I don’t know, but I do have a small wish list:

  • Lower-level access to rendering: Lose the HTML/CSS, and go straight to the canvas
  • Multithreading support: We’re close to the limit of single-core performance, and even cell phones have multiple cores now
  • Lower-level access to memory: This is a compliment to multithreading, and its nice to not have to rely on garbage collection if you know what you’re doing
  • Static verification: This should be an option for applications where correctness is important
  • Better error handling: This is a real pain in node right now

What do you want to see next?

web-dev-s-curves.jpeg

Requiring Modules in CouchDB

Thanks to node.js, using the same code on the server and client is easier than ever. What about running application code in the database? If you have a function that is expensive to compute, but always returns the same result for the same input, then you could get a significant speedup by having your database cache the results of the computation ahead of time. This is especially true if you’re using CouchDB, which uses incremental MapReduce to make view computation very efficient.

Our realtime app at Getable is backed by CouchBase, a derivative of CouchDB. CouchBase and CouchDB share the same replication protocol, but differ in several important ways. One such difference is that CouchBase does not support require, while CouchDB does. One major caveat with CouchDB’s require is that you have to push the modules you want up as design documents. Since forcing humans to resolve dependency trees is outlawed in the Geneva Conventions, we need a better way of doing this.

Enter Browserify’s standalone option. We can use it to create a string of code that is easily prepended to the map function in our views. Browserify’s standalone option takes a string, which is the name of the exported module. Critically, the export will be added to the global object if require is not available, which is the case in CouchBase views. Therefore, if you create a browserify bundle with the {standalone: ‘mymodule’} option and prepend that string to your map function, you will now have global.mymodule available for use. The one gotcha is that the global object does not exist in CouchBase views either, so it must be initialized ahead of the bundle.

Create an entry script that just exports the module you want:

// entry.js
module.exports = require(‘mymodule’)  

Then bundle it with the standalone option and initialize global ahead of time:

// bundler.js
var bundle = new browserify({standalone: ‘mymodule’})  
bundle.add(‘entry.js’)  
bundle.bundle(function (err, src) {  
  var prelude = ‘var global={};’ + src.toString()
})

Now, if you have a map function, you can just insert this prelude right after the function header:

function mapFunction (doc) {  
  // prelude goes here!
  emit(doc._id, global.mymodule(doc))
}

As an implementation note, we have a small script that uses Function.toString() to manage our design documents. It turns our map functions into strings, searches for the use of application logic, and browserifies the appropriate standalone bundle for each function. It’s less prone to failure than manual updates, and makes the experience just a bit more magical.


The “One Year Later” update: we’ve seen vastly improved performance by pushing these standalone bundles up under the lib key.

Jankproof Javascript

Javascript is getting faster all the time, but things like long lists of complex cells are always going to be expensive to compute. To solve this, I wrote the unjank module last week. It helps you do expensive things in Javascript without causing the user experience to suffer.

I’m happy to report that after a week of production use, it’s clear that this technique is a significant improvement over what we were doing before for several reasons.

Device Agnostic

It doesn’t matter how fast or slow the task is; unjank benchmarks it on-the-fly and runs it as quickly as the device will allow. This means that your application is jank-free on all devices without you having to come up with magic numbers that determine how quickly a task should run.

Smooth Scrolling

An unexpected discovery was that kinetic scrolling in Webkit works very well even if the page is getting longer during the scroll. This means that if your user is scrolling down a long list as it is being rendered with unjank, they will not perceive it as slow at all. Webkit preserves the momentum of the scroll and keeps going as the page gets longer.

Aborting Tasks

The ability to abort an ongoing task is critical because most tasks are initiated by a user action. For example, if you have two tabs that have a long list each, quickly switching between the tabs will eventually crash the application unless the rendering of the lists is aborted when the tab becomes inactive.

Conclusion

I’m going to be using unjank a lot more going forward, especially where lists are involved. I pulled up the Getable app to experience it pre-unjank, and it has that signature lagginess associated with web apps, despite our use of requestAnimationFrame. With unjank, our longest lists no longer cause the browser to stutter — a small step out of the uncanny valley of hybrid apps.