The Nature of Computing Innovation
What constitutes our technologies, and how do they evolve?
January 21, 2025
In his book The Nature of Technology, W. Brian Arthur describes technology as combinations of other technologies. This description poses a couple of questions for computing.
First, take a software program like Git. To figure out what technologies Git is made of, perhaps all we have to do is inspect its dependencies? Surprisingly, Git has few real dependencies.¹
Second, note how Arthur’s description of technology is recursive — where does it end? His answer is in elemental technologies. These are technologies which deliver an effect based on a natural phenomenon:
That certain objects — pendulums or quartz crystals — oscillate at a steady given frequency is a phenomenon. Using this phenomenon for time keeping constitutes a principle, and yields from this a clock.— W. Brian Arthur, The Nature of Technology
If we re-examine Git not by its software dependencies, but by the algorithms and data structures in its source, we get closer.² Consider Git’s recursive three-way merge feature:
- Recursive three-way merge is built from Diff3 and a DAG of commits;
- Commit objects associate metadata with a top-level tree;
- Tree objects are directory-like structures which associate filenames with sub-trees and hashes;
- Hashes are SHA-1 hashes of blobs;
- Blobs consist of a header string and DEFLATE-compressed binary data.
Diff3 reconciles two edits of a common ancestor based upon their structural similarity. SHA-1 uses the uniform distribution of modulo to create unpredictable-but-deterministic values. Lossless compression algorithms like DEFLATE work due to statistical redundancy in data.
In other words:
A technology is a programming of phenomena to our purposes.— W. Brian Arthur, The Nature of Technology
A Computing Ontology
We can model Git’s merging ability as an ontology of technologies:
Note the hierarchy: the arrows indicate “using” relationships. Git’s recursive three-way merge uses a version of Diff3 that uses Myers’ difference algorithm.³
Innovation’s frontier
Zooming out from Git, imagine that we have modeled all computing technologies as an ontology. The frontier of computing innovation then, is defined as the set of all unique combinations of nodes in the Computing Ontology.
Git’s speed, correctness guarantees and efficient use of space were essential to its success.⁴ Without the availability of suitable elemental technologies, “Git” would have been beyond the technology frontier.
The Computing Ontology can also help us define terms:
- Invention is the discovery of a new phenomenon and an algorithm and/or data structure to harness it. A new node is added.
- Innovation is a novel combination of existing nodes to accomplish a new effect; new child nodes are added.
- Configuration is the software engineering we do every day. Creating apps, fixing bugs etc. These create or modify child nodes, but they’re unlikely to have descendants.
In terms of the number of descendants: invention > innovation > configuration. This is why the dictionary (hashed binary tree of pairs) is an important data structure, but the config struct for any particular app is not.
Disruption
What would it take to displace Git? Discovering a better algorithm that uses the same data structures wouldn’t be enough as Git itself could adopt it. For example, Git now defaults to using SHA-256 instead of SHA-1.
It seems like disruption would require an innovation that created a fundamentally better way of version control. Such as using a schema-aware difference algorithm. If no such technologies exist yet, disruption is beyond the frontier.
But that’s not all:
I am struck that innovation emerges when people are faced by problems — particular, well-specified problems.— W. Brian Arthur, The Nature of Technology
It’s not enough to be technically feasible, somebody has to identify the problem/opportunity in order to exploit it.
When Larry McVoy created BitKeeper he invented the concept of distributed version control. Later when he stopped providing BitKeeper to the Linux kernel project for free, he created a problem which actually spawned two new version control systems: Mercurial and Git.⁵
Where might we see disruption soon? The Infrastructure-as-Code paradigm looks vulnerable. Last year, Brian Grant (Kubernetes architect) wrote an excellent series on IaC which identifies fundamental issues with how tools like Terraform work. Several competitors have emerged - a notable example is Systems Initiative led by Adam Jacob, creator of Chef. What will happen if they succeed?
We can model the effects of disruption using the Computing Ontology; the disrupted technology and its descendant nodes will be discarded in favor of a new hierarchy of nodes. As the new technology evolves,⁶ its changing components will contribute more nodes to the Ontology. The effects of those changes will ripple out, driving further adaptations and disruptions.
Arthur calls this destructive cascade an “avalanche”. It’s not just Terraform that is at risk, but its descendants such as projects like OpenTofu and services like Scalr. The cloud service providers stationed higher up the mountain will be spared.
- E.G. in Git 2.47.1 many dependencies are optional.
- Git Internals - Git Objects describes some of these.
- Git uses the linear space variant, which occasionally makes for some weird diffs.
- Linus explained this in an early talk at Google. Git was so much faster than the rest that it changed how users actually did version control.
- The coincidence of pejoratives as project names is striking.
- Arthur describes several forces which drive adaptations in existing technologies: internal replacement, lock-in and adaptive stretch and structural deepening.