Distributed Consensus 2020

It’s not what you think it is.

A few weeks ago, I led my connections to a single-question survey, asking what the currently most utilized distributed consensus algorithms is. This isn’t the largest group in the world, but it’s also a fairly mixed bunch: a majority is in the tech industry, but almost as many are not. Some or old guard, some entered the field only a while ago. Some are more research inclined, some more practically oriented.

I figure it makes a decent survey group.

Signs at Foley Square

Figure: “Signs at Foley Square” by TNLNYC is licensed under CC BY-SA 2.0

Without further ado, let’s look at the answers.

Raft and Paxos dominate

Raft takes up 36% of the answers, and Paxos 29%. Raft is the more modern of the two, designed to be simpler than Leslie Lamport’s venerable Paxos algorithm.

Both algorithms target non-Byzantine fault tolerant consensus, though may be extended to cover Byzantine failure modes. In either case, they’re best suited for reaching consensus in smallish groups of nodes.

Paxos is used in some of Google and Amazon’s cloud services, Ceph and Cassandra. On the other hand, Raft’s predominant user is etcd, but also has implementations from other cloud vendors. In either case, cloud is the usage of choice.

A similar algorithm, Zookeeper's ZAB, was nominated by about 7% of respondents. Since all three share similarities in their workings and their use cases, they can be considered the “cloud group” of algorithms. This group together takes up well in excess of two thirds of the answers with a total of 72%.

So how do we know how often these algorithms are actually utilized?

We can start by looking at Netcraft's monthly webserver survey. We’re assuming that anything that’s deployed in the cloud has at least one public, web-based service. That’s probably not a correct assumption, but adjusting to real-world usage should be a question of adjusting by a few percentage points.

As Netcraft points out, their method for determining the number of computers used in web services is fairly accurate, counting in the region of 2/3 to 3/4 of computers in data centers. Taking numbers from the February 2020 survey, then, we can start by estimating the number of cloud machines in use to be in the region of 9,564,965 / 2 * 3 to 9,564,965 / 3 * 4 or 12,753,287 to 14,347,447. Let’s round that to 13.5M.

It’s not exactly a given that all of those 13.5M computers participate in distributed consensus groups. But we’re also in the year in which Kubernetes has been declared the winner of the container wars, and new deployments at least tend to be heavily containerized. It’s probably reasonably safe to assume that data center computers participating in a distributed consensus group will hover in the region of the 10M mark.

Blockchain

The next most often mentioned algorithm is blockchain. At nearly 21% total, it’s still a healthy chunk of my readers that consider this to be the most utilized distributed consensus algorithm.

I suspect the number is so high because a younger generation of engineers will not have grown up with Paxos, but left university encountering Byzantine fault tolerant distributed consensus with the proof of work blockchain powering Bitcoin.

About a third of people submitting Blockchain as the answer were thinking that Practical Byzantine Fault Tolerance was the most utilized algorithm, despite Bitcoin’s dominance of the sector.

Again, how to get at numbers?

This is harder to get a grasp on. Etherscan tracks about 7.7K nodes as I’m writing it, with Ethereum generally considered to be one of the most popular Blockchain implementations. Bitnodes counts just over 10K for Bitcoin at the same time.

Assuming these two are the largest players in the field, I suspect it’s safe to say that the total number of participating nodes across all Blockchains is not going to exceed the 1M mark. For our purposes, it’s enough to guesstimate that Blockchain is about an order of magnitude smaller than traditional consensus algorithms of the “cloud group”.

Other Mentions

With the cloud and blockchain groups arriving at 93% of the answers, what’s left?

It turns out, quite a few people mentioned databases as a mechanism for distributing consensus. While it’s true that each database client will agree with all others on the current application state, there is no consensus making mechanism in the database itself. Even if the database were to be replicated, in database clusters one node always serves as the source of truth, with the others merely copying this state.

In a more abstract sense, however, it’s still true that this kind of “consensus” is oftentimes all that’s required to distribute work in a networked system.

In a similar vein, one answer was DNS, the Domain Name System. Architecturally speaking, it’s not much different from a replicated database, but it is very true that this database operates at a massive scale.

And the Winner is…

… not even mentioned.

Well. I admit to cheating somewhat. But I was cheating for a reason.

One of the more interesting things about this mini survey was to see by how much the world overvalues blockchain’s contribution to distributed consensus mechanisms.

If 10M cloud machines are to be held against 1M blockchain machines, then the cloud makes up over 90% of the total. And that is assuming that blockchain participants are not managed in a cloud as well, but the that the two are completely disparate groups – which is frankly unlikely.

Compare that to the 72%/21% of answers, and it’s fairly clear that blockchain is more prevalent in the minds of people than in actual use. Chalk it up to good marketing, eh?

So I cheated by phrasing the question in just an open enough way to let people send in replies that reflect the mind share of the algorithm more than its actual use. Contrasting that to a best effort guess at an actual utilization is interesting, at least to me.

But I was also cheating in a different way. Because there exists another distributed consensus algorithm that is widely used today, and it’s unlike Paxos and unlike blockchain.

At a best guess, it is used by in excess of 30 billion devices on a daily basis.

That’s billion with a B. It’s also three orders of magnitude more devices than our cloud estimate. It’s run in Antarctica. It’s run under the sea. It’s run in space.

It’s also a true consensus algorithm, in which each node establishes its version of the global state based on the information it gets from other nodes – not by merely copying it, but by algorithmically excluding obvious deviations from the global agreement. It’s fault-tolerant to a fairly high degree.

But what’s more interesting is that it’s also so fundamental to the functioning of devices that it’s all but guaranteed that any node that runs a cloud group consensus algorithm or blockchain also runs this algorithm.

More after the break.

So where does this 30 billion number come from? Well, to be fair it’s probably an outdated estimate. And it doesn’t really look at all the devices out there. The number could vary by quite a bit. The source is this Statista estimate on IoT devices over the years, and for 2020 it suggests 30 billion, where a more readable source mentions 20 billion.

So what’s this amazing algorithm you ask?

It goes by three letters: NTP.

Network Time Protocol

In the Network Time Protocol, each node may publish their own interpretation of the current agreed-upon UTC time upon request, and request other node’s interpretation in turn. A fairly sophisticated set of algorithms excludes statistical outliers to avoid false state.

And you can be sure these days that every internet connected device participates in the protocol even if only as a client. Even if devices are not connected to the internet, they may participate in a variant of NTP for keeping their view of time up-to-date, because so many other operations require a shared understanding of time.

For example, exchanging cryptographic authentication tokens via JWT requires the nodes to honour timestamps; without a reasonably accurate shared understanding of the UTC time, other services that the devices may offer could not be used safely.

Definition of Consensus

I can practically hear some of my readers complain that this stretches the definition of consensus considerably. And that’s true, which is part of the point of asking the question in the first place.

But let’s break this down a little.

  • In the cloud group as in blockchain as in NTP, each node finds their own version of the truth about the shared state.
  • Each algorithm group takes pains to exclude false statements about the shared state (fault tolerance).
  • Each algorithm group effectively implements a voting mechanism or majority consensus (though the approaches are very different).

The major differences between NTP on the one, and the others on the other hand is that by whichever means, the others reach a definitive shared agreement of their global state. Whether N nodes vote on the authoritative leader or on the shared state directly, once the vote comes in, it points to the truth.

NTP is different in that it accepts deviation from the global state – but reduces the deviation to the point where it has no practical meaning for everyday use cases.

Distributed consensus doesn’t have to be perfect, it just has to be good enough for the use-case.

Though in this case, “good enough” is pretty damn good.


Published on March 9, 2020