💾 Archived View for vi.rs › decentralization captured on 2020-11-07 at 01:46:44. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Part 1: Decentralization

▄▀█ ll networks begin as only one thing; one neuron, one cell, one chip, one

█▀█ computer, or one user. One entity alone is not a network, but it is the

starting point for understanding the unified theory of decentralization. One

entity is fully sovereign, it has no connections to anything else that might

influence or control it. One entity in isolation is empowered to act however it

wants to strive for whatever results it seeks.

When one entity connects to another however, then the behavior of one affects

the other. Some form of agreement must be struck between them that dictates

what is allowed and what isn’t. In computer networks these agreements go by

lots of different names: access control lists, community standards, etc. It is

these operating agreements that users follow, or submit to, that have a

profound effect on the value, utility, and autonomy of the overall system. To

better understand decentralization you must first think of operating agreements

as either coming from the top down or the bottom up. Distributed systems

originally implied a bottom-up system but, with mobile app networks, that’s no

longer true.

The term “network” usually conjures up the mental picture of many individual

entities connected by some communication mechanism and working together to

accomplish a given task. The term “distributed system” describes networks where

the primary functions of the network are performed by the nodes in the network

and not a set of central servers; email is a distributed system; Spotify is

not.

There is a distinct difference between a distributed system and a decentralized

system. All decentralized systems are distributed systems, but not all

distributed systems are decentralized.

┏━━━━━━━━─━━━─━━─━──────────┐

┃ │

┃ Distributed │

╽ │

╿ ┏━━━─━━─━───────┐ │

╿ ╽ Decentralized │ │

│ └──────────────━┛ │

│ │

└──────────────────────────━┛

Up until now, “decentralized” was an adjective applied to many distributed

systems that aren’t actually decentralized (e.g., Git, Secure Scuttlebutt,

Bitcoin, etc.). To better understand the difference between distributed and

decentralized, we must break down distributed systems into the functional

pieces that all distributed systems must possess to function. Each functional

piece solves a particular problem. In all, there are nine different problems

and each has at least one novel solution but most have many. All solutions fall

somewhere along the spectrum between fully centralized to fully decentralized

and for a distributed system to be called “decentralized,” it must solve all

nine problems using decentralized solutions.

It is common to hear people say that “decentralized” describes what a

distributed system is not instead of what it is. However, when using the word

“decentralized” they typically mean something more than just the organization

of the network. To them it implies a partitioning of the services, governance,

and overall power structure to prevent any one entity, or user, from

controlling others in the system. It then follows that a fully decentralized

system—among many other things—atomizes the power structure to the smallest

possible unit and distributes it out to the edges where it is under direct user

control. A single user or node is sovereign in this kind of power structure.

Two users connected are still sovereign if neither user can dictate rules upon

the other; this is decentralized. The two users lose their “user sovereignty”

if they must submit to “community guidelines” that prevent them from saying

certain words or sharing certain ideas; this is not decentralized.

For the last few years, the term “self-sovereign” has been used to describe a

system that is fully decentralized. Since sovereignty is a consequence of the

underlying system structure and not of the user themselves, I prefer the term

“user sovereignty” as it more accurately describes a system’s design and how it

shapes the bottom-up operating agreement for the network.

This shift in thinking suggests that the term “decentralized,” in the realm of

distributed systems, is defined as the following:

Decentralization is the direction in which user sovereignty increases.

It applies not only to the governance but also to the structure and function of

the nodes and network. Moving towards decentralization increases user

sovereignty and is the harder thing to accomplish since it goes against all

authoritarian impulses. Censorship is only possible when a system sacrifices

user sovereignty to build a permissioned publishing platform like Twitter and

Facebook. Enforcing “community guidelines” that dictate what content can be

published is only possible by greatly reducing user sovereignty. Fully

decentralized systems do nothing to control what the users can publish but

instead give users the ability to filter what and who they are exposed to; that

is decentralized; that increases user sovereignty.

All distributed systems experience significant “centralization pressure”

because centralization is profitable. Just a cursory glance at Facebook,

Twitter, and Github, makes it seem like the more centralized a system is, the

greater the potential for profit. And so it is! There is a mountain of money to

be made by turning users into captives and farming them like domesticated

animals; gathering their data to sell and carving up their attention to market.

The cost to the user is their sovereignty with side effects that often reach

into the real world. If you say the wrong thing on Twitter, you will be banned

and you may also lose your job and/or bank account as well. In China, the

social credit system is fully centralized with zero user sovereignty. Its goal

is to keep the Chinese people in virtual jail cells enforced by the people

themselves out of fear of real world consequences.

User sovereignty matters. It matters as much as our right to speak freely and

to gather peaceably and protest our government. In a decentralized system,

users are free to join and leave at will and take their data with them in a

portable format. They have absolute control over what data is shared with

others and the system as a whole as well as the ability to completely delete

their data at any time. This includes the meta data such as with whom and when

they connected in the system. To give into centralization pressure is lazy and

immoral. Decentralization requires conviction and virtue. To centralize shows

disregard for the users a system hopes to serve. All systems architects should

start with the goal of maximum user sovereignty first, then make smart and

conscientious compromises to decentralization only when absolutely necessary

and be fully transparent about the cost to users’ sovereignty.

To be fully decentralized is to maximize user sovereignty in all solutions for

the nine problems. Choosing a centralized solution for just one of the nine

problems causes a loss of user sovereignty and moves the distributed system

away from being fully decentralized. Similarly, if a solution to any of the

nine problems is left out of the system design, this also reduces the system’s

decentralization; Bitcoin in particular suffers from this kind of reduction in

user sovereignty. It only takes one centralized solution—or in Bitcoin’s case,

non-solution—to open up an opportunity for a corporation or government to

“capture” the community of users for financial and/or strategic control

reasons. In worst case scenarios, “corporate capture” can present an

existential threat to the independence of the system and sovereignty of the

users by tying them to an all-encompassing centralized platform that serves as

a gatekeeper for user access.

So what are the nine problems of decentralization? They are:

Discovery
Introduction
Coherence
Public Services
Trust
Privacy
Coordination
Membership
Persistent State

What follows is a brief discussion of each one. This document does not cover

the different solutions to problems, centralized or decentralized. The purpose

is to present an overview of the problem so that we can build a new way of

looking at distributed systems settling on a new unified theory of

decentralization.

Discovery

All distributed systems start with just one node and one user. Until there are

two users that are connected, we can’t start to call it a distributed system.

When new users wish to join a distributed system by connecting to other users,

they have to solve the discovery problem. Finding the IP address or the domain

name or the user name of another user to connect to has lots of solutions. The

most common and easiest solution is to use a centralized server where users get

the information needed to initiate connections to other users. This is the

model that Twitter and Facebook and nearly all social platforms use. Oddly

enough, this is also the system that Git users use via Github and Secure

Scuttlebutt users use via public “pub” servers. Stranger still is the fact that

Bitcoin uses hard-coded IP addresses to Bitcoin seed nodes that act like

centralized servers for discovery purposes. Building a fully decentralized

discovery solution is an ongoing research topic. There are a few solutions but

they are difficult to use and some have privacy issues. For instance DNS

queries are public and not encrypted.

Introduction

Once users have connected on the network level, they need to exchange

(cryptographic) credentials with each other to establish their identities. This

is the introduction problem. Is the introduction anonymous, pseudo-anonymous,

or public? If the credentials exchanged are tied to actual people or

organizations, how are those credentials verified? If the credentials are

anonymous or pseudo-anonymous, how will users be identified in subsequent

connections? The solution for introduction in a distributed system has many

critical consequences that affect the solutions for other problems like trust,

privacy, coordination, and membership. The introduction problem may be the

hardest of all of the problems simply based on the observation that most

distributed systems don’t provide a solution and rely on external, out-of-band

services for introduction.

Coherence

After two users have discovered each other and are introduced, the connection

between them is closed eventually, and one or both users will go offline. Very

few end users stay online all of the time. Of those few who do, only a tiny

fraction keep a static address or other stable means for connecting to them

again. The coherence problem focuses on how users reconnect with each other

after they go offline and wish to rejoin the network again.

The world today is made up of frequently disconnected and mobile users who move

around the Internet topologically. Their IP addresses change often as well as

their firewalled status. It is common for a user to be behind a firewall at

home and work. But while commuting, the user might be using a non-firewalled

IPv6 connection via a mobile device. Solutions for the coherence problem must

accommodate this reality and keep users connected despite the constant churn

and chaos of their network status.

Public Services

The reason users join a distributed system is to take advantage of the public

services provided to them and to have access to the other users. A public

service is presented the same way to every user. Whether it is creating or

consuming content, all distributed systems exist to provide public services to

users. Facebook exists to communicate photos and messages between friends and

family. Providing these in a decentralized way is a very difficult problem to

solve while protecting user sovereignty. Existing solutions have various

tradeoffs with efficiency and privacy. One such solution is query flooding,

which was common in early p2p file sharing systems. It doesn’t scale well but

does a good job of preserving a user’s privacy. Later designs routed queries

and began trading user privacy for efficiency.

Trust

Trust in systems relies upon the solution for introduction and, possibly, on

the public services presentation if authentication is done using a distributed

identity solution. In short, the trust problem comes down to being certain of

whom you are talking (i.e. authentication) and the data you are receiving is

both private and unmodified (i.e. confidentiality and authenticity). Combining

those two creates trust. Trust in an interaction between users is a function of

how well you trust the other user and the risk inherent in the transaction.

This perfectly models how trust is handled in human-scale social networks and

is intuitive for even novice users.

To solve the trust problem in a decentralized way, the current state of the art

is to use a hybrid top-down/web-of-trust model where relationships are pairwise

and distributed but utilize verifiable credentials from trusted institutions to

give trust to the authentication portion. A new and exciting development is the

use of public proofs-of-work such as a user’s contributions to an open source

software project as the basis for trust. This opens up the possibility of

anonymous users to use their consistent and high quality contributions to well

known and trusted projects as the trust anchor for their authentication. This

is the same as saying, “you don’t need to know who I am, however, I can prove

to you that I am the same user that has been the maintainer of a notable part

of the Linux kernel project for years.” Authentication by reputation. This maps

nicely to the familiar social trust we all rely upon in everyday life.

Privacy

Privacy is probably the easiest problem to solve in a decentralized system.

This problem has received the most attention in the last 20 years leading to

the Tor Project and I2P and mixnets along with end-to-end encryption and

zero-round-trip, perfect-forward secrecy protocols like Noise.

By layering solutions at each level of the OSI stack and using cryptography and

zero-knowledge proofs pervasively, it is easy to prevent correlation and

tracking methods used to de-anonymize users from their traffic. Any

decentralized system that takes privacy seriously must prevent IP packet

tracking through the use of onion routing and/or mixnets. Ultimately it must

rely on pairwise identifiers to prevent other users from colluding to track and

de-anonymize the users they are talking to. Then the whole system must be

designed to never store or transmit any personal information and, instead, use

zero-knowledge proofs and verifiable claims to implement authentication and

authorization based on what a user is, not who they are. Any system that

trafficks knowledge about a user can ultimately disclose that information and

compromise the user’s privacy, threatening their sovereignty.

Coordination

The coordination problem has three parts: communication, collaboration and

corroboration. Communication is whether the users of the system conduct all

system functions using the main communication channels of the system? In a lot

of distributed systems, the answer is often “no” because of the authentication

piece of the trust problem. Systems such as Bitcoin require that some

communication happens outside of the main transaction and block sharing

network.

When two users of Bitcoin wish to transfer bitcoins, the recipient must

communicate to the sender the destination bitcoin address. This out-of-band

(OOB) communication presents many technical problems for users. The challenges

are serious enough that the authors of Bitcoin invented a special

binary-to-text encoding system called Base58Check to minimize the opportunity

for errors when sending and transcribing bitcoin addresses. There is also the

challenge of man-in-the-middle attacks leading to the misdirection of

bitcoins by substituting their address for the one sent by the real

recipient.

Requiring OOB communication to use a system opens it to significant

centralization pressure by ceding control over that part of the communication

regime to an outside solution provider. Outside solutions providers usually

invest in centralized solutions for automating and streamlining the OOB

portions of system communication. This is why Coinbase exists. This is also

partially why Github exists. Relying on centralized third parties to enable the

full capabilities of a distributed system, such as Bitcoin (Coinbase) and Git

(Github), hurts the overall decentralization of the system. It ultimately

affects its independence and limits its resistance to attack because these

central systems are really in control of user access.

Collaboration is whether the nodes can work together to provide a public

service such as search. Whether the service is a search function or packet

routing, it is difficult to design a fully decentralized solution for

collaboration without affecting privacy. This is an area of active research.

The last part of coordination is corroboration. Corroboration is whether the

nodes share data with each other that supports decentralized solutions for

other problems. Reputation systems fall under this part of coordination. How a

reputation system is designed directly affects trust, privacy, membership, and

potentially even coherence and discovery. There has been some research in

decentralized corroboration. However, most systems designers find the problem

too difficult and instead build centralized solutions.

Membership

Like coordination, the membership problem has many facets to it. If a system is

designed to for user sovereignty then participation is entirely at the

discretion of the user. Fully decentralized systems have no way of preventing

an arbitrary user from participating in the system. Therefore, they allow users

to create cliques that have isolated, private communication and interactions.

No non-member can participate in the clique or even observe that the clique

exists.

Membership isn’t just about group formation and protection. It also deals with

preventing the correlation of the nodes in a group. The fact that a group of

users are associated and communicate is often many times more valuable to an

attacker than the information communicated. Fully decentralized systems allow

for the formation of these groups without disclosing to any observers who is

connecting to whom. Alice must be able to join with Bob and Charlie in such a

way that Malory cannot observe the group formation, nor can she enumerate all

of the members of the group.

Persistent State

The true value of any distributed system is to maintain some persistent state

for the system as a whole. For instance, distributed file systems such as

Tahoe-LAFS store a set of files spread out over a number of nodes such that

the failure of a subset of nodes does not affect the availability of the

data. Bitcoin distributes a copy of the Bitcoin blockchain to all of the

nodes in the network and therefore has a fully decentralized persistent state

solution. To be fully decentralized, all nodes in a network must be able to

reproduce the full persistent state set. When applied to distributed systems

other than blockchain, it is possible to create “metastability” where no one

node is online all of the time but enough nodes are part of the system that

the probability that at least one is online at any given time approaches

certainty.

Conclusion

Those are the nine problems. So how do they fit into user sovereignty? For any

distributed system the programmers/architects have to decide how they will

solve each one. There are a number of solutions for these problems with some

being more centralized and power-asymmetric and others that are decentralized

and user-sovereign. For a system to be fully user-sovereign it has to have a

decentralized solution for every problem.

There is one other thing that is really interesting about this model: it

predicts massive economic/business opportunities. When a popular distributed

system does not have a solution for one of the problems, it creates an opening

for a corporation to capture the users and make money. Companies may follow the

principles of user sovereignty and offer paid edge services that won't threaten

the overall system or violate the users' trust. However, companies may also

build centralized walled gardens and threaten the autonomy of the users and the

entire ecosystem that depends on the system.

If you are concerned about the latter, design for the former.

▪ .

█▀▀ █░█ █▀▀ █▀▀ █▀█ █▀ █

█▄▄ █▀█ ██▄ ██▄ █▀▄ ▄█ ▄

. ▛ ╿ ▋

▎ ▪ ╵ ▎ .

▏ ╵ ▏

▏

▎

Table of Contents