💾 Archived View for vi.rs › decentralization captured on 2020-11-07 at 01:46:44. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2020-09-24)
-=-=-=-=-=-=-
▄▀█ ll networks begin as only one thing; one neuron, one cell, one chip, one
█▀█ computer, or one user. One entity alone is not a network, but it is the
starting point for understanding the unified theory of decentralization. One
entity is fully sovereign, it has no connections to anything else that might
influence or control it. One entity in isolation is empowered to act however it
wants to strive for whatever results it seeks.
When one entity connects to another however, then the behavior of one affects
the other. Some form of agreement must be struck between them that dictates
what is allowed and what isn’t. In computer networks these agreements go by
lots of different names: access control lists, community standards, etc. It is
these operating agreements that users follow, or submit to, that have a
profound effect on the value, utility, and autonomy of the overall system. To
better understand decentralization you must first think of operating agreements
as either coming from the top down or the bottom up. Distributed systems
originally implied a bottom-up system but, with mobile app networks, that’s no
longer true.
The term “network” usually conjures up the mental picture of many individual
entities connected by some communication mechanism and working together to
accomplish a given task. The term “distributed system” describes networks where
the primary functions of the network are performed by the nodes in the network
and not a set of central servers; email is a distributed system; Spotify is
not.
There is a distinct difference between a distributed system and a decentralized
system. All decentralized systems are distributed systems, but not all
distributed systems are decentralized.
┏━━━━━━━━─━━━─━━─━──────────┐
┃ │
┃ Distributed │
╽ │
╿ ┏━━━─━━─━───────┐ │
╿ ╽ Decentralized │ │
│ └──────────────━┛ │
│ │
└──────────────────────────━┛
Up until now, “decentralized” was an adjective applied to many distributed
systems that aren’t actually decentralized (e.g., Git, Secure Scuttlebutt,
Bitcoin, etc.). To better understand the difference between distributed and
decentralized, we must break down distributed systems into the functional
pieces that all distributed systems must possess to function. Each functional
piece solves a particular problem. In all, there are nine different problems
and each has at least one novel solution but most have many. All solutions fall
somewhere along the spectrum between fully centralized to fully decentralized
and for a distributed system to be called “decentralized,” it must solve all
nine problems using decentralized solutions.
It is common to hear people say that “decentralized” describes what a
distributed system is not instead of what it is. However, when using the word
“decentralized” they typically mean something more than just the organization
of the network. To them it implies a partitioning of the services, governance,
and overall power structure to prevent any one entity, or user, from
controlling others in the system. It then follows that a fully decentralized
system—among many other things—atomizes the power structure to the smallest
possible unit and distributes it out to the edges where it is under direct user
control. A single user or node is sovereign in this kind of power structure.
Two users connected are still sovereign if neither user can dictate rules upon
the other; this is decentralized. The two users lose their “user sovereignty”
if they must submit to “community guidelines” that prevent them from saying
certain words or sharing certain ideas; this is not decentralized.
For the last few years, the term “self-sovereign” has been used to describe a
system that is fully decentralized. Since sovereignty is a consequence of the
underlying system structure and not of the user themselves, I prefer the term
“user sovereignty” as it more accurately describes a system’s design and how it
shapes the bottom-up operating agreement for the network.
This shift in thinking suggests that the term “decentralized,” in the realm of
distributed systems, is defined as the following:
It applies not only to the governance but also to the structure and function of
the nodes and network. Moving towards decentralization increases user
sovereignty and is the harder thing to accomplish since it goes against all
authoritarian impulses. Censorship is only possible when a system sacrifices
user sovereignty to build a permissioned publishing platform like Twitter and
Facebook. Enforcing “community guidelines” that dictate what content can be
published is only possible by greatly reducing user sovereignty. Fully
decentralized systems do nothing to control what the users can publish but
instead give users the ability to filter what and who they are exposed to; that
is decentralized; that increases user sovereignty.
All distributed systems experience significant “centralization pressure”
because centralization is profitable. Just a cursory glance at Facebook,
Twitter, and Github, makes it seem like the more centralized a system is, the
greater the potential for profit. And so it is! There is a mountain of money to
be made by turning users into captives and farming them like domesticated
animals; gathering their data to sell and carving up their attention to market.
The cost to the user is their sovereignty with side effects that often reach
into the real world. If you say the wrong thing on Twitter, you will be banned
and you may also lose your job and/or bank account as well. In China, the
social credit system is fully centralized with zero user sovereignty. Its goal
is to keep the Chinese people in virtual jail cells enforced by the people
themselves out of fear of real world consequences.
User sovereignty matters. It matters as much as our right to speak freely and
to gather peaceably and protest our government. In a decentralized system,
users are free to join and leave at will and take their data with them in a
portable format. They have absolute control over what data is shared with
others and the system as a whole as well as the ability to completely delete
their data at any time. This includes the meta data such as with whom and when
they connected in the system. To give into centralization pressure is lazy and
immoral. Decentralization requires conviction and virtue. To centralize shows
disregard for the users a system hopes to serve. All systems architects should
start with the goal of maximum user sovereignty first, then make smart and
conscientious compromises to decentralization only when absolutely necessary
and be fully transparent about the cost to users’ sovereignty.
To be fully decentralized is to maximize user sovereignty in all solutions for
the nine problems. Choosing a centralized solution for just one of the nine
problems causes a loss of user sovereignty and moves the distributed system
away from being fully decentralized. Similarly, if a solution to any of the
nine problems is left out of the system design, this also reduces the system’s
decentralization; Bitcoin in particular suffers from this kind of reduction in
user sovereignty. It only takes one centralized solution—or in Bitcoin’s case,
non-solution—to open up an opportunity for a corporation or government to
“capture” the community of users for financial and/or strategic control
reasons. In worst case scenarios, “corporate capture” can present an
existential threat to the independence of the system and sovereignty of the
users by tying them to an all-encompassing centralized platform that serves as
a gatekeeper for user access.
So what are the nine problems of decentralization? They are:
What follows is a brief discussion of each one. This document does not cover
the different solutions to problems, centralized or decentralized. The purpose
is to present an overview of the problem so that we can build a new way of
looking at distributed systems settling on a new unified theory of
decentralization.
All distributed systems start with just one node and one user. Until there are
two users that are connected, we can’t start to call it a distributed system.
When new users wish to join a distributed system by connecting to other users,
they have to solve the discovery problem. Finding the IP address or the domain
name or the user name of another user to connect to has lots of solutions. The
most common and easiest solution is to use a centralized server where users get
the information needed to initiate connections to other users. This is the
model that Twitter and Facebook and nearly all social platforms use. Oddly
enough, this is also the system that Git users use via Github and Secure
Scuttlebutt users use via public “pub” servers. Stranger still is the fact that
Bitcoin uses hard-coded IP addresses to Bitcoin seed nodes that act like
centralized servers for discovery purposes. Building a fully decentralized
discovery solution is an ongoing research topic. There are a few solutions but
they are difficult to use and some have privacy issues. For instance DNS
queries are public and not encrypted.
Once users have connected on the network level, they need to exchange
(cryptographic) credentials with each other to establish their identities. This
is the introduction problem. Is the introduction anonymous, pseudo-anonymous,
or public? If the credentials exchanged are tied to actual people or
organizations, how are those credentials verified? If the credentials are
anonymous or pseudo-anonymous, how will users be identified in subsequent
connections? The solution for introduction in a distributed system has many
critical consequences that affect the solutions for other problems like trust,
privacy, coordination, and membership. The introduction problem may be the
hardest of all of the problems simply based on the observation that most
distributed systems don’t provide a solution and rely on external, out-of-band
services for introduction.
After two users have discovered each other and are introduced, the connection
between them is closed eventually, and one or both users will go offline. Very
few end users stay online all of the time. Of those few who do, only a tiny
fraction keep a static address or other stable means for connecting to them
again. The coherence problem focuses on how users reconnect with each other
after they go offline and wish to rejoin the network again.
The world today is made up of frequently disconnected and mobile users who move
around the Internet topologically. Their IP addresses change often as well as
their firewalled status. It is common for a user to be behind a firewall at
home and work. But while commuting, the user might be using a non-firewalled
IPv6 connection via a mobile device. Solutions for the coherence problem must
accommodate this reality and keep users connected despite the constant churn
and chaos of their network status.
The reason users join a distributed system is to take advantage of the public
services provided to them and to have access to the other users. A public
service is presented the same way to every user. Whether it is creating or
consuming content, all distributed systems exist to provide public services to
users. Facebook exists to communicate photos and messages between friends and
family. Providing these in a decentralized way is a very difficult problem to
solve while protecting user sovereignty. Existing solutions have various
tradeoffs with efficiency and privacy. One such solution is query flooding,
which was common in early p2p file sharing systems. It doesn’t scale well but
does a good job of preserving a user’s privacy. Later designs routed queries
and began trading user privacy for efficiency.
Trust in systems relies upon the solution for introduction and, possibly, on
the public services presentation if authentication is done using a distributed
identity solution. In short, the trust problem comes down to being certain of
whom you are talking (i.e. authentication) and the data you are receiving is
both private and unmodified (i.e. confidentiality and authenticity). Combining
those two creates trust. Trust in an interaction between users is a function of
how well you trust the other user and the risk inherent in the transaction.
This perfectly models how trust is handled in human-scale social networks and
is intuitive for even novice users.
To solve the trust problem in a decentralized way, the current state of the art
is to use a hybrid top-down/web-of-trust model where relationships are pairwise
and distributed but utilize verifiable credentials from trusted institutions to
give trust to the authentication portion. A new and exciting development is the
use of public proofs-of-work such as a user’s contributions to an open source
software project as the basis for trust. This opens up the possibility of
anonymous users to use their consistent and high quality contributions to well
known and trusted projects as the trust anchor for their authentication. This
is the same as saying, “you don’t need to know who I am, however, I can prove
to you that I am the same user that has been the maintainer of a notable part
of the Linux kernel project for years.” Authentication by reputation. This maps
nicely to the familiar social trust we all rely upon in everyday life.
Privacy is probably the easiest problem to solve in a decentralized system.
This problem has received the most attention in the last 20 years leading to
the Tor Project and I2P and mixnets along with end-to-end encryption and
zero-round-trip, perfect-forward secrecy protocols like Noise.
By layering solutions at each level of the OSI stack and using cryptography and
zero-knowledge proofs pervasively, it is easy to prevent correlation and
tracking methods used to de-anonymize users from their traffic. Any
decentralized system that takes privacy seriously must prevent IP packet
tracking through the use of onion routing and/or mixnets. Ultimately it must
rely on pairwise identifiers to prevent other users from colluding to track and
de-anonymize the users they are talking to. Then the whole system must be
designed to never store or transmit any personal information and, instead, use
zero-knowledge proofs and verifiable claims to implement authentication and
authorization based on what a user is, not who they are. Any system that
trafficks knowledge about a user can ultimately disclose that information and
compromise the user’s privacy, threatening their sovereignty.
The coordination problem has three parts: communication, collaboration and
corroboration. Communication is whether the users of the system conduct all
system functions using the main communication channels of the system? In a lot
of distributed systems, the answer is often “no” because of the authentication
piece of the trust problem. Systems such as Bitcoin require that some
communication happens outside of the main transaction and block sharing
network.
When two users of Bitcoin wish to transfer bitcoins, the recipient must
communicate to the sender the destination bitcoin address. This out-of-band
(OOB) communication presents many technical problems for users. The challenges
are serious enough that the authors of Bitcoin invented a special
binary-to-text encoding system called Base58Check to minimize the opportunity
for errors when sending and transcribing bitcoin addresses. There is also the
challenge of man-in-the-middle attacks leading to the misdirection of
bitcoins by substituting their address for the one sent by the real
recipient.
Requiring OOB communication to use a system opens it to significant
centralization pressure by ceding control over that part of the communication
regime to an outside solution provider. Outside solutions providers usually
invest in centralized solutions for automating and streamlining the OOB
portions of system communication. This is why Coinbase exists. This is also
partially why Github exists. Relying on centralized third parties to enable the
full capabilities of a distributed system, such as Bitcoin (Coinbase) and Git
(Github), hurts the overall decentralization of the system. It ultimately
affects its independence and limits its resistance to attack because these
central systems are really in control of user access.
Collaboration is whether the nodes can work together to provide a public
service such as search. Whether the service is a search function or packet
routing, it is difficult to design a fully decentralized solution for
collaboration without affecting privacy. This is an area of active research.
The last part of coordination is corroboration. Corroboration is whether the
nodes share data with each other that supports decentralized solutions for
other problems. Reputation systems fall under this part of coordination. How a
reputation system is designed directly affects trust, privacy, membership, and
potentially even coherence and discovery. There has been some research in
decentralized corroboration. However, most systems designers find the problem
too difficult and instead build centralized solutions.
Like coordination, the membership problem has many facets to it. If a system is
designed to for user sovereignty then participation is entirely at the
discretion of the user. Fully decentralized systems have no way of preventing
an arbitrary user from participating in the system. Therefore, they allow users
to create cliques that have isolated, private communication and interactions.
No non-member can participate in the clique or even observe that the clique
exists.
Membership isn’t just about group formation and protection. It also deals with
preventing the correlation of the nodes in a group. The fact that a group of
users are associated and communicate is often many times more valuable to an
attacker than the information communicated. Fully decentralized systems allow
for the formation of these groups without disclosing to any observers who is
connecting to whom. Alice must be able to join with Bob and Charlie in such a
way that Malory cannot observe the group formation, nor can she enumerate all
of the members of the group.
The true value of any distributed system is to maintain some persistent state
for the system as a whole. For instance, distributed file systems such as
Tahoe-LAFS store a set of files spread out over a number of nodes such that
the failure of a subset of nodes does not affect the availability of the
data. Bitcoin distributes a copy of the Bitcoin blockchain to all of the
nodes in the network and therefore has a fully decentralized persistent state
solution. To be fully decentralized, all nodes in a network must be able to
reproduce the full persistent state set. When applied to distributed systems
other than blockchain, it is possible to create “metastability” where no one
node is online all of the time but enough nodes are part of the system that
the probability that at least one is online at any given time approaches
certainty.
Those are the nine problems. So how do they fit into user sovereignty? For any
distributed system the programmers/architects have to decide how they will
solve each one. There are a number of solutions for these problems with some
being more centralized and power-asymmetric and others that are decentralized
and user-sovereign. For a system to be fully user-sovereign it has to have a
decentralized solution for every problem.
There is one other thing that is really interesting about this model: it
predicts massive economic/business opportunities. When a popular distributed
system does not have a solution for one of the problems, it creates an opening
for a corporation to capture the users and make money. Companies may follow the
principles of user sovereignty and offer paid edge services that won't threaten
the overall system or violate the users' trust. However, companies may also
build centralized walled gardens and threaten the autonomy of the users and the
entire ecosystem that depends on the system.
▪ .
.
█▀▀ █░█ █▀▀ █▀▀ █▀█ █▀ █
█▄▄ █▀█ ██▄ ██▄ █▀▄ ▄█ ▄
. ▛ ╿ ▋
▎ ▪ ╵ ▎ .
▏ ╵ ▏
▏
▎