💾 Archived View for zaibatsu.circumlunar.space › ~krixano › phlog › Internet_P2P_ZeroNet_Explanation… captured on 2020-09-24 at 01:57:09.
View Raw
More Information
-=-=-=-=-=-=-
This is a collection of back-to-back comments (on the
ZeroNet Ted Talk Video [1]) that I wrote to a person who
had questions about ZeroNet. She seemed to not be very
technical, which is why I explain a lot of things, starting
from the Internet, how information is transferred over the
internet, DNS servers and their use, then going into P2P
and ZeroNet, then explaining Public Key Cryptography and
how it's used in ZeroNet, and finishing on User content in
ZeroNet and the concept of users on the platform. This
explanation I gave is probably not perfect or 100%
accurate, but I did the best I could with my knowledge.
ZeroNet is not really solely about anonymity, it just
- allows* this. Firstly, the internet obviously is the
ability for computers to communicate to each other
through the air (wifi) and wires spread throughout the
world, and through "routers" (not necessarily the exact
same as the router you have in your home, but basically
almost the same), which are just computers that "route"
(send off in the correct direction) information so they
end up going to the correct final destination - another
computer. Really, the "cloud" doesn't exist. The cloud
is just storing your information on servers that are
accessible via the internet - these computers/servers are
often duplicated/backed-up to multiple computers/servers in
case of computer failures/problems. The web is not the same
as the internet. The web is really just "documents" (more
accurately, text files of HTML code) on servers that people
who use a computer can "download" by telling that server
to send over the website "documents" via the internet
(you certainly don't need the internet to do this, btw)
- this is what your browser does. The problem is, then,
how browsers know what computer they must get information
from based on the URL - that's provided by DNS (Domain
Name System) Servers - which are just more computers that
store URL's and what the IP address associated with that
URL is. The IP address can then be used by these routers
throughout the world to know where the final destination
the information you are sending to is. Your browser first
consults a DNS server - it knows the IP address of the DNS
server automatically usually given by your ISP - to get
the IP address of the server, then sends a "request" over
the internet to that server to tell that server to send
back a "document", which is the webpage, and your browser
eventually gets this webpage/document and displays it.
But there's a big problem with this. This is where ZeroNet
and P2P comes in. The problem is these servers or groups
of servers that store the "documents" for the webpages
of a website are single-points-of-failure. There's only
one server or group of servers that send out the webpages
to users. A government, ISP, Hacker, etc. can easily shut
down this server (or group of servers). This has happened
often with Hackers DDoSing websites. This has happened with
governments shutting down websites. This has likely also
happened with ISP's shutting down websites, for example
if you aren't paying for a business package with your ISP.
The way ZeroNet handles this is by using P2P. First,
we must understand *exactly* what P2P is and how it
works. At first, it's simple, but there are some more
complicated problems with it that have been solved in
various ways. Let's start with the simple. Previously,
we had URL's that send a request to a *specific* server
(it's IP address given by a DNS server) to tell it
to send us a webpage, which our browser receives and
displays for us. But P2P works differently. It starts
with *one* computer. It has, for example, a movie. They
use a program to allow people to get this movie with
torrent clients. They get a *magnet* link (the equivalent
of Website URLs, but for torrents). When people use this
magnet link with their torrent "client" (client = a program
that first sends a request over the internet to a server,
receives the infromation requested from the server, and
does something with that information. Ex: a browser -
chrome, firefox, IE, Edge, etc), their client looks for
"peers" of this file - peers are other computers that have
previously downloaded the file with their torrent client
and are allowing people to download it from them. Since
this one computer is the only one that currently has the
file, this person downloads the file from that computer.
Then, once they download it, they become a "seeder" -
another computer that can send other computers this same
file. Therefore, there are now 2 "seeders". The computer
that originally had the file, and this other computer
that downloaded it from that original computer. When more
computers ask for the file, they can download it from
- either one* of these computers. When they are finished,
they become a "seeder" - other computers can download
the file from them. Essentially, users are clients and
servers at the same time. But there's a problem - How do
new downloaders know which computers have the file they
are looking for? This would be essentially equivalent to
a DNS server for URLs on the web. Well - this problem is
solved in multiple ways. One of which are called "trackers"
- they are servers that *track* the IP addresses of the
computers that are seeding a file. When a person starts
seeding a file, they tell this tracker that they are
doing so, so that the tracker can store that computers
IP address. When new downloaders want to download the
file, they ask this tracker server what IP addresses are
currently seeding the file. They get the IP address, send
a request to that computer telling it to send the file
to them, then they receive it. Then, they become seeders,
tell the tracker they are now seeding, and other computers
who want the file can download from them. Additionally,
there can be multiple tracker servers, and your client
would ask for IPs from *all* of them.
But there's another problem - one that's also
been solved. Trackers end up having a similar
single-point-of-failure problem, where a tracker can easily
be shut down. Sure, you can still use other trackers, but
you'd also need to find a way to get other trackers. So
there's a way to reduce reliance on a central tracker
server - Peer Exchange. When your torrent client gets IPs
of computers seeding a file, your client can store the IP
addresses of each of these peers.
Because of this, any peer you find can directly tell other
peers you have stored about that peer. This way these
other computers don't need to necessarily use the tracker
server to know about these peers - you are exchanging
peers between peers you know about. Peer Exchange is
also called PEX. Now, this certainly reduces usage of
trackers, but doesn't completely get rid of it. Remember
that trackers are kinda single-points-of-failures. When a
tracker goes down, you have one less source for finding
seeders of a file you want to download. So, we need a
way to get peers without relying on one central server
(or a collection of central servers). This is where DHT
comes in. Before, we were able to solve the problem of
clients relying on *one* server for a website or file, to
relying on multiple servers for a file by making a user who
downloads a file also a server. The same idea is used by
DHT with trackers. DHT makes users tracker servers. But the
essential component is that a list of seeders of files is
"distributed" - copied in it's entirety - to these other
trackers (users that act as trackers on DHT are called
- nodes*). Therefore, every *node* has the *whole* list of
peers. If one node goes down, you can easily use another
node *to get the exact same information*. You can read more
about DHT elsewhere, as it'd be too much to explain here.
So, how does ZeroNet use these ideas, and why? Firstly,
ZeroNet uses the concept of torrents, trackers, and Peer
Exchange, to ensure that there's no single point of failure
for websites, and so that you can get the exact same
website from a peer of that website. With this, a hacker,
government, ISP, etc. cannot take down a website by taking
down the original server. Instead, they would have to take
down every single peer - which is very unlikely to happen
as long as there are enough peers/seeders of the website.
So, when a person visits a website for the first time,
the ZeroNet *client* (which acts like a torrent client
with additional features specific to ZeroNet) asks a
tracker for a list of IPs of computers that are currently
seeding all of the files of the Website (Called "Zites"
in ZeroNet for ZeroNet Site). Then, when they are done
downloading this zite, they become a seeder which other
people can download from. Of course they also tell the
tracker they are seeding the zite. But, there are several
problems with this - which have of course been solved.
Firstly, how does one start a website on ZeroNet. They
create the website, then they tell a tracker they are
seeding that new "zite". When other people want to download
the zite, they download it from that original computer
(as long as it's online), then they become a seeder for
other people to download from.
Next, How does the original owner update a zite? They
make the changes to the zite, then they ask the tracker
for peers, and they tell these peers (of the zite) there's
an update to the zite, then it sends out the updated file
to these peers. Then, these peers get the updated files
and send them on to other peers (of the zite) that they
find. In this way, the updated files are distributed
throughout all of the peers of a zite currently on the
network.
What if a peer was offline? When they go online, they check
peers of the website that are currently online to see if
there has been an update. If so, they ask peers for the
updated files.
Now, we are going to get into how the verification
component works. Firstly, the problem is what if peers
change the files of a website they are seeding? Other
projects use hashes to ensure that the file you are
downloading matches that hash. ZeroNet has to send
- multiple* files and has to allow the owner of the zite to
- update* the zite. So, ZeroNet uses what's called "public
key cryptography" - this is highly based on math tricks
(namely, prime numbers) and is quite clever...
So, I'm dumbing this down greatly because it would take too
long to explain. But the point of public key cryptography
is that you have a "public key" and a "private key". These
keys are seemingly random numbers and letters. The idea
is that you can use your private key with, for example,
a message to your fans to encrypt this message. Since
you encrypted the message with your private key, the
message can only be decrypted with your public key. If
your private key is kept completely private, fans know
that when they decrypt the message with your public key
(your public key is public, hence the name) that it came
from your private key - because no other private key can
encrypt the message and still allow your public key to
decrypt it. If fans want to send a message *only to you*,
they can encrypt the message with your public key and they
know that only your private key can decrypt the message -
so only you can see that message. ZeroNet relies on this
idea. There are multiple methods of doing this encrypting
and decrypting, RSA being the most popular afaik. This
is what ZeroNet means by "BitCoin Cryptography". It
uses the same method of encrypting stuff as BitCoin
does. You can find more info about cryptography and
public key cryptography with this Khan Academy series:
https://www.khanacademy.org/computing/computer-science/cryptography
Firstly, ZeroNet URL's use a public key. The zite owner
keeps the private key to themselves. When they make
the zite, they create a "Digital Signature" - a Digital
Signature, in the case of ZeroNet, is a hash of *one*
file (more about this later) encrypted with the private
key. The public key can decrypt this signature to get
the hash of this one file. This one file is called the
"content.json file". It stores the hashes of all of the
other files of the zite as well as other information about
the zite. Whenever there's an update, a new signature is
created. When people download the zite from peers, they
check this signature.
So what are hashes and why are they useful in signatures? A
hash is a way to convert a file into a relatively small
(not really small by counting standards) number. Each
unique file has it's own unique "hash". Therefore, if you
know the hash of a file, you can check that the file hasn't
been modified by hashing it and checking that hash against
the hash you already know. By decrypting this hash with the
public key of the zite, we know it must come from the zite
owner - this is how a signature works. The client can then
compare this hash with the hash computed by the client.
But, there are some more problems (which have been solved,
of course). The current setup that I've just described only
allows for information by the owner of the zite to be in
a zite. How do we allow users to add information to the
zite as well? And how exactly are users handled? This is
where ZeroNet differentiates itself from other projects,
namely Beaker Browser and IPFS. ZeroNet achieves this
using files, aggregation of the information in these files,
and more public key cryptography for users.
So, the main problems with allowing users to add
information to a zite is 1.) distributing (copying
completely to all peers of the zite) a database that can
be added to by multiple peers *without peers overwriting
each other*, and 2.) Verifying that user information isn't
modified by other people.
To further explain problem 1 - If there was, for example,
one file that anyone can write to to add user information
to a zite, if two people modify it at the same time and
then tell other peers to get this newly updated file and
send it throughout the network, there will be two different
versions of this file. The version that will win out will
be the one that was sent out to peers last - which means
the other version will be overwritten or just disappear
and their information will not be added to the file. The
way ZeroNet fixes this is by designating a file per "user"
of the zite (more about users later).
Then, this user would just update their own file, so there
can never be two versions of that file trying to be sent
over the network.
But that's where problem 2 comes in, how do peers know that
a "user"s file wasn't manipulated by someone other than the
user. This is where the distinction of the user comes in. A
user in ZeroNet is just like users on regular websites, but
there are very important distinctions. Firstly, each user
has a public and private key (the private key of course
being kept private only by the person who owns the user).
Secondly, ZeroNet relies of these public and private keys
instead of passwords. Why? Because there's no central
server to store user information, firstly. Secondly,
it's more secure. Thirdly, it can use the private keys
to create signatures. The way this works is the same as
how zite owners can create a signature with their private
key. The signature is a hash of *one* file encrypted with
the private key. This file contains the hash of the file
that contains the user information that user added. When
it's updated, a new signature is created, the user tells
peers about the update, sends the update to those peers,
and those peers send to other peers until the whole network
has the updated file. Once that happens, all the user
information is taken from all these user files and put
into a database, which is *not* seeded (shared/distributed
over the network - the user files already handle this)
but is accessible from the website as a regular database
that can be queried with SQL.
So that's how ZeroNet achieves websites that are both
dynamic and distributed (meaning multiple peers have the
zite in it's entirety). The benefits are that it eliminates
the single-point-of-failure (there's multiple peers that
you can get a website from, rather than just one) which
reduces the likelyhood of a hacker, government, ISP,
etc. from shutting down a website on ZeroNet. This same
concept is applied to various other projects, including
federation.
But, there's one more component that I haven't talked about
yet: anonymity. ZeroNet achieves this by relying on other
projects that implement this. The one ZeroNet currently
uses is Tor, but Cjdns and I2P can also be used in the
future (Cjdns also takes care of additional problems as
well - other routing problems). To explain a little of how
Tor works, we should go back to the very first comment
I made, where I said how information over the internet
is "routed" by "routers" to the correct destination. A
router is a computer that sends off your information to
another router, or if it can send directly to the desired
computer, that computer. With this, every single router
your information goes to knows the IP address of the
computer that sent it. Tor is able to make sure all of
these computers don't know the IP address that sent the
information - so there's no way to locate you as they won't
be able to get your IP address. I'm not going to explain
how all of this works because I'm not very knowledgeable in
this area, but I can tell you that this concept is called
"Onion Routing", which hopefully makes it easier to look
up more information on this.
Now that I've explained all of this, there are additional
things I'd like to make sure people note about ZeroNet:
1.) There are blocklists in case you accidentally go to
a zite that's bad. You can add to a blocklist and you can
share it. When it's updated, the people who are using the
blocklist get the updates. This is like blocklists for
torrent ciients.
2.) You can "mute" users - your client will *never*
download or seed the files/user information of that user.
3.) The zites you seed are only the zites you download. You
have a choice in which zites you download. This is no
worse or better than the internet, where the exact same
thing applies. The only difference is the zites you do
download you seed to other people. If you no longer want
to seed a zite, you can delete the zite from your computer.
4.) ZeroNet does *not* use BitCoin. It uses the same
- method of cryptography* that BitCoin uses.
5.) Tor is not required, but useful.
6.) ZeroNet does not yet use DHT, but this is planned.
7.) The main difference between ZeroNet and other projects
is how it allows for *dynamic* websites (users having the
ability to post on zites).
A few more things. Trackers keep track of which peers
are serving what zites. *But*, how can you get to know
the addresses/URLs of these zites? This works *exactly*
like the regular web. I can put a link/url to another
website in this comment. When you click on this link, your
browser asks the DNS server the IP address associated with
the link, then sends a request to the server associated
with the IP address to send the webpage back. The
browser eventually receives this webpage and displays
it. This is one example of sharing a link to a website
with someone. The link isn't the content. By sharing the
link on here, I'm not also sharing all of the content -
you still need to ask the server associated with the link
to send back the webpage.
Google doesn't search the internet or the web. What it
does is search *what it knows about the web*. Google
can only search the list of URLs it has stored on it's
servers. So how does Google actually get a list of all
these websites? Well, it does what is called "crawling". It
looks at all of the websites it currently knows about
searching for links to new websites and stores them. Then
it looks at those websites and stores new links it finds
there. This process continues constantly to make sure the
index is continually updated. Google also lets you send
them links.
ZeroNet suffers from the same problem. We have zite
addresses. Your ZeroNet Client asks a tracker for the peers
of a given zite address, and then it downloads the content
from those peers if needed. However, we still need a way
of knowing what the zite addresses are, just like we need
to know the address of a website on the regular internet
to view it. You can tell people the Zite address, post
it on websites or zites, or have a crawler that crawls
ZeroNet Zites and keeps an index of these addresses that
can then be made into a ZeroNet search engine that people
can search. There are several ZeroNet search engines
(Kaffiene, Horizon, EulerFinder), & many of them crawl.
[1] https://www.ted.com/talks/tamas_kocsis_the_case_for_a_decentralized_internet/discussion