<-- back to the mailing list

Gemini Archiving and WARC

Dr. Otto Skrzyk drskrzyk at tilde.team

Fri Sep 4 06:23:21 BST 2020

- - - - - - - - - - - - - - - - - - - 

On Thu, Sep 03, 2020 at 11:54:08PM -0400, Caranatar wrote:

This seems like an incredibly cynical and myopic take. It's also
expected that everything on the internet will track you, will be
constantly expanded for the purpose of commercialization instead of user
experience, etc.... Yet Gemini purposefully rejects those notions in
favor of something better. The idea that the same shouldn't apply here
is odd.
-caranatar

Calling it myopic is a bit harsh and probably misses a point that youput forward as a support - one of the selling points of gemini is thatit rejects complexity and some of the concerns of a more commercializedinternet. One of those concerns is the potential for misuse of theinformation or infrastructure beyond the intent of the content creatoror host. That or the right to retract that information.

You'll have to forgive me seeing some irony that someone with ariseup.net email address would speak against someone putting forth adviceabout taking caution in what you post on the internet. Riseup existslargely in part because others share this "cynical and myopic take."

Regardless, the issues being brought up here seem to circle aroundcontent control and archival ethics and less about the protocol.

Tom writes:
On Wed, 02 Sep 2020 01:23:22 +0000
acdw <acdw at acdw.net> wrote:
On 2020-09-01 (Tuesday) at 23:43, Charles E. Lehner
<cel at celehner.com> wrote:
Hi Gemini List,
Has anyone thought about, or implemented, archiving of Gemini
content/traffic?
WARC (Web ARChive)¹ is a standard format used for web archiving. It
uses text headers for metadata like in HTTP and email. It looks to
me like WARC could be adapted for Gemini. The WARC spec supports
multiple URI schemes, although it doesn't specify any other than
http/https, ftp, and dns². Bespoke formats could also be used, of
course, or just downloading files wget-style, but using a standard
format could allow for interop with "the WARC ecosystem"³.
Archive Team⁴ has also worked on archiving non-HTTP protocols like
FTP⁵ and Gopher⁶.
I think there is an opportunity for people to maintain high-quality
archives of Gemini content, like what the Internet Archive⁷ and
archive.today⁸ do for the HTTP(S) Web. Now is a good time to start,
while many of the original Gemini hosts⁹ are still online.
Regards,
Charles E. Lehner
¹ https://en.wikipedia.org/wiki/Web_ARChive
²
https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1/#ftp-scheme
³ https://www.archiveteam.org/index.php?title=The_WARC_Ecosystem
⁴ https://www.archiveteam.org/
https://en.wikipedia.org/wiki/Archive_Team
⁵ https://www.archiveteam.org/index.php?title=FTP
⁶ https://www.archiveteam.org/index.php?title=Gopher
⁷ https://en.wikipedia.org/wiki/Internet_Archive
https://archive.org/
⁸ https://archive.today
https://en.wikipedia.org/wiki/Archive.today
⁹ gemini://gemini.circumlunar.space/servers/
I personally think this is a great idea, but I know some might not be
so on-board with it. I'm thinking of solderpunk's post (in their
gopherhole, actually):
gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/the-individual-archivist-and-ghosts-of-gophers-past.txt
So is there a way to opt-out of archiving for publishers? Some in the
community might want to know about it, though I personally am of the
opinion that if you've published it, it's now the property of the
commons.
Ounce you publish something to the internet there is no retracting it.
This is one of the first things I was taught the first time I used the
net. Alongside never using your real name on the net unless your
publishing something.
--
sent from emacs using mu4e

-- Dr . Otto Skrzyk gemini : gemini://tilde.team/~drskrzyk web : https://drskrzyk.tilde.team/mastodon : @docskrzyk at hackers.town

-------------- next part --------------A non-text attachment was scrubbed...Name: signature.ascType: application/pgp-signatureSize: 833 bytesDesc: not availableURL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200904/947df2a1/attachment-0001.sig>