Gemini Archiving and WARC

Hi Gemini List,

Has anyone thought about, or implemented, archiving of Gemini content/traffic?

WARC (Web ARChive)? is a standard format used for web archiving. It uses 
text headers for metadata like in HTTP and email. It looks to me like WARC 
could be adapted for Gemini. The WARC spec supports multiple URI schemes, 
although it doesn't specify any other than http/https, ftp, and dns?. 
Bespoke formats could also be used, of course, or just downloading files 
wget-style, but using a standard format could allow for interop with "the 
WARC ecosystem"?.

Archive Team? has also worked on archiving non-HTTP protocols like FTP? and Gopher?.

I think there is an opportunity for people to maintain high-quality 
archives of Gemini content, like what the Internet Archive? and 
archive.today? do for the HTTP(S) Web. Now is a good time to start, while 
many of the original Gemini hosts? are still online.

Regards,
Charles E. Lehner

? https://en.wikipedia.org/wiki/Web_ARChive
? https://iipc.github.io/warc-specifications/specifications/warc-format/war
c-1.1/#ftp-scheme
? https://www.archiveteam.org/index.php?title=The_WARC_Ecosystem
? https://www.archiveteam.org/
  https://en.wikipedia.org/wiki/Archive_Team
? https://www.archiveteam.org/index.php?title=FTP
? https://www.archiveteam.org/index.php?title=Gopher
? https://en.wikipedia.org/wiki/Internet_Archive
  https://archive.org/
? https://archive.today
  https://en.wikipedia.org/wiki/Archive.today
? gemini://gemini.circumlunar.space/servers/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200901/9166
93f1/attachment.sig>

---

Next in thread (2 of 12): 🗣️ acdw (acdw (a) acdw.net)

View entire thread.