<-- back to the mailing list

[spec] [tech] Companion Specification Proposal for Metadata

Gary Johnson lambdatronic at disroot.org

Thu Feb 25 23:56:10 GMT 2021

- - - - - - - - - - - - - - - - - - - 

Omar Polo <op at omarpolo.com> writes:

Thanks for putting into words exactly what I had in mind, way better
than I could ever do. Your proposal is exactly what I was trying to
describe in the other thread.
I loved your proposal, but only until here. I think that what follows
is overly-complicated by the fact that you're trying to provide a way to
define the meaning of the metadata, something that can be avoided, at
least in the scope of Gemini.

Hi Omar. I'm not sure I follow you here. Could you provide an example?

My proposal did not (intentionally) associate any meaning withparticular metadata fields. I merely wanted to provide a human-readable,Gemtext-format syntax for associating metadata (the bulleted listattribute:value pairs) with resources on a capsule (indicated by linklines).

Do you have an alternative format that you would like to propose fordiscussion?

Let's keep the metadata generic. We'll then start using common keys
because, well, they're widespread (like Author, Date, ...) or expressive
enough (`Tags: music punk-rock' is pretty self-exlpanatory), while still
allowing authors to add whenever they want extra fields if they feel
like (there are people writing poetry, maybe they want to add a metadata
about the metrics? or about a particular style?)

We are in agreement here. I do not mean to prescribe a list ofstandardized metadata attributes in this companion spec. My examplesused a few that I made up on the spot (i.e., author, last-modified,copyright, tags). I'll leave deciding on "the right set" of attributesto those who actually intend to use metadata.

(as other pointed out several time in the past, $DOCUMENT_ROOT is not
something set in stone. We have single-user capsules, multi user
capsule with different URLs style -- example.com/~op/ vs
example.com/users/op vs ... -- etc)

That's a fair point, and one that John Cowan raised in his response aswell. Thanks for reminding me of this. In that case, we should discusshow to remedy this issue.

One approach could be to keep the metadata.gmi file at each capsule'sdocument root as I originally proposed. This should be well-defined on aper-capsule basis even on a server hosting multiple capsules in thecommon pubnix style. It is simply the toplevel directory of yourpersonal capsule (i.e. ~/public_gemini or equivalent for user capsulesand whatever server-level document root is specified by the admin wholaunched it).

This would put the burden on metadata bots to try and find thesemetadata.gmi files at the appropriate paths under a multi-hostingdomain.

Without additional server-provided information, the bots may simplyresort to brute force checking every directory path on the domain for a.metadata.gmi file, which could lead to a lot of dead-end networkrequests.

Instead, I can think of (at least) two ways the server could help thebot.

1. BAD: Aggregate Metadata Up

Even though the visiting bot doesn't know which paths lead to the document roots of our users' capsules, the Gemini server does. At startup time, a metadata-exporting Gemini server could check each user's document root for a .metadata.gmi file. Any that are found could be concatenated together to form a single toplevel gemini://cool.capsule.com/.metadata.gmi file.

However, in order for this to work correctly, the server would need to apply two transformations to each user-level metadata.gmi file before concatenation:

1. All link lines would need to be prefixed by the URL path that the server assigns to that capsule's document root (e.g., /~someuser/).

2. To prevent errant bulleted list attributes at the top of one user's metadata.gmi file (with no prior link lines) from being erroneously applied to the final link lines of the previous metadata.gmi in the concatenation sequence, a single link line for the current capsule's document root (e.g., =

/~someuser/) would need to prepended to the front of each user-level metadata.gmi file prior to concatenation.

These are relatively simple text transformations, but they do place additional burden on server authors, so this isn't my favorite option.

2. GOOD: Allow Metadata to Link to Other Metadata

In this case, we just extend the metadata.gmi parsing rules for bots to say that if any of the link lines that they read in end with .metadata.gmi, then these can and should be followed for further metadata about parts of this site. This doesn't require any other changes to the companion spec as written except for that note.

To make this work, at startup time a metadata-exporting Gemini server could check each user's document root for a .metadata.gmi file. For each such file that is found, the server can append a new link line pointing to that metadata.gmi file (relative to the server's toplevel document root) to its own toplevel $DOCUMENT_ROOT/.metadata.gmi if it exists. If a toplevel $DOCUMENT_ROOT/.metadata.gmi file doesn't exist, the server can create one containing just the links to the users' .metadata.gmi files.

Note that this doesn't even have to happen at server start time. Instead, the server could program $DOCUMENT_ROOT/.metadata.gmi as a dynamic endpoint that checks for user-level .metadata.gmi files whenever it is called, thereby making users' metadata available as soon as the user publishes it to their capsule with no need for a server restart. (This is by far my favorite option.)

Okay, I think I've answered all your points. What do you think?

Best, Gary

-- GPG Key ID: 7BC158EDUse `gpg --search-keys lambdatronic' to find meProtect yourself from surveillance: https://emailselfdefense.fsf.org=======================================================================() ascii ribbon campaign - against html e-mail/\ www.asciiribbon.org - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.See http://www.gnu.org/philosophy/no-word-attachments.html