💾 Archived View for rawtext.club › ~sloum › geminilist › 005682.gmi captured on 2024-02-05 at 11:08:10. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-11-30)

-=-=-=-=-=-=-

<-- back to the mailing list

Metadata Without A Proposal

nothien at uber.space nothien at uber.space

Fri Feb 26 12:59:55 GMT 2021

- - - - - - - - - - - - - - - - - - - 

Philip Linde <linde.philip at gmail.com> wrote:

2. Must not be English-specific.
What is the preferable alternative? We could use numbers to indicate
element type, but ultimately numbers are dependent on numeral systems,
which depend on language and culture.

... Did you read the rest of the e-mail? I listed specific ways inwhich we can support date, author, and license metadata, using existingformats and conventions, none of which use English (for licenses,instead of SPDX license identifiers, Petite Abeille has neatly suggestedusing links, although I don't agree with their format).

If instead of using English directly, we define opaque strings of
characters for the tags, such that the tag "author" consistently means
"author", we really achieve the same thing. That is a simple solution
that is language independent.
Or we could use emoji, although I believe most computer users in the
world would have a harder time typing out a given emoji than a given
opaque, ASCII- and English-compatible string.

So you want to force non-English Gemini writers to use English words?When (seeing my original proposal) it's unnecessary? Imagine if you hadto end every Gemini document with the magic incantanation "ίĭئٻɨƁϸͶѠGڧ".That doesn't seem fun.

3. Must be machine-parsable.
We should consider the difference between needs and wants here. If I
have no interest in specifying another license to use my work than what
is implied from my sharing it, that doesn't necessarily mean I don't
want to specify date or author, so perhaps all or most elements should
be optional.

The sole purpose of giving a fixed format to metadata is so that it ismachine parsable; all other metadata can simply be stated using naturallanguage. And yes, if you read the rest of my e-mail, you would noticethat everything in it is completely optional.

4. Should affect presentation.
gemtext as a whole is about separating content from presentation.
Some of the earlier metadata proposals referred to metadata for
presentation, e.g. to specify a color to view the text in. This is
against the spirit of gemtext/Gemini (if not the spec).
Agreed, but as I understand it you do *not* want it to affect
presentation.

Yep, typo.

5. Must be difficult to extend.
...
What do you propose that prevents conventional use from dictating
reality? And why is it important that the specification can not be
extended? Unlike e.g. text/gemini, if a client doesn't support some
superset of the tags initially specified, there is no degradation. If
in the future we want to extend a meta data format to support e.g.
specifying where, in addition to when, it was written, the clients
that don't support it shouldn't suffer from it.
The only important concern to me is that there is a canonical
description of tags. That description can be extended indefinitely as
far as I'm concerned, for as long as the original meanings of the
initial set of supported tags aren't changed or overloaded by newer
tags.

Non-extensibility is a fundamental part of the spirit of Gemini. Wewant to prevent metadata from being used for all but the specifiedpurposes so that it is not misused in the future. Consider, forexample, a 'color' metadata key that had been suggested early on in theoriginal metadata thread. We want to prevent these kinds of misusesfrom happening at all. Notice that my proposal-not-proposal handleseach metadata field on a case-by-case basis; there is no way provided tohandle additional fields. In addition, I've stated that other metadatafields, which don't have to be known to search engines, can use anarbitrary, capsule-specific convention, so that you can use additionalmetadata fields internally.

I think that instead of defining ourselves what fields are important
we should start from a standard, e.g. DCMI with the element set
defined in IETF RFC 5013.
With that as a basis, if there is no suitable format already, we can
define a human readable, text-compatible data format and a
corresponding text/xyz MIME type. Then, a text/gemini document that
feels like supplying additional metadata can link to a metadata file
which the server serves with the above MIME type. A client that does
not support the MIME type should defer to serving unknown text/* types
as plain text. A client that does support it can localize the
elements, including things like names and date and time formats. If
the client is a crawler, it should find the linked metadata document
as a matter of its normal operation because it is linked from the
document.

This has a few problems:

1. It is extensible. As I've argued above, we don't want extensibility. This would mean that we have to have a very strict format for this metadata file, and given how few fields are really necessary to be machine-parsable, this would be a very small file. With my proposal, we can embed all the necessary metadata into the existing files.

2. The keys specified in IETF RFC 5013 are English-specific. As I've explained in my original mail, this is not sustainable for non-English Gemini clients and writers, as either the writers are forced to use English (bad), or the clients are forced to support the same keywords across a /lot/ of languages (bad).

Personally I don't think this is a standard I would use either way.
It's mostly for the benefit of robots that there's a point in
formalizing information like this. Humans can interpret such
information as indicated in the document itself in a much wider
variety of formats. It's not my intention, primarily, to serve robots.

Many gemlogs use the gmisub format, which is essentially providing datemetadata. There are uses, and making your content understandable to'robots' will also make it understandable to the users behind them. Oneparticularly helpful area that my proposal-not-proposal provides for isbasic search engine filtering (by date, author, and license).

~aravk | ~nothien