💾 Archived View for rawtext.club › ~sloum › geminilist › 005713.gmi captured on 2023-11-14 at 09:54:42. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
Oliver Simmons oliversimmo at gmail.com
Fri Feb 26 21:25:57 GMT 2021
- - - - - - - - - - - - - - - - - - -
On Fri, 26 Feb 2021 at 10:51, <nothien at uber.space> wrote:
I've lost track of the currently raging metadata thread entirely, and so
I've started this as a new post.
Good choice :)
3. Must be machine-parsable.
Search engines, archivers, and other crawler-style clients need to be
attended to. Some of the information they need is: date, author, and
license.
Every form of somewhat organised info is "machine readable", onlysentences and stuff aren't.(although ML is getting really good - but i don't think anyone want to use that)
5. Must be difficult to extend.
Again, this comes from the general Gemini philosophy that anything
that can be misused will be misused. This rules out lots of current
proposals because they specify tags, and the usage of tags can only be
controlled by convention, which is subject to change.
We use a text-based format, so this is semi-bogus.I can easily add stuff, such as styling, to my documents *without* atag format and make software to support it - without extending thespec.Gemini wants to be "non-extensible", but having freeform text breaks that.This is an unfixable problem though, and just a side effect of what Gemini is.
## Dates
My proposal with dates is to use what we already have - the gmisub companion spec.[...]
Search engines
and crawlers can still choose to include date information based on when
they last crawled the page.
This would only really work for things that are looking at sites as awhole, mainly search engines.My issue with these metadata in separate location ideas is that itcreates additional work, and network requests, to get the info aboutone file.Also for more one-off things with dates, creating gmisub stuff for itis slightly overboard.
## Licenses
This is really nice, I didn't know there was a convention for it.
## Authors
There are two possibilities I see with author metadata: either take it
from the license line, discussed above, or extend the gmisub spec to
also allow for an optional author field.
See above about gmisub.The licence line makes most sense to me, however not everyone addslicenses (meaning they get copyright), and may still want their nameon it, the current method of licence-first doesn't work in this case.
- Oliver Simmons-- DBAD
(`- name` is how I sign my emails and stuff when I remember)There's probably many other ways this could be done, the above wasjust a quickly typed example.
In the example I have pointed out a second issue - licenses that aren't in SPDX.I'm not entirely sure what SPDX is, but from a quick search it appearsit doesn't contain the DBAD license (which is what I personally usefor stuff I really don't care about).
=
https://dbad-license.org/
## Other Fields
Clearly, other fields aren't supported by this. If you want to place
additional metadata in your content, then I suggest writing it in
natural language. If it is absolutely necessary to have it
machine-parsable (so that it can be specially understood by e.g. search
engines) then we can talk about that here on the ML, but others have
argued against e.g. tags because they allow easily manipulating search
results. Expect resistance.
Agreed on this, tag metadata formats are just a catch-all, andcatch-alls are typically bad.
## Conclusion
I don't think we need a 'metadata proposal' to achieve the goals we're
looking for. The format conventions are already mostly in place; we
just need to formalize them.
I agree that the catch-all metadata proposals are unneeded, I think weshould stop with them.I would also think we should start calling them catch-all metadata orsomething similar, there's a distinction between a generic format thatallows any metadata, and dedicated formats for individual pieces ofmetadata, such as dates, authors and licenses.
- Oliver Simmons