💾 Archived View for lonelysilo.ca › rfc › gemini-semantics.gmi captured on 2022-07-16 at 13:24:31. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-11-30)

-=-=-=-=-=-=-

Gemini Semantics

RFC for semantics (RDF) on gemini resources.

Abstract

Computers inherited the need to categorize from their human creators. Ontology is a branch of philosophy that studies the categorization of all things known to humans. Ontologies can be described in ways that permits both humans and computers to reason about concepts and draw conclusions from them. In particular there is an approach to ontology building that involves statement triples with Subject, Predicate and Object, which was popularized on the Internet with [RDF] and the Semantic Web. The goal of this Request For Comments (RFC) is to consider how make ontologies in Gemini resources using the existing protocol and syntax while maintaining its guiding principles: readability by humans, simplicity and flexibility.

RDF

Introduction

There are two fundamental assumptions behind RDF. All knowledge can be encoded as Subject-Predicate-Object triples. These items can be encoded as special URI's known as a "Concept URI." This document is a resource that is addressable as a URI. The concept URI is the idea behind this resource and may in fact be the same URI that you are using to read this document right now to learn about this concept. Other people or computers might also come here someday and build more knowledge.

Subject     Predicate  Object
-----------------------------------
My Cat      Is A       cat
Jane's Cat  Age        10
Gem-Sem-RFC Is A       Creative Work

For every SPO triple there is a subject that is the resource being described, which can be any resource, addressable, or not. Most subjects of interest are either addressed with a concept URI or can be traversed through multiple triples to one that is.

Triples describe some aspect of the subject resource, but there is no static list of what those aspects are. Instead, RDF has what is called an open world assumption. Anyone in the world can define their own predicates with URI's and anyone else can use them in their triples. There are certain common and fundamental predicates. Some of them will be covered by this document while many others are discovered as you learn more about RDF and see more examples.

There is also the object portion of a triple. Object assign the value of a triple for a predicate on the subject. Objects can be literal values like text, numbers or dates. It's also possible for an object to be another resource indicating that there is a relationship between resources.

Subject                                           Predicate                                         Object
---------------------------------------------------------------------------------------------------------------------------------------
gemini://example.com/my-cat                       http://www.w3.org/1999/02/22-rdf-syntax-ns#type   gemini://ontologies.com/animals/cat
gemini://example.com/janes-cat                    gemini://ontologies.com/animals/age               10
gemini://lonelysilo.ca/rfc/gemini-semantics.gmi   http://www.w3.org/1999/02/22-rdf-syntax-ns#type   http://schema.org/CreativeWork

All of the URI's in this example are conceptual, but some can actually be viewed in your browser or even parsed by a computer to learn more.

Expressing Triples as Gemini Links

The simplest and most verbose way to express a Gemini triple is in a link.

http://www.w3.org/1999/02/22-rdf-syntax-ns#type gemini://ontologies.com/animals/cat

gemini://ontologies.com/animals/age 10

http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://schema.org/CreativeWork

This form is verbose, difficult for humans to read and doesn't fit very well with Gemini conventions. It does conform to the Gemini specification though. Also, it will parse using a very simple extension to a Gemini parser. This form is best left as theoretical and not implemented. Other file formats, such as N-Triples, are better suited to raw triple data like this, better suited for machine processing than for humans.

Vocabulary terms

Let's make the triples a little easier to read by introducing some vocabulary.

[rdf:type] [cat]

[age] 10

[rdf:type] [schemaOrg:CreativeWork]

The predicates here are written using a reference syntax that resembles an existing Gemini convention and a markdown construct. The square braces refer to a nick-named link further down in the document where the full details reside.

(rdf)

(age)

(cat)

Here the full URI's of the rdf and animal vocabularies are declared and the nicknames are assigned using the braces. The nicknames can be referenced using the square braces, which logically includes the URI at the point of the reference. This is what happened with the animal age concept URI in the previous section that serves as a predicate for one of the triples.

For some vocabularies, such as rdf, there are a large number of concepts at the URI and it is possible that you will use several in your Gemini resource. Instead of repeating many very similar URI's the prefix URI can be declared once and a composed reference syntax is used at each reference with a colon ":" delimiter. The resulting reference URI is formed by taking the URI of the nickname before the colon and appending the name after the colon. The colon itself is discarded as it is only a delimiter. Note the importance of the trailing fragment "#" or slash "/" in the vocabulary URI if composed references are being used.

Vocabulary triples

Vocabularies themselves are concepts and can be the subject of additional triples. This can be used to assert certain facts about the vocabulary at the time that the Gemini resource was written, such as the version or last modified date. This can be useful for tracking down the version of the vocabulary that matches the assumptions made in the Gemini resource.

(rdf) [schemaOrg:version] 1.1

Additional pairs of predicates and objects can be added onto the same link. Note that the nickname of the link itself can be used in those pairs. Nicknames do not need to be declared before they are used in the Gemini resource.

(rdf) [schemaOrg:version] 1.1 [rdf:type] [owl:Ontology]

Grouping Facts Together with Headings

With the extended link syntax of the previous section it is possible to describe more than one triple related to the subject of the link, but it is limited and probably only suitable for providing a very small number of facts about vocabularies. For subjects that have many facts there is a better way that aligns better with the Gemini structure.

In semantic Gemini headings themselves can be the subject of triples. The predicates and objects for that subject are written in the form of a list, which you can naturally add a very large number of facts and it will render in a compact way in a Gemini browser with the vocabulary referencing techniques described in the previous section.

My Cat (my-cat)

Jane's Cat (janes-cat)

(friends)

Nicknames for headings

You might have noticed a few things from the example in the previous section. First, there are nicknames for headings. This allows you to refer to those resources in other sections as either objects or even predicates. The URI's for these resources is the URI of the current resource with a hash "#" and the nickname in case they will be linked from other resources.

There is one heading that may not have a nickname and that is the top-level one. It's URI is implicitly the URI of the Gemini resource itself. References to the top-level principal resource are made using either the full URI or the special empty reference "[]" with an empty set of square braces.

Names and descriptions

Semantic resources can have names (rdfs:label) and descriptions (dcterms:description). Gemini headings also have names, which is the text in the heading itself, minus the nickname. The first paragraph after the heading is the description of it. These are two implicit triples generated for each section in a Gemini resource. It's also natural to read and write them this way.

(dcterms)

(rdfs)

Types

There is one further optimization made for specifying the type of a resource. A list item with only a single reference is interpreted as the stating the type of the heading resource with an implied (rdf:type) predicate like this.

Bob's Cat (bobs-cat)

Metadata

The Greek word "meta" means "after." It's the data you read later on to fill in the background of what you have been reading. It also means that the metadata doesn't need to be decided right away. It can be defined later on after a set of resources are crafted and the edge cases better understood. Unlike low-level computer programming, this space can be much more fluid and organic to suit the needs of the humans.

Limitations

The design of the semantic Gemini resources, like the design of Gemini itself, is intentionally simplified from the more powerful designs of RDF-XML, Turtle and other semantic web resource types. There are certain constructs that will not be possible here, such as explicit data types for object values. Luckily, URI's can link to other file formats and transfer protocols. Also, Gemini is capable of serving virtually any kind of file.

Search Considerations

Existing Gemini search engines, such as GUS are able to index and search tokens of text in a resource and find relevant matches based on those terms. Searching for terms based on typical nick name (e.g. rdf:type) or the vocabulary URL and the term (http://...rdf-syntax... and type) should find resources that have a fact using that predicate. Adding the search term of the object or vocabulary will further narrow the search to the resources of interest. Combined with the back-link search capabilities there is a great deal of power available there already.

With the design of the new Semantic Gemini, a new type of search engine should be possible that would support graph traversal logic to join data spanning multiple joining triples and finer grained filtering of object values. RDF permits search engines not only to be able to find links between resources, but also the nature of those links. Existing SPARQL engines could also be updated to support Gemini protocol and the parsing of semantic Gemini resources as part of their data gathering.

Summary

Semantic Gemini resources open the world of semantics to this new space without reinventing them entirely and also incorporating existing standards and conventions into the design. This will permit the introduction of new ontologies from existing schemas so that human and machines may interpret them in a consistent way. Integrating with existing ontological web data through bridging technologies, such as the wikipedia bridges will be able to benefit from this design since the semantic data can now be represented and linked in Gemini. Given the light-weight nature of Gemini, it is also possible that semantics may flourish here in new areas that were more difficult in the more cumbersome systems of the World Wide Web, but only time will tell.

(schemaOrg) [schemaOrg:version] 11.0