💾 Archived View for soviet.circumlunar.space › rwl › gemlog › 2022-03-15-on-web-annotations.gmi captured on 2023-05-24 at 18:38:57. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

On Web annotations

I recently discovered Web Annotations:

https://www.w3.org/annotation/

The basic idea is very simple. An annotation is just a document that

is in some sense "about" another document, or a part of another

document. Thus annotations could be used for replies to blog posts, or

comments on news articles, or reviews of products, or marginalia for

historical texts. Any sort of commentary on any sort of document can

theoretically be represented as an annotation.

Of course, you've always been able to write a page of HTML containing

text about something else on the Web, with a link to that other thing.

What makes annotations different is that this semantic relationship

(this document is *about* that one) is encoded in a specified,

JSON-based format, so that "machines" can discover and know about the

relationship and do interesting things with it. Thus they are part of

the semantic Web.

An annotation has a *target*, which is the document (or document part)

which the annotation is "about"; and it has a *body*, which is the

content of the annotation. Both are identified by generalized URLs

(bodies can optionally just be a string). A few other fields identify

various metadata, like the type, format, and language of the body and

target. This is a simple, very general data model that can represent

pretty much any kind of connection between texts.

An important advantage of annotations over plain hyperlinks is that

they can identify *parts* of the target document in a fairly

fine-grained way, using selectors that are already common Web

technologies (e.g. CSS selectors). Thus annotations are particularly

useful for making commentary on longer documents.

What are annotations good for?

For me, the most exciting applications of annotations fall broadly in

the category of collaborative research.

Research that consists of commentary on other research is an ancient

and important cultural practice: there are many texts of ancient Greek

philosophy, for example, that were lost, and we only know about them

because they were quoted or paraphrased by later authors writing their

own commentary. We accumulate knowledge by having an ongoing

conversation with past work.

But commentary is limited by the problem of *forward search*. How do

you find out what other people have already said and done in response

to a given piece of work? How can you gather all the existing

commentary about a particular document, or part of that document? This

has historically been very difficult, and remains difficult today,

though tools like Google Scholar have started to make it easier.

Web Annotations could change this. When a Web server serves up a

document, it can specify a URL for an annotation container for that

document. Using this mechanism, your browser can find annotations that

have already been made, and allow you to create new ones.

Here's an example. Suppose a historical text has been newly scanned,

and the scans are published on the Web. Different researchers can then

create:

transcriptions
translations
comments or notes on passages in the text
links to other documents (e.g. research articles) that discuss the text

as Web Annotations. They can all do this asynchronously, in the course

of their normal research. But as they publish those annotations, others

will be able to see them and benefit from that work. That will save

effort on tedious tasks like transcription and translation. It will

make it easier to discover relevant research that has already been

done. And it will help colleagues with common interests find each other.

As someone who does this kind of research, I'm excited to see how this

will develop. There are still lots of problems to solve, but there's

now a standard to build on.

Links

If you're interested in learning more, here are the relevant

specifications by the W3C, which are fairly readable:

The Data Model

The Protocol

The W3C Annotations Working Group

Here are two clients demonstrating the idea, though I don't think

either of them is using the actual W3C protocol or data model:

Hypothesis

Genius