💾 Archived View for gemi.dev › gemini-mailing-list › 000364.gmi captured on 2024-08-19 at 00:24:25. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Unadorned Gemtext instead of syndication formats

📧 Messages: 8
🗣️ Authors: 4
📅 First Message: 2020-09-08 17:31
📅 Last Message: 2020-09-10 00:04

1. easeout (a) tilde.team (easeout (a) tilde.team)

📅 Sent: 2020-09-08 17:31
📧 Message 1 of 8

Recently on the mailing list and in #gemini we've been talking about
syndication, as in RSS / Atom. I want to begin by summarizing some of
our thoughts.

# Existing formats

RSS and Atom are well-supported formats. They are XML, but you can
probably import a parser / generator rather than having to write one
yourself.

JSON Feed is a well-supported format, and is regarded as simpler than
RSS / Atom. Even if you had to write your own parser, you would be able
to build off a JSON parser because those are available for basically
every language. They are also often simpler to build on than an XML
parser.

RSS, Atom, and JSON Feed publish not just Gemini, but other protocols.
They are commonly served over not just Gemini, but other protocols. And
existing feed readers generally expect feeds to be in one of these
formats. So if you want your site to be syndicated in the existing
internet feed reader ecosystem, using one of these formats is the price
of entry.

CAPCOM expects Atom, therefore Atom became a common way to publish
Gemini blogs. This works, and does not need to change unless an
alternative would be markedly better.

# Making syndication easier

However, some of us are concerned with the complexity of creating a feed
alongside creating Gemini content. If one of our goals is to make Gemini
content easy to publish, making it easy to syndicate is an attractive
goal too. And it is jarring to go from writing Gemtext, which is
remarkably simple and straightforward, to writing Atom XML, or to
finding a static site generator tool that will do it for you. This feels
like a cost. It might remind us of the bloat on the web we are trying to
do without. Or it might be a barrier to entry for users who would be
confident publishing Gemtext, but not confident publishing Atom, or for
that matter, a web page.

To address the fact that creating a feed is a few steps more difficult
than creating a Gemtext page, one idea is to make a new feed format
which is written in Gemtext and conforms to a pattern. While it's true
this would be easy to write, there is no existing parser ecosystem for
it, so it would be effectively Gemini-only. If you wanted to syndicate
to the broader internet, you would still need to publish something like
Atom. Furthermore, although parsing Gemtext is not as difficult as Atom
or JSON Feed if you had to start from plain text and work your way up,
existing formats have off-the-shelf parsers in many languages, so you
are not starting from the bottom. But with a brand new format based on
Gemtext, you would have to.

So we are not all sold on the idea of a Gemtext-based feed format.
However, one positive characteristic such a format is that it would be
browsable as a regular Gemtext page. And that got us thinking: 

If regular Gemtext pages are easy to parse, can we just treat any
Gemtext page as a feed? In fact, yes we can. You can poll plain Gemtext
pages, not feed format files, and detect the links on the page that are
new. And we don't need a Gemtext feed format pattern, because the
pattern of link lines beginning with "=>" is enough.

# Subscribing to any Gemtext page

To get a feel for that, I did a proof of concept that you can try out:

=> https://github.com/kconner/gemini-subscription-cli Proof of Concept

This repository contains a Makefile and a shell script. The script takes
a list of Gemini URLs to fetch, which are expected to be Gemtext pages.
It filters them down to just the link lines. It compares the set of link
lines with a previously-fetched set of link lines, and it identifies and
emits those that are new. It saves the new total set as the next run's
previous set for comparison.

If you want, you can clone the repo and then try it out.

Subscribe to some pages:

$ echo 'gemini://gemini.circumlunar.space:1965/servers/' >> subscribed-urls
$ echo 'gemini://tilde.team/~easeout/glog/' >> subscribed-urls

Fetch all the links, since we don't have a prior set to compare to:

$ make
fetched: gemini://gemini.circumlunar.space:1965/servers/                   

fetched: gemini://tilde.team/~easeout/glog/                                

wrote: new-links.gmi                                                       

=> gemini://gus.guru/known-hosts Gemini hosts known to GUS                 

=> gemini://gemini.conman.org                                              

=> gemini://zaibatsu.circumlunar.space                                     

=> gemini://carcosa.net                                                    

?

Later, update again and identify what's new. (To test right away, you
can delete a few lines from cached-links.gmi.)

$ make
fetched: gemini://gemini.circumlunar.space:1965/servers/
fetched: gemini://tilde.team/~easeout/glog/
wrote: new-links.gmi
=> gemini://breadpunk.club
=> 2020-09-07-re-a-wordpress-confession.gmi 7 September 2020: Re: A Wordpress Confession

# Where does this functionality belong?

This proof of concept demonstrates that any Gemini page can already be
treated like a feed. This means that, if you don't mind not
participating in the broader internet feed reader ecosystem, and you
don't mind the occasional page redesign creating noise in subscription
results, you don't have to make a feed file at all. Syndication is
zero-cost.

In hindsight, why does RSS exist? What is "Really Simple" about it?
Well, it's simpler than HTML. If HTML was easy enough to parse, you
could just subscribe to web pages and not need feed files. But Gemtext
is already easy to parse. We don't our experience on the web to mislead
us into thinking a page, like your blog index page, is not already a
subscribable list of links.

It's been pointed out this is not very different from Spacewalk, which
(I think?) works by just polling Gemini pages and updating when they
change, as opposed to when their links change. Both of these are useful
and easy ways to subscribe to any Gemtext page.

# Closing

Gemtext authors could feel like they are not burdened with creating a
feed file if browsers, feed readers, and aggregators allowed subscribing
to Gemtext pages directly. When as an author you understand that a feed
file is unnecessary, the pressure to create a feed file is lifted. But
for that to happen, you have to see direct Gemtext subscription working
in the wild. One way that could work is for a browser to have a way to
treat a bookmark like a feed subscription. I also think if CAPCOM would
accept Gemtext pages as a feed URL option, that would go a long way.

Link to individual message.

2. Meff (meff (a) meff.me)

📅 Sent: 2020-09-08 18:17
📧 Message 2 of 8

easeout at tilde.team writes:

> However, some of us are concerned with the complexity of creating a feed
> alongside creating Gemini content. If one of our goals is to make Gemini
> content easy to publish, making it easy to syndicate is an attractive
> goal too. And it is jarring to go from writing Gemtext, which is
> remarkably simple and straightforward, to writing Atom XML, or to
> finding a static site generator tool that will do it for you. This feels
> like a cost. It might remind us of the bloat on the web we are trying to
> do without. Or it might be a barrier to entry for users who would be
> confident publishing Gemtext, but not confident publishing Atom, or for
> that matter, a web page.

At this point in time, how many folks can write Gemtext but aren't
confident writing HTML? Yes, for those that hand-author feeds, it's
certainly a bit of a bother writing XML by hand (I've screwed it up
myself a few times), but I'm not sure the
Gemini community is at the point where we have users that are confident
with Gemtext but not with HTML.

> # Subscribing to any Gemtext page
>
> To get a feel for that, I did a proof of concept that you can try out:
>
> => https://github.com/kconner/gemini-subscription-cli Proof of Concept
>
> This repository contains a Makefile and a shell script. The script takes
> a list of Gemini URLs to fetch, which are expected to be Gemtext pages.
> It filters them down to just the link lines. It compares the set of link
> lines with a previously-fetched set of link lines, and it identifies and
> emits those that are new. It saves the new total set as the next run's
> previous set for comparison.

If there's just a shell script parsing text and making socat commands, what's
stopping an author from using `jq` or `xidel` on the output to parse
JSON and XML respectively? I realize in code it's not that simple, but
most languages in general use have JSON and XML parsers easily available
and generally well documented.

> This proof of concept demonstrates that any Gemini page can already be
> treated like a feed. This means that, if you don't mind not
> participating in the broader internet feed reader ecosystem, and you
> don't mind the occasional page redesign creating noise in subscription
> results, you don't have to make a feed file at all. Syndication is
> zero-cost.

One of my favorite parts of Gemini is the attitude that it works with the
rest of the net, as opposed to trying to cloister itself like the Web
does. One of my pet peeves with the web is how it tries to reinvent
everything. Long-lived connections, push protocols, things that TCP has
had _forever_.

>
> In hindsight, why does RSS exist? What is "Really Simple" about it?
> Well, it's simpler than HTML. If HTML was easy enough to parse, you
> could just subscribe to web pages and not need feed files. But Gemtext
> is already easy to parse. We don't our experience on the web to mislead
> us into thinking a page, like your blog index page, is not already a
> subscribable list of links.

Small aside: Semantically annotated tags have been a dream for the W3C
for quite some time now, and annotating feed links and other tags is the
purpose behind Semantic Web initiatives. The web has definitely tried to
allow in-band feeds.

> Gemtext authors could feel like they are not burdened with creating a
> in the wild. One way that could work is for a browser to have a way to
> treat a bookmark like a feed subscription. I also think if CAPCOM would
> accept Gemtext pages as a feed URL option, that would go a long way.

I still feel like this is NIH, but I'm not a CAPCOM or Spacewalk author
so a decision like this  isn't my place. As long as we continue to have
Atom/RSS support, that's all I'd like. I guess if there is a broader
move to Gemtext feeds, there should be pretty simple methods available
to convert those into Atom-compatible feeds.

- meff

Link to individual message.

3. Jacob Moody (moody (a) posixcafe.org)

📅 Sent: 2020-09-08 18:45
📧 Message 3 of 8

I talked about this a bit on #gemini but though I could collect some of 
my points here for posterity.

On 9/8/20 12:31 PM, easeout at tilde.team wrote:
> RSS and Atom are well-supported formats. They are XML, but you can
> probably import a parser / generator rather than having to write one
> yourself.

Part of the beauty of gemini to me is that there is really no bar to 
entry in terms of requirements of what the system has to have before you 
start. There is tls but I think having this encryption requirement is 
more then reasonable(even if I have reservations on the design of tls). 
The requirements on using something that uses XML would limit the 
ecosystems that gemini could exist on. I loved seeing the adaption in 
otherwise limited software stacks like plan9, in which the complexity of 
things like xml parsing has prevented a readily usable library to exist 
on the platform.

> This proof of concept demonstrates that any Gemini page can already be
> treated like a feed. This means that, if you don't mind not
> participating in the broader internet feed reader ecosystem, and you
> don't mind the occasional page redesign creating noise in subscription
> results, you don't have to make a feed file at all. Syndication is
> zero-cost.

This is fantastic. With just one page could be used for both a human 
readable index as well as something like a feed. Having a single page 
for both I think is really in the spirit of how I interpreted this 
protocol. The fact that you could whip up something this small to 
process feeds like is a testament to the simplicity of this type of 
system already.

> Gemtext authors could feel like they are not burdened with creating a
> feed file if browsers, feed readers, and aggregators allowed subscribing
> to Gemtext pages directly. When as an author you understand that a feed
> file is unnecessary, the pressure to create a feed file is lifted. But
> for that to happen, you have to see direct Gemtext subscription working
> in the wild. One way that could work is for a browser to have a way to
> treat a bookmark like a feed subscription. I also think if CAPCOM would
> accept Gemtext pages as a feed URL option, that would go a long way.

I don't think something like this would stop people from using something 
like atom or so on as their own feed system, which I find more then 
fine. If servers would like to, they could even offer an automated way 
of creating those types of feeds based on something like the Gemtext 
implementation. I would consider this type of feature to evolve over 
time for which ever feed system is chosen regardless. For me personally 
I think this satisfies the 'complexity when you want it and none if you 
dont' attitude that I find attractive. What I am trying to avoid here is 
the expectation that servers need to provide an xml based feed and the 
expectation that clients should parse it.

Thanks,
Moody

Link to individual message.

4. easeout (a) tilde.team (easeout (a) tilde.team)

📅 Sent: 2020-09-08 22:27
📧 Message 4 of 8

On Tue, Sep 08, 2020 at 11:17:42AM -0700, Meff wrote:

> At this point in time, how many folks can write Gemtext but aren't
> confident writing HTML? Yes, for those that hand-author feeds, it's
> certainly a bit of a bother writing XML by hand (I've screwed it up
> myself a few times), but I'm not sure the
> Gemini community is at the point where we have users that are confident
> with Gemtext but not with HTML.

> If there's just a shell script parsing text and making socat commands, what's
> stopping an author from using `jq` or `xidel` on the output to parse
> JSON and XML respectively? I realize in code it's not that simple, but
> most languages in general use have JSON and XML parsers easily available
> and generally well documented.

It's true that us early adopters are comfortable enough to handle this,
and that publishing Atom or RSS has distinct advantages. But I hope we
get to the point where it's not just us early adopters. If we can make
it simpler for newcomers, we might get there sooner.

I think for Gemini to grow and thrive and have new content that sustains
it, the ecosystem needs to:

- Convert web readers to Gemini readers, so there is more audience
- Convert Gemini readers to Gemini publishers, so there is more content
- Convert Gemini publishers to Gemini hosts, so there are more domains

That's a funnel, basically. At each stage, some fraction of people will
stop and not move to the next stage. The simpler we make each step, the
farther through the funnel the average user will get. So I want to push
on all of these things. Today I'm attempting to make the syndication
part of publishing simpler.

There's also nothing wrong with continuing to publish Atom or RSS. I can
think of two big advantages: You would still have to use a feed format
in order to syndicate beyond Gemini. You could declare exactly which
links you wanted to publish, which is nicer for your readers, so readers
should prefer it when it's offered. In other words, incentives still
exist to make the upgrade. I just don't think there is a cost to
building clients that can subscribe to your Gemtext page even if you
haven't made a feed.

> One of my favorite parts of Gemini is the attitude that it works with the
> rest of the net, as opposed to trying to cloister itself like the Web
> does. One of my pet peeves with the web is how it tries to reinvent
> everything. Long-lived connections, push protocols, things that TCP has
> had _forever_.

I feel that. Do you think that, if we merely allowed RSS or Atom, and
did not lean into them as a standard, we would be heading toward a
compatibility problem because a smaller fraction of Gemini pages would
have RSS or Atom feeds? If so, could that be offset by the way
simplicity might encourage more users to create content?

> I still feel like this is NIH, but I'm not a CAPCOM or Spacewalk author
> so a decision like this  isn't my place. As long as we continue to have
> Atom/RSS support, that's all I'd like. I guess if there is a broader
> move to Gemtext feeds, there should be pretty simple methods available
> to convert those into Atom-compatible feeds.

I wouldn't say we're inventing anything??You maybe understood this
already, but I'm not sure whether I was clear: I'm not suggesting we
create a new feed format that happens to have syntax in common with
Gemtext. I'm suggesting that Gemtext, because it contains links, already
is a feed if you interpret it that way.

But anyway, yes, I think aggregators and feed readers should continue to
use RSS and Atom and JSON Feed and would not want that to change.

(Sorry for the repeat email; I failed to send it to the list address.)

Link to individual message.

5. easeout (a) tilde.team (easeout (a) tilde.team)

📅 Sent: 2020-09-08 22:30
📧 Message 5 of 8

On Tue, Sep 08, 2020 at 01:45:27PM -0500, Jacob Moody wrote:
> I don't think something like this would stop people from using something
> like atom or so on as their own feed system, which I find more then fine. If
> servers would like to, they could even offer an automated way of creating
> those types of feeds based on something like the Gemtext implementation. I
> would consider this type of feature to evolve over time for which ever feed
> system is chosen regardless. For me personally I think this satisfies the
> 'complexity when you want it and none if you dont' attitude that I find
> attractive. What I am trying to avoid here is the expectation that servers
> need to provide an xml based feed and the expectation that clients should
> parse it.

Agreed, RSS et al are great and we shouldn't get rid of them. But they
could be viewed as an upgrade path from a basic but functional starting
point.

(Sorry for the repeat message; I failed to send to the list address.)

Link to individual message.

6. Paper (paper (a) tilde.institute)

📅 Sent: 2020-09-09 20:17
📧 Message 6 of 8

I am an author of a feed reader and I thought about implementing this
after I read this, but there are some problems:

How should my feed reader behave if there are links to different paths
on the page? I think I should reject all links from a different domain
and from a different protocol, but there could be a link to the same
domain, but to a different path.

=> / back
=> /gemlog/1/ post 1
=> /gemlog/2/ post 2
=> /contact/ contact to me

Here, it is visible which ones are the blog entries and which ones are
something else, but a program would have to guess. I don't think it
would be a good idea to hardcode that everything with gemlog in the path
is a feed item, that would cause too much confusion. Also, different
clients would behave differently adding to the confusion.

It is a great idea, but I think we could instead write a script or a
cli-program which would output Atom/RSS/Json feed for a gemini URL
given. This program would do the neccessary guessing and parsing and
output something that can be understood by all RSS libraries and
clients. Some clients I know support running a command to get a feed.

Or we could just use gemfeed to generate Atom for us.

Link to individual message.

7. easeout (a) tilde.team (easeout (a) tilde.team)

📅 Sent: 2020-09-09 23:54
📧 Message 7 of 8

On Wed, Sep 09, 2020 at 04:17:43PM -0400, Paper wrote:
> I am an author of a feed reader and I thought about implementing this
> after I read this, but there are some problems:
> 
> How should my feed reader behave if there are links to different paths
> on the page? I think I should reject all links from a different domain
> and from a different protocol, but there could be a link to the same
> domain, but to a different path.
> 
> => / back
> => /gemlog/1/ post 1
> => /gemlog/2/ post 2
> => /contact/ contact to me
> 
> Here, it is visible which ones are the blog entries and which ones are
> something else, but a program would have to guess. I don't think it
> would be a good idea to hardcode that everything with gemlog in the path
> is a feed item, that would cause too much confusion. Also, different
> clients would behave differently adding to the confusion.

I'm suggesting only a bare bones, dead simple, worse-is-better,
get-what-you-can-with-what's-on-hand version of syndication. I would
just collect all links on the page and not devise an intelligent way to
filter them, as if we were scraping web pages for only the good stuff. I
agree that would lead to divergent behavior among clients.

So let it be just the simplest possible implementation. The upgrade path
to a nicer experience would be for the author to publish RSS or Atom in
order to select just the right links for readers.

> It is a great idea, but I think we could instead write a script or a
> cli-program which would output Atom/RSS/Json feed for a gemini URL
> given. This program would do the neccessary guessing and parsing and
> output something that can be understood by all RSS libraries and
> clients. Some clients I know support running a command to get a feed.
> 
> Or we could just use gemfeed to generate Atom for us.

Generating Atom et al is the kind of thing many Gemini authors are
already doing (like me!), and those folks certainly don't need to worry
about this idea. When a feed is offered, I think users will prefer to
use it.

But if this takes off, new authors who haven't taken the step of
generating a feed will be more tied into the conversation already. For
instance, I was not going to get any blog replies or readership until I
did the work to publish Atom so I could then submit to CAPCOM. An 80%
solution would have helped me get engaged as a newcomer. It would also
help future users as Gemini's audience slowly broadens to include a less
technical crowd.

One more benefit for us early adopters: If we lower the barrier to entry
for others, it will mean more content for us to read and engage with.

Link to individual message.

8. easeout (a) tilde.team (easeout (a) tilde.team)

📅 Sent: 2020-09-10 00:04
📧 Message 8 of 8

On Wed, Sep 09, 2020 at 07:54:12PM -0400, easeout at tilde.team wrote:

> I'm suggesting only a bare bones, dead simple, worse-is-better,
> get-what-you-can-with-what's-on-hand version of syndication. I would
> just collect all links on the page and not devise an intelligent way to
> filter them, as if we were scraping web pages for only the good stuff. I
> agree that would lead to divergent behavior among clients.

In hindsight I think I was unclear, so let me try to restate:

I would just collect every link on the page. I would not devise an
intelligent way to filter them. That would be like scraping web pages
with guesses and approximations, which we don't need to do. I would not
want clients to compete on differentiated Gemtext-subscribing behavior
when RSS / Atom are the right answer for an upgrade path.

Link to individual message.

---

Previous Thread: Language tags in Luke's "A proposed scheme for parsing preformatted alt text"

Next Thread: When thinking about feed formats, consider non-blog uses, too