💾 Archived View for gemi.dev › gemini-mailing-list › 000364.gmi captured on 2024-03-21 at 17:27:18. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
Recently on the mailing list and in #gemini we've been talking about syndication, as in RSS / Atom. I want to begin by summarizing some of our thoughts. # Existing formats RSS and Atom are well-supported formats. They are XML, but you can probably import a parser / generator rather than having to write one yourself. JSON Feed is a well-supported format, and is regarded as simpler than RSS / Atom. Even if you had to write your own parser, you would be able to build off a JSON parser because those are available for basically every language. They are also often simpler to build on than an XML parser. RSS, Atom, and JSON Feed publish not just Gemini, but other protocols. They are commonly served over not just Gemini, but other protocols. And existing feed readers generally expect feeds to be in one of these formats. So if you want your site to be syndicated in the existing internet feed reader ecosystem, using one of these formats is the price of entry. CAPCOM expects Atom, therefore Atom became a common way to publish Gemini blogs. This works, and does not need to change unless an alternative would be markedly better. # Making syndication easier However, some of us are concerned with the complexity of creating a feed alongside creating Gemini content. If one of our goals is to make Gemini content easy to publish, making it easy to syndicate is an attractive goal too. And it is jarring to go from writing Gemtext, which is remarkably simple and straightforward, to writing Atom XML, or to finding a static site generator tool that will do it for you. This feels like a cost. It might remind us of the bloat on the web we are trying to do without. Or it might be a barrier to entry for users who would be confident publishing Gemtext, but not confident publishing Atom, or for that matter, a web page. To address the fact that creating a feed is a few steps more difficult than creating a Gemtext page, one idea is to make a new feed format which is written in Gemtext and conforms to a pattern. While it's true this would be easy to write, there is no existing parser ecosystem for it, so it would be effectively Gemini-only. If you wanted to syndicate to the broader internet, you would still need to publish something like Atom. Furthermore, although parsing Gemtext is not as difficult as Atom or JSON Feed if you had to start from plain text and work your way up, existing formats have off-the-shelf parsers in many languages, so you are not starting from the bottom. But with a brand new format based on Gemtext, you would have to. So we are not all sold on the idea of a Gemtext-based feed format. However, one positive characteristic such a format is that it would be browsable as a regular Gemtext page. And that got us thinking: If regular Gemtext pages are easy to parse, can we just treat any Gemtext page as a feed? In fact, yes we can. You can poll plain Gemtext pages, not feed format files, and detect the links on the page that are new. And we don't need a Gemtext feed format pattern, because the pattern of link lines beginning with "=>" is enough. # Subscribing to any Gemtext page To get a feel for that, I did a proof of concept that you can try out: => https://github.com/kconner/gemini-subscription-cli Proof of Concept This repository contains a Makefile and a shell script. The script takes a list of Gemini URLs to fetch, which are expected to be Gemtext pages. It filters them down to just the link lines. It compares the set of link lines with a previously-fetched set of link lines, and it identifies and emits those that are new. It saves the new total set as the next run's previous set for comparison. If you want, you can clone the repo and then try it out. Subscribe to some pages: $ echo 'gemini://gemini.circumlunar.space:1965/servers/' >> subscribed-urls $ echo 'gemini://tilde.team/~easeout/glog/' >> subscribed-urls Fetch all the links, since we don't have a prior set to compare to: $ make fetched: gemini://gemini.circumlunar.space:1965/servers/ fetched: gemini://tilde.team/~easeout/glog/ wrote: new-links.gmi => gemini://gus.guru/known-hosts Gemini hosts known to GUS => gemini://gemini.conman.org => gemini://zaibatsu.circumlunar.space => gemini://carcosa.net ? Later, update again and identify what's new. (To test right away, you can delete a few lines from cached-links.gmi.) $ make fetched: gemini://gemini.circumlunar.space:1965/servers/ fetched: gemini://tilde.team/~easeout/glog/ wrote: new-links.gmi => gemini://breadpunk.club => 2020-09-07-re-a-wordpress-confession.gmi 7 September 2020: Re: A Wordpress Confession # Where does this functionality belong? This proof of concept demonstrates that any Gemini page can already be treated like a feed. This means that, if you don't mind not participating in the broader internet feed reader ecosystem, and you don't mind the occasional page redesign creating noise in subscription results, you don't have to make a feed file at all. Syndication is zero-cost. In hindsight, why does RSS exist? What is "Really Simple" about it? Well, it's simpler than HTML. If HTML was easy enough to parse, you could just subscribe to web pages and not need feed files. But Gemtext is already easy to parse. We don't our experience on the web to mislead us into thinking a page, like your blog index page, is not already a subscribable list of links. It's been pointed out this is not very different from Spacewalk, which (I think?) works by just polling Gemini pages and updating when they change, as opposed to when their links change. Both of these are useful and easy ways to subscribe to any Gemtext page. # Closing Gemtext authors could feel like they are not burdened with creating a feed file if browsers, feed readers, and aggregators allowed subscribing to Gemtext pages directly. When as an author you understand that a feed file is unnecessary, the pressure to create a feed file is lifted. But for that to happen, you have to see direct Gemtext subscription working in the wild. One way that could work is for a browser to have a way to treat a bookmark like a feed subscription. I also think if CAPCOM would accept Gemtext pages as a feed URL option, that would go a long way.
easeout at tilde.team writes: > However, some of us are concerned with the complexity of creating a feed > alongside creating Gemini content. If one of our goals is to make Gemini > content easy to publish, making it easy to syndicate is an attractive > goal too. And it is jarring to go from writing Gemtext, which is > remarkably simple and straightforward, to writing Atom XML, or to > finding a static site generator tool that will do it for you. This feels > like a cost. It might remind us of the bloat on the web we are trying to > do without. Or it might be a barrier to entry for users who would be > confident publishing Gemtext, but not confident publishing Atom, or for > that matter, a web page. At this point in time, how many folks can write Gemtext but aren't confident writing HTML? Yes, for those that hand-author feeds, it's certainly a bit of a bother writing XML by hand (I've screwed it up myself a few times), but I'm not sure the Gemini community is at the point where we have users that are confident with Gemtext but not with HTML. > # Subscribing to any Gemtext page > > To get a feel for that, I did a proof of concept that you can try out: > > => https://github.com/kconner/gemini-subscription-cli Proof of Concept > > This repository contains a Makefile and a shell script. The script takes > a list of Gemini URLs to fetch, which are expected to be Gemtext pages. > It filters them down to just the link lines. It compares the set of link > lines with a previously-fetched set of link lines, and it identifies and > emits those that are new. It saves the new total set as the next run's > previous set for comparison. If there's just a shell script parsing text and making socat commands, what's stopping an author from using `jq` or `xidel` on the output to parse JSON and XML respectively? I realize in code it's not that simple, but most languages in general use have JSON and XML parsers easily available and generally well documented. > This proof of concept demonstrates that any Gemini page can already be > treated like a feed. This means that, if you don't mind not > participating in the broader internet feed reader ecosystem, and you > don't mind the occasional page redesign creating noise in subscription > results, you don't have to make a feed file at all. Syndication is > zero-cost. One of my favorite parts of Gemini is the attitude that it works with the rest of the net, as opposed to trying to cloister itself like the Web does. One of my pet peeves with the web is how it tries to reinvent everything. Long-lived connections, push protocols, things that TCP has had _forever_. > > In hindsight, why does RSS exist? What is "Really Simple" about it? > Well, it's simpler than HTML. If HTML was easy enough to parse, you > could just subscribe to web pages and not need feed files. But Gemtext > is already easy to parse. We don't our experience on the web to mislead > us into thinking a page, like your blog index page, is not already a > subscribable list of links. Small aside: Semantically annotated tags have been a dream for the W3C for quite some time now, and annotating feed links and other tags is the purpose behind Semantic Web initiatives. The web has definitely tried to allow in-band feeds. > Gemtext authors could feel like they are not burdened with creating a > in the wild. One way that could work is for a browser to have a way to > treat a bookmark like a feed subscription. I also think if CAPCOM would > accept Gemtext pages as a feed URL option, that would go a long way. I still feel like this is NIH, but I'm not a CAPCOM or Spacewalk author so a decision like this isn't my place. As long as we continue to have Atom/RSS support, that's all I'd like. I guess if there is a broader move to Gemtext feeds, there should be pretty simple methods available to convert those into Atom-compatible feeds. - meff
I talked about this a bit on #gemini but though I could collect some of my points here for posterity. On 9/8/20 12:31 PM, easeout at tilde.team wrote: > RSS and Atom are well-supported formats. They are XML, but you can > probably import a parser / generator rather than having to write one > yourself. Part of the beauty of gemini to me is that there is really no bar to entry in terms of requirements of what the system has to have before you start. There is tls but I think having this encryption requirement is more then reasonable(even if I have reservations on the design of tls). The requirements on using something that uses XML would limit the ecosystems that gemini could exist on. I loved seeing the adaption in otherwise limited software stacks like plan9, in which the complexity of things like xml parsing has prevented a readily usable library to exist on the platform. > This proof of concept demonstrates that any Gemini page can already be > treated like a feed. This means that, if you don't mind not > participating in the broader internet feed reader ecosystem, and you > don't mind the occasional page redesign creating noise in subscription > results, you don't have to make a feed file at all. Syndication is > zero-cost. This is fantastic. With just one page could be used for both a human readable index as well as something like a feed. Having a single page for both I think is really in the spirit of how I interpreted this protocol. The fact that you could whip up something this small to process feeds like is a testament to the simplicity of this type of system already. > Gemtext authors could feel like they are not burdened with creating a > feed file if browsers, feed readers, and aggregators allowed subscribing > to Gemtext pages directly. When as an author you understand that a feed > file is unnecessary, the pressure to create a feed file is lifted. But > for that to happen, you have to see direct Gemtext subscription working > in the wild. One way that could work is for a browser to have a way to > treat a bookmark like a feed subscription. I also think if CAPCOM would > accept Gemtext pages as a feed URL option, that would go a long way. I don't think something like this would stop people from using something like atom or so on as their own feed system, which I find more then fine. If servers would like to, they could even offer an automated way of creating those types of feeds based on something like the Gemtext implementation. I would consider this type of feature to evolve over time for which ever feed system is chosen regardless. For me personally I think this satisfies the 'complexity when you want it and none if you dont' attitude that I find attractive. What I am trying to avoid here is the expectation that servers need to provide an xml based feed and the expectation that clients should parse it. Thanks, Moody
On Tue, Sep 08, 2020 at 11:17:42AM -0700, Meff wrote: > At this point in time, how many folks can write Gemtext but aren't > confident writing HTML? Yes, for those that hand-author feeds, it's > certainly a bit of a bother writing XML by hand (I've screwed it up > myself a few times), but I'm not sure the > Gemini community is at the point where we have users that are confident > with Gemtext but not with HTML. > If there's just a shell script parsing text and making socat commands, what's > stopping an author from using `jq` or `xidel` on the output to parse > JSON and XML respectively? I realize in code it's not that simple, but > most languages in general use have JSON and XML parsers easily available > and generally well documented. It's true that us early adopters are comfortable enough to handle this, and that publishing Atom or RSS has distinct advantages. But I hope we get to the point where it's not just us early adopters. If we can make it simpler for newcomers, we might get there sooner. I think for Gemini to grow and thrive and have new content that sustains it, the ecosystem needs to: - Convert web readers to Gemini readers, so there is more audience - Convert Gemini readers to Gemini publishers, so there is more content - Convert Gemini publishers to Gemini hosts, so there are more domains That's a funnel, basically. At each stage, some fraction of people will stop and not move to the next stage. The simpler we make each step, the farther through the funnel the average user will get. So I want to push on all of these things. Today I'm attempting to make the syndication part of publishing simpler. There's also nothing wrong with continuing to publish Atom or RSS. I can think of two big advantages: You would still have to use a feed format in order to syndicate beyond Gemini. You could declare exactly which links you wanted to publish, which is nicer for your readers, so readers should prefer it when it's offered. In other words, incentives still exist to make the upgrade. I just don't think there is a cost to building clients that can subscribe to your Gemtext page even if you haven't made a feed. > One of my favorite parts of Gemini is the attitude that it works with the > rest of the net, as opposed to trying to cloister itself like the Web > does. One of my pet peeves with the web is how it tries to reinvent > everything. Long-lived connections, push protocols, things that TCP has > had _forever_. I feel that. Do you think that, if we merely allowed RSS or Atom, and did not lean into them as a standard, we would be heading toward a compatibility problem because a smaller fraction of Gemini pages would have RSS or Atom feeds? If so, could that be offset by the way simplicity might encourage more users to create content? > I still feel like this is NIH, but I'm not a CAPCOM or Spacewalk author > so a decision like this isn't my place. As long as we continue to have > Atom/RSS support, that's all I'd like. I guess if there is a broader > move to Gemtext feeds, there should be pretty simple methods available > to convert those into Atom-compatible feeds. I wouldn't say we're inventing anything??You maybe understood this already, but I'm not sure whether I was clear: I'm not suggesting we create a new feed format that happens to have syntax in common with Gemtext. I'm suggesting that Gemtext, because it contains links, already is a feed if you interpret it that way. But anyway, yes, I think aggregators and feed readers should continue to use RSS and Atom and JSON Feed and would not want that to change. (Sorry for the repeat email; I failed to send it to the list address.)
On Tue, Sep 08, 2020 at 01:45:27PM -0500, Jacob Moody wrote: > I don't think something like this would stop people from using something > like atom or so on as their own feed system, which I find more then fine. If > servers would like to, they could even offer an automated way of creating > those types of feeds based on something like the Gemtext implementation. I > would consider this type of feature to evolve over time for which ever feed > system is chosen regardless. For me personally I think this satisfies the > 'complexity when you want it and none if you dont' attitude that I find > attractive. What I am trying to avoid here is the expectation that servers > need to provide an xml based feed and the expectation that clients should > parse it. Agreed, RSS et al are great and we shouldn't get rid of them. But they could be viewed as an upgrade path from a basic but functional starting point. (Sorry for the repeat message; I failed to send to the list address.)
I am an author of a feed reader and I thought about implementing this after I read this, but there are some problems: How should my feed reader behave if there are links to different paths on the page? I think I should reject all links from a different domain and from a different protocol, but there could be a link to the same domain, but to a different path. => / back => /gemlog/1/ post 1 => /gemlog/2/ post 2 => /contact/ contact to me Here, it is visible which ones are the blog entries and which ones are something else, but a program would have to guess. I don't think it would be a good idea to hardcode that everything with gemlog in the path is a feed item, that would cause too much confusion. Also, different clients would behave differently adding to the confusion. It is a great idea, but I think we could instead write a script or a cli-program which would output Atom/RSS/Json feed for a gemini URL given. This program would do the neccessary guessing and parsing and output something that can be understood by all RSS libraries and clients. Some clients I know support running a command to get a feed. Or we could just use gemfeed to generate Atom for us.
On Wed, Sep 09, 2020 at 04:17:43PM -0400, Paper wrote: > I am an author of a feed reader and I thought about implementing this > after I read this, but there are some problems: > > How should my feed reader behave if there are links to different paths > on the page? I think I should reject all links from a different domain > and from a different protocol, but there could be a link to the same > domain, but to a different path. > > => / back > => /gemlog/1/ post 1 > => /gemlog/2/ post 2 > => /contact/ contact to me > > Here, it is visible which ones are the blog entries and which ones are > something else, but a program would have to guess. I don't think it > would be a good idea to hardcode that everything with gemlog in the path > is a feed item, that would cause too much confusion. Also, different > clients would behave differently adding to the confusion. I'm suggesting only a bare bones, dead simple, worse-is-better, get-what-you-can-with-what's-on-hand version of syndication. I would just collect all links on the page and not devise an intelligent way to filter them, as if we were scraping web pages for only the good stuff. I agree that would lead to divergent behavior among clients. So let it be just the simplest possible implementation. The upgrade path to a nicer experience would be for the author to publish RSS or Atom in order to select just the right links for readers. > It is a great idea, but I think we could instead write a script or a > cli-program which would output Atom/RSS/Json feed for a gemini URL > given. This program would do the neccessary guessing and parsing and > output something that can be understood by all RSS libraries and > clients. Some clients I know support running a command to get a feed. > > Or we could just use gemfeed to generate Atom for us. Generating Atom et al is the kind of thing many Gemini authors are already doing (like me!), and those folks certainly don't need to worry about this idea. When a feed is offered, I think users will prefer to use it. But if this takes off, new authors who haven't taken the step of generating a feed will be more tied into the conversation already. For instance, I was not going to get any blog replies or readership until I did the work to publish Atom so I could then submit to CAPCOM. An 80% solution would have helped me get engaged as a newcomer. It would also help future users as Gemini's audience slowly broadens to include a less technical crowd. One more benefit for us early adopters: If we lower the barrier to entry for others, it will mean more content for us to read and engage with.
On Wed, Sep 09, 2020 at 07:54:12PM -0400, easeout at tilde.team wrote: > I'm suggesting only a bare bones, dead simple, worse-is-better, > get-what-you-can-with-what's-on-hand version of syndication. I would > just collect all links on the page and not devise an intelligent way to > filter them, as if we were scraping web pages for only the good stuff. I > agree that would lead to divergent behavior among clients. In hindsight I think I was unclear, so let me try to restate: I would just collect every link on the page. I would not devise an intelligent way to filter them. That would be like scraping web pages with guesses and approximations, which we don't need to do. I would not want clients to compete on differentiated Gemtext-subscribing behavior when RSS / Atom are the right answer for an upgrade path.
---
Previous Thread: Language tags in Luke's "A proposed scheme for parsing preformatted alt text"
Next Thread: When thinking about feed formats, consider non-blog uses, too