💾 Archived View for gemini.patatas.ca › posts › roll_your_own_rss.gmi captured on 2024-03-21 at 14:50:21. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
2024-02-21
It's a fair question. There are loads of RSS feed-building services out there, and I'm sure they do the job just fine. But when it comes to my website, I like taking the amateur DIY route as much as I can.
Masochism, you say? Sure, maybe a little. But it's also really satisfying to get your hands dirty sometimes. There's a meaningful perspective shift that happens as you begin to demystify some of this stuff: you start seeing yourself less as a consumer when it comes to tech, and more like a handyperson, a participant in a process.
Besides, in this case, "really simple" is right there in the name. How hard could it be?
As it turns out, not very.
So here's how I built a basic RSS feed for my website, and as a bonus, a few rss buttons you can add to your own site.
--:--
An RSS feed is defined by an XML document saved somewhere in your site's directory structure. For this example, let's make a directory '/rss' with an empty file 'rss.xml' inside it.
Next, we'll open rss.xml and start editing. I highly recommend typing everything yourself instead of copy-pasting. The first couple of lines describe the document itself:
<?xml version="1.0" encoding="utf-8"?> <rss version="2.0">
After that, we define some required information about our site. You can find your country/region's language code here:
https://www.rssboard.org/rss-language-codes
<channel> <title>Your Website's Name</title> <link>https://your.website.tld/</link> <description>Describe your website in one or two sentences</description> <language>en-ca</language>
You probably want your site's logo to appear with your feed too.
According to the W3C's RSS 2.0 specification, the image's title should be the same as the channel title from before, and your image needs to be a PNG, JPEG or GIF. The link element is where the image links to if clicked:
<image> <title>Your Website's Name</title> <url>https://your.website.tld/images/yourlogo.png</url> <link>https://your.website.tld/</link> </image>
There are other (optional) elements you can add here, e.g. copyright information, contact emails, publication & update time/date, content rating, etc. To find out more about these, see:
https://www.rssboard.org/rss-specification
And if you're curious like I was, here's a detailed look at the ARPANET-era RFC822 time & date format:
https://whitep4nth3r.com/blog/how-to-format-dates-for-rss-feeds-rfc-822/
Here's where we can start adding the actual feed items. I'll use the web version of this post for the example below, and save myself some time once I'm ready to publish:
<item> <title>Roll Your Own RSS Feed</title> <link>https://thedabbler.patatas.ca/pages/roll_your_own_rss.html</link> <guid>https://thedabbler.patatas.ca/pages/roll_your_own_rss.html</guid> <pubDate>Wed, 21 Feb 2024 00:00:00 EST</pubDate> <description>A step-by-step guide to creating your own RSS feed from scratch</description> <category>syndication</category> </item>
The <title>, <link>, and <description> elements should be self-explanatory, and the links in the previous section explain the date/time format.
Short answer: not always.
Long answer: The idea here is that the guid (globally unique identifier) helps aggregators determine if an item is new or not. In theory, the guid can be any string of characters, as long as that string is unique.
This comes with a caveat though. According to the W3C spec, there's an optional attribute of the <guid> element that's called 'isPermaLink', and its default value is "true". So if your guid is some arbitrary string, make sure isPermaLink is set to "false", or feed aggregators will think it's supposed to be a URL:
<guid isPermaLink="false">thedabbler-blog-post-7254<guid>
I don't need this though, so I just copy-pasted the URL.
Lastly, there's the <category> element(s). Put as many as you like. Treat them like keywords.
Continue adding items to your feed, and when you're done, close out the file by adding </channel> and </rss> at the end.
--:--
At this point, you should have a functional, standards compliant RSS feed. Validate compliance either at either one of these locations:
https://validator.w3.org/feed/
https://www.rssboard.org/rss-validator/
(note: if you're getting 'XML parsing error' at this step, read on to the end of this post to learn about CDATA and escape characters). You can also try subscribing to it with your feed reader.
If this all works, you could definitely just put a link button on your page and call it a day. There are a couple other nice additions worth considering, though.
First off: there are browser plugins that can automatically let you know if there's an RSS feed available for the page you're browsing. Putting a regular link on your site won't trigger this - but putting something like this in the <head> of your page's HTML will:
<link rel="alternate" type="application/rss+xml" title="Your Site's RSS" href="https://your.website.tld/rss/rss.xml">
Second: If you're using a feed reader a lot, you probably appreciate having the full text of articles delivered to the reader. Less distraction, less page-loading, less advertising, pop-ups, etc. I searched the web using terms like "rss feed include fulltext", but couldn't find any instructions for how to get this to work. I even cracked open the XML files of a couple of blogs I follow, just to see if I could figure it out that way, but only got part-way there.
Eventually I discovered I'd gone down a blind alley; including full-text in a feed is an extension provided by the Atom spec, and not RSS proper. So to add full-text, we first need to change the <rss> tag to this:
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
and add the following to our file somewhere before the first <item> element:
<atom:link href="http://thedabbler.patatas.ca/rss.xml" rel="self" type="application/rss+xml"/>
Now we're ready to start adding fulltext to our items.
Well, almost. One more bit of housekeeping first.
If you want to put plaintext without HTML markup, you can simply use a <content> element, but it would be a lot cooler (and a lot more readable) to render the HTML in the feed reader. So, let's use <content:encoded> instead. (Note: this needs to go after the item's <description> element.)
But if we simply copy-paste the HTML from our post in, it won't render. Why? Because the 'content:encoded' element doesn't actually stop a feed reader from trying to use the HTML as part of the XML document's code. Every time it sees a character like '&', '<', or '>' it reads it as part of a command.
The solution is to wrap our HTML inside another layer called CDATA, which tells the reader that everything inside should be interpreted as a string of characters, instead of as code:
<content:encoded> <![CDATA[ -your post's HTML- ]]> </content:encoded>
Once this is done, go ahead and check your feed again to verify it's still valid.
One thing that might still cause (minor) errors is if you have any special characters in your text that aren't part of the HTML markup. My understanding is that it's possible they'll get ignored by a modern browser, but not by someone's feed reader.
So if you didn't already, be sure to use
& instead of &
< instead of <
> instead of >
in your posts wherever those characters aren't part of the HTML markup itself.
The same thing goes for other places in this XML document, like the description text or post titles: use those escape characters and avoid errors!
--:--
The full document should end up looking something like this:
<?xml version="1.0" encoding="utf-8"?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"> <channel> <title>The Dabbler</title> <link>https://thedabbler.patatas.ca/</link> <description> Web home of @smallpatatas. Writing about whatever strikes my fancy. Most often that means some combination of: the Fediverse, beginner coding and server administration, music-making, politics, cooking, art, economics, philosophy, technology. </description> <language>en-ca</language> <image> <title>The Dabbler</title> <url>https://thedabbler.patatas.ca/images/favicon.png</url> <link>https://thedabbler.patatas.ca/</link> </image> <atom:link href="http://thedabbler.patatas.ca/rss.xml" rel="self" type="application/rss+xml"/> <item> <title>Roll Your Own RSS Feed</title> <link>https://thedabbler.patatas.ca/pages/roll_your_own_rss.html</link> <guid>https://thedabbler.patatas.ca/pages/roll_your_own_rss.html</guid> <pubDate>Wed, 21 Feb 2024 00:00:00 EST</pubDate> <description>A step-by-step guide to creating your own RSS feed from scratch</description> <category>syndication</category> <category>RSS</category> <category>tutorial</category> <content:encoded> <![CDATA[ <div> <p>An infinitely recursive loop of code</p> <p>More HTML from the post</p> </div> ]]> </content:encoded> </item> <item> <title>Another Post Containing Tips & Advice</title> <link>https://thedabbler.patatas.ca/pages/another_post_url</link> <guid>https://thedabbler.patatas.ca/pages/another_post_url</guid> <pubDate>Sat, 17 Feb 2024 00:00:00 EST</pubDate> <description>Some tips & advice for doing other cool stuff on the web</description> <category>cool stuff</category> <category>neat things</category> <content:encoded> <![CDATA[ <div> <p>Some HTML from the fun cool post</p> <p>More HTML from the rad & wonderful post</p> </div> ]]> </content:encoded> <item> </channel> </rss>
Once your feed is up and running, reward your hard work by grabbing one of these vintage 'chicklets', or generating your own at
https://sikosis.com/tools/chicklet/index.php
and adding it to your page. I'll add more as I find them.
What better way to signal your DIY-web street cred to the world?
../images/chiclets/radiofeed.png
../images/chiclets/rss-2.0-2.png
../images/chiclets/rss2.0-feed.png
../images/chiclets/rss-2.0.png
../images/chiclets/rss-button2.png
../images/chiclets/rss-button3.png
../images/chiclets/rss-button.png
../images/chiclets/rss_icon_11x11.png
../images/chiclets/rss_icon_15x15.png
--:--