💾 Archived View for ew.srht.site › en › 2022 › 20220917-musings-on-smolzine-2.gmi captured on 2024-05-26 at 14:39:39. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
I wrote:
And what is in this list now? Ok, I can look at the 203 entries. But I can try to extract the domain name of the capsules and see, whether some are mentioned a lot more than others
/en/2022/20220915-musings-on-smolzine.gmi
kelbot was quick to point out, that my shell one liner seemed a bit strange, since pollux.casa and flounder were indeed referenced in SmolZINE multiple times. He was right, of course. I had not "normalized" the domains to their last two fields. So, try it again, Sam:
So let's explain the resulting one liner in pieces. From all issues of SmolZINE, they being files in the current directory, grep the gemini links. Be sure to supress the file name (-h). Also be sure to anchor the search at the beginning of the line with '^':
grep -h '^=> gemini://' smolzine-issue-*.gmi |
The resulting lines start with the Marker for links and a blank (optional but apparently always present in this output). So I ask good ol'awk to just print the second string:
awk '{print $2;}' |
Now I have a list of all the gemini:// entries. In order to extract the highest two parts of any entry, I ask good ol'sed to massage the strings.