💾 Archived View for bbs.geminispace.org › u › oldernow › 12069 captured on 2023-12-28 at 16:43:51. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Comment by ☯️ oldernow

Re: "Lagrange plugins?"

In: s/Lagrange

@skyjake Thanks for the "MIME hooks" documentation pointer!

However, I doubt you'll want to use this. The filters you register via MIME hooks are unconditionally run on every matching page. This would make every Gemtext page with links load extremely slowly as it has to finish checking each link before showing the filtered results. Furthermore, you'd run the risk of overburdening slower servers by sending them too many requests at once.

I'm not understanding the "at once" part of that, as I'm running gemget in a loop, subsequent invocations not running until the one before them completes. Or would hitting a server several times in a row, each spaced two seconds apart still be considered being impolite, as it were?

FWIW, I made improvements to the aforementioned Lua script, leveraging obtaining an exit status from os.execute() instead of parsing gemget stderr from io.popen(). (I think there's a way to get command exit status down the io.popen() path, but I'm not remembering the details, and a quick grep'ing of old scripts didn't reveal anything.). I also noticed gemget has a "--connect-timeout" option. Two seconds will probably cause me to miss out on some links unnecessarily, but then I tend toward impatience and probably wouldn't want to be waiting on such links regardless the content given my Gemini travels have me rough-estimating that maybe 1% of links overall ever lead to something that "really does it for me":

#! /usr/bin/env lua
for line in io.stdin:lines() do
	local url,title = string.match(line, '^=>%s*(%S*)%s*(.*)


)
	if url then
		local exit_okay, exit_termination_type, exit_status = os.execute('gemget --connect-timeout 2 ' .. url .. ' 1>/dev/null 2>/dev/null')
		if exit_status == 1 then
			print('DEAD ' .. line)
		else
			print(line)
		end
	else
		print(line)
	end
end

I ran it against Antenna for kicks:

time gemget gemini://warmedal.se/~antenna/ | test-gemini-links 2>&1 | tee test.gmi

That took 1 minute 7 seconds, checking 86 links, eight of which were considered "DEAD". But seven of those were for irrelevant relative URLs. The one that wasn't was a good link that simply took longer than two seconds to pursue.

I'm not sure I even need to use the "MIME hooks" feature given I can run the likes of the above, and then start Lagrange with the resulting file (test.gmi in this case) as a URL argument, and a little scripting could make that happen for me.

BTW, I love how Lagrange simply adds a tab for subsequent command line invocations if there's already an instance running!

☯️ oldernow

Nov 30 · 4 weeks ago

6 Later Comments ↓

☯️ oldernow · Nov 30 at 14:48:

Ooops... my previous comment somehow lead to the initial posting having two titles... not sure why... I'm not going to try to edit it out because I can imagine Murphy's Law having a field day with my not completely understanding editing post titles/segments... :-)

☯️ oldernow · Nov 30 at 14:54:

And, of course, now I'm realizing not all links are going to be gemini:// links, and of course getgem fails on them, making all https:// links look bad... :-) It never takes long to re-remember why I eventually couldn't do software development anymore.... :-) So I guess I'll play with using cURL against https:// links to at least cover that case, which I imagine to be the most frequent non-gemini:// case....

🚀 skyjake · Nov 30 at 15:17:

It is probably best to run this as a separate tool/script as you've noted. It is effectively a little crawler, after all.

An interesting related feature in Lagrange could be the creation of an offline archive by saving contents of a capsule and all of its linked pages up to a chosen depth. That would also determine if any links are bad.

When it comes to making too many requests, multiple ones spaced out a few seconds apart are already too much for the lowest-end hardware running servers in Geminispace.

🍵 michaelnordmeyer · Dec 01 at 07:37:

What happened to Gemini's paradigm "one page, one request?"

☯️ oldernow · Dec 01 at 16:29:

Now that I know the "two finger tap" trick to properly display a working context menu in Lagrange, exercising my "check links" code goes like this:

@skyjake that makes me wonder whether there's already (or might be in the future..) the means to customize the "context menu"? It would be nice to be able to add an action that calls a user script to process and return a modified version of the page being viewed - in place. Then the above list could be reduced to just the first bullet.

🚀 skyjake · Dec 01 at 16:44:

At the moment, there is no way for one to add actions to the context menu. It's a nice idea, though. I'll add it to my list of things to do.

Original Post

🌒 s/Lagrange

Lagrange plugins? — Lagrange plugins? (Apologies in advance for being took lazy to scan Lagrange documentation, and for likely using clunky terminology in what follows.) Does Lagrange support adding something akin to a plugin that Lagrange would pass the gemtext about to be displayed, but then *instead* render the gemtext modified/output/passed back by the plugin? I ask, because I played a bit with a simple "detect bad links" idea, which at the moment is this Lua script: [preformatted] In...

💬 oldernow · 8 comments · Nov 29 · 4 weeks ago