πΎ Archived View for gemi.dev βΊ gemini-mailing-list βΊ 000317.gmi captured on 2023-11-04 at 12:40:55. Gemini links have been rewritten to link to archived content
β‘οΈ Next capture (2023-12-28)
-=-=-=-=-=-=-
Hello Geminauts, Earlier today I added proxy support to the master branch of gemget[1]. This allows you to specify a host like 1.2.3.4, or example.com:3000, that will be sent the Gemini request instead of the host specified in the URL. This kind of proxying is explicitly support by the Gemini spec, you can read more about it in section 2. I will also be adding proxy support to Amfora as well, so that if enabled, all requests will go through the proxy. However to my knowledge, no such server exists that can act as a general proxy. I'd be happy to hear about any, if I've missed something. I forsee that they could be used in the same way Squid[2] is for the web, as a cache. I think this could work even better for Gemini then it does for the web, as documents are more compact and are much less dynamic. They could also be used for privacy, as many web proxies are. I have to wonder whether anyone would actually use this. What do you all think of this workflow/feature? Is it potentially useful? I'm vaguely interested in writing such a server, but I'd be happy if anyone else beats me to the punch. Keep in mind that server should also cache and serve TLS certs, so that they match the domain being requested. 1: https://github.com/makeworld-the-better-one/gemget/commit/23048d309ed7f7 5cab79bb5daecc4a60f3b1587f 2: http://www.squid-cache.org/ makeworld P.S. See the FetchWithHost function in go-gemini for a code example of how the proxying works.
colecmac at protonmail.com writes: > I have to wonder whether anyone would actually use this. What do you > all think of this workflow/feature? Is it potentially useful? I do think a caching proxy would be potentially useful, especially as few clients do caching themselves. I could also see it useful for offline use (client spiders through local proxy while online, client reads offline through local proxy). The tricky bit is, of course, cache expiration. We don't provide as much useful information to the cache as http does. You'll need to use heuristics not only for how long to cache, but to provide a way to force a refresh, which isn't covered in the spec. -- +-----------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | A flower falls, even though we love it; and a weed grows, | | even though we do not love it. -- Dogen |
On Sun Aug 9, 2020 at 7:55 PM CEST, Jason McBrayer wrote: > colecmac at protonmail.com writes: > > > I have to wonder whether anyone would actually use this. What do you > > all think of this workflow/feature? Is it potentially useful? > > I do think a caching proxy would be potentially useful, especially as > few clients do caching themselves. I could also see it useful for > offline use (client spiders through local proxy while online, client > reads > offline through local proxy). > > The tricky bit is, of course, cache expiration. We don't provide as much > useful information to the cache as http does. You'll need to use > heuristics not only for how long to cache, but to provide a way to force > a refresh, which isn't covered in the spec. If nothing else, I think there's a small, useful role for a very simple caching proxy which just maintains an in-memory cache of everything visited which is valid for, say, five minutes. The reason this is useful is that navigating Gemini capsules often involves extensive use of the "back button" (and, at least for me on AV-98, the "up" command, which gets you from example.com/foo/bar/baz to to example.com/foo/bar/ even in the lack of an explicit link, and the "root" command, which gets you from example.com/foo/bar/baz to to example.com), which often entails loading the same page repeatedly in a relatively short window of time. A very simple dumb local cache would cut down on transactions for this. For people like me who often read something in Gemini/Gopherspace, then want to reference a few days later but cannot remember where they read it, a proxy which maintained full-text search of everything visited in the past month or so would be *super* handy, but I have no idea how to build such a thing. I'm kind of attracted to the idea of small, simple, do-one-thing-well proxies which can be chained together like "filter" programs in a pipeline...but I guess the TLS overhead would stack up quickly, encouraging a kind of highly configurable "Swiss army knife" proxy instead. Not as pretty, but potentially very useful... Cheers, Solderpunk
"Solderpunk" <solderpunk at posteo.net> writes: > For people like me who often read something in Gemini/Gopherspace, > then want to reference a few days later but cannot remember where they > read it, a proxy which maintained full-text search of everything > visited in the past month or so would be *super* handy, but I have no > idea how to build such a thing. I wrote such a thing for HTTP 1.0 back around 2000. The performance was terrible, and it quickly became obsolete (I wasn't up to implementing HTTP 1.1 features like chunked transfer at the time), and the spread of SSL put a final nail in its coffin. But it was fun at the time. -- +-----------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | A flower falls, even though we love it; and a weed grows, | | even though we do not love it. -- Dogen |
> If nothing else, I think there's a small, useful role for a very simple > caching proxy which just maintains an in-memory cache of everything > visited which is valid for, say, five minutes. The reason this is > useful is that navigating Gemini capsules often involves extensive use > of the "back button" (and, at least for me on AV-98, the "up" command, > which gets you from example.com/foo/bar/baz to to example.com/foo/bar/ > even in the lack of an explicit link, and the "root" command, which gets > you from example.com/foo/bar/baz to to example.com), which often entails > loading the same page repeatedly in a relatively short window of time. > A very simple dumb local cache would cut down on transactions for this. I think this is often best relegated to the client, but I see how it's nice for the client not to have to worry about it, and to have this other proxy instead. I was originally picturing something longer lived, so that it would still be useful along with any caches the client already has, but I'm not sure how much sense that makes. > For people like me who often read something in Gemini/Gopherspace, then > want to reference a few days later but cannot remember where they read > it, a proxy which maintained full-text search of everything visited in > the past month or so would be super handy, but I have no idea how to > build such a thing. Me neither, but that definitely seems cool. I'm sure Natpen would have an idea on how to do this :). It's also a nice way to keep history for clients that don't do that (all of them?). The full text search also doubles as an archive of all the sites you visit. makeworld
> For people like me who often read something in Gemini/Gopherspace, > then want to reference a few days later but cannot remember where > they read it, a proxy which maintained full-text search of everything > visited in the past month or so would be *super* handy, but I have no > idea how to build such a thing. I assume downloading every page into a local cache and running something like https://blevesearch.com/ on top of them when a search request comes would work reasonably well. I assume this wouldn't require _that much_ space, as most Gemini pages (that I have encountered so far) are small, and text is really compressible if the need comes up. Changes to the cached pages could be stored with deltas/patches, so that might be interesting for historical preservation too, assuming most pages don't change that much, that often. And all this can happen without leaving Gemini, like how GUS works. > I'm kind of attracted to the idea of small, simple, do-one-thing-well > proxies which can be chained together like "filter" programs in a > pipeline...but I guess the TLS overhead would stack up quickly, Since I am not exactly knowledgeable on TLS and other low-level protocols (except knowing how to open a socket), how much would the TLS overhead be if the proxy was hosted on the same machine as the client? I assume modern CPUs can easily deal with the encryption work at reasonable speeds, and the connection being local would probably get rid of the majority of the network overhead. Also another idea to throw into the pile: a "master" proxy that accepts TLS connections, but delegates everything else into small filter scripts. That way we can get rid of the TLS and networking overhead of stacking multiple proxies on top of each other, while keeping the flexibility and simplicity of the pipeline approach. The one drawback I can think of is the scripts would no longer be regular proxies on their own, and if someone wanted to use even just one of these scripts, they would need to install the entire pipeline framework. But at that point, why not build the pipelines into clients? Assuming these scripts work via "stdin -> (script magic) -> stdout", even shell clients could run them by piping their "gemini receive" command into them. Long-running scripts could probably just be accessed via wrappers with netcat, or something else I haven't thought of yet. Anyway, these are all just some ideas that I am throwing here. I might even try building some of them if there's any interest, but I am pretty sure there will be some issues I haven't thought of. Let me know what you all think! -- Have a nice (day|night|week(end)?) ~ Ecmel B. Canl?er ~
On Mon Aug 10, 2020 at 11:31 PM CEST, wrote: > I think this is often best relegated to the client, but I see how it's > nice > for the client not to have to worry about it, and to have this other > proxy instead. > I was originally picturing something longer lived, so that it would > still be > useful along with any caches the client already has, but I'm not sure > how much > sense that makes. Yeah, this wouldn't be too hard to do in the client. I guess I think of it the same way I think of poor little rarely-used Agena: isn't it necessarily hard to add Gopher support to a Gemini proxy, but also what's the point of a dozen different client authors doing the same thing a dozen different times when it fits so naturally into the proxy concept? But, sure, for a tiny in-memory short-lived cache perhaps this argument is not so strong, certainly less though than for something larger and longer lived. I do think that gets tricky, though, it's hard to know when to invalidate the cache without servers having any way to signal it. Cheers, Solderpunk
On Tue Aug 11, 2020 at 12:14 AM CEST, Ecmel Berk Canl?er wrote: > But at that point, why not build the pipelines into clients? Assuming > these scripts work via "stdin -> (script magic) -> stdout", even shell > clients could run them by piping their "gemini receive" command into > them. Long-running scripts could probably just be accessed via wrappers > with netcat, or something else I haven't thought of yet. This is a nice idea. AV-98 almost supports it already, you can pipe the body of Gemini responses to arbitrary pipelines based on their MIME type, which allows for reformatting text and things like that, but nothing in the pipeline gets access to the URL where the resource came from, which would be required to make a lot of useful applications work. But it probably wouldn't be hard at all to pass that information along via environment variables. Hmm... Cheers, Solderpunk
> Keep in mind that server should also cache and serve TLS certs, so that they > match the domain being requested. I have changed this, as I don't think it makes sense anymore. Currently with go-gemini v0.8.1, the proxy does not need to have a TLS cert that matches the proxy resource. It just needs to serve a cert that matches its own hostname. https://github.com/makeworld-the-better-one/go-gemini/commit/3d39237a5e9120 67bf436cf032bf2cc9f42e47ed Thanks, makeworld
On Tue, 2020-08-11 at 20:18 +0200, Solderpunk wrote: > AV-98 almost supports it already, you can pipe the > body of Gemini responses to arbitrary pipelines based on their MIME > type, which allows for reformatting text and things like that, but > nothing in the pipeline gets access to the URL where the resource > came > from, which would be required to make a lot of useful applications > work. > But it probably wouldn't be hard at all to pass that information > along > via environment variables. Hmm... I wanted that to edit pages from within AV-98 as well. I'd like to define a command that takes the current text, and the current URL, calls $EDITOR, and reposts it somewhere (based on the current URL). Cheers Alex
---