πŸ’Ύ Archived View for gemi.dev β€Ί gemini-mailing-list β€Ί 000317.gmi captured on 2023-11-04 at 12:40:55. Gemini links have been rewritten to link to archived content

View Raw

More Information

➑️ Next capture (2023-12-28)

-=-=-=-=-=-=-

Proxying

colecmac@protonmail.com <colecmac (a) protonmail.com>

Hello Geminauts,

Earlier today I added proxy support to the master branch of gemget[1].
This allows you to specify a host like 1.2.3.4, or example.com:3000,
that will be sent the Gemini request instead of the host specified in
the URL.

This kind of proxying is explicitly support by the Gemini spec, you can read
more about it in section 2.

I will also be adding proxy support to Amfora as well, so that if enabled,
all requests will go through the proxy.

However to my knowledge, no such server exists that can act as a general proxy.
I'd be happy to hear about any, if I've missed something. I forsee that they
could be used in the same way Squid[2] is for the web, as a cache. I think this
could work even better for Gemini then it does for the web, as documents are more
compact and are much less dynamic.

They could also be used for privacy, as many web proxies are.

I have to wonder whether anyone would actually use this. What do you all think of
this workflow/feature? Is it potentially useful? I'm vaguely interested in writing
such a server, but I'd be happy if anyone else beats me to the punch. Keep in mind
that server should also cache and serve TLS certs, so that they match the domain
being requested.


1: https://github.com/makeworld-the-better-one/gemget/commit/23048d309ed7f7
5cab79bb5daecc4a60f3b1587f
2: http://www.squid-cache.org/


makeworld

P.S. See the FetchWithHost function in go-gemini for a code example of how
the proxying works.

Link to individual message.

Jason McBrayer <jmcbray (a) carcosa.net>

colecmac at protonmail.com writes:

> I have to wonder whether anyone would actually use this. What do you
> all think of this workflow/feature? Is it potentially useful?

I do think a caching proxy would be potentially useful, especially as
few clients do caching themselves. I could also see it useful for
offline use (client spiders through local proxy while online, client reads
offline through local proxy).

The tricky bit is, of course, cache expiration. We don't provide as much
useful information to the cache as http does. You'll need to use
heuristics not only for how long to cache, but to provide a way to force
a refresh, which isn't covered in the spec.


-- 
+-----------------------------------------------------------+
| Jason F. McBrayer                    jmcbray at carcosa.net  |
| A flower falls, even though we love it; and a weed grows, |
| even though we do not love it.            -- Dogen        |

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

On Sun Aug 9, 2020 at 7:55 PM CEST, Jason McBrayer wrote:
> colecmac at protonmail.com writes:
>
> > I have to wonder whether anyone would actually use this. What do you
> > all think of this workflow/feature? Is it potentially useful?
>
> I do think a caching proxy would be potentially useful, especially as
> few clients do caching themselves. I could also see it useful for
> offline use (client spiders through local proxy while online, client
> reads
> offline through local proxy).
>
> The tricky bit is, of course, cache expiration. We don't provide as much
> useful information to the cache as http does. You'll need to use
> heuristics not only for how long to cache, but to provide a way to force
> a refresh, which isn't covered in the spec.

If nothing else, I think there's a small, useful role for a very simple
caching proxy which just maintains an in-memory cache of everything
visited which is valid for, say, five minutes.  The reason this is
useful is that navigating Gemini capsules often involves extensive use
of the "back button" (and, at least for me on AV-98, the "up" command,
which gets you from example.com/foo/bar/baz to to example.com/foo/bar/
even in the lack of an explicit link, and the "root" command, which gets
you from example.com/foo/bar/baz to to example.com), which often entails
loading the same page repeatedly in a relatively short window of time.
A very simple dumb local cache would cut down on transactions for this.

For people like me who often read something in Gemini/Gopherspace, then
want to reference a few days later but cannot remember where they read
it, a proxy which maintained full-text search of everything visited in
the past month or so would be *super* handy, but I have no idea how to
build such a thing.

I'm kind of attracted to the idea of small, simple, do-one-thing-well
proxies which can be chained together like "filter" programs in a
pipeline...but I guess the TLS overhead would stack up quickly,
encouraging a kind of highly configurable "Swiss army knife" proxy
instead.  Not as pretty, but potentially very useful...

Cheers,
Solderpunk

Link to individual message.

Jason McBrayer <jmcbray (a) carcosa.net>

"Solderpunk" <solderpunk at posteo.net> writes:

> For people like me who often read something in Gemini/Gopherspace,
> then want to reference a few days later but cannot remember where they
> read it, a proxy which maintained full-text search of everything
> visited in the past month or so would be *super* handy, but I have no
> idea how to build such a thing.

I wrote such a thing for HTTP 1.0 back around 2000. The performance was
terrible, and it quickly became obsolete (I wasn't up to implementing
HTTP 1.1 features like chunked transfer at the time), and the spread of
SSL put a final nail in its coffin. But it was fun at the time.

-- 
+-----------------------------------------------------------+
| Jason F. McBrayer                    jmcbray at carcosa.net  |
| A flower falls, even though we love it; and a weed grows, |
| even though we do not love it.            -- Dogen        |

Link to individual message.

colecmac@protonmail.com <colecmac (a) protonmail.com>

> If nothing else, I think there's a small, useful role for a very simple
> caching proxy which just maintains an in-memory cache of everything
> visited which is valid for, say, five minutes. The reason this is
> useful is that navigating Gemini capsules often involves extensive use
> of the "back button" (and, at least for me on AV-98, the "up" command,
> which gets you from example.com/foo/bar/baz to to example.com/foo/bar/
> even in the lack of an explicit link, and the "root" command, which gets
> you from example.com/foo/bar/baz to to example.com), which often entails
> loading the same page repeatedly in a relatively short window of time.
> A very simple dumb local cache would cut down on transactions for this.

I think this is often best relegated to the client, but I see how it's nice
for the client not to have to worry about it, and to have this other proxy instead.
I was originally picturing something longer lived, so that it would still be
useful along with any caches the client already has, but I'm not sure how much
sense that makes.

> For people like me who often read something in Gemini/Gopherspace, then
> want to reference a few days later but cannot remember where they read
> it, a proxy which maintained full-text search of everything visited in
> the past month or so would be super handy, but I have no idea how to
> build such a thing.

Me neither, but that definitely seems cool. I'm sure Natpen would have an
idea on how to do this :). It's also a nice way to keep history for clients
that don't do that (all of them?). The full text search also doubles as an
archive of all the sites you visit.


makeworld

Link to individual message.

Ecmel Berk CanlΔ±er <me (a) ecmelberk.com>

> For people like me who often read something in Gemini/Gopherspace,
> then want to reference a few days later but cannot remember where
> they read it, a proxy which maintained full-text search of everything
> visited in the past month or so would be *super* handy, but I have no
> idea how to build such a thing.

I assume downloading every page into a local cache and running
something like https://blevesearch.com/ on top of them when a
search request comes would work reasonably well.

I assume this wouldn't require _that much_ space, as most Gemini pages
(that I have encountered so far) are small, and text is really
compressible if the need comes up.

Changes to the cached pages could be stored with deltas/patches, so
that might be interesting for historical preservation too, assuming
most pages don't change that much, that often.

And all this can happen without leaving Gemini, like how GUS works.

> I'm kind of attracted to the idea of small, simple, do-one-thing-well
> proxies which can be chained together like "filter" programs in a
> pipeline...but I guess the TLS overhead would stack up quickly,

Since I am not exactly knowledgeable on TLS and other low-level
protocols (except knowing how to open a socket), how much would the TLS
overhead be if the proxy was hosted on the same machine as the client?
I assume modern CPUs can easily deal with the encryption work at
reasonable speeds, and the connection being local would probably get rid
of the majority of the network overhead.

Also another idea to throw into the pile: a "master" proxy that accepts
TLS connections, but delegates everything else into small filter
scripts. That way we can get rid of the TLS and networking overhead of
stacking multiple proxies on top of each other, while keeping the
flexibility and simplicity of the pipeline approach. The one drawback I
can think of is the scripts would no longer be regular proxies on their
own, and if someone wanted to use even just one of these scripts, they
would need to install the entire pipeline framework.

But at that point, why not build the pipelines into clients? Assuming
these scripts work via "stdin -> (script magic) -> stdout", even shell
clients could run them by piping their "gemini receive" command into
them. Long-running scripts could probably just be accessed via wrappers
with netcat, or something else I haven't thought of yet.

Anyway, these are all just some ideas that I am throwing here. I might
even try building some of them if there's any interest, but I am pretty
sure there will be some issues I haven't thought of. Let me know what
you all think!

-- 
Have a nice (day|night|week(end)?)
~ Ecmel B. Canl?er ~

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

On Mon Aug 10, 2020 at 11:31 PM CEST,  wrote:

> I think this is often best relegated to the client, but I see how it's
> nice
> for the client not to have to worry about it, and to have this other
> proxy instead.
> I was originally picturing something longer lived, so that it would
> still be
> useful along with any caches the client already has, but I'm not sure
> how much
> sense that makes.

Yeah, this wouldn't be too hard to do in the client.  I guess I think of
it the same way I think of poor little rarely-used Agena: isn't it
necessarily hard to add Gopher support to a Gemini proxy, but also
what's the point of a dozen different client authors doing the same
thing a dozen different times when it fits so naturally into the proxy
concept?

But, sure, for a tiny in-memory short-lived cache perhaps this argument
is not so strong, certainly less though than for something larger and
longer lived.  I do think that gets tricky, though, it's hard to know
when to invalidate the cache without servers having any way to signal
it.

Cheers,
Solderpunk

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

On Tue Aug 11, 2020 at 12:14 AM CEST, Ecmel Berk Canl?er wrote:

> But at that point, why not build the pipelines into clients? Assuming
> these scripts work via "stdin -> (script magic) -> stdout", even shell
> clients could run them by piping their "gemini receive" command into
> them. Long-running scripts could probably just be accessed via wrappers
> with netcat, or something else I haven't thought of yet.

This is a nice idea.  AV-98 almost supports it already, you can pipe the
body of Gemini responses to arbitrary pipelines based on their MIME
type, which allows for reformatting text and things like that, but
nothing in the pipeline gets access to the URL where the resource came
from, which would be required to make a lot of useful applications work.
But it probably wouldn't be hard at all to pass that information along
via environment variables.  Hmm...

Cheers,
Solderpunk

Link to individual message.

colecmac@protonmail.com <colecmac (a) protonmail.com>

> Keep in mind that server should also cache and serve TLS certs, so that they
> match the domain being requested.

I have changed this, as I don't think it makes sense anymore. Currently with
go-gemini v0.8.1, the proxy does not need to have a TLS cert that matches the
proxy resource. It just needs to serve a cert that matches its own hostname.

https://github.com/makeworld-the-better-one/go-gemini/commit/3d39237a5e9120
67bf436cf032bf2cc9f42e47ed

Thanks,
makeworld

Link to individual message.

Alex Schroeder <alex (a) gnu.org>

On Tue, 2020-08-11 at 20:18 +0200, Solderpunk wrote:
> AV-98 almost supports it already, you can pipe the
> body of Gemini responses to arbitrary pipelines based on their MIME
> type, which allows for reformatting text and things like that, but
> nothing in the pipeline gets access to the URL where the resource
> came
> from, which would be required to make a lot of useful applications
> work.
> But it probably wouldn't be hard at all to pass that information
> along
> via environment variables.  Hmm...

I wanted that to edit pages from within AV-98 as well. I'd like to
define a command that takes the current text, and the current URL,
calls $EDITOR, and reposts it somewhere (based on the current URL).

Cheers
Alex

Link to individual message.

---

Previous Thread: [ANN] html2gemini and html2gmi-cli - A Go library and app for converting HTML to Gemini

Next Thread: [ANN] GemiNaut 0.8.7 is released