💾 Archived View for gemi.dev › gemini-mailing-list › 000511.gmi captured on 2024-06-16 at 13:32:33. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

Heads up about a Gemini client @ 198.12.83.123

1. Sean Conner (sean (a) conman.org)


  It's not threatening my server or anything, but who ever is responsible
for the client at 198.12.83.123, your client is currently stuck in the
Redirection From Hell test and has been for some time.  From the length of
time, it appears to be running autonomously so perhaps a leftover thread, or
an autonomous client that doesn't read robots.txt, or didn't follow the spec
carefully enough.

  Anyway, just a heads up.

  -spc

Link to individual message.

2. Sean Conner (sean (a) conman.org)

It was thus said that the Great Sean Conner once stated:
> 
>   It's not threatening my server or anything, but who ever is responsible
> for the client at 198.12.83.123, your client is currently stuck in the
> Redirection From Hell test and has been for some time.  From the length of
> time, it appears to be running autonomously so perhaps a leftover thread, or
> an autonomous client that doesn't read robots.txt, or didn't follow the spec
> carefully enough.
> 
>   Anyway, just a heads up.
> 
>   -spc

  Sorry, this was sent to an additional address by mistake, and said other
address has no relation to what the client is doing.

  -spc

Link to individual message.

3. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

(Resent to the list, not just spc)

> 198.12.83.123

? dig -x 198.12.83.123 +short
phlox.titmouse.org.

https://phlox.titmouse.org/about.html mentions a Discordian by
the name of "Benedict T. Eyen, the T stands for Teeth." Unfortunately
there's no email address, so we can't contact him. Curiously, there
is a Gemini server running on that domain, but accessing it just gives
me Proxy Request Refused.

makeworld

Link to individual message.

4. Sean Conner (sean (a) conman.org)

It was thus said that the Great Sean Conner once stated:
> 
>   It's not threatening my server or anything, but who ever is responsible
> for the client at 198.12.83.123, your client is currently stuck in the
> Redirection From Hell test and has been for some time.  From the length of
> time, it appears to be running autonomously so perhaps a leftover thread, or
> an autonomous client that doesn't read robots.txt, or didn't follow the spec
> carefully enough.
> 
>   Anyway, just a heads up.
> 
>   -spc

  So the client in question was most likely a web proxy.  I'm not sure what
site, nor the software used, but it did response to a Gemini request with
"53 Proxy Requet Refused" so there *is* a Gemini server there.  And given
that it made 137,060 requests before I shut down my own server told me that
it was an autonomous agent that no one was watching.  Usually, I may see a
client hit 20 or 30 times before it stops.  Not this one.

  Now granted, my server is a bit unique in that I have tests set up
specifically for clients to test against, and several of them involve
infinite redirects.  And yes, that was 137,060 *unique* requests.

  So first up, Solderpunk, if you could please add a redirection follow
limit to the specification and make it mandatory.  You can specify some
two, heck, even three digit number to follow, but please, *please*, add it
to the specification and *not* just the best practices document to make
programmers aware of the issue.  It seems like it's too easy to overlook
this potential trap (I see it often enough).

  Second, had the proxy in question fetched robots.txt, I had this area
specifically marked out:

User-agent: *
Disallow: /test/redirehell 

  I have that for a reason, and had the autonomous client in question read
it, this wouldn't have happened in the first place.  Even if you disagree
with this, it may be difficult to stop an autonomous agent once the user of
said web proxy has dropped the web connection.  I don't know, I haven't
written a web proxy, and this is one more thing to keep in mind when writing
one.  I think it would be easier to follow robots.txt.

  -spc (To the person who called me a dick for blocking a web proxy---yes,
	there *are* reasons to block them)

Link to individual message.

5. Robert "khuxkm" Miles (khuxkm (a) tilde.team)

November 29, 2020 9:25 PM, "Sean Conner" <sean at conman.org> wrote:

> It was thus said that the Great Sean Conner once stated:
> 
>> It's not threatening my server or anything, but who ever is responsible
>> for the client at 198.12.83.123, your client is currently stuck in the
>> Redirection From Hell test and has been for some time. From the length of
>> time, it appears to be running autonomously so perhaps a leftover thread, or
>> an autonomous client that doesn't read robots.txt, or didn't follow the spec
>> carefully enough.
>> 
>> Anyway, just a heads up.
>> 
>> -spc
> 
> So the client in question was most likely a web proxy. I'm not sure what
> site, nor the software used, but it did response to a Gemini request with
> "53 Proxy Requet Refused" so there *is* a Gemini server there. And given
> that it made 137,060 requests before I shut down my own server told me that
> it was an autonomous agent that no one was watching. Usually, I may see a
> client hit 20 or 30 times before it stops. Not this one.
> 
> Now granted, my server is a bit unique in that I have tests set up
> specifically for clients to test against, and several of them involve
> infinite redirects. And yes, that was 137,060 *unique* requests.
> 
> So first up, Solderpunk, if you could please add a redirection follow
> limit to the specification and make it mandatory. You can specify some
> two, heck, even three digit number to follow, but please, *please*, add it
> to the specification and *not* just the best practices document to make
> programmers aware of the issue. It seems like it's too easy to overlook
> this potential trap (I see it often enough).
> 
> Second, had the proxy in question fetched robots.txt, I had this area
> specifically marked out:
> 
> User-agent: *
> Disallow: /test/redirehell
> 
> I have that for a reason, and had the autonomous client in question read
> it, this wouldn't have happened in the first place. Even if you disagree
> with this, it may be difficult to stop an autonomous agent once the user of
> said web proxy has dropped the web connection. I don't know, I haven't
> written a web proxy, and this is one more thing to keep in mind when writing
> one. I think it would be easier to follow robots.txt.
> 
> -spc (To the person who called me a dick for blocking a web proxy---yes,
> there *are* reasons to block them)

I recently wrote a gemini to web proxy as a simple side-project to see how 
easy it would be to create, and one thing I implemented that I feel should 
be a standard for web proxies is not handling redirects internally. If you 
tell my gemini proxy to request a page that offers a redirect (say, the 
next page link for LEO), it will send you back a small web page saying 
"hey, the site at this URL wants to send you to this other URL, do you 
want to follow that redirect or nah?" (not exact wording but you get my 
drift). That is, if you attempt to access the Redirection from Hell test 
using my proxy, each and every redirect would be a "confirm redirect" page 
served to the user. After about 20 pages, you'd think the user would catch 
on. That being said, my gemini proxy is not linked anywhere on my website 
(and if it were in a place I would link publically I would use robots.txt 
to prevent web crawlers from accessing it), so perhaps I'm not the target 
of this message.

I still maintain that a proxy is a direct agent of a user, and not an 
automated client. Proxy authors should use robots.txt on the web side to 
block crawlers from accessing the proxy, but proxies shouldn't have to follow robots.txt.

It's actually easier to just write your web proxy in such a way that this 
doesn't happen to you.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

6. Michael Lazar (lazar.michael22 (a) gmail.com)

On Sun, Nov 29, 2020 at 4:15 PM <colecmac at protonmail.com> wrote:
>
> (Resent to the list, not just spc)
>
> > 198.12.83.123
>
> ? dig -x 198.12.83.123 +short
> phlox.titmouse.org.
>
> https://phlox.titmouse.org/about.html mentions a Discordian by
> the name of "Benedict T. Eyen, the T stands for Teeth." Unfortunately
> there's no email address, so we can't contact him. Curiously, there
> is a Gemini server running on that domain, but accessing it just gives
> me Proxy Request Refused.
>
> makeworld

I'm not trying to call anyone out, but since this situation has come up on the
mailing list a few times it's helpful to look at the TLS cert.

 ```
openssl s_client -connect 198.12.83.123:1965
CONNECTED(00000003)
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = lignumvitae.org
verify return:1
---
Certificate chain
 0 s:/CN=lignumvitae.org
   i:/C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3
 1 s:/C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3
   i:/O=Digital Signature Trust Co./CN=DST Root CA X3
 ```

gemini://lignumvitae.org/ returns the same proxy error, but they're using a
Let's Encrypt cert so presumably they have a HTTPS server running too.

https://lignumvitae.org/ redirects to https://gj.libraryoferis.org/

from there, I recognized library of eris as an one of the early (and totally
awesome) gemini servers so I tried

gemini://libraryoferis.org/

There's an email address listed on that capsule.

- Michael

Link to individual message.

7. Sean Conner (sean (a) conman.org)

It was thus said that the Great Michael Lazar once stated:
> On Sun, Nov 29, 2020 at 4:15 PM <colecmac at protonmail.com> wrote:
> >
> > (Resent to the list, not just spc)
> >
> > > 198.12.83.123
> >
> > ? dig -x 198.12.83.123 +short
> > phlox.titmouse.org.
> >
> > https://phlox.titmouse.org/about.html mentions a Discordian by
> > the name of "Benedict T. Eyen, the T stands for Teeth." Unfortunately
> > there's no email address, so we can't contact him. Curiously, there
> > is a Gemini server running on that domain, but accessing it just gives
> > me Proxy Request Refused.
> >
> > makeworld
> 
> I'm not trying to call anyone out, but since this situation has come up on the
> mailing list a few times it's helpful to look at the TLS cert.

  It never occured to me to look at the certificate [1].

> There's an email address listed on that capsule.

  Thanks for the information.

  -spc

[1]	I had been away for hours at that point, and came back to the web
	proxy *still* going, and a much of ssh bots attempting to log onto
	the server and my automatic blocking system not fully working [2],
	so I was a bit preoccupied at the time.

[2]	Checking 0.0.0.0 and 0.0.0.0/0 are two entirely different things.

Link to individual message.

8. Sean Conner (sean (a) conman.org)

It was thus said that the Great Robert khuxkm Miles once stated:
> November 29, 2020 9:25 PM, "Sean Conner" <sean at conman.org> wrote:
> 
> > It was thus said that the Great Sean Conner once stated:
> > 
> >> It's not threatening my server or anything, but who ever is responsible
> >> for the client at 198.12.83.123, your client is currently stuck in the
> >> Redirection From Hell test and has been for some time. From the length of
> >> time, it appears to be running autonomously so perhaps a leftover thread, or
> >> an autonomous client that doesn't read robots.txt, or didn't follow the spec
> >> carefully enough.
> >> 
> >> Anyway, just a heads up.
> >> 
> >> -spc
> > 
> > So the client in question was most likely a web proxy. I'm not sure what
> > site, nor the software used, but it did response to a Gemini request with
> > "53 Proxy Requet Refused" so there *is* a Gemini server there. And given
> > that it made 137,060 requests before I shut down my own server told me that
> > it was an autonomous agent that no one was watching. Usually, I may see a
> > client hit 20 or 30 times before it stops. Not this one.
> > 
> > Now granted, my server is a bit unique in that I have tests set up
> > specifically for clients to test against, and several of them involve
> > infinite redirects. And yes, that was 137,060 *unique* requests.
> > 
> > So first up, Solderpunk, if you could please add a redirection follow
> > limit to the specification and make it mandatory. You can specify some
> > two, heck, even three digit number to follow, but please, *please*, add it
> > to the specification and *not* just the best practices document to make
> > programmers aware of the issue. It seems like it's too easy to overlook
> > this potential trap (I see it often enough).
> > 
> > Second, had the proxy in question fetched robots.txt, I had this area
> > specifically marked out:
> > 
> > User-agent: *
> > Disallow: /test/redirehell
> > 
> > I have that for a reason, and had the autonomous client in question read
> > it, this wouldn't have happened in the first place. Even if you disagree
> > with this, it may be difficult to stop an autonomous agent once the user of
> > said web proxy has dropped the web connection. I don't know, I haven't
> > written a web proxy, and this is one more thing to keep in mind when writing
> > one. I think it would be easier to follow robots.txt.
> > 
> > -spc (To the person who called me a dick for blocking a web proxy---yes,
> > there *are* reasons to block them)
> 
> I recently wrote a gemini to web proxy as a simple side-project to see how
> easy it would be to create, and one thing I implemented that I feel should
> be a standard for web proxies is not handling redirects internally. If you
> tell my gemini proxy to request a page that offers a redirect (say, the
> next page link for LEO), it will send you back a small web page saying
> "hey, the site at this URL wants to send you to this other URL, do you
> want to follow that redirect or nah?" (not exact wording but you get my
> drift). That is, if you attempt to access the Redirection from Hell test
> using my proxy, each and every redirect would be a "confirm redirect" page
> served to the user. After about 20 pages, you'd think the user would catch
> on. That being said, my gemini proxy is not linked anywhere on my website
> (and if it were in a place I would link publically I would use robots.txt
> to prevent web crawlers from accessing it), so perhaps I'm not the target
> of this message.

  You, specifically, weren't the target for my last bit, but I am addressing
in general those who write webproxies for Gemini.  Your proxy's method of
handing redirects works.  I was just a bit upset that an agent out there
made 137,000 requests [1] before anyone did anything about it.

> I still maintain that a proxy is a direct agent of a user, and not an
> automated client. Proxy authors should use robots.txt on the web side to
> block crawlers from accessing the proxy, but proxies shouldn't have to
> follow robots.txt.

  I understand the argument, but I can't say I'm completely on board with it
either, because ...

> It's actually easier to just write your web proxy in such a way that this
> doesn't happen to you.

you would probably be amazed at just how often clients *don't* limit
following redirects.  Most of the time, someone is sitting there, watching
their client and stopping it after perhaps 30 seconds, fix the first
redirect issue (redirecting back to itself) only to get trapped at the next
step.

  And to think I brought this upon myself for wanting redirects in the first
place.

  -spc (How ironic)

[1]	For the record, it was NOT placing an undue burden on my server,
	just cluttering the log files.  It's only an issue when the log file
	gets to 2G in size, then logging stops for everything.

Link to individual message.

9. Sean Conner (sean (a) conman.org)

It was thus said that the Great Sean Conner once stated:
> 
> you would probably be amazed at just how often clients *don't* limit
> following redirects.  Most of the time, someone is sitting there, watching
> their client and stopping it after perhaps 30 seconds, fix the first
> redirect issue (redirecting back to itself) only to get trapped at the next
> step.

  I want to clarify that this usually happens when a new client is being
tested.  Somehow, the web proxy in question wasn't tested.

  -spc

Link to individual message.

---

Previous Thread: [SPEC-CHANGE] Mandatory scheme in request and link URLs

Next Thread: [ANN] Emacs org-mode to gemini export back-end