💾 Archived View for gemi.dev › gemini-mailing-list › 000015.gmi captured on 2024-09-29 at 04:57:28. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

IPv6 and gemini

1. plugd (plugd (a) thelambdalab.xyz)

Hi all,

I've been having intermittent connection issues with some of the gemini
servers.  While this is of course to be expected at this point, it
seemed odd that, for instance, gemini://zaibatsu.circumlunar.space
seemed to be offline more often than not.  I finally got around to
looking at this more carefully, and noticed that using gnutls-cli to
open a connection to the IPv4 address always succeeds, whereas
connections to the IPv6 address that the resolver returns do not.
Since the network that I'm using supports IPv6 and falls back to IPv4
when necessary, this was causing a problem.

Furthermore, because of the requirement that clients transmit the full
URL to the server, it's not trivial to get around this by just directing
the client to the IPv4 address: the server probably won't recognise the
URL and will respond with an error code.

There seem to be four options:

1. Have clients only look at IPv4 addresses (i.e. ignore AAAA DNS
records).
 - Pro: would immediately solve the problem in this case.
 - Con: gemini gets stuck in the past.

2. Have clients do a reverse-DNS lookup when the gemini URL contains
an IP and use this to construct a URL to supply to the gemini server.
 - Pro: would allow clients to connect using URLs with literal IP
   addresses in them.
 - Cons:
    a) connecting via the hostname still wouldn't work.
    b) if a single server is hosting several gemini sites the result
       of the connection would be non-deterministic.
    
3. Have servers recognise IP URLs.
 - Pros and cons same as for 2.

4. Have servers ensure that if they have an AAAA record they also listen
for IPv6 gemini connections.
  - Pros: future-proof, no client-side changes necessary.
  - Con: some additional work necessary on the server side.

Obviously 4 is my favourite because it's less (no) work for me. :)

plugd

Link to individual message.

2. Sean Conner (sean (a) conman.org)

It was thus said that the Great plugd once stated:
> Hi all,
> 
> I've been having intermittent connection issues with some of the gemini
> servers.  While this is of course to be expected at this point, it
> seemed odd that, for instance, gemini://zaibatsu.circumlunar.space
> seemed to be offline more often than not.  I finally got around to
> looking at this more carefully, and noticed that using gnutls-cli to
> open a connection to the IPv4 address always succeeds, whereas
> connections to the IPv6 address that the resolver returns do not.
> Since the network that I'm using supports IPv6 and falls back to IPv4
> when necessary, this was causing a problem.
> 
> Furthermore, because of the requirement that clients transmit the full
> URL to the server, it's not trivial to get around this by just directing
> the client to the IPv4 address: the server probably won't recognise the
> URL and will respond with an error code.

  Worse than that:

[spc]lucy:~/projects/gemini/Lua>lua client.lua gemini://127.0.0.1/
ios:write() = name `127.0.0.1' not present in server certificate

  Of course, an option could be added to the client to overwrite the
hostname from the URL, so for example:

GenericUnixPrompt> geminiclient -h example.com gemini://127.0.0.1/

so it would use the IP address to connect, but instead of sending
'127.0.0.1' as the host, it would use the one passed in as an option.  That
might be easier said than done though [1].

> There seem to be four options:

  [ snip ]

> 4. Have servers ensure that if they have an AAAA record they also listen
> for IPv6 gemini connections.
>   - Pros: future-proof, no client-side changes necessary.
>   - Con: some additional work necessary on the server side.

  It depends.  On GLV-1.12556, all that takes is to use an address of "::"
to listen on all IPv4 and IPv6 interfaces, and that should work with any
modern IP stack.

  There is also a fifth option:

	5. Have clients attempt to connect to both addresses and use the one
	that connects first.
	- Pro: should always work
	- Con: complicates the client

> Obviously 4 is my favourite because it's less (no) work for me. :)

  I like 4 as well.

  -spc

[1]	Looking at my own very simplistic client and yeah, I would have to
	use a lower level API call to do that.  Ick.

Link to individual message.

3. plugd (plugd (a) thelambdalab.xyz)


Sean Conner writes:

>> 4. Have servers ensure that if they have an AAAA record they also listen
>> for IPv6 gemini connections.
>>   - Pros: future-proof, no client-side changes necessary.
>>   - Con: some additional work necessary on the server side.
>
>   It depends.  On GLV-1.12556, all that takes is to use an address of "::"
> to listen on all IPv4 and IPv6 interfaces, and that should work with any
> modern IP stack.

I didn't want to presume. :-)

>   There is also a fifth option:
>
> 	5. Have clients attempt to connect to both addresses and use the one
> 	that connects first.
> 	- Pro: should always work
> 	- Con: complicates the client

Ideally this should already have been happening, but determining whether
a server is not listening is difficult, isn't it?  At some point you
have to simply define a timeout.  Which means that, unless you want
false negatives you have to wait a while (say a few seconds at least)
before declaring the connection a failure.  This is okay if it happens
once, but it'd be awful if this were happening for every single
selector.  Which means that clients would also have to implement some
local caching mechanism ...

>
>> Obviously 4 is my favourite because it's less (no) work for me. :)
>
>   I like 4 as well.

Yay!

plugd

Link to individual message.

4. solderpunk (solderpunk (a) SDF.ORG)

> >> Obviously 4 is my favourite because it's less (no) work for me. :)
> >
> >   I like 4 as well.
> 
> Yay!
>

I would like to hope that it won't be *too* much longer before 4 just
kind of happens because networking libraries will have gotten to the
point where you have to actively go out of your way to write IPv4-only
code.

Sorry that the Zaibatsu is doing a bad job of leading by example on
this front!  I was fully prepared to accept the blame for this, since
the gegobi server was thrown togeher over a few evenings and I just
didn't think about IPv6.  I thought I'd quickly patch it to support
serving on both IP versions at once before making this post, but...

...it turns out the quick and easy TCP server utilities in Python's
socketserver module are (still, in 2019 for crying out loud)
hard-coded to only work with IPv4.  So it's not a quick patch, but a
slightly larger project. :(   Maybe I was a bit optimistic in my first
paragraph.

Since starting the Gemini project, this is the second time I've felt
disappointed in Python's standard library, which is a rare thing.  At
gemini://mozz.us/journal/2019-08-21.txt, Michael writes about how the
ssl module can't accept a self-signed client certificate.  Even worse
than that, while it's possible to accept self-signed server
certificates, you can't get direct access to the details of that
certificate (like, say, the validity start and expiry dates).  With
CA-validated certificates you can easily get this data.  If the cert
is self-signed, you can only get an x509 encoded representation of the
cert - and there's nothing in the standard library to decode it!  The
ssl module really does seem to be designed to let people who don't
understand TLS very thoroughly write HTTPS stuff without shooting
themselves in the foot (the docs even talk explicitly about "Web
servers" instead of just "servers"!).  And, of course, I understand
why it's useful for a library with that kind of interface to exist.
But I *do* expect to be able to "go off-road" when I really want to.
This is going to make it a big pain to get a proper TOFU system in
AV-98, grumble, grumble.  When I eventually get around to writing my
own server, I'll definitely do it in something other than Python:
probably Go.

Any way, perhaps the most interesting thing to come out of this
conversation is the matter of using URLs with IP addresses instead of
hostnames as Gemini requests.  I'd guess a lot of existing servers
don't handle this well.  And, as mentioned, in the case of
hostname-based virtual servers, it's not entirely clear what handling
this well even means.  I wonder if it's worth explicitly disallowing
such requests?

-Solderpunk

Link to individual message.

5. Sean Conner (sean (a) conman.org)

It was thus said that the Great solderpunk once stated:
> I would like to hope that it won't be *too* much longer before 4 just
> kind of happens because networking libraries will have gotten to the
> point where you have to actively go out of your way to write IPv4-only
> code.
> 
> Sorry that the Zaibatsu is doing a bad job of leading by example on
> this front!  I was fully prepared to accept the blame for this, since
> the gegobi server was thrown togeher over a few evenings and I just
> didn't think about IPv6.  I thought I'd quickly patch it to support
> serving on both IP versions at once before making this post, but...
> 
> ...it turns out the quick and easy TCP server utilities in Python's
> socketserver module are (still, in 2019 for crying out loud)
> hard-coded to only work with IPv4.  So it's not a quick patch, but a
> slightly larger project. :(   Maybe I was a bit optimistic in my first
> paragraph.

  Sad.  It's not that hard to support IPv6.  In C, just by calling
getaddrinfo() you can get both IPv4 and IPv6 addresses for a hostname (and
it supports parsing IP addresses of either family), and socket(), bind(),
listen(), connect(), etc. all support IPv6.

> Since starting the Gemini project, this is the second time I've felt
> disappointed in Python's standard library, which is a rare thing.  At
> gemini://mozz.us/journal/2019-08-21.txt, Michael writes about how the
> ssl module can't accept a self-signed client certificate.  Even worse
> than that, while it's possible to accept self-signed server
> certificates, you can't get direct access to the details of that
> certificate (like, say, the validity start and expiry dates).  With
> CA-validated certificates you can easily get this data.  If the cert
> is self-signed, you can only get an x509 encoded representation of the
> cert - and there's nothing in the standard library to decode it!  The
> ssl module really does seem to be designed to let people who don't
> understand TLS very thoroughly write HTTPS stuff without shooting
> themselves in the foot (the docs even talk explicitly about "Web
> servers" instead of just "servers"!).  And, of course, I understand
> why it's useful for a library with that kind of interface to exist.

  I suspect this was done intentionall to disuade people from using
self-signed certificates because dragons.  Or hackers.  Or something.  Maybe
people can't validate certificates properly.  

> Any way, perhaps the most interesting thing to come out of this
> conversation is the matter of using URLs with IP addresses instead of
> hostnames as Gemini requests.  I'd guess a lot of existing servers
> don't handle this well.  

  Remove the TLS restriction, and it's not a problem at all.  The problem is
that during the TLS negotiation, the hostname of the server you are
connecting to is sent [1] as part of the protocol.  There is a way to
extract which server is being referenced and thus, you can figure out which
set of files (or handlers) to serve the request from.

> And, as mentioned, in the case of
> hostname-based virtual servers, it's not entirely clear what handling
> this well even means.  I wonder if it's worth explicitly disallowing
> such requests?

  If you get foo.example.com as the host, service the request against this
set of content (subdirectory, etc); if bar.example.com, out of that set of
content.  That's all it really means.  The problem is getting the
information from TLS.

  -spc

[1]	In my Lua wrapper for TLS, I have a function to connect to an
	endpoint using an address, and it's signature is:

		ios = tcp.connecta(addr,hostname,to,config)

	(the last two parameters are optional).  You *have* to include a
	hostname.  libtls won't work without it.

Link to individual message.

6. solderpunk (solderpunk (a) SDF.ORG)

>   Sad.  It's not that hard to support IPv6.  In C, just by calling
> getaddrinfo() you can get both IPv4 and IPv6 addresses for a hostname (and
> it supports parsing IP addresses of either family), and socket(), bind(),
> listen(), connect(), etc. all support IPv6.

There are bindings for all of this in Python, too, in the `socket'
module.  VF-1 and AV-98 both use getaddrinfo to support connecting to
IPv6 servers.  But Agena and Gegobi both use some easy/lazy higher-level
tools in the `socketserver` module, which allow writing simple forking
or threaded TCP servers without having to bother actually writing any of
the networks.  Obviously you sacrifice some degree of control and
performance, but I thought they'd be fine for non-critical applications
like, well, anything Gemini related in these easy days!  Not handling
IPv6 is a bit of a deal-breaker, though, so I'll have to actually do
things properly...
 
>   I suspect this was done intentionall to disuade people from using
> self-signed certificates because dragons.  Or hackers.  Or something.

Dragon hackers!

"Welcome to Gemini: here be dragons.  It's okay, they like TOFU!"

>   Remove the TLS restriction, and it's not a problem at all.  The problem is
> that during the TLS negotiation, the hostname of the server you are
> connecting to is sent [1] as part of the protocol.  There is a way to
> extract which server is being referenced and thus, you can figure out which
> set of files (or handlers) to serve the request from.

Hmm.  Are you talking about SNI (Server Name Indication) here?

-Solderpunk

Link to individual message.

7. Sean Conner (sean (a) conman.org)

It was thus said that the Great solderpunk once stated:
> 
> >   Remove the TLS restriction, and it's not a problem at all.  The problem is
> > that during the TLS negotiation, the hostname of the server you are
> > connecting to is sent [1] as part of the protocol.  There is a way to
> > extract which server is being referenced and thus, you can figure out which
> > set of files (or handlers) to serve the request from.
> 
> Hmm.  Are you talking about SNI (Server Name Indication) here?

  Probably.  I haven't looked too much into it yet.

  -spc

Link to individual message.

8. solderpunk (solderpunk (a) SDF.ORG)

On Sun, Sep 15, 2019 at 04:31:44PM -0400, Sean Conner wrote:
>   Probably.  I haven't looked too much into it yet.

Nor I, but I think that's an optional extension on TLS which we can't
rely on.

Dangerously close to veering off topic here, so expect a new thread
soon, but I've started trying to do a proper TOFU-style certificate
handling scheme in AV-98.  An immediate question is whether to remember
previously seen certificates against hostnames or IP addresses.  The
most obvious place where this could really matter is for servers (with a
single IP) serving up multiple Gemini sites under different hostnames.
If the client *doesn't* send a hostname very early on in the handshake
then it seems to me there is no way for the server to use distinct certs
per hostname.

I need to do some reading, no doubt this is entirely well-trod ground in
HTTPS-land.

-Solderpunk

Link to individual message.

---

Previous Thread: [ANN] GLV-1.12556 Gemini Server now released

Next Thread: URLs in request lines