💾 Archived View for gemi.dev › gemini-mailing-list › 000494.gmi captured on 2024-12-17 at 14:53:14. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

On the use of %20 instead of + in URIs

1. Adnan Maolood (me (a) adnano.co)

Section 1.2 of the Gemini specification says:

> Spaces in gemini URIs should be encoded as %20, not +.

I noticed that some servers do not properly handle +, treating it as a
literal plus sign. Since this is a recommendation and not a must,
shouldn't servers treat plus signs as spaces? Not allowing spaces makes
it slightly more difficult to use URL parsing libraries which use plus
signs.

If servers should not treat plus signs as spaces, then I think that this
portion of the specification should be changed to say:

> Spaces in gemini URIs MUST be encoded as %20, not +.

Otherwise, I think an additional clarification is in order:

> Spaces in gemini URIs should be encoded as %20, not +.
> However, software should still treat + as a space.

Link to individual message.

2. John Cowan (cowan (a) ccil.org)

On Wed, Nov 25, 2020 at 12:14 AM Adnan Maolood <me at adnano.co> wrote:

If servers should not treat plus signs as spaces, then I think that this
> portion of the specification should be changed to say:
>
> > Spaces in gemini URIs MUST be encoded as %20, not +.
>

+1  (not to be confused with " 1").  The usual use of x-www-form-urlencoded
format, as its name indicates, has to do with forms, which Gemini clients
don't do.  (Not that they couldn't in principle, to be sure.)




John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
As you read this, I don't want you to feel sorry for me, because,
I believe everyone will die someday.
               --From a Nigerian-type scam spam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201126/5889
6fb8/attachment-0001.htm>

Link to individual message.

3. Philip Linde (linde.philip (a) gmail.com)

On Wed, 25 Nov 2020 00:02:35 -0500
"Adnan Maolood" <me at adnano.co> wrote:

> Section 1.2 of the Gemini specification says:
> 
> > Spaces in gemini URIs should be encoded as %20, not +.
> 
> I noticed that some servers do not properly handle +, treating it as a
> literal plus sign. Since this is a recommendation and not a must,
> shouldn't servers treat plus signs as spaces? Not allowing spaces makes
> it slightly more difficult to use URL parsing libraries which use plus
> signs.

"+" doesn't generally have a special meaning that implies synonymity
with spaces. The sentence you're referring to should be considered a
clarifying point (because it's an easy mistake to make I guess), not a
difference from the standard it adopts (RFC 3986)

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201127/e891
b617/attachment.sig>

Link to individual message.

4. James Tomasino (tomasino (a) lavabit.com)

On 11/27/20 11:10 AM, Philip Linde wrote:
> "+" doesn't generally have a special meaning that implies synonymity
> with spaces. The sentence you're referring to should be considered a
> clarifying point (because it's an easy mistake to make I guess), not a
> difference from the standard it adopts (RFC 3986)

Philip is correct, but to add on, the '+' specifically has a semantic 
meaning as a space in query strings and form encodings 
(application/x-www-form-urlencoded) according to RFC 1866, but NOT in the 
rest of the URL itself. If you have a URL library that is parsing +'s as 
spaces in the URL outside of a query string, that library is incorrect. 
Only %20 is valid as a space in that context.

+ signs are listed as a sub-delimiter (RFC 3986 Section 2.2) and part of 
the reserved character set, meaning they should be encoded if not doing so 
would create confusion in the URL schema. Encoding a + sign would be %2B. 
You're welcome to do that if you want to avoid confusion, but it really 
shouldn't be necessary.

Link to individual message.

5. Adnan Maolood (me (a) adnano.co)

On Fri Nov 27, 2020 at 6:37 AM EST, James Tomasino wrote:
> On 11/27/20 11:10 AM, Philip Linde wrote:
> > "+" doesn't generally have a special meaning that implies synonymity
> > with spaces. The sentence you're referring to should be considered a
> > clarifying point (because it's an easy mistake to make I guess), not a
> > difference from the standard it adopts (RFC 3986)
>
> Philip is correct, but to add on, the '+' specifically has a semantic
> meaning as a space in query strings and form encodings
> (application/x-www-form-urlencoded) according to RFC 1866, but NOT in
> the rest of the URL itself. If you have a URL library that is parsing
> +'s as spaces in the URL outside of a query string, that library is
> incorrect. Only %20 is valid as a space in that context.
>
> + signs are listed as a sub-delimiter (RFC 3986 Section 2.2) and part of
> the reserved character set, meaning they should be encoded if not doing
> so would create confusion in the URL schema. Encoding a + sign would be
> %2B. You're welcome to do that if you want to avoid confusion, but it
> really shouldn't be necessary.

The Go URL library encodes spaces in the query string as '+'. Is this
correct behavior?

Link to individual message.

6. James Tomasino (tomasino (a) lavabit.com)

On 11/27/20 4:20 PM, Adnan Maolood wrote:
> The Go URL library encodes spaces in the query string as '+'. Is this
> correct behavior?

It's correct, but be careful:

- x-www-form-urlencoded *should* use +
- query string *may* use either + or %20 (%20 is notable in mailto link subject values)
- remainder of URL *must* use %20

That last bit is the important one for this discussion. Any spaces before 
the ? must be %20.

Link to individual message.

7. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

(My initial email only went to Adnan, sorry.)


> The Go URL library encodes spaces in the query string as '+'. Is this
> correct behavior?

Not for Gemini, no. According to RFCs it is okay, I believe. Whether or
not this should be changed, here's how my go-gemini library handles it.
These funcs should be used in place of the "net/url" ones.


func QueryEscape(query string) string {
    return strings.ReplaceAll(url.PathEscape(query), "+", "%2B")
}


func QueryUnescape(query string) (string, error) {
    return url.PathUnescape(query)
}



I hope that's helpful.


makeworld

Link to individual message.

---

Previous Thread: [ANN] Rocket Nine Labs gemlog (First post: How to create a gemlog with gssg)

Next Thread: On certificates and validation