💾 Archived View for gemi.dev › gemini-mailing-list › 000556.gmi captured on 2024-08-19 at 01:09:54. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

[spec] What to do of fragments when there is a redirection

📧 Messages: 27
🗣️ Authors: 12
📅 First Message: 2020-12-22 09:59
📅 Last Message: 2021-01-05 23:26

1. Stephane Bortzmeyer (stephane (a) sources.org)

📅 Sent: 2020-12-22 09:59
📧 Message 1 of 27

Gemini specfication apparently does not mention what to do of
fragments when there is a redirection.

If original URI is <gemini://foobar.example/#baz> and there is a
redirect to <gemini://thing.example/doit>, should the Gemini client
consider is has to go to <gemini://thing.example/doit#baz> or to
<gemini://thing.example/doit>?

The spec says "The path, query and fragment components are allowed and
have no special meanings beyond those defined by the generic syntax."
But RFC 3986 just describes a *syntax*, it seems silent about
semantics.

Therefore, for HTTP, RFC 7231 has to describe in detail what the
client ("user agent", in HTTP parlance) has to do with
fragments. Should we consider that "in doubt, do as HTTP does?")

Link to individual message.

2. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-12-22 20:34
📧 Message 2 of 27

>
> Therefore, for HTTP, RFC 7231 has to describe in detail what the
> client ("user agent", in HTTP parlance) has to do with
> fragments. Should we consider that "in doubt, do as HTTP does?")
>

I think we should adopt the following RFC 7231 compatible rules.

0) The semantics of a fragment are defined solely by the media type of the
resource and not by the rest of the URL.

1) Clients MUST NOT send a fragment to the server.  If the server needs to
know the content of a fragment, it should be part of the path or the query
string instead.  This was a firm HTTP rule until fragments beginning with !
were invented, and IMO should be a firm Gemini rule.

2) If the client gets a redirect containing a fragment, the client MUST
apply this fragment when the redirected resource is retrieved, ignoring any
original fragment.

3) If the original URL has a fragment and the redirect doesn't, the client
MUST apply the original fragment when the redirected resource is retrieved.

4) Content authors MUST NOT put personally identifying information into
fragments, as they can be transferred from one host to another by the
operation of rule 3.



John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
He who would do good to another must do it in Minute Particulars;
General Good is the plea of the scoundrel, hypocrite and flatterer:
For Art and Science cannot exist but in minutely organized Particulars.
  --William Blake, il miglior fabbro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201222/ef94
f595/attachment.htm>

Link to individual message.

3. Gary Johnson (lambdatronic (a) disroot.org)

📅 Sent: 2020-12-22 20:51
📧 Message 3 of 27

Stephane Bortzmeyer <stephane at sources.org> writes:

> Gemini specfication apparently does not mention what to do of
> fragments when there is a redirection.
>
> If original URI is <gemini://foobar.example/#baz> and there is a
> redirect to <gemini://thing.example/doit>, should the Gemini client
> consider is has to go to <gemini://thing.example/doit#baz> or to
> <gemini://thing.example/doit>?

This is a reasonable question. In HTTP, fragments are usually used to
denote a particular named anchor on the same page. While I support
parsing fragments from Gemini requests in my server and presenting them
to CGI-like functions, I don't include fragments in redirects to other
pages in the same way that I would leave off the query from one request
when redirecting to another page. AFAIK, both the query and fragment are
specific to a particular path on the server and don't transfer to
others.

Hopefully, Solderpunk can clarify this once and for all for us.

Best,
  Gary

-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

4. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-12-22 23:20
📧 Message 4 of 27

On 22-Dec-2020 20:34, John Cowan wrote:
>
>     Therefore, for HTTP, RFC 7231 has to describe in detail what the
>     client ("user agent", in HTTP parlance) has to do with
>     fragments. Should we consider that "in doubt, do as HTTP does?")
>
>
> I think we should adopt the following RFC 7231 compatible rules.
>
> 0) The semantics of a fragment are defined solely by the media type of 
> the resource and not by the rest of the URL.
>
> 1) Clients MUST NOT send a fragment to the server.  If the server 
> needs to know the content of a fragment, it should be part of the path 
> or the query string instead.  This was a firm HTTP rule until 
> fragments beginning with ! were invented, and IMO should be a firm 
> Gemini rule.
>
> 2) If the client gets a redirect containing a fragment, the client 
> MUST apply this fragment when the redirected resource is 
> retrieved, ignoring any original fragment.
>
> 3) If the original URL has a fragment and the redirect doesn't, the 
> client MUST apply the original fragment when the redirected resource 
> is retrieved.
>
> 4) Content authors MUST NOT put personally identifying information 
> into fragments, as they can be transferred from one host to another by 
> the operation of rule 3.

I agree with this - that they should be client side only, but can be 
part of a persistent URL.

The question that remains particularly, is what should be the semantics 
of the fragment for our most prevalent media type - text/gemini?

I would propose that the fragment should indicate an offset to one of 
the headers in the page - but we'd need to agree how they should be 
formulated.

Some Markdown engines seems to have an established convention of 
creating the fragment name based on the header text, suitably 
normalised, e.g.

# Here is a heading

would have the associated offset fragment: "here-is-a-heading"

this approach is somewhat robust to page edits

Another approach could be to use the line index e.g. endpoint#12 means 
the twelfth line in that page. This is more finegrained, but is a bit 
more fragile.

Then again, do we really have much of a need to do page-specific 
indexing for URLs? Most Gemini pages are quite simple and not too 
complicated. So maybe we just get by without the fragment?

  - Luke

Link to individual message.

5. Sean Conner (sean (a) conman.org)

📅 Sent: 2020-12-22 23:38
📧 Message 5 of 27

It was thus said that the Great Luke Emmet once stated:
> 
> The question that remains particularly, is what should be the semantics 
> of the fragment for our most prevalent media type - text/gemini?

  So here's a totally off-the-cuff spec I'm pulling out of my nether regions
that seems simple and somewhat robust.

  You have the following item types in Gemini:

	text lines
	link lines
	pre toggles
	heading lines
	unordered list lines
	quote lines

  All of these can be represented by a single letter 'T', 'L', 'P', 'U'
and 'Q' with the header lines as 'H1', 'H2' and 'H3'.  So, some examples:

	foo#h1.1	- first top level header
	foo#h2.4	- fourth second level header
	foo#l.9		- ninth link line
	foo#h3.2l.2	- second link after second third level header
	foo#p.2t1	- first line past second pre-toggle

	gemini://gemini.circumlunar.space/docs/specification.gmi#h1.4h2.2h2
			- references section 3.2.2 2x (SUCCESS) section

  -spc (See?  Very simple)

Link to individual message.

6. Nathan Galt (mailinglists (a) ngalt.com)

📅 Sent: 2020-12-22 23:41
📧 Message 6 of 27


> On Dec 22, 2020, at 3:20 PM, Luke Emmet <luke at marmaladefoo.com> wrote:
> 
> 
> 
> On 22-Dec-2020 20:34, John Cowan wrote:
>> 
>>    Therefore, for HTTP, RFC 7231 has to describe in detail what the
>>    client ("user agent", in HTTP parlance) has to do with
>>    fragments. Should we consider that "in doubt, do as HTTP does?")
>> 
>> 
>> I think we should adopt the following RFC 7231 compatible rules.
>> 
>> 0) The semantics of a fragment are defined solely by the media type of 
the resource and not by the rest of the URL.
>> 
>> 1) Clients MUST NOT send a fragment to the server.  If the server needs 
to know the content of a fragment, it should be part of the path or the 
query string instead.  This was a firm HTTP rule until fragments beginning 
with ! were invented, and IMO should be a firm Gemini rule.
>> 
>> 2) If the client gets a redirect containing a fragment, the client MUST 
apply this fragment when the redirected resource is retrieved, ignoring 
any original fragment.
>> 
>> 3) If the original URL has a fragment and the redirect doesn't, the 
client MUST apply the original fragment when the redirected resource is retrieved.
>> 
>> 4) Content authors MUST NOT put personally identifying information into 
fragments, as they can be transferred from one host to another by the 
operation of rule 3.
> 
> I agree with this - that they should be client side only, but can be 
part of a persistent URL.
> 
> The question that remains particularly, is what should be the semantics 
of the fragment for our most prevalent media type - text/gemini?
> 
> I would propose that the fragment should indicate an offset to one of 
the headers in the page - but we'd need to agree how they should be formulated.
> 
> Some Markdown engines seems to have an established convention of 
creating the fragment name based on the header text, suitably normalised, e.g.
> 
> # Here is a heading
> 
> would have the associated offset fragment: "here-is-a-heading"
> 
> this approach is somewhat robust to page edits
> 
> Another approach could be to use the line index e.g. endpoint#12 means 
the twelfth line in that page. This is more finegrained, but is a bit more fragile.
> 
> Then again, do we really have much of a need to do page-specific 
indexing for URLs? Most Gemini pages are quite simple and not too 
complicated. So maybe we just get by without the fragment?
> 
> - Luke

Chrome also supports text fragments like 
`#:~:text=an%20example%20text%20fragment` , and they?re trying to have 
this feature be a proper web standard. No other browser has bothered 
implementing it, though.

https://wicg.github.io/scroll-to-text-fragment/

PDFs also support `#page=42`.

https://helpx.adobe.com/acrobat/kb/link-html-pdf-page-acrobat.html

Link to individual message.

7. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2020-12-22 23:46
📧 Message 7 of 27



> On Dec 23, 2020, at 00:38, Sean Conner <sean at conman.org> wrote:
> 
>  -spc (See?  Very simple)

Wow. Perhaps text/gemini should have no fragment whatsoever. No fragment 
in text/gemini, what about that? Simpler for sure.

Link to individual message.

8. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2020-12-22 23:48
📧 Message 8 of 27



> On Dec 23, 2020, at 00:41, Nathan Galt <mailinglists at ngalt.com> wrote:
> 
> Chrome also supports text fragments like 
`#:~:text=an%20example%20text%20fragment` , and they?re trying to have 
this feature be a proper web standard. No other browser has bothered 
implementing it, though.

I can see why not one bothered. For sure no fragment in text/gemini :)

Link to individual message.

9. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-12-23 00:51
📧 Message 9 of 27

On Tue, 22 Dec 2020 18:38:45 -0500
Sean Conner <sean at conman.org> wrote:

>   So here's a totally off-the-cuff spec I'm pulling out of my nether regions
> that seems simple and somewhat robust.

That's only good until the author of the document you link to modifies
it at an inconvenient place. It's only one step removed from defining
the fragment to be a byte offset.

It might be more robust to define the fragment as referring to the
first heading line that has the fragment content as a prefix, but
that's still prone to break with document changes.

In the most flexible of worlds, the fragment is a regular expression
that matches the line it refers to and an index to select one of
potentially many matching lines, but I don't quite like that idea.
There are too many subtly different regex implementations for it to be
practical, and it fundamentally doesn't solve the problem that
expectations will change with the text content of the document.

For a good balance, one might have the fragment be a an exact match of
the heading line you refer to, with simple wildcards like "*" for any
(or no) string of characters and "?" for any one character.

Here it seems that the separation of presentation and rendered content
in HTML is useful. You can change a heading or edit a document and
still refer to parts of it using a set of IDs that remain consistent
through the changes. Lacking that, perhaps one of these more or
less error prone solutions are OK for Gemini, but I tend to agree with
Petite Abeille that fragments should not have a special meaning for
text/gemini.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201223/4847
3551/attachment.sig>

Link to individual message.

10. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-12-23 18:39
📧 Message 10 of 27

On Tue, Dec 22, 2020 at 7:51 PM Philip Linde <linde.philip at gmail.com> wrote:

> It might be more robust to define the fragment as referring to the
> first heading line that has the fragment content as a prefix, but
> that's still prone to break with document changes.
>

Even HTML fragments break if their referents are deleted.  Nothing is
completely immune.

> For a good balance, one might have the fragment be a an exact match of
> the heading line you refer to
>

That's reasonable, but I think a prefix match would suffice; that way you
aren't tempted to make headings overly short in order to keep fragments
small.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
In politics, obedience and support are the same thing.  --Hannah Arendt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201223/416d
9569/attachment.htm>

Link to individual message.

11. William Orr (will (a) worrbase.com)

📅 Sent: 2020-12-23 22:22
📧 Message 11 of 27

Wouldn't this be dependent on the other discussion of IRIs, since gemtext 
can have arbitrary unicode? Also would require clients to NFC normalize 
the prefix/heading lines before doing the matching.

23 dic. 2020 19:40:00 John Cowan <cowan at ccil.org>:

> 
> 
> On Tue, Dec 22, 2020 at 7:51 PM Philip Linde <linde.philip at gmail.com> wrote:
> ?
>> It might be more robust to define the fragment as referring to the
>> first heading line that has the fragment content as a prefix, but
>> that's still prone to break with document changes.
> 
> Even HTML fragments break if their referents are deleted.? Nothing is 
completely immune.
>> For a good balance, one might have the fragment be a an exact match of
>> the heading line you refer to
> 
> That's reasonable, but I think a prefix match would suffice; that way 
you aren't tempted to make headings overly short in order to keep fragments small.
> 
> 
> 
> John Cowan ? ? ? ? ?http://vrici.lojban.org/~cowan ? ? ? ?cowan at ccil.org
> In politics, obedience and support are the same thing. ?--Hannah Arendt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201223/3792
4367/attachment.htm>

Link to individual message.

12. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-12-23 22:44
📧 Message 12 of 27

On 23-Dec-2020 22:22, William Orr wrote:
> Wouldn't this be dependent on the other discussion of IRIs, since 
> gemtext can have arbitrary unicode? Also would require clients to NFC 
> normalize the prefix/heading lines before doing the matching.
>
> 23 dic. 2020 19:40:00 John Cowan <cowan at ccil.org>:
>
>
>
>     On Tue, Dec 22, 2020 at 7:51 PM Philip Linde
>     <linde.philip at gmail.com <mailto:linde.philip at gmail.com>> wrote:
>
>         It might be more robust to define the fragment as referring to
>         the
>         first heading line that has the fragment content as a prefix, but
>         that's still prone to break with document changes.
>
>
>     Even HTML fragments break if their referents are deleted.  Nothing
>     is completely immune.
>
>         For a good balance, one might have the fragment be a an exact
>         match of
>         the heading line you refer to
>
>
>     That's reasonable, but I think a prefix match would suffice; that
>     way you aren't tempted to make headings overly short in order to
>     keep fragments small.
>

How about this scheme:

  - use the full text of the headings only as index points, encoded in a 
simple way and truncated for simplicity

The psuedo code would be:

    marker = take-left (base64 heading) 12

Example, imagine a document having headings and we want to calculate a 
match or a lookup for some heading. Lets say heading text is:

"# This is a heading"

the marker is therefore "IyBUaGlzIGlz"

So a link to that heading would be:

gemini://server/path/to/end/point.gmi#IyBUaGlzIGlz

(or some other value, doesnt have to be 12, but feels about right)

this would have the following advantages:

1. Content-addressable, so quite robust to insertions, deletions 
elsewhere in the document, whereas the offsets/counting schemes are less 
robust. Although as others have pointed out, there is no completely 
robust mechanism that is tolerant of any change to the document.

2. Not too long, but long enough so there is a reasonable likelihood not 
to have too many false positive hits in any document, generally.

3. Easily calculated and fast

4. UI is simple - select the heading (e.g. right click or whatever the 
equivalent gesture would be in your client), and have the client tell 
you the corresponding marker

5. Works with unicode heading content

6. URL friendly

Or maybe some sort of variant on something like this.

Or maybe we just live without them - we can always return to this some 
other time if there really is a pressing user need. Is there really a 
requirement really for this yet?

  - Luke

Link to individual message.

13. Stephane Bortzmeyer (stephane (a) sources.org)

📅 Sent: 2020-12-24 14:07
📧 Message 13 of 27

On Tue, Dec 22, 2020 at 03:34:03PM -0500,
 John Cowan <cowan at ccil.org> wrote 
 a message of 93 lines which said:

> I think we should adopt the following RFC 7231 compatible rules.

It seems a good idea. I hope it will be included in the official
specification, or in a companion specification.

> 3) If the original URL has a fragment and the redirect doesn't, the client
> MUST apply the original fragment when the redirected resource is retrieved.

Even if it's in a different capsule?

Link to individual message.

14. Stephane Bortzmeyer (stephane (a) sources.org)

📅 Sent: 2020-12-24 14:52
📧 Message 14 of 27

On Tue, Dec 22, 2020 at 11:20:19PM +0000,
 Luke Emmet <luke at marmaladefoo.com> wrote 
 a message of 61 lines which said:

> The question that remains particularly, is what should be the
> semantics of the fragment for our most prevalent media type -
> text/gemini?

It is an interesting question but unrelated to the subject of this
thread.

> Then again, do we really have much of a need to do page-specific
> indexing for URLs? Most Gemini pages are quite simple and not too
> complicated. So maybe we just get by without the fragment?

Note there is a standard for fragments on plain text: RFC 5147
<gemini://gemini.bortzmeyer.org/rfc-mirror/rfc5147.txt>.

Link to individual message.

15. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-12-24 15:13
📧 Message 15 of 27


On 24-Dec-2020 14:52, Stephane Bortzmeyer wrote:
>> The question that remains particularly, is what should be the
>> semantics of the fragment for our most prevalent media type -
>> text/gemini?
> It is an interesting question but unrelated to the subject of this
> thread.
Fair point, if this thread is only about the protocol aspects, not their 
application to text/gemini.

But it is still something to be sorted out if we will use them in gemini 
content.

>> Then again, do we really have much of a need to do page-specific
>> indexing for URLs? Most Gemini pages are quite simple and not too
>> complicated. So maybe we just get by without the fragment?
> Note there is a standard for fragments on plain text: RFC 5147
> <gemini://gemini.bortzmeyer.org/rfc-mirror/rfc5147.txt>.
>
Yes that might be useful for plain text, but gemini is not plain text, 
it is text/gemini. Similarly HTML is not plain text, rather text/html, 
and it specifies a semantics for how the fragments are identified within 
the html content via <a name, or id attributes.

We could consider using RFC5147 for any content served as text/plain.

  - Luke

Link to individual message.

16. Stephane Bortzmeyer (stephane (a) sources.org)

📅 Sent: 2020-12-24 15:25
📧 Message 16 of 27

On Thu, Dec 24, 2020 at 03:13:36PM +0000,
 Luke Emmet <luke at marmaladefoo.com> wrote 
 a message of 27 lines which said:

> > Note there is a standard for fragments on plain text: RFC 5147
> > <gemini://gemini.bortzmeyer.org/rfc-mirror/rfc5147.txt>.
> > 
> Yes that might be useful for plain text, but gemini is not plain text, 

It is not but it is close to plain text. There have been several
interesting proposals on this list for fragment semantics. The good
thing about RFC 5147 is that is already exists.

Link to individual message.

17. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-12-24 17:55
📧 Message 17 of 27

On 24-Dec-2020 15:25, Stephane Bortzmeyer wrote:
>>> Note there is a standard for fragments on plain text: RFC 5147
>>> <gemini://gemini.bortzmeyer.org/rfc-mirror/rfc5147.txt>.
>> Yes that might be useful for plain text, but gemini is not plain text,
> It is not but it is close to plain text. There have been several
> interesting proposals on this list for fragment semantics. The good
> thing about RFC 5147 is that is already exists.

You could say HTML is close to plain text, since it is implemented in 
text, so why not use it there?

The limitation of RFC 5147 is that it is not robust at all to any edits 
at all. And gemini resources are not the type of media that are never 
edited, so the match is not so good IMO.

A semantic based addressing scheme would be more robust, and the most 
obvious structure within a gemini document is its heading structure.

  - Luke

Link to individual message.

18. Solderpunk (solderpunk (a) posteo.net)

📅 Sent: 2020-12-26 14:12
📧 Message 18 of 27

On Tue Dec 22, 2020 at 9:34 PM CET, John Cowan wrote:

Given the potential for "what should fragments mean/do in Gemini?" to
turn into yet another endless discussion, and given their relatively low
importance (Gopher has survived without them for 30 years and I,
personally, don't remember ever really missing them in that context),
I'm tempted to take quite seriously Petite Abeille's suggestion to
simply remove them entirely.  However, for the sake of responding
coherently to other issues raised in this thread, let's suppose
fragments hang around:

> 1) Clients MUST NOT send a fragment to the server.

I agree with this, and would be happy to make it explicit in the spec.

> 2) If the client gets a redirect containing a fragment, the client MUST
> apply this fragment when the redirected resource is retrieved, ignoring
> any
> original fragment.
>
> 3) If the original URL has a fragment and the redirect doesn't, the
> client
> MUST apply the original fragment when the redirected resource is
> retrieved.

This I'm not so clear on.  If a server receives a request for a URL
which doesn't include a fragment (which everybody seems to agree is how
things should work), why on Earth would it want to redirect to a URL
which *does* have a fragment?  The only case I can think of where this
might be useful is if multiple separate documents were merged into a
single document, with fragments pointing to the distinct subsections
which used to stand alone.  That's neat, but is it so important that we
should support it instead of doing the brutually simple thing of just
saying that redirect URLs, like request URLs, should not include
fragments, and that following a redirect involves completely discarding
the previous URL (including fragments)?  Carrying stuff across between
distinct requests feels a bit ugly to me.

Cheers,
Solderpunk

Link to individual message.

19. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2020-12-26 15:35
📧 Message 19 of 27

> On Dec 26, 2020, at 15:12, Solderpunk <solderpunk at posteo.net> wrote:
> 
> simply remove them entirely

Alternatively, ignore them, but explicitly. i.e. some verbiage to the 
effect of "while fragments are allowed in IRIs, they do not have any 
special meaning in gemini".

Perhaps someone will come up with an interesting use case in the future. 

In the meantime, fragment = undefined behavior for now.

My 2?.

Link to individual message.

20. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-12-28 03:25
📧 Message 20 of 27

On Sat, Dec 26, 2020 at 10:18 AM Solderpunk <solderpunk at posteo.net> wrote:

> > 2) If the client gets a redirect containing a fragment, the client MUST
> > apply this fragment when the redirected resource is retrieved, ignoring
> > any
> > original fragment.
>

I haven't got any idea why HTTP/1.1 prescribes this behavior either.  Your
idea of redirecting a whole page to part of another page is probably the
best explanation, but I'll do more research on it.

> 3) If the original URL has a fragment and the redirect doesn't, the
> > client
> > MUST apply the original fragment when the redirected resource is
> > retrieved.
>

This part I do understand.  Since the meaning of a fragment depends on the
media type and not the rest of the URL, then if a server is redirecting a
request to the document's new home, the client should carry over the
fragment id from the original URL.  In this case the fragment id doesn't
participate in the protocol at all.

This rule should exist even if text/gemini doesn't have a fragment
definition, because Gemini protocol can host any media type including HTML.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
Fundamental thinking is ha-ard.  Let's go ideology-shopping.
                        --Philosopher Barbie

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
LEAR: Dost thou call me fool, boy?
FOOL: All thy other titles thou hast given away:
That thou wast born with.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201227/807d
764c/attachment-0001.htm>

Link to individual message.

21. Solderpunk (solderpunk (a) posteo.net)

📅 Sent: 2020-12-28 09:59
📧 Message 21 of 27

Thanks for weighing in on this thread.  Of all the outstanding issues,
this is actually the one where I have the least clear idea of what we
should do.

On Mon Dec 28, 2020 at 4:25 AM CET, John Cowan wrote:

> > 3) If the original URL has a fragment and the redirect doesn't, the
> > > client
> > > MUST apply the original fragment when the redirected resource is
> > > retrieved.
> >
>
> This part I do understand. Since the meaning of a fragment depends on
> the
> media type and not the rest of the URL, then if a server is redirecting
> a
> request to the document's new home, the client should carry over the
> fragment id from the original URL. In this case the fragment id doesn't
> participate in the protocol at all.
>
> This rule should exist even if text/gemini doesn't have a fragment
> definition, because Gemini protocol can host any media type including
> HTML.

Okay, so original URL fragments should be carried over to redirect URLs
if the redirect URL has no fragment (which I still suspect perhaps
should be the only valid case).

What about queries?

To some extent, the issue of a redirect URL having a query in it is
similar to a link in a text/gemini document having a query in it, which
was slightly controversial.  Not so much in the sense that people
thought it should be disallowed, but that it was widely believed clients
should handle such things very carefully and give the user a chance to
confirm or edit the query string contents.

Should redirect URLs be allowed to contain queries?

If an original URL has a query in it but the client gets back a redirect
URL without a query, should the original query be appended?

Cheers,
Solderpunk

Link to individual message.

22. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2020-12-28 10:34
📧 Message 22 of 27

> On Dec 28, 2020, at 10:59, Solderpunk <solderpunk at posteo.net> wrote:
> 
> Thanks for weighing in on this thread.  Of all the outstanding issues,
> this is actually the one where I have the least clear idea of what we
> should do.

My 2?: kill fragments altogether as they don't add much values, only 
headaches. Keep queries as is, including redirects. Totally legit.

Link to individual message.

23. Arav K. (nothien (a) uber.space)

📅 Sent: 2020-12-28 12:32
📧 Message 23 of 27

On Mon, Dec 28, 2020 at 10:59:33AM +0100, Solderpunk wrote:
> Should redirect URLs be allowed to contain queries?
> 
> If an original URL has a query in it but the client gets back a
> redirect URL without a query, should the original query be appended?

Only the server has the necessary information to determine whether the
redirect URL should have the query from the original URL, if any, or
not.  It could be that the query string determined the redirect, in
which case the redirect would probably not have the (same) query.  If,
however, the redirect is the same regardless of the query, then perhaps
the server needs to append the query to the redirect.  It should be up
to the server.  The client should use what the server gives it verbatim
(fragments are a separate issue which may change this).

~aravk | ~nothien
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201228/a199
cc05/attachment.sig>

Link to individual message.

24. Côme Chilliet (come (a) chilliet.eu)

📅 Sent: 2020-12-28 12:40
📧 Message 24 of 27

Le lundi 28 d?cembre 2020, 10:59:33 CET Solderpunk a ?crit :
> Okay, so original URL fragments should be carried over to redirect URLs
> if the redirect URL has no fragment (which I still suspect perhaps
> should be the only valid case).

This is my opinion as well, the fragment should be carried over if the 
redirect URL has no fragment.

I also think that it should be allowed to redirect to a URL with a 
fragment, the obvious example is if several documents are merged, their 
previous URL can link to the specific part of the merged document.

Someone used HTML served over gemini as an example, I think it is a good 
thinking exercise to be able to analyse fragment handling in gemini 
protocol separately from fragment meaning in text/gemini documents.
Think about a client browsing HTML through gemini protocol, and needing to 
be able to link and redirect to anchors.

Fragments may be forbidden in the gemini request, since they should never 
be sent to the server.

> What about queries?
> 
> To some extent, the issue of a redirect URL having a query in it is
> similar to a link in a text/gemini document having a query in it, which
> was slightly controversial.  Not so much in the sense that people
> thought it should be disallowed, but that it was widely believed clients
> should handle such things very carefully and give the user a chance to
> confirm or edit the query string contents.
> 
> Should redirect URLs be allowed to contain queries?
> 
> If an original URL has a query in it but the client gets back a redirect
> URL without a query, should the original query be appended?

I think queries should be handled exactly like path segment, and not be 
carried over to a redirection without query, and be allowed in the redirection.

If I was redoing the tictactoe today, I would use a query string instead 
of a path segment to carry the game state, making it clearer to people and 
especially crawlers (GUS and lupa both played every possible tictactoe 
game while crawling) that it is the same page with a different input.

See GUS backlink pages for instance, it is useful for me to be able to 
link to gemini://gus.guru/backlinks?tictactoe.lanterne.chilliet.eu
And it may make sense for me to redirect there from my server if I had a 
local responses link, and I decided to switch to using GUS backlinks instead.

C?me

Link to individual message.

25. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-12-28 13:36
📧 Message 25 of 27



On 28-Dec-2020 09:59, Solderpunk wrote:
> Okay, so original URL fragments should be carried over to redirect URLs
> if the redirect URL has no fragment (which I still suspect perhaps
> should be the only valid case).
Redirect URLs should not have a fragment as they specify a client side 
action. They wont be aware of the clients currently held fragment, so 
cannot act legitimately on it (for example to propose an alternative 
fragment). If they do have one, the client should strip it off, if they 
re-apply the previous fragment.

But frankly I'm skeptical it is a good semantics for a client to 
re-apply an old fragment on an out of date resource to a newly 
redirected resource. In my view they are attached only to the original 
URL and should not be replayed.

There are also some information leakage questions here of replaying 
fragments against new resources, but probably they arent too problematic 
in practice.

> What about queries?
>
> To some extent, the issue of a redirect URL having a query in it is
> similar to a link in a text/gemini document having a query in it, which
> was slightly controversial.  Not so much in the sense that people
> thought it should be disallowed, but that it was widely believed clients
> should handle such things very carefully and give the user a chance to
> confirm or edit the query string contents.
>
> Should redirect URLs be allowed to contain queries?
>
> If an original URL has a query in it but the client gets back a redirect
> URL without a query, should the original query be appended?

No, queries are very different to fragments which are client side. 
Queries are part of the server's URL space, like the path.

URLs with queries should not be adjusted by the client, unless in 
response to a user input.

A server may legitimately use these to implement various kinds of state 
- for example (in relation to another thread) the language partition of 
the application. For example I could implement a weather report in 
different languages thus:

gemini://server/weather/report?english

if the server implements a redirect to a new end point, it should do the 
work of appending any relevant parameterised query, thus:

31 gemini://server/weather/new-report?english

The client cannot know that the query would be valid against the target 
URL, and besides the target URL may already have a query on it, that the 
client should not strip off and replace with something else

  - Luke

Link to individual message.

26. Gary Johnson (lambdatronic (a) disroot.org)

📅 Sent: 2021-01-05 18:53
📧 Message 26 of 27

Solderpunk <solderpunk at posteo.net> writes:

> Should redirect URLs be allowed to contain queries?
>
> If an original URL has a query in it but the client gets back a redirect
> URL without a query, should the original query be appended?

That's a fair question. One of the recommendations you made for server
authors was that we should redirect any URLs that don't end with a / to
their equivalent paths with / appended.

Thus a URL like    gemini://mycapsule.org/foo?somequery
should redirect to gemini://mycapsule.org/foo/?somequery
in order for the client to be able to correctly resolve any relative
URLs in links within the returned gemtext document (assuming this URL
responds with gemtext).

This is what I currently do in Space Age. If we were to mandate dropping
the query string, I suspect we would end up with some very broken
dynamic pages in Geminispace.

YMMV,
  Gary

-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

27. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2021-01-05 23:26
📧 Message 27 of 27

> On Dec 28, 2020, at 10:59, Solderpunk <solderpunk at posteo.net> wrote:
> 
> What about queries?

Not to be messed with. The server decides.

> Should redirect URLs be allowed to contain queries?

Yes.

> If an original URL has a query in it but the client gets back a redirect
> URL without a query, should the original query be appended?

No.

In short, client should not mess around with queries, aside from moving them along.

Please do not redefine the handling & meaning of what an URL is.

Fragments are different as they refer to an anchor within a target 
gemini/text. gemini/text has no concept of anchors, therefore fragments 
make no sense for gemini/text.

But they make sense for http://+text/html though.

? ???

Link to individual message.

---

Previous Thread: [users] [ANN] kiln version 0.1.0

Next Thread: [spec] IRIs, IDNs, and all that international jazz