πΎ Archived View for gemi.dev βΊ gemini-mailing-list βΊ 000455.gmi captured on 2024-08-19 at 00:45:45. Gemini links have been rewritten to link to archived content
β¬ οΈ Previous capture (2023-12-28)
-=-=-=-=-=-=-
According to the Gemtext specification, any line that starts with "```" is a preformatting toggle. This makes it impossible to have such a line as part of a preformatted block. I understand the design goals of Gemtext, but I believe the Markdown solution to the same problem can be lifted straight into Gemtext relatively easily: allow more than 3 backticks to open a preformatted block, and require the same number to close it as the number that opened it. This way, any possible text can be included in a preformatted block. Thoughts? The same issue exists with text lines (it's impossible to display "=>" or "#" at the beginning of a line of text), but I'm not concerned about that because having to prefix such lines with a space is not a big deal for plain text, whereas it is for code.
On Mon, 9 Nov 2020 20:38:08 -0500 Ryan Westlund <rlwestlund at gmail.com> wrote: > According to the Gemtext specification, any line that starts with > "```" is a preformatting toggle. This makes it impossible to have such > a line as part of a preformatted block. I understand the design goals > of Gemtext, but I believe the Markdown solution to the same problem > can be lifted straight into Gemtext relatively easily: allow more than > 3 backticks to open a preformatted block, and require the same number > to close it as the number that opened it. This way, any possible text > can be included in a preformatted block. Thoughts? No, I don't like that idea. > The same issue exists with text lines (it's impossible to display "=>" > or "#" at the beginning of a line of text), but I'm not concerned > about that because having to prefix such lines with a space is not a > big deal for plain text, whereas it is for code. I have suggested this in IRC as I've considered that someone might have use for this, I agree with your argument, and since gemtext processing happens only by reading the first few bytes of each line, it is possible to introduce a new line format that is '\' which is escaped line, where it discards line formatting that is based on the first few characters so > \=> this is escaped will print > => this is escaped also keep in mind that '\' does not only escape the character after it, it disables line formatting, because in the example above, if '=' is escaped, '>' is right after it and can be interpreted as a quote. Now, even pre-formatted text toggle characters can be escaped by adding a rule in your parser to consider '\```' within a pre-formatted text block and translate it to literal '```'. Anyone else who might find my suggestion unappealing or have a better idea, feel free to share, I think this is a flaw with the gemtext document format not to have a way to escape certain characters to disable them from being interpreted mistakenly, all other formats do have a mechanism for doing this.
It was thus said that the Great Ryan Westlund once stated: > According to the Gemtext specification, any line that starts with > "```" is a preformatting toggle. This makes it impossible to have such > a line as part of a preformatted block. I understand the design goals > of Gemtext, but I believe the Markdown solution to the same problem > can be lifted straight into Gemtext relatively easily: allow more than > 3 backticks to open a preformatted block, and require the same number > to close it as the number that opened it. This way, any possible text > can be included in a preformatted block. Thoughts? > > The same issue exists with text lines (it's impossible to display "=>" > or "#" at the beginning of a line of text), but I'm not concerned > about that because having to prefix such lines with a space is not a > big deal for plain text, whereas it is for code. I was about to recommend the zero-width space, and I even created a document for this: gemini://gemini.conman.org/test/escape.gemini but I did notice that when the text is selected, the zero-width space is also selected (as it should). I have a second file gemini://gemini.conman.org/test/escape2.gemini that uses " \b" (space, backspace character) which didn't work with cut-n-paste (at least on Firefox on Mac OS-X) but if I viewed the resulting file in a terminal and then did a cut-n-paste, the output was as expected. So the question is---how often are you going to quote material where a line starts with ```? Is it often enough to worry about it? Or could you place the example as a plain text file and link to it? -spc
On 2020-11-10 (Tuesday) at 02:08, Sean Conner <sean at conman.org> wrote: > So the question is---how often are you going to quote material where a > line starts with ```? Is it often enough to worry about it? Or could you > place the example as a plain text file and link to it? > > -spc 10000%. The escaping suggestion has been brought up a lot, but I have no idea when writing a line starting with ``` is 100% necessary, and couldn't be prefixed with a zero-width space or even a normal space. Even the various "getting started with gemtext" posts, which are the only ones that might conceivably actually have need to do it (from my understanding the escaping rules in other markup are really only there so you can talk about the markup in the markup), get by just fine without doing so. Like Sean said, if it's *really* *that* *necessary* you can just put it in a file marked text/plain. But I really doubt that it is. -- ~ acdw acdw.net | breadpunk.club/~breadw
On Mon, 9 Nov 2020 21:08:34 -0500 Sean Conner <sean at conman.org> wrote: > I was about to recommend the zero-width space, and I even created a > document for this: > > gemini://gemini.conman.org/test/escape.gemini > > but I did notice that when the text is selected, the zero-width space > is also selected (as it should). I have a second file > > gemini://gemini.conman.org/test/escape2.gemini > > that uses " \b" (space, backspace character) which didn't work with > cut-n-paste (at least on Firefox on Mac OS-X) but if I viewed the > resulting file in a terminal and then did a cut-n-paste, the output > was as expected. Here you're making assumptions about how \b will be rendered, I'm not all against your approach, but I'm not also confident that all implementations will render this properly. > So the question is---how often are you going to quote material > where a line starts with ```? Is it often enough to worry about it? > Or could you place the example as a plain text file and link to it? Indeed, you could also provide it in a separate text/plain file as acdw suggested, my point is it is just annoying that no matter what I do, I can't have a line that starts with '>', '#', '=>', '*', in gemtext, regardless of how often anyone is gonna ever need that. However, until I explicitly have an example of a use case that might require that, I'm not going to propose anything at all, even if others did, it's up to Solderpunk to approve it.
> > gemini://gemini.conman.org/test/escape2.gemini > > that uses " \b" (space, backspace character) which didn't work with > > cut-n-paste (at least on Firefox on Mac OS-X) but if I viewed the > > resulting file in a terminal and then did a cut-n-paste, the output > > was as expected. > > Here you're making assumptions about how \b will be rendered, I'm not > all against your approach, but I'm not also confident that all > implementations will render this properly. Agreed. In my client (Amfora), the \b is rendered as a space for some unknown reason. However this seems to be an issue with terminal UI toolkit used, as a simple command like printf '\btest' renders as expected on the command line, as just the word 'test'. I find using zero width spaces to be a cool hack, and using separate files to be an elegant solution. makeworld
On 11/10/20 2:41 AM, Ali Fardan wrote: > However, until I explicitly have an example of a use case that might > require that, I'm not going to propose anything at all, even if others > did, it's up to Solderpunk to approve it. I actually have encountered a need to write ``` literals in a gemtext file in the past. You can see such an example here: gemini://tilde.team/~tomasino/journal/20200601-accessibility.gmi I just used a space at the start of each line in the block. It's pretty intuitive and very easy for non-technical people to figure out.
Use whitespace before the preformatted text and give warning to your reader to be careful when copying such text. Maybe clients could do '''Maximum Common Frontal White Space Elimination''' in preformatted text blocks :P ~smlckz On 11/10/20, colecmac at protonmail.com <colecmac at protonmail.com> wrote: >> > gemini://gemini.conman.org/test/escape2.gemini >> > that uses " \b" (space, backspace character) which didn't work with >> > cut-n-paste (at least on Firefox on Mac OS-X) but if I viewed the >> > resulting file in a terminal and then did a cut-n-paste, the output >> > was as expected. >> >> Here you're making assumptions about how \b will be rendered, I'm not >> all against your approach, but I'm not also confident that all >> implementations will render this properly. > > Agreed. In my client (Amfora), the \b is rendered as a space for some > unknown reason. However this seems to be an issue with terminal UI > toolkit used, as a simple command like > > printf '\btest' > > renders as expected on the command line, as just the word 'test'. > > I find using zero width spaces to be a cool hack, and using separate files > to be an elegant solution. > > makeworld > >
So if I understand Ali Fardan's suggestion, it is to introduce starting with '\' as a signal for a new line type which could accomodate all of these issues. I admit I like that it solves both the issues I mentioned with one change, even the one I didn't think really needed a solution. The main reason I don't prefer it to my own suggestion is that it would still mean that preformatted lines might need to be altered in some way (if the preformatted lines contain "\```" or something), instead of allowing to paste them in unmodified and only have to modify the prefomatting toggle lines. And that is if escaping only considers syntax that would be interpreted anyway. If it implemented in a context-insensitive way, meaning a line starting with "\#" is still translated to "#" inside preformatted text, it could actually make the problem worse by increasing the number of preformatted lines that must be modified. One comment on that same message: > also keep in mind that '\' does not only escape the character after it, > it disables line formatting, because in the example above, if '=' is > escaped, '>' is right after it and can be interpreted as a quote. I don't see how this is correct? A quote line must start with >, not =>. If in "\=>", the \ escapes the = so that the resulting text is =>, I don't see how it could be interpreted as a quote. Ali's comment on the idea of using zero-width space or ASCII backspace voices my own thoughts: > Here you're making assumptions about how \b will be rendered, I'm not > all against your approach, but I'm not also confident that all > implementations will render this properly. This is definitely my concern. Including invisible characters that are outside of normal consideration seems like a bad idea. In response to message from Sudipto Mallick: > Use whitespace before the preformatted text and give warning > to your reader to be careful when copying such text. I don't like anything that requires "give warning to your reader", because that's an admission that the markup is getting in the way. A markup format shouldn't require telling your reader "I had to format this weirdly because of the markup; you need to correct it". > Maybe clients could do '''Maximum Common Frontal White Space > Elimination''' in preformatted text blocks :P But that could alter the meaning, for example if writing about indentation. For the sake of use case: I write Python tutorials in Markdown, as well as the specification for Sanemark, a variant of Markdown. Several similar issues have come up for me before with Markdown (this specific one would've been a major obstacle for the Sanemark spec if Markdown didn't implement what I suggest, because leading space is significant).
MCFWSE (hehe) wouldn't alter meaning for most of the cases if you think about it for a moment. If you have a line which is *not* indented in some codeblock which has some other lines indented, no problem: because the MCFWS (hah) is just "" because of that non-indented line. Now, if all of the lines in a codeblock is indented, then you have to add some more indent, at least a space, at the front of each line, and add another line with only that newly added indent, so that that newly added indent is removed and rest of the indent is preserved. [So complex. Uh! Anyone want to implement this?] I think that the use of Zero-Width Space to escape constructs other than code blocks is okay as they wouldn't usually be copy pasted. ~smlckz
There's quite a bit to unpack here. First, let me say that the issue here is in-band signalling. You get this issue whenever you use some value (or characater in this case) to signal some change in interpretation of data, and you need to use the value (or character) *as* data and not a signal. HTML has this issue as well, in that it needs a way to designate a markup tag, and it uses '<' for that (based upon its use in SGML). But if one needs to use '<' in regular text it needs to be escaped. So (again, from SGML) they use the '&' character to introduce named entities---a representation of a character that could not otherwise be typed. But that means if you want to use '&' as a character, it too, needs to be escaped. So that means in HTML, if you want to display a '&' you escape it as "&" [1]. And to display a '<' you escape it as ">". Gemtext does *not* have such a facility, as it complicates the processing of the text, which is something solderpunk wanted to keep simple. Escaping data complicates this (I've seen complications with the proper encoding of URLs for instance). There is no solution (aside from serving up a plain text file) that will easily solve this issue. Now on with the rest of the commentary ... It was thus said that the Great Ryan Westlund once stated: > The main reason I don't prefer it to my own suggestion is that it would > still mean that preformatted lines might need to be altered in some way > (if the preformatted lines contain "\```" or something), instead of > allowing to paste them in unmodified and only have to modify the > prefomatting toggle lines. Sorry, no way around that. I mean, one *could* use HTML entities: ``` Blah blah blah blah. And now a preformatted block of code: ``` This is a diagram ``` ``` but then the processing becomes harder as the client would then have to scan character by character, converting entities to characters. Or perhaps use the standard '\' as an escape character: ``` Blah blah blah blah. And now a preformatted block of code: \``` This is a diagram ``\` ``` as long as at least one of the grave characters is escaped, it won't trigger block mode (on or off). But again, you have to process everything character by character to handle the '\' character. Or just decide that the following four characters at the start of a line \``` is to be presented, verbatim, as ``` and *not* trigger block mode. As mentioned earlier, one could just use more than three such characters: ```````````` Look Ma! Block mode! It's defined as ``` ... See? ```````` But it's not really defined as three, but more than three, and again, you have issues. Other possibilities---use the first non ` character as a final delimeter: ```| To define a block mode, use three grave accents in a row, with another character that doesn't appear in the text; said character will then end the block mode. For example: ```@ this is block mode @ See? | Or perhaps a sequence of characters? ```end-of-line To define a block mode, use three grave accents in a row, followed by a sequence of non-blank characters; said sequence will end the block mode: ```EOF This is block mode EOF See? end-of-line Or how about this variant: ```end-of-line Blah blah blah ```EOF This is a sample block mode ```EOF See? ```end-of-line I mean, you can go crazy with this stuff. But every option involves more processing than happens now. This is also not to say I endorse or condemn any of these methods. > For the sake of use case: I write Python tutorials in Markdown, as well as > the specification for Sanemark, a variant of Markdown. Do you know that Markdown was created by John Gruber as an easy way to create HTML pages, with shortcuts for the tags he used the most often, leaving the more obscure or harder to support tags to HTML itself? I mean, why else would his Markdown include the ability to include HTML? If he needed an image (and I don't think he includes many images) he would type the <IMG ... > tag by hand. The varitions come when people wanted to
On Tue, 10 Nov 2020 00:19:53 -0500 Ryan Westlund <rlwestlund at gmail.com> wrote: > The main reason I don't prefer it to my own suggestion is that it > would still mean that preformatted lines might need to be altered in > some way (if the preformatted lines contain "\```" or something), > instead of allowing to paste them in unmodified and only have to > modify the prefomatting toggle lines. Indeed it is ugly, one other workaround to this is having > ``` > ``` interpreted as literal '```' to avoid altering preformatted text, and to be clear, this does not apply to the case of having multiple preformatted text blocks next to each other, this only applies to a single preformatted text block that contains nothing. > And that is if escaping only considers syntax that would be > interpreted anyway. If it implemented in a context-insensitive way, > meaning a line starting with "\#" is still translated to "#" inside > preformatted text, it could actually make the problem worse by > increasing the number of preformatted lines that must be modified. I suggested a special parser rule for preformatted text blocks that is separate for the rule used for all other text, and this rule only applies to '^\```', though I agree with your concerns. > > also keep in mind that '\' does not only escape the character after > > it, it disables line formatting, because in the example above, if > > '=' is escaped, '>' is right after it and can be interpreted as a > > quote. > > I don't see how this is correct? A quote line must start with >, not > =>. If in "\=>", the \ escapes the = so that the resulting text is =>, > I don't see how it could be interpreted as a quote. I'm pointing out to those writing parsers to avoid the pitfall of just skipping the character after that escape character and continuing as normal from there, I do that myself so I thought it might be worth noting, to clarify, if the parser ignores the character that is escaped and continues as normal from there, then: > \=> escaped only the '=' which means now that '>' is the first character in the line and can be interpreted as a quote block, of course, this is implementation specific, and I don't know how others have implemented their parsers.
Sean Conner wrote: >It was thus said that the Great Ryan Westlund once stated: >> The main reason I don't prefer it to my own suggestion is that it would >> still mean that preformatted lines might need to be altered in some way >> (if the preformatted lines contain "\```" or something), instead of >> allowing to paste them in unmodified and only have to modify the >> prefomatting toggle lines. > Sorry, no way around that. I mean, one *could* use HTML entities: The point of that paragraph was that my proposal makes it so that you only have to modify the preformatting toggle lines to paste in arbitrary content, not the content itself. It does achieve that. (CDATA does not because the string "]]>" ends it and there is no rule about matching the number about brackets used to open it, so a CDATA section cannot include "]]>".) For what it's worth, the reason I don't think HTML needs this as much as gemtext is because HTML already has a general escaping mechanism, and is not necessarily meant to be written by hand anyway. Ali Fardan wrote: > Indeed it is ugly, one other workaround to this is having > >> ``` >> ``` > > interpreted as literal '```' to avoid altering preformatted text, and > to be clear, this does not apply to the case of having multiple > preformatted text blocks next to each other, this only applies to a > single preformatted text block that contains nothing.. Interesting. Clarification: is this like PostgreSQL string escapes, where two consecutive toggle lines inside a block come out as a literal "```"? Or does it only apply to empty blocks (and not the same pattern within a block with other content), so that embedding a literal "```" requires four in a row (one to end the current block, then two for the literal sequence, then the fourth to open another preformatting block - assuming two adjacent blocks will be merged)?
Here's a possible solution to the problem of escapes which I don't think has been mentioned. Currently each of the line types is introduced by a 1-3 char prefix with two exceptions: the default text line type, and the preformatted line type. We can deal with the escaping problem by adding prefixes for these too. So there's a sense in which this actually makes gemtext *simpler*, more regular, even as it adds more notation. As a nice side-effect, it also lets one- and two-liner preformatted blocks take less vertical space. The prefix for text line could be just a backslash, roughly in line with the suggestion elsewhere in this thread. Examples: \#Bold# text is sometimes denoted this way. \=> HELLO! <= \* is the multiplication operator \\n refers to the newline character. \This is just an ordinary text line, the initial backslash is ignored. And for preformatted lines, it could e.g. be "`` ": `` $ make install `` ``` rain `` ``` drops `` `` ... '' she replied, speechless. That leaves the problem (if we're to be completist) of associating alt-text to a sequence of preformatted lines using this notation. We could say that such a sequence immediately following a preformatted block using "```" is considered part of the block: ``` rain drops ``` `` ``` `` ``` Alternatively (neater?) we could consider "`` " as magic even when it's within a preformatted block: ``` rain drops `` ``` `` ``` ``` -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/3517 c648/attachment.sig>
On Tue, 10 Nov 2020 17:17:29 +0100 mbays at sdf.org wrote: > Here's a possible solution to the problem of escapes which I don't think > has been mentioned. > > Currently each of the line types is introduced by a 1-3 char prefix with > two exceptions: the default text line type, and the preformatted line > type. We can deal with the escaping problem by adding prefixes for these > too. So there's a sense in which this actually makes gemtext *simpler*, > more regular, even as it adds more notation. As a nice side-effect, it > also lets one- and two-liner preformatted blocks take less vertical > space. Another side effect is that a change like this would immediately invalidate nearly every text/gemini document in existence. Only documents that currently don't use plain text lines or preformatted blocks would be unaffected. Every client would also be invalidated insofar that they would display the backslashes verbatim until they were updated. That is a huge cost for a problem that's so far largely hypothetical. This looks to me like a solution waiting for a problem. Unless someone can demonstrate a real case where the obvious workarounds (e.g. leading whitespace) didn't suffice I don't think it warrants a change, certainly not one that's so fundamental. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/c966 07b7/attachment.sig>
On 2020-11-10 (Tuesday) at 17:30, Philip Linde <linde.philip at gmail.com> wrote: > > This looks to me like a solution waiting for a problem. Unless someone > can demonstrate a real case where the obvious workarounds (e.g. > leading whitespace) didn't suffice I don't think it warrants a change, > certainly not one that's so fundamental. Hear, hear. I'm tempted to buy one of those vanity domains like willgeminisupportescaping.com that just says "NO," for easy linking when this invariably comes up again. -- ~ acdw acdw.net | breadpunk.club/~breadw
On Tue Nov 10, 2020 at 1:38 PM EDT, acdw wrote: > Hear, hear. I'm tempted to buy one of those vanity domains like > willgeminisupportescaping.com that just says "NO," for easy linking when > this invariably comes up again. willgeminisupport.com would be the more general solution to all of these questions
> willgeminisupport.com would be the more general solution to all of these > questions This seems like a valid request, here we go: gemini://when.willgemini.support/?escaping
On Tue Nov 10, 2020, Ren? Wagner wrote: > This seems like a valid request, here we go: > gemini://when.willgemini.support/?escaping Nicely done :)
[The quoted messages below didn't go to the list, but were meant to]
Hey! I just tested both of your approaches with Kristall: > I was about to recommend the zero-width space, and I even created a > document for this: > > gemini://gemini.conman.org/test/escape.gemini > > but I did notice that when the text is selected, the zero-width space > is also selected (as it should). https://mq32.de/public/05851bad5b5e23d9e24bcaa7f259ab6de55c17d0.png Looks okay, works as expected >I have a second file > > gemini://gemini.conman.org/test/escape2.gemini > > that uses " \b" (space, backspace character) which didn't work with > cut-n-paste (at least on Firefox on Mac OS-X) but if I viewed the > resulting file in a terminal and then did a cut-n-paste, the output > was as expected. https://mq32.de/public/2a35ca5e121d2373f9390315c4857dd58c198ca8.png I think the zero-width space solution is okay, using '<SPACE><BACKSPACE>' is not, as it's not rendered correctly in anything that is not a plain terminal: vi, vim, and nano display that file as ' ^H' gedit will display ' ?' where ? is a replacement character: https://mq32.de/public/be09184d3c095251bf2f31ad2bd0685a8a2f0741.png castor displays the same replacement character: https://mq32.de/public/408e8c8b1fdc92c70d601c6def64bb5121d64754.png Regards - xq
It was thus said that the Great Felix Quei?ner once stated: > Hey! > > I just tested both of your approaches with Kristall: > > > I was about to recommend the zero-width space, and I even created a > > document for this: > > > > gemini://gemini.conman.org/test/escape.gemini > > > > but I did notice that when the text is selected, the zero-width space > > is also selected (as it should). > > https://mq32.de/public/05851bad5b5e23d9e24bcaa7f259ab6de55c17d0.png > Looks okay, works as expected Yes, but select the text, paste it elsewhere and see if the zero-width space character show up. It did when I tried it. -spc
---
Previous Thread: Serious writing (in the Latin script) needs italics
Next Thread: Cache duration and response body size proposals