💾 Archived View for gemi.dev › gemini-mailing-list › 000030.gmi captured on 2024-08-18 at 23:07:00. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
In my last post on this list I called escape codes spaghetti without much of an explaination. Tl;dr: -Do we want color in text/gemini? -If yes how should it be standardized? I really like that there are creative people out there looking for ways to push systems like gemini outside of what they are supposed to be able to do. And I don't want to decide (at least not now) if the escape code thingy is a "bug" or a feataure in most clients. The first question we have ask here is if we want color support at all: I know, that gemini should remain simple and support for colors takes away a good part of this simplicity. But there are a lot of usecases for color support. (Code highlighting, pretty ascii art, footnotes, ...) For me these are two pretty strong arguments against each other. So in case we want color support: Assuming the escape code thingy is a feataure: We should standardize which escape codes are allowed and how clients should handle them. Assuming it is a "bug": Some people (me included after thinking about it) like to see color being used in gemini so we have to find another way to put colors in while the text/gemini format should remain easy to parse and human readable. (Correct me if I'm wrong but that is how I interpret the text/gemini format: Markup that is easy to parse but still human readable if you open it in something like a text editor) If I had to standardize it I'd use custom escape sequences, similar to those already in use but optimized to be no longer than necessary that ALWAYS end in a ';'. This is not really text editor friendly, but avoids having to escape the escape caracter (I'm looking at you XML), makes it easy to parse/strip the escape sequences and if you don't you at least get very short spaghetti. In any case I'm for filtering out unwanted escape codes. I'm looking forward to get some other opinions on this. Greetings - Baschdel -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20191211/8507 60df/attachment.htm>
It was thus said that the Great Baschdel once stated: > In my last post on this list I called escape codes spaghetti without much > of an explaination. > > Tl;dr: > -Do we want color in text/gemini? > -If yes how should it be standardized? Oh goody! We still haven't figured out our last question about Gemini: To reflow or not to reflow, that is the question. > I really like that there are creative people out there looking for ways to > push systems like gemini outside of what they are supposed to be able to > do. > And I don't want to decide (at least not now) if the escape code thingy is > a "bug" or a feataure in most clients. The first question we have ask here > is if we want color support at all: I know, that gemini should remain > simple and support for colors takes away a good part of this simplicity. > But there are a lot of usecases for color support. (Code highlighting, > pretty ascii art, footnotes, ...) For me these are two pretty strong > arguments against each other. > > So in case we want color support: Assuming the escape code thingy is a > feataure: We should standardize which escape codes are allowed and how > clients should handle them. Assuming it is a "bug": Some people (me > included after thinking about it) like to see color being used in gemini > so we have to find another way to put colors in while the text/gemini > format should remain easy to parse and human readable. (Correct me if I'm > wrong but that is how I interpret the text/gemini format: Markup that is > easy to parse but still human readable if you open it in something like a > text editor) Ah, the so called ANSI escape codes, which aren't at all defined by ANSI but ISO, and are technically known as ECMA-48. First off, the control codes that *are* defined by ASCII, codes 0 to 31 [1]. There are only seven control codes that releate to text (we're excluding the unit separators, which are rarely, if ever, used): 07 BEL audible alarm 08 BS move left one space 09 HT horizontal tab 0A LF move down one line 0B VT vertical tab 0C FF form feed 0D CR carriage return Of these, HT, LF and CR are the most used---the rest not so much in text files. Now technically, what we consider the backspace is actually the action of two control characters, BS (move left one space) and DEL (ignore this character---if things worked this way, an easy way to get, say, an umlaut is 'a BS "' or an n with a tilde over it by 'n BS ~'), and there are fights over how to end a line (CR, LF, or both? Technically, Microsoft got this right with both). Then there's ECMA-48, which is what is under discussion here. It's a vast standard, but they all fall under a few patterns (using a slightly modified RFC-5234 format---"a" - "z" means the range of characters between "a" and "z"): Pattern group 1 (largely, a few exceptions in here as shown below): group1 = %d27 ( "`" - "~") Pattern group 2 (largely, a few exceptions in here shown below): group2 = %d27 ( "@" - "_") / %d128-175 ; [3] Pattern group 3 (these are what is popularly known as ANSI codes, and a subset of group 2): group3 = (%d27 "[" / %d155) *("0" - "?") *1(" " - "/") ("@" - "~") and finally pattern group 4 (and are a subset of group 1): cmd = %d144 / %d152 / %d157 / %d158 / %d159 / %d27 ( "P" / "X" / "]" / "^" / "_") group4 = cmd *(%d8-13 / " " - "~") (%d156 / %27 "\") Group 4 is the most problematic, security wise, as they define actual messages. 'ESC]' is defined as "OPERATING SYSTEM COMMAND" and 'ESC_' is "APPLICATION PROGRAM COMMAND". Not many terminal emulators support group 4, but they are defined. And there are several commands under group 3 that are problematic. Microsoft defined 'ESC[...p' (which is in the private use area of ECMA-48) to redefine the keys on the keyboard (so beware of the text file that does 'ESC[13;"deltree c:\"p'). And there are code that do more than just define colors, they can define fonts, move the cursor, define locked regions on the screen, set or clear tab stops, insert lines or characters on the screen, delete lines or characters on the screen, and even define print parameters. But oddly, no defined sequence to query the size of the screen (fancy that!). Now, not all terminial emulators support all of ECMA-48, but it's a large standard (and there is plenty of room for terminal emulators to extend support with private codes). > If I had to standardize it I'd use custom escape sequences, similar to > those already in use but optimized to be no longer than necessary that > ALWAYS end in a ';'. Hard to enforce, and the ending in ';' is problematic with respect to the current standard. For instance, to set the foreground color blinking and the background color: ESC[31;42;5m > This is not really text editor friendly, but avoids > having to escape the escape caracter (I'm looking at you XML), makes it > easy to parse/strip the escape sequences and if you don't you at least get > very short spaghetti. So how would you define the control sequences? One thing I noticed with gemini://konpeito.media/ is that the default page was very hard for me to read, but that's because the page made one assumption that isn't universally true---that the default background color is black. I don't use a black background on my terminals. > In any case I'm for filtering out unwanted escape codes. That's what I do, only I filter out all escape codes, since I don't want the screen to mess up and my program get confused as to what is where. -spc [1] 32, or space, is special in that it can be treated as a control code (part of the unit separator group starting with 28), or the graphical portion (even though it has no graphical representation). And 127 is technically not part of any control group, as it technically means "ignore me entirely" [2]. [2] It comes from paper tape, which originally were 7 bits long, and the hole represents a 1. If there was an error, to fix it, all 7 bits were punched out (representing 127) and and reading side was known to ignore that character entirely. [3] If you are using the UTF-8 encoding scheme, these characters will be encoded as UTF-8 codepoints, so 155 is encoded as the byte sequence 194,155, as if things weren't bad enough.
Sean Conner <sean at conman.org> writes: > It was thus said that the Great Baschdel once stated: >> In any case I'm for filtering out unwanted escape codes. > That's what I do, only I filter out all escape codes, since I don't > want the screen to mess up and my program get confused as to what is > where. It seems to me that the idea of putting ANSI escape codes (oversimplifying here; thanks Sean for the more complete explanation) makes a number of assumptions: 1. That all Gemini clients are running on a terminal (emulator) that interprets ANSI escape codes (or the client provides enough terminal emulation to interpret these escape codes). 2. That ANSI escape codes sent by Gemini pages will only interact with your terminal in a safe way. 3. That it is desirable for untrusted content from the Internet to be able to control your terminal. I, honestly, don't think any of these are true. My suggestion: that the text/gemini format not be allowed to contain ANSI escape codes, even with a MIME type specifying an encoding that could contain them. If you want to send control codes, it should be sent with an appropriate MIME type. I'm not sure what that would be, actually ? is text/plain; charset=us-ascii correct? Or does it need to be application/something? -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
This is interesting timing. Bombadillo v1.x supported printing any escape codes it received. With 2.0.0 we eliminated that support in favor of filtering out escape codes (likely in a naive but mostly functional way andy time `\033` is encountered anything that follows it until a `[A-Za-z]` character is reached will not be rendered (including the terminal character that ends the sequence). That is where things stand with Bombadillo's mainline release right now. However, there has been an issue in our backlog for awhile now to reinstate color. I worked on this a bit over the last week and have done the following: I have added a "theme" (previoiusly there was only 'normal' and 'inverse') called 'color'. This theme looks identical to the normal theme, except it will render the escape sequences. This approach allows the user to move in and out of this rendering at will. If something doesnt look right in `color` mode they can always switch to `normal` mode. Sadly, color is not compatible with `inverse` mode since I use escape sequences to achieve the inverse effect and it is immediately removed anyone tosses out a `\033[0m`. I am kind of opposed to developing any syntax for text/gemini that would treat escape sequences any differently than any other text. Clients can choose how and if to implement escape codes in a way that makes sense to them and their user-base. I think most clients will be well served taking Sean's approach of filtering out escape codes. For those wanting to keep clients lightweight a simple string replace for `\033` to any other character (maybe a box?) will make the escape codes not render, but still show document intent. This is REALLY easy to implement in just about any language. I really like the usage of color on cat's recent page and hope to see more people using color. I do 100% agree with Sean though that using color is often presumptuous and I have ran into issues where my terminal's bg color did not mesh well at all with the colors being used by an application (which is partly why bombadillo's tui does not use colors, only inversion). I also agree with Sean (it seems to be happening a lot this time around) that getting closer to an answer on reflow eventually would be a good idea. As it stands I am wrapping but doing no reflow whatsoever. --? Sent with https://mailfence.com Secure and private email
On Thu, Dec 12, 2019 at 10:18 AM Jason McBrayer <jmcbray at carcosa.net> wrote: > 1. That all Gemini clients are running on a terminal (emulator) that > interprets ANSI escape codes (or the client provides enough terminal > emulation to interpret these escape codes). > > 2. That ANSI escape codes sent by Gemini pages will only interact with > your terminal in a safe way. > > 3. That it is desirable for untrusted content from the Internet to be > able to control your terminal. > > I, honestly, don't think any of these are true. I agree, and adding to this: 4. That gemini terminal clients write text directly to stdout. Libraries like curses maintain their own internal screen buffer and don't allow passing through raw escape codes. Doing so would likely break any TUI display anyway, since escape codes could also change the cursor position. My opinion is that ANSI escape codes should be neither endorsed nor prohibited in the Gemini protocol. Even prohibiting them would add additional complexity, because then servers would need to worry about what's a valid gemini document and what isn't. Treat it as a fun little easter egg and leave it at that. - mozz
> Even prohibiting them would add additional complexity, > because then servers would need to worry about what's a valid gemini document > and what isn't. Treat it as a fun little easter egg and leave it at that. Wouldn't that lead to client support fragmentation? I agree that "prohobiting" content on the protocol level would be difficult, but would we want to at least "discourage" things such as ANSI color codes that are difficult to implement and not part of the spec? Also, maybe this is off-topic, but in my opinion, a lot of beauty of the gemini spec is that the source code is so transparent. Once we have escape codes (and invisible characters), however, the source code is much more opaque. I wouldn't want to see source code displayed any differently by `cat` than when displayed by `vim`, for example. Just my two cents :-)
On 12/12/2019 9:02 PM, Michael Lazar wrote: > > My opinion is that ANSI escape codes should be neither endorsed nor prohibited > in the Gemini protocol. Even prohibiting them would add additional complexity, > because then servers would need to worry about what's a valid gemini document > and what isn't. Treat it as a fun little easter egg and leave it at that. > The way we handle this in the BBS world is as follows: 1.) If server is capable, then it checks or asks if the client is ANSI capable. 2.) If the client supports ANSI (many do, and many don't) then ANSI is served. The level of complication is mitigated by this approach, especially with regards to client implementation. I would note that many on this list expressed interest and were supportive of the effort, while using the approach above would not disenfranchise those not interested. -- Bradley D. Thornton Manager Network Services http://NorthTech.US TEL: +1.310.421.8268
It was thus said that the Great Bradley D. Thornton once stated: > On 12/12/2019 9:02 PM, Michael Lazar wrote: > > > My opinion is that ANSI escape codes should be neither endorsed nor prohibited > > in the Gemini protocol. Even prohibiting them would add additional complexity, > > because then servers would need to worry about what's a valid gemini document > > and what isn't. Treat it as a fun little easter egg and leave it at that. > > The way we handle this in the BBS world is as follows: > > 1.) If server is capable, then it checks or asks if the client is ANSI > capable. > > 2.) If the client supports ANSI (many do, and many don't) then ANSI is > served. > > The level of complication is mitigated by this approach, especially with > regards to client implementation. > > I would note that many on this list expressed interest and were > supportive of the effort, while using the approach above would not > disenfranchise those not interested. And thus we get user-agent strings being sent. -spc (I'm just saying ... )
---