I was surprised and excited today to spot a (beta) web-to-Gemini portal at https://portal.mozz.us/. By default it points at the Zaibatsu, and my first thoughs upon seeing it rendered was "Oh, what a shame, it messes up the bullet point list at the top of the page". But, no, nothing is messed up here, this is precisely what I specced in 0.9.1 with the text reflowing! I'm kind of sad about this. I was resolved to destroying the possibility of ASCII art in Gemini pages as an unfortunate "breaking eggs to make an omlette" kind of thing. I agree with the many people who have expressed frustration at Gopher content looking wonky if viewed on anything other than an 70 or 80 character wide terminal. But obviously I didn't think the consequences of the way I specced things through very carefully, because I wouldn't have been anywhere near as happy about destroying the possibility of bulleted lists. Those are a really useful and legitimate thing for text-based content to have. So, how do we fix this? Of course there has been plenty of talk in the past about using some fullblown lightweight markup language for Gemini, something like Markdown for example, or ratfactor's "Text Junior" (ooh, I should invite ratfactor to this list). I was a proponent of this in the beginning but I moved away from it quickly after it became clear that (i) people have strong opinions on lightweight markup languages and no choice was going to be popular with everybody, and (ii) lots of popular lightweight markup languages either lack a clear and unambiguous specification, or they have one and it's a difficult thing to write a parser for. In the end I put in the very minimal definition of reflowability that I did because it had an existing RFC reference and didn't seem *too* onerous to implement (and, anyway, is optional). But now it seems we need something more than just that. We could just add one more sentence saying that lines beginning with optional whitespace, a * and then at least one whitespace character should not be flowed, just like lines beginning with => should not be flowed. But that's likely the first step down a slippery slope to inventing our own ad-hoc non-standard markup language. Which perhaps we *ought* to do, but if so we ought to actually explicitly decide on it and commit to doing it properly in one go instead of designing it piecemeal. The alternative would seem to be to backtrack on reflowing and say "reflowing would be very nice, but it's not worth the cost of having to carefully specify how to handle all the edge cases necessary to preserve other very nice things like bullets, so tough cookies". I don't think that would be a popular choice, but it's still on the table at this point, IMHO. -Solderpunk
It was thus said that the Great solderpunk once stated: > I was surprised and excited today to spot a (beta) web-to-Gemini > portal at https://portal.mozz.us/. By default it points at the > Zaibatsu, and my first thoughs upon seeing it rendered was "Oh, what a > shame, it messes up the bullet point list at the top of the page". > > But, no, nothing is messed up here, this is precisely what I specced > in 0.9.1 with the text reflowing! > > I'm kind of sad about this. I was resolved to destroying the > possibility of ASCII art in Gemini pages as an unfortunate "breaking > eggs to make an omlette" kind of thing. I agree with the many people > who have expressed frustration at Gopher content looking wonky if > viewed on anything other than an 70 or 80 character wide terminal. > But obviously I didn't think the consequences of the way I specced > things through very carefully, because I wouldn't have been anywhere > near as happy about destroying the possibility of bulleted lists. > Those are a really useful and legitimate thing for text-based content > to have. > > So, how do we fix this? There's always gemini://gemini.conman.org/gRFC/0001, which allows one to specify fixed or flowed, and for flowed, there's a way to mark lines that shouldn't be flowed at all. I even moved it back to a PROPOSED state. -spc
> there's a way to mark lines that > shouldn't be flowed at all. I assume this refers to the following part from early in RFC3676 4.1? > If the line ends in a space, the line is flowed. Otherwise it is > fixed. Using a trailing space is kind of a neat solution, in the sense that is has very low visual impact. But the particular implementation above strikes me as "backward". I want the majority of my lines to be flowed. The exceptions, which would be things like bullet points, or a centred title line, would likely account for fewer than 10% of the lines in the whole document. Since they are in the minority, *they* are the lines which should require extra effort (manually adding an extra space). The majority wrapped lines should require no effort to prepare (and certainly shouldn't require me to fight against my editor's wrapping). Adopting RFC3676 wholesale would bring in a lot of extra stuff that I don't think we have an immediate need for (like quoting related stuff). But I'm very happy to use it as inspiration. Maybe a rule where lines ending in a space are immune to flowing could work? -Solderpunk
It was thus said that the Great solderpunk once stated: > > there's a way to mark lines that > > shouldn't be flowed at all. > > I assume this refers to the following part from early in RFC3676 4.1? > > > If the line ends in a space, the line is flowed. Otherwise it is > > fixed. Yup. > Adopting RFC3676 wholesale would bring in a lot of extra stuff that I > don't think we have an immediate need for (like quoting related stuff). > But I'm very happy to use it as inspiration. > > Maybe a rule where lines ending in a space are immune to flowing could > work? Well, Markdown has two RFCs (RFC-7763 and 7764), so one could always use that to serve up documents---they allow linking after all. Then again, expect to see requests for changes all the time as people use the protocol/text format and bump up against the limitations. I don't know what to tell you here---I'm neutral on this aspect. -spc
It was thus said that the Great solderpunk once stated: > I was surprised and excited today to spot a (beta) web-to-Gemini > portal at https://portal.mozz.us/. Pretty cool, but it fails on loading any of the proposed RFCs on my site: gemini://gemini.conman.org/gRFC/ I think that's because I send a MIME type of text/gemini; charset=US-ASCII; format=flowed which confuses it. I changed one of the documents to just return text/gemini and it loaded fine. Internesting. > By default it points at the > Zaibatsu, and my first thoughs upon seeing it rendered was "Oh, what a > shame, it messes up the bullet point list at the top of the page". There's a way around it---you can view it at https://portal.mozz.us/?url=gemini.conman.org%2Ftest%2Fbullet.gemini Basically, I created a "bulletted list" by making each bullet point a link. Hey, it works. -spc (It also a decent test for parsing URLs---the gateway doesn't do a proper job actually ... )
> Well, Markdown has two RFCs (RFC-7763 and 7764), so one could always use > that to serve up documents---they allow linking after all. Then again, > expect to see requests for changes all the time as people use the > protocol/text format and bump up against the limitations. I don't know what > to tell you here---I'm neutral on this aspect. I've actually always expected that serving Markdown over Gemini could become quite popular. Some people are no doubt going to see Gemini first and foremost as "the web, stripped down" (as opposed to others who will see it as "gopher, souped up"), and in that sense Markdown is perhaps a natural partner since it seems to be the leading contender, in terms of mindshare, for "HTML, stripped down". I think if Gemini ever "gets big" in any meaningful sense (which, to be clear, I don't think likely), that usage will probably be what drives it. A nice graphical client which rendered section headers in heavier fonts and did bold and italics would offer a very nice experience, IMHO. This naturally raises the question "why not just spec text/markdown as the default response type?". I actually quite like Markdown so the notion doesn't offend me, but I do think it raises the bar a little too high for client implementation effort. Markdown is very common and there's no shortages of libraries for dealing with it, but the vast majority of them have the goal of converting it to HTML, which is no good for us. Of course one can just dump Markdown to the screen and not worry about line wrapping or section formatting or anything - Markdown is *designed* to look nice and readable as is. But Markdown also allows links to occur anywhere in the text, and providing a nice textual interface to that kind of hypertext isn't straightforward. It can be done, of course, as things like lynx show. But handling the one-link-per-line structure of Gopher or text/gemini is trivial, as so many Gopher clients show. With text/gemini, you can, provably, write a usable Gemini client in < 100 LOC. With Markdown, I don't think this would be possible (but anybody feel free to prove me wrong!). I also happen to think that the one-link-per-line restriction encourages very clean and usable designs. It's possible that in the future Geminispace will be split into two "camps", a mainstream one using Markdown and graphical clients and the retrogrouch crowd using text/gemini and terminal clients. Maybe *that* is how to solve the problem or reflowing text in text/gemini - we don't, text/gemini is just plain text and if it really bothers you, well, that's the proof of which camp you're in, so start writing a client that renders Markdown! Really just thinking out loud here... -Solderpunk
I think think many clients will opt to implement markdown parsers (heck, some may even try html parsing), but I think it would be a bad call to include it in the spec for gemini. Now that it has been brought up, I may add markdown parsing to the client I am working on (I am still in planning stages figuring out how I want it to work and the necessary structs and execution flows). I think markdown can be rendered in cool/sensible ways in a terminal as well, so it can be a very nice way to go. I think that inline links can still be handled in the bracket number style (link[7]) that many use in gopherspace, for example. However, markdown will require the writing of an actual lexer/parser rather than just reading in lines and looking for a magic string at the beginning. Not a huge deal for some, an insurmountable challenge for others. Me personally, I think it would be fun to code it up, but only as an additional feature for my client and not as a core part of gemini. I would definitely support a few optional things that can be rendered as a part of text/gemini. I had brought these up with solderpunk earlier in development of the spec and they, I believe, were deemed implementation details for the client... which I am mostly fine with. My only worry is that if we leave it to clients to add support for things like bold text in gemini documents we will end up with a fragmented situation where some clients support one way of doing it and others support others. I think picking the bare minimum essential styling and providing that as a part of the spec would be helpful in making text/gemini attractive without going the way of html/css/http. I think the following limited set makes sense and can be handled with basic string replacing if people dont want to parse: 1. Bold Bold can be rendered in most terminals as bold. Clients that prefer could just string.toUpper or the like. If this was done with opening and closing tags a substring replace (or proper parse) could just replace it with the escape sequences for bold. 2. Italic Same situation as bold, more or less. For situations where italics is not supported by whatever viewing system, the tags could be replaced by asterisks? Like *this*. 3. Bold-Italic Mostly the same as above, but a combination. 4. Heading (only one level) This can be rendered a number of ways. In a terminal I would likely render this with the escape for 'inverse' text. Makes it really pop. I think the above four would provide sufficient styling to handle most basic uses and provide a little bit of flair. If people were into the idea, we'd need to come up with how to denote those things in a text/gemini document. I am not adamant that the above needs to be included, but I think it could be nice. As for text reflow: I am not in favor of html style text reflow (ignoring more than one space char). I think it makes sense, like gopher, to render text as provided. The exception to this would be word wrap, particularly for clients intended on width limited devices. My plan is to have word wrapping be a togglable feature in my client. Anyway, much like solderpunk: just thinking out loud. ps. As to bullets: since gemini is utf-8 by default bullets should be very much in play. --? Sent with https://mailfence.com Secure and private email
It also occurs to me that it would not be a crazy thing to add another magic string. If we wanted to handle things as list items that would never get reflowed: ``` --> I am a list item I am text that would get reflowed. --> I am another list item --> and I complete the list, yay! ``` Something like that is an easy solution to making sure something always appears on a line of its own. Not sure how elegant it is. As other things come up the arrow style syntax magic string at the beginning can easily be modified to support various features, but is reasonably limited as well. --? Sent with https://mailfence.com Secure and private email
Brian Evans writes: > I think think many clients will opt to implement markdown parsers > (heck, some may even try html parsing), but I think it would be a bad > call to include it in the spec for gemini. Now that it has been > brought up, I may add markdown parsing to the client I am working on > (I am still in planning stages figuring out how I want it to work and > the necessary structs and execution flows). I am of two minds on this. On the one hand, I am generally of the "pave the cowpaths" school, where your RFCs ratify actual practice and pick out best practices. On the other, I would not like to see Gemini usage split early on between Markdown-implementing sites/clients and non-Markdown-implementing. Personally, what I would most like to see in a client is a large subset of Markdown (no embedded HTML, probably no inline images). But all of the CommonMark text properties, levels of headings, etc. In general, I would like for reasonable typography to be what you think of when you think of a Gemini page. I understand that this makes things significantly harder for client implementors. That's one reason why I've suggested ratfactor's Text Junior format (can someone invite ratfactor? I don't have any means of contacting them) as a standard format for Gemini. It is basically a large subset of Markdown, but fully line-oriented and stripped of parsing ambiguities. A simple client can either just cat(1) it, can fmt(1) everything that's not wrapped in a ``` ``` block, or can apply full formatting. Everything I have served on my Gemini site is legal Text Junior. It is perhaps unfortunate that despite there being Markdown parser libraries for every common language, practically all of them are focused on generating HTML rather than generating (e.g.) a parse tree that you could use in your own layout engine. If all of those libraries had better support for other representations, I'd honestly argue that we just use MarkDown (CommonMark), because it already occupies the text-formatting analogue of the niche that Gemini is aiming for in the protocol world, IMO. -- +----------------------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | The scalloped tatters of the King in Yellow must hide Yhtill forever.|
Jason McBrayer writes: > It is perhaps unfortunate that despite there being Markdown parser > libraries for every common language, practically all of them are > focused on generating HTML rather than generating (e.g.) a parse tree > that you could use in your own layout engine. I've actually looked at the internals of the Python and Common Lisp libraries for MarkDown, and they both have extension points that could reasonably be hijacked for returning parse trees, for what it's worth. Still might be a bit much for someone trying to write a quick-and-dirty client. -- +----------------------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | The scalloped tatters of the King in Yellow must hide Yhtill forever.|
I am envisioning a Gemini client that looks a lot like this GTK Markdown reader: https://github.com/craigbarnes/showdown/blob/master/README.md -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20190817/c101 0829/attachment.htm>
It was thus said that the Great Brian Evans once stated: > > The exception to this would be word wrap, particularly for clients > intended on width limited devices. My plan is to have word wrapping be a > togglable feature in my client. My gopher client has a key (F3 [1]) to reflow the current page. -spc [1] Why F3? Because the function keys get no love in this modern age.
On Thu, Aug 15, 2019 at 7:52 PM Sean Conner <sean at conman.org> wrote: > > It was thus said that the Great solderpunk once stated: > > I was surprised and excited today to spot a (beta) web-to-Gemini > > portal at https://portal.mozz.us/. > > Pretty cool, but it fails on loading any of the proposed RFCs on my site: > > gemini://gemini.conman.org/gRFC/ > > I think that's because I send a MIME type of > > text/gemini; charset=US-ASCII; format=flowed This has now been fixed. I'm not going to do anything with the "format=flowed" parameter because it's not in the official spec (yet), but you should now be able to specify a charset and it will be respected by the proxy. > -spc (It also a decent test for parsing URLs---the gateway doesn't do a > proper job actually ... ) > This should also be fixed now, I wasn't handling relative paths properly. Since there are currently only a handful of actual gemini servers to test against, it's difficult to discover these types of bugs when writing a client. It would be helpful if there was a playground or sandbox server that presented all of the different combinations of URL formats, charsets, response codes, etc. that are allowed by the Gemini spec. I might take a shot at starting one myself. - mozz
It was thus said that the Great Michael Lazar once stated: > On Thu, Aug 15, 2019 at 7:52 PM Sean Conner <sean at conman.org> wrote: > > > > It was thus said that the Great solderpunk once stated: > > > I was surprised and excited today to spot a (beta) web-to-Gemini > > > portal at https://portal.mozz.us/. > > > > Pretty cool, but it fails on loading any of the proposed RFCs on my site: > > > > gemini://gemini.conman.org/gRFC/ > > > > I think that's because I send a MIME type of > > > > text/gemini; charset=US-ASCII; format=flowed > > This has now been fixed. I'm not going to do anything with the "format=flowed" > parameter because it's not in the official spec (yet), but you should now be > able to specify a charset and it will be respected by the proxy. You might want to fix the parser. I changed gRFC/0002 to return a MIME type of: text/gemini; format=flowed; charset=US-ASCII; (as the order of parameters is unspecified in proper MIME parsing) and I'm getting an Internal Server Error. The other two pages still work. > > -spc (It also a decent test for parsing URLs---the gateway doesn't do a > > proper job actually ... ) > > > > This should also be fixed now, I wasn't handling relative paths properly. I still don't think it's working correctly. On the page: gemini://gemini.conman.org/test/bullet.gemini The first five links are, in order: example:bullet RFC-7595 example:bullet about:blank RFC-6694 - a non-existant link /test/bullet.gemini self-referential link The first three *should not* make a reference to my server. The last two should. > Since there are currently only a handful of actual gemini servers to test > against, it's difficult to discover these types of bugs when writing a client. > It would be helpful if there was a playground or sandbox server that presented > all of the different combinations of URL formats, charsets, response codes, etc. > that are allowed by the Gemini spec. I might take a shot at starting one myself. Not a bad idea. -spc
It was thus said that the Great Michael Lazar once stated: > > -spc (It also a decent test for parsing URLs---the gateway doesn't do a > > proper job actually ... ) > > > > This should also be fixed now, I wasn't handling relative paths properly. One more issue I just found---I went to https://portal.mozz.us/ I clicked on the first link under "Desiging-out-loud logs" and got: 51 not found because the URL sent was: zaibatsu.circumlunar.space//announcing-gegobi-a-gemini-gopher-bihoster.txt (note the two '//' in the path portion). -spc
> Since there are currently only a handful of actual gemini servers to test > against, it's difficult to discover these types of bugs when writing a client. > It would be helpful if there was a playground or sandbox server that presented > all of the different combinations of URL formats, charsets, response codes, etc. > that are allowed by the Gemini spec. I might take a shot at starting one myself. jullenxx suggested something similar to me on Mastodon a little while back. It's a really excellent idea and something I think would definitely benefit the project. Please, feel very free to set something like this up! I am planning to soon start writing something similar from the client side - it will send various requests to whatever server you point it at and check for appropriate responses. It'll do things like send malformed requests, send requests for non-Gemini resources, or for Gemini resources at other hosts, try to connect with old versions of TLS (or even SSL!), etc. These two tools should take us a long way toward producing more robust software. -Solderpunk
On Sat, Aug 17, 2019 at 03:14:03PM -0400, Jason McBrayer wrote: > That's one reason why I've suggested ratfactor's Text Junior format (can > someone invite ratfactor? I don't have any means of contacting them) as I invited ratfactor a few days ago and I can see now that he is indeed subscribed. And a good thing, too! Unable to sleep last night, I was reading old entries of his phlog on my phone and after (re)reading: gopher://sdf.org:70/0/users/ratfactor/phlog/2018-08-15-Text-has-styles gopher://sdf.org:70/0/users/ratfactor/phlog/2019-04-21-text-junior gopher://sdf.org:70/0/users/ratfactor/phlog/2019-06-09-gopher-2.0-markup I am convinced that he's thought about this question at least as long and as hard as the rest of us combined, and I'm sure he will have very valuable contributions to make to this discussion. Although I note that Text Junior as of yet has no way to deal with bulleted lists! -Solderpunk
On Sat, Aug 17, 2019 at 05:14:10PM -0400, Jason McBrayer wrote: > I am envisioning a Gemini client that looks a lot like this GTK Markdown reader: > https://github.com/craigbarnes/showdown/blob/master/README.md Yes! This is very much the kind of thing that I've thought has an outside shot at becoming popular. It's probably not a popular idea, but I'll confess anyway: my dream version of a client looking a lot like that also randomly generates a nice background/foreground colour scheme for each page, using the hostname as a seed - every page at a given server looks the same, but every different server has its own subtly unique visual identity, which the server has no way to control. -Solderpunk
One caveat - this is still my vision for a client, but I noticed that this Markdown reader actually converts the Markdown to HTML, and then views it in a GTKWebView widget! That's a bit of unnecessary tech stack I'd like to avoid. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20190818/fe52 eba9/attachment.htm>
> Personally, what I would most like to see in a client is a large subset > of Markdown (no embedded HTML, probably no inline images). But all of > the CommonMark text properties, levels of headings, etc. In general, I > would like for reasonable typography to be what you think of when you > think of a Gemini page. I understand that this makes things > significantly harder for client implementors. I too would be very happy if Gemini developed a reputation for nice, functional typography. None of the features you propose strike me as problematic. I especially like the levels of headings idea. Not just for the visual aspect, but because it allows, like the Markdown browser you shared a link to, a nice navigational sidebar for large structured documents, which is very much a good thing. > That's one reason why I've suggested ratfactor's Text Junior format (can > someone invite ratfactor? I don't have any means of contacting them) as > a standard format for Gemini. It is basically a large subset of > Markdown, but fully line-oriented and stripped of parsing ambiguities. A > simple client can either just cat(1) it, can fmt(1) everything that's > not wrapped in a ``` ``` block, or can apply full formatting. Everything > I have served on my Gemini site is legal Text Junior. Text Junior seems nice and I'm not opposed to using it or something very similar to it as a basis for Gemini. But the current worst shortcoming, IMHO, of the very minimal "yeah, you can reflow stuff if you like" Gemini spec is that it wrecks nicely formatted lists, and as far as I can see TJ doesn't currently handle that either. -Solderpunk
Here's a question which is going to sound aggressive or confrontational but is actually just me trying to help all of us (myself included) clarify our thinking and start moving in the direction of a deicision:
It was thus said that the Great solderpunk once stated: > Here's a question which is going to sound aggressive or confrontational > but is actually just me trying to help all of us (myself included) > clarify our thinking and start moving in the direction of a deicision: > > *ahem* > > Given that, thanks to the inclusion of MIME types in the response > header, Gemini is already perfectly capable of serving of Markdown, and > given that Markdown is powerful enough to completely replicate all of > the semantics currently in the text/gemini spec (i.e. it can link to > other places via URL with a user-friendly label attached), what do we > actually stand to gain by speccing text/gemini up as something which is, > roughly, just Markdown with perhaps a few features removed and its > native linking syntax replaced by our own line-based => alternative? > Isn't this line of thought just leading us in the direction of > substantial duplication of effort and having two redundant ways to do > more or less the same thing? Isn't that, generally speaking, a pretty > bad way to design things? I thought that one of the overriding concepts for Gemini was the ease if implementation---that one should be able to write a client in 100 lines of <insert language here>. Now it has certainly grown a bit. Given that a gemini index file can have relative links: => subinfo/document.txt Some Text means that clients now have to merge the base URL with the relative URL to derive the new full URL complicates the client a bit, but we do get a lot of functionality for giving up on ease of implementation (or the reliance upon some libraries). > If the answer to "what do we actually stand to gain?" is "Hmm, not much, > actually" then it seems sensible to me that we should back away from > this direction. > > If the answer is "We gain X, Y and Z", then the syntax, then we can do > our best to design syntax which maximises X, Y and Z. > > Either answer clarifies things for us. The issue I have with Markdown is that there is no one standard for it. Mark Gruber created it in 2004 as a way for *him* to create HTML documents without having to write HTML (or use a clumsy HTML editor) and he had no desire to add to it (because it works for him). Since then, multiple versions have been created to address shortcomings people came across as they tried using Markdown for their own use, and as of right now, defined in RFC-7763 and RFC-7764, are the various flavors of Markdown: Original http://daringfireball.net/projects/markdown/ MultiMarkdown http://fletcher.github.io/MultiMarkdown-4/syntax Github https://help.github.com/articles/github-flavored-markdown/ Pandoc http://johnmacfarlane.net/pandoc/README.html#pandocs-markdown Fountain http://fountain.io/syntax CommonMark http://spec.commonmark.org/ kramdown-rfc2629 https://github.com/cabo/kramdown-rfc2629 rfc7328 https://github.com/miekg/pandoc2rfc Extra https://michelf.ca/projects/php-markdown/extra/ It's almost like "pick your poison" here. -spc
It was thus said that the Great Michael Lazar once stated: > > Since there are currently only a handful of actual gemini servers to test > against, it's difficult to discover these types of bugs when writing a client. > It would be helpful if there was a playground or sandbox server that presented > all of the different combinations of URL formats, charsets, response codes, etc. > that are allowed by the Gemini spec. I might take a shot at starting one myself. I've made a stab at some initial tests at: gemini://gemini.conman.org/test/torture/ There are currently 19 tests (0001 through 0019) and they only cover resolving links (full URL, full path, relative path, relative path with ".." and "." components), plus parsing of the MIME type. Comments welcome, especially with tests that may be unfair or are problematic. -spc
> I've made a stab at some initial tests at: > > gemini://gemini.conman.org/test/torture/ > > There are currently 19 tests (0001 through 0019) and they only cover > resolving links (full URL, full path, relative path, relative path with ".." > and "." components), plus parsing of the MIME type. Comments welcome, > especially with tests that may be unfair or are problematic. Thanks for this! I'm happy to report that AV-98 passes with flying colours. I've started a test client which you provide a hostname and it throws various requests at that host and checks the response status against its expectations. It's not quite ready for prime time yet, though. Although, it's already made me realise something that we haven't specced any behaviour for at all: how should a server respond to an empty request? i.e. just CRLF. Is this invalid, such that it should trigger a 59 response? Some servers do this, but others seem to treat it as a request for the root document. -Solderpunk
> I thought that one of the overriding concepts for Gemini was the ease if > implementation---that one should be able to write a client in 100 lines of > <insert language here>. > > Now it has certainly grown a bit. Given that a gemini index file can have > relative links: I agree that all this talk of complicated markdown syntax gravely threatens this overriding concept (although one could argue maybe not, as just printing non-link lines as-is *is* an explicitly permitted alternative). But relative URLs are really no danger to this at all, and for the record I've just updated https://tildegit.org/solderpunk/gemini-demo-1 to handle status codes of 10, and it's still under 100 lines of Python. It follows redirects, has no trouble with relative links, and uses mailcap to open non-text responses. The user interface is fairly brutalist, but in terms of taking advantage of what the protocol can do, it's fairly complete. Client certificates are the only thing it opts out of. I should try it out on your torture test... -Solderpunk
It was thus said that the Great solderpunk once stated: > > I've made a stab at some initial tests at: > > > > gemini://gemini.conman.org/test/torture/ > > > > There are currently 19 tests (0001 through 0019) and they only cover > > resolving links (full URL, full path, relative path, relative path with ".." > > and "." components), plus parsing of the MIME type. Comments welcome, > > especially with tests that may be unfair or are problematic. > > Thanks for this! I'm happy to report that AV-98 passes with flying > colours. I've added two new tests. Also, the index page lists all the tests currently available so you can resume from where you last left off. > Although, it's already made me realise something that we haven't specced > any behaviour for at all: how should a server respond to an empty > request? i.e. just CRLF. Is this invalid, such that it should trigger > a 59 response? Some servers do this, but others seem to treat it as a > request for the root document. It'd be an exception to not send anything. Other than that, I don't have any real preference. -spc
Okay, I'm ready to cautiously, tentatively pick this question up again. Even while I've been away I've tried to mull this one over. One particular thought I've seized upon is this: the *only* good reason to add the complexity of defining any kind of markup syntax is so that clients can wrap Gemini content at arbitrary widths without breaking all the very nice and useful and valuable formatting abilities which text/plain already provides with Zen-like simplicity. If nobody ever tried to wrap text to new widths, people could *just write* Markdown (whatever Markdown means to them) and it would work and look nice, and fancier clients could optionally render some parts of it in visually appealing ways. Basically, text wrapping/reflowing is the complicating factor here. Text wrapping will break nice formatting of lists and tables and things absent some markup for identifying lists and tables and things, and detailed rules on how to render those items to arbitrary width. The *only* reason that wrapping/reflowing is such a hot topic is the combination of the two facts: i) Most people producing Gopher content format it for 70 or 80 columns, following various presciptions of Ancient Lore (NB: there are some interesting exceptions, e.g. trnslts at rawtext.club offers all their content in the user's choice of 40, 72 or 120 chars!). ii) More and more people consuming Gopher content are doing it on phones, tablets and other newfangled devices which can't display 70 or 80 columns neatly so that things look weird. (At least, I assume this is the reason there's so much demand for this. If anybody thinks this is wrong, I'd love to hear it) Point ii) above is not going to change any time soon, in fact it's only going to become increasingly common as time goes on, whether the retrogrouches amongst us like it or not. But point i) is just an arbitrary convention and so entirely mutable. This raises the question: is there a number of characters at which text/gemini content could be, by convention, wrapped by the author such that it displays nicely on a majority of mobile devices whilst not looking ridiculous on real displays? At gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/files/text-wrapping-experiment.txt I have some text wrapped at 70, 60, 50, 40, 30 and finally 35 columns. Using Pocket Gopher on my phone, I can read text wrapped at 70 columns no worries if I hold the phone in landscape orientation. In portrait I need 35, which admittedly doesn't look quite right on my laptop screen. But I think my phone screen is atypically small, and I wonder if maybe 40 would be workable for a lot of people? Feedback from actual device users very welcome... -Solderpunk
solderpunk writes: > This raises the question: is there a number of characters at which > text/gemini content could be, by convention, wrapped by the author > such that it displays nicely on a majority of mobile devices whilst > not looking ridiculous on real displays? I answered on the fediverse, but my kind-of-average phone, with DiggieDog, will show 80 in landscape or 50 in portrait. I'm not sure the only reason wrapping is an issue is mobile devices; I would like to see good but simple typography be the expectation on Gemini (heavier than plain text, lighter than HTML). But I fully admit that this raises the implementation complexity, and my own graphical client experiment has not yet got to the point where I am trying to render markdown. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
> I answered on the fediverse, but my kind-of-average phone, with > DiggieDog, will show 80 in landscape or 50 in portrait. Thanks for the data point! > I'm not sure the only reason wrapping is an issue is mobile devices; I > would like to see good but simple typography be the expectation on > Gemini (heavier than plain text, lighter than HTML). Could you elaborate a little on what you mean by "simple typography" and how it relates to wrapping/flowing in particular? I understand you'd like things like section headings in a larger, perhaps bolder font, and the ability to bold/italicise words, etc., but unless I'm missing something (very possible!) this is all unrelated to wrapping? -Solderpunk
solderpunk writes: > Could you elaborate a little on what you mean by "simple typography" > and how it relates to wrapping/flowing in particular? I understand > you'd like things like section headings in a larger, perhaps bolder > font, and the ability to bold/italicise words, etc., but unless I'm > missing something (very possible!) this is all unrelated to wrapping? They are only loosely related, in that 80 columns of monospace text, while fully serviceable in a text terminal, is not good typography on a screen that's capable of displaying something more similar to a printed page. But separating these issues: maybe we should just specify Gemini maps as max 70 columns of monospaced, unflowed text. If people want better typography, they can send text/markdown. My apologies for any roughness in my writing today; I've got a pretty bad cold. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
On September 2, 2019 12:20:24 PM UTC, Jason McBrayer <jmcbray at carcosa.net> wrote: >But separating these issues: maybe we should just specify Gemini maps >as >max 70 columns of monospaced, unflowed text. If people want better >typography, they can send text/markdown. If we assume that mime-types other than text/gemini are rendered according to their own rules (eg, markdown) then the issue of wrapping is left to text/gemini and any types that don't have inherent rules (eg, text/plain). I think we have just a few choices: 1. Display as-is authored. 2. Wrap everything 3. Wrap only long lines (need to decide on arbitrary width) 4. Define markup to be explicit about wrapping vs. fixed. #1 is the easiest to implement but leaves the same problem as gopher, and it's likely that some client authors will go rogue and implement #2 or #3 anyway. #4 adds complexity and opens a door for more and more markup. What it we look at combination mime-types, though? What if we allowed something to be defined as a Gemini map & in markdown, or asciidoc, etc? Parsing mine types to indicate "this is a Gemini file" first gives you links, and if the secondary type is known to the client can give you formatting. It can fall back on "display as-is" for unknown types. Now this doesn't necessarily solve text/plain if folks are determined to wrap it, but I think it makes Gemini map files more flexible over time without adding burden to the spec directly.
On Sun, Sep 1, 2019 at 7:05 AM solderpunk <solderpunk at sdf.org> wrote: > Point ii) above is not going to change any time soon, in fact it's only > going to become increasingly common as time goes on, whether the > retrogrouches amongst us like it or not. But point i) is just an > arbitrary convention and so entirely mutable. > > This raises the question: is there a number of characters at which > text/gemini content could be, by convention, wrapped by the author such > that it displays nicely on a majority of mobile devices whilst not > looking ridiculous on real displays? I think this is a valid idea that is worth seriously considering. I actually went through the same train of thought about a week ago, but I held back from posting because wasn't entirely sure how I felt about it, and I'm simultaneously working on a proposal for a simplified markdown dialect for gemini [1]. I have a iPhone 10 XR and use the iOS gopher client by Charles Childers. My font size can fit 96 characters wide and 44 characters narrow. I could probably read a slightly smaller font, but I have trouble clicking on the links with my fat fingers. Limiting the width below 70 characters is largely unexplored territory. This is the type of constraint that could inspire people to get creative in their page design (in a good way). Gemini pages might end up evolving their own distinct identity that distinguishes them from other text platforms. It could also open the door to experimenting with terminal client UI, for example displaying multiple pages side-by-side. Sources say that everything from 45 to 75 characters wide is satisfactory for text readability [2]. I would probably aim for 45-50 to comfortably fit most smart phones for most people. Perhaps the largest downside of using a fixed width format is that you lose accessibility. People with reduced vision might want/need to use a larger font than average. With re-flowed text this is easy because the text will wrap to fit their screen. Also, I've heard that screen readers work poorly with hard line wraps, although I can't find a citation for that. [1] gemini://mozz.us/markdown/ [2] http://webtypography.net/2.1.2
> I think this is a valid idea that is worth seriously considering. I actually > went through the same train of thought about a week ago, but I held back from > posting because wasn't entirely sure how I felt about it, and I'm > simultaneously working on a proposal for a simplified markdown dialect for > gemini [1]. I've also wondered about something along these lines. It's an intriguing idea. Some ill-defined and possibly semi-irrational fear tickles in the back of my head that it *sounds* like a simple and harmless thing to allow, but could in practice be opening some can of ambiguity worms that will come back to bite us. > I have a iPhone 10 XR and use the iOS gopher client by Charles Childers. My > font size can fit 96 characters wide and 44 characters narrow. I could probably > read a slightly smaller font, but I have trouble clicking on the links with my > fat fingers. Thanks for this data point! I've gotten a few responses on Mastodon to my query as to what width various people can read on their phones. Almost everybody has said they can read 50 characters comfortably in portrait mode. One person reported 50 in one client, 46 in another. It seems like 45 columns would work for the majority of people. I'm going to format the rest of this email wrapped at 45 columns to help us all get a feel for that size. > Limiting the width below 70 characters is largely unexplored territory. This is > the type of constraint that could inspire people to get creative in their page > design (in a good way). Gemini pages might end up evolving their own distinct > identity that distinguishes them from other text platforms. It could also open > the door to experimenting with terminal client UI, for example displaying > multiple pages side-by-side. I agree that there's a certain amount of excitement associated the unexplored possibilities associated with a new format. Not only would a convention of wrapping Gemini content at 45 chars effectively remove the need to add enough markdup to the spec to facilitate near reflow, it could, like you say, give the medium a distinctie look and feel and encourage interesting new developments like multi-column layouts. My biggest concern at this point is that link labels could easily exceed 45 chars and then clients need to come up with a nice way to wrap them... > Perhaps the largest downside of using a fixed width format is that you lose > accessibility. People with reduced vision might want/need to use a larger font > than average. With re-flowed text this is easy because the text will wrap to > fit their screen. Also, I've heard that screen readers work poorly with hard > line wraps, although I can't find a citation for that. This is a bit of a concern. I noticed that dgold recently removed all ASCII art from their phlog for accessibility reasons - see gopher://ascraeus.org/0/phlog/030.txt. -Solderpunk
> > I agree that there's a certain amount of > excitement associated the unexplored > possibilities associated with a new format. > Not only would a convention of wrapping > Gemini content at 45 chars effectively remove > the need to add enough markdup to the spec to > facilitate near reflow, it could, like you > say, give the medium a distinctie look and > feel and encourage interesting new > developments like multi-column layouts. > > My biggest concern at this point is that link > labels could easily exceed 45 chars and then > clients need to come up with a nice way to > wrap them... The link format: => link-URL link-text You could state that the link-text should be 45 characters or less, and if not given, then the link-URL should be truncated at 42 characters, with "..." added on the end for display (if less than 42 characters, then no need for the "..." at the end). In Lua, this would be: if #linkurl > 42 then linktext = linkurl:sub(1,42) .. "..." end In a string-hostile language like C: char buffer[46]; if (strlen(linkurl) > 42) snprintf(buffer,sizeof(buffer),"*.*s...",42,42,linkurl); So I don't think it's a hard thing to do. > > Perhaps the largest downside of using a fixed width format is that you lose > > accessibility. People with reduced vision might want/need to use a larger font > > than average. With re-flowed text this is easy because the text will wrap to > > fit their screen. Also, I've heard that screen readers work poorly with hard > > line wraps, although I can't find a citation for that. > > This is a bit of a concern. I noticed that > dgold recently removed all ASCII art from > their phlog for accessibility reasons - see > gopher://ascraeus.org/0/phlog/030.txt. Or how about: lines that start with '=>' are links and are handled differently, lines starting with a ' ' are fixed (but should be less than N characters in length), and starting with any other character should be reflowed until a blank line, a line staring with '=>' or whitespace. So this wouldn't be reflowed. But this would reflow because it doesn't start with a whitespace => example:foo link text is handled differently. -spc (Just some thoughts ... )
spc writes: > Or how about: lines that start with '=>' are links and are handled > differently, lines starting with a ' ' are fixed (but should be > less than N characters in length), and starting with any other character > should be reflowed until a blank line, a line staring with '=>' or > whitespace. I agree with that proposal wholeheartedly. It creates one additional simple parse case for fixed lines (that will not be truncated to N characters if they go beyond). But I do not think that additional parse to implement adds any major work or complexity to clients. I am of the opinions that link lines _should_ wrap. My reason for this is that many people may want to do url only links: => gemini://this.is.a/url/to/some/place_or_other?it-has-a-data-component-th at-exceeds-the-limit-of-characters It is reasonable to want to see the whole link before following it. So, I am in favor of: 1. Regular text and links both wrap to n (tbd) characters 2. A leading space creates a non-wrapping line that will be truncated at n (tbd) characters if it exceeds that number of chars. Clients can optionally add an ellipses. One thing to consider with wrapping or truncating links is that many terminal based clients will implement links with numbers. Given that those numbers will be based by the number of links on a page, it will be difficult for content authors to predict what the behavior of longer lines will be (where something might wrap or get cut) in order to make room for interface elements such as: => A link 13 => A link [13] A link (13) A link ( 13) => A link Any of those are reasonable and possible ways a client can represent a link on the screen. Because how clients do this will vary, things can get unpredictable. As such, what happens at what boundary should, I believe, be a part of the spec. So that people can know what to expect. They should also expect differing clients to render things mildly differently and not rely on very long lines to always appear identically. As a side note: I agree that accessibility is a solid concern. --? Sent with https://mailfence.com Secure and private email
On 9/5/19 9:17 PM, Brian Evans wrote: > 1. Regular text and links both wrap to n (tbd) characters > 2. A leading space creates a non-wrapping line that will > be truncated at n (tbd) characters if it exceeds that > number of chars. Clients can optionally add an ellipses. I can't believe how simple that is. A leading space for fixed content is ridiculously simple. I love it. Regarding accessibility, it's hard without metadata, but even so at this point we would have three distinctions: 1. Links (which could be announced as such to screen readers) 2. Regular text (which would be read as body) 3. Fixed formatting via leading space (blocks of which could be announced to screen readers and optionally read or skipped) That's really not too bad from an accessibility POV for such a minimal protocol.
On Thu, Sep 5, 2019 at 3:04 PM Sean Conner <sean at conman.org> wrote: > Or how about: lines that start with '=>' are links and are handled > differently, lines starting with a ' ' are fixed (but should be > less than N characters in length), and starting with any other character > should be reflowed until a blank line, a line staring with '=>' or > whitespace. > > So this wouldn't be reflowed. > But this would > reflow because it doesn't start with a whitespace > => example:foo link text is handled differently. I'm not a fan of using significant whitespace in markdown languages. Spot the difference between these two lines. One should be fixed width and one should be reflowed: Beyond that, I can't help but feel that this would lead to the worst of both worlds. If I drop down to fixed width lines (say N=70) because I want to add a bullet list to my page, now clients must render at least 70 lines in order to accurately view my content. So what good does having other sections of reflowed text do for me at that point? I do think that preformatted text blocks have their place in markdown languages for simple things like code snippets and ASCII art. The kind of stuff that you can cut off and it doesn't affect the readability of the page. But you can't extend them to replace all of the page formatting. In my opinion this needs to be an all-or-nothing decision: 1. Fixed-width text with no special syntax 2. Reflowed text with some simple flavor of markdown for styling
Michael writes: > In my opinion this > needs to be an all-or-nothing decision: > > 1. Fixed-width text with no special syntax > 2. Reflowed text with some simple flavor of markdown for styling In #1 above is the idea that there would be no wrapping and any text that went over the width limit would be truncated? I can see that being a bad user experience if someone's device does not conform easily to the size that is chosen. It is also annoying to enforce. A perfectly fine document would not be readable because their lines go to 60 instead of 45? I am fine with a fixed width, but only if wrapping or reflowing is available and declarable by the content author. - - - In #2 is reflowed what we are aiming for or is wrapped what we are aiming for? Both terms have been used during the ongoing conversation and I have lost track of which one people are preferring and how each person defines them. Example: In this example I am writing a long line and then a newline and a shorter line. Reflow (ignores whitespace beyond single spaces): In this example I am writing a long line and then a newline and a shorter line. Wrap (recognizes whitespace and just wraps the long line): In this example I am writing a long line and then a newline and a shorter line. - - - I am definitely still open to the idea of a simple markdown flavor, which would potentially make some of this conversation moot. That again ups the client complexity a good amount though so it would have to be **very** simple. - - - Tomasino writes: > I can't believe how simple that is. A leading space for fixed content is > ridiculously simple. I love it. Right? Nice and easy. Given that fixed content will likely be more rare than wrapped or flowed content it seems like an elegant solution. --? Sent with https://mailfence.com Secure and private email
It was thus said that the Great Michael Lazar once stated: > On Sun, Sep 1, 2019 at 7:05 AM solderpunk <solderpunk at sdf.org> wrote: > > Point ii) above is not going to change any time soon, in fact it's only > > going to become increasingly common as time goes on, whether the > > retrogrouches amongst us like it or not. But point i) is just an > > arbitrary convention and so entirely mutable. > > > > This raises the question: is there a number of characters at which > > text/gemini content could be, by convention, wrapped by the author such > > that it displays nicely on a majority of mobile devices whilst not > > looking ridiculous on real displays? > > I think this is a valid idea that is worth seriously considering. I actually > went through the same train of thought about a week ago, but I held back from > posting because wasn't entirely sure how I felt about it, and I'm > simultaneously working on a proposal for a simplified markdown dialect for > gemini [1]. > > [1] gemini://mozz.us/markdown/ I read through those, and it appears that you generate an AST from the markdown-like text, then run through that AST to generate HTML. Is there code to generate text output, or is that just the file itself? -spc
It was thus said that the Great Brian Evans once stated: > > Tomasino writes: > > I can't believe how simple that > > is. A leading space for fixed > > content is ridiculously simple. I > > love it. > > Right? Nice and easy. Given that > fixed content will likely be more > rare than wrapped or flowed content > it seems like an elegant solution. How about a full sample? -spc (This entire email fits the specificaion below ... ) A Proposed Formatting Specification for Gemini Index files. by Sean Conner The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. => https://www.ietf.org/rfc/rfc2119.txt [RFC2119] BCP 14 => https://www.ietf.org/rfc/rfc8174.txt [RFC8174] (update) A Gemini index file, regardless of character encoding [1], shall only consist of the space character [2], graphic characters [3] and a limited set of control characters out of the C0 set [4]; the C1 control set [5] is outright rejected and MUST NOT appear in a Gemini index file. The following C0 control set characters are allowed: HT (character 9) Horizontal Tab. Classified as "whitespace" LF (character 10) Line Feed. Classified as "end of line marker" CR (character 13) Carriage Return. Classified as "end of line marker" SP (character 32) Space. Classified as "whitespace" Any other C0 control character MUST NOT appear in a Gemini index file. Characters not defined as "end of line marker" or "whitespace" is considered, per this specification, to be a "graphical character". The two characters LF and CR MUST appear in that order in a Gemini index file. It is unspecified (at this time) what should happen if a single LF or CR is encountered. Both characters together constitute the "end of line marker". It is also unspecified (at this time) what should happen if a C0 control character not listed above, or a C1 control character is encountered in a Gemini index file. A "line of text" is any sequence of "whitespace" and "graphical characters" followed by an "end of line marker". A "line of text" that starts with the character sequence => is considered a "link line" and contains a link to another document. The BNF [RFC5234] for a "link line" is: link = mark WSP url [ WSP text ] CRLF mark = "=>" url = %x21-7E ; see [RFC3986] for syntax text = %x20-FF ; see [RFC3629] for format CR = %x0D LF = %x0A CRLF = CR LF SP = %x20 HTAB = %x09 WSP = SP / HTAB => https://www.ietf.org/rfc/rfc5234 [RFC5234] BNF syntax => https://www.ietf.org/rfc/rfc3986 [RFC3986] URL syntax => https://www.ietf.org/rfc/rfc3629 [RFC3639] UTF-8 format For maximum interoperability, the text portion (if present) should be at most 40 characters in length; if longer, it is up to the client to handle it as it sees fit. It MAY "wrap", it SHOULD "reflow", it MAY "cut off" the text. If the text portion doesn't appear, then the URL MUST be displayed as the text portion, subject to the same limitations just mentioned. To "wrap" text, once the text has reached the right edge of the screen [6], the text resumes at the left edge, even if it cuts a word in half. Upon encoutering an "end of line marker", move to the next line. For example, to "wrap" the following paragraphs: "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum mauris leo, condimentum vitae varius at, elementum sed odio. Sed commodo felis lacinia blandit vestibulum. Duis vel sagittis massa. Maecenas sodales dui tristique velit luctus tincidunt a sit amet neque. Sed vitae velit in sapien semper accumsan. Nulla sem odio, malesuada a viverra at, tristique eu tortor. "Quisque auctor porta enim, eget tincidunt augue cursus non. Nulla at condimentum purus. Curabitur maximus malesuada risus, at ultrices nisl luctus vel. Nulla eget est luctus, dignissim urna vel, luctus felis. Donec facilisis malesuada porta. Nulla elementum felis ut justo sollicitudin pellentesque. Vestibulum faucibus, ipsum tincidunt volutpat lacinia, turpis libero bibendum sem, in malesuada turpis lectus et neque." with a width of 30 characters: 123456789012345678901234567890 ------------------------------ "Lorem ipsum dolor sit amet, c onsectetur adipiscing elit. Ve stibulum mauris leo, condiment um vitae varius at, elementum sed odio. Sed commodo felis la cinia blandit vestibulum. Duis vel sagittis massa. Maece nas sodales dui tristique veli t luctus tincidunt a sit amet neque. Sed vitae velit in sapi en semper accumsan. Nulla sem odio, malesuada a viverra at, tristique eu tortor. "Quisque auctor porta enim, eg et tincidunt augue cursus non. Nulla at condimentum purus. C urabitur maximus malesuada ris us, at ultrices nisl luctus ve l. Nulla eget est luctus, dign issim urna vel, luctus felis. Donec facilisis malesuada port a. Nulla elementum felis ut ju sto sollicitudin pellentesque. Vestibulum faucibus, ipsum ti ncidunt volutpat lacinia, turp is libero bibendum sem, in mal esuada turpis lectus et neque. " To "reflow" text, lines are broken at whitespace [7], where an "end of line marker" is placed to start the next line, and any existing "end of line markers" are ignored unless there are two in a row. For example, the two example paragraphs "reflowed" at 30 characters: 123456789012345678901234567890 ------------------------------ "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum mauris leo, condimentum vitae varius at, elementum sed odio. Sed commodo felis lacinia blandit vestibulum. Duis vel sagittis massa. Maecenas sodales dui tristique velit luctus tincidunt a sit amet neque. Sed vitae velit in sapien semper accumsan. Nulla sem odio, malesuada a viverra at, tristique eu tortor. "Quisque auctor porta enim, eget tincidunt augue cursus non. Nulla at condimentum purus. Curabitur maximus malesuada risus, at ultrices nisl luctus vel. Nulla eget est luctus, dignissim urna vel, luctus felis. Donec facilisis malesuada porta. Nulla elementum felis ut justo sollicitudin pellentesque. Vestibulum faucibus, ipsum tincidunt volutpat lacinia, turpis libero bibendum sem, in malesuada turpis lectus et neque." If a suitable breaking point cannot be found (no whitespace or "end of line markers" found), then the line MUST be wrapped to the next line, at which the "reflow" algorithm is picked up again. An example of a paragraph that exhibits such behavior, again at 30 characters: "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum mauris leo, condimentumvitaevariusat,elementum sed odio. Sed commodo felis lacinia blandit vestibulum. Duis vel sagittis massa. Maecenas sodales dui tristique velit luctus tincidunt a sit amet neque. Sed vitae velit in sapien semper accumsan. Nulla sem odio, malesuada a viverra at, tristique eu tortor." 123456789012345678901234567890 ------------------------------ "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum mauris leo, condimentumvitaevariusat,eleme ntum sed odio. Sed commodo felis lacinia blandit vestibulum. Duis vel sagittis massa. Maecenas sodales dui tristique velit luctus tincidunt a sit amet neque. Sed vitae velit in sapien semper accumsan. Nulla sem odio, malesuada a viverra at, tristique eu tortor." To "cut off", any characters past the right edge of the screen are simple discared until the next "end of line marker": 123456789012345678901234567890 ------------------------------ "Lorem ipsum dolor sit amet, c leo, condimentum vitae varius lacinia blandit vestibulum. Duis vel sagittis massa. Maec tristique velit luctus tincidu sapien semper accumsan. Nulla eu tortor. "Quisque auctor porta enim, eg condimentum purus. Curabitur luctus vel. Nulla eget est lu facilisis malesuada porta. Nu pellentesque. Vestibulum fauc libero bibendum sem, in malesu A "line of text" that starts with one or more "whitespace" characters, followed by "graphical characters" is a "fixed line" and MUST NOT be "reflowed"; it MAY be "wrapped" or it MAY be "cut off". For maximum interoperability, such "fixed" lines SHOULD be 40 characters or less. The BNF for a "fixed" line: fixed = 1*WSP VCHAR *text CRLF VCHAR = %x21-FF ; see [RFC3629] for format => https://www.ietf.org/rfc/rfc3629 [RFC3639] UTF-8 format It is an ambiguous condition when a line consists of only whitespace characters, and such a line SHOULD NOT appear in a Gemini index file. A "line of text" that does not start with whitespace or the character sequence => is subject to being "reflowed" until two consecutive "end of line markers" are encountered. It should be noted that this document follows the format given above, with no fixed line longer than 40 characters and no link text longer than 40 characters. The rest of the text can be reflowed at any given width. This should give a feeling for what such a document would look like. * * * * * [1] An assumption is being made that any character encoding system used is based on US-ASCII, which defines the first 128 characters. [2] US-ASCII character 32. It is both considered a control character as one of the unit separation characters FS (file separator), GS (group separator), RS (record separator) and US (unit separator) as a finer grained separator character, and as a graphic character dispite not having a graphical representation. For UTF-8, this will also include the variations on white space, such as thin spacing, zero-width spacing, etc. [3] Any character with a visual representation, or as part of a visual representation. [4] Characters 0 through 31. [5] The so-called ANSI escape codes. See Wikipedia for more information. => https://en.wikipedia.org/wiki/C0_and_C1_control_codes#C1_controls C1 Set [6] This assumes a "left-to-right" ordering of characters. Other orderings of rendering text is out of scope for this document. [7] An ambitious implementation may want to break at a dash (-) or a soft hyphen (UTF-8 character \u00AD).
From: Sean Conner <sean@conman.org> The two characters LF and CR MUST appear in that order in a Gemini index file. It is unspecified (at this time) what should happen if a single LF or CR is encountered. Both characters together constitute the "end of line marker". ------ Was there an earlier discussion on this point? That looks lifted from the gopher RFC, but is there a reason why it's applying to Gemini? It seems quite specific and arbitrary to require both these days. Did someone have a good justification already?
It was thus said that the Great James Tomasino once stated: > From: Sean Conner <sean at conman.org> > > The two characters LF and CR MUST appear in that order in a Gemini index > > file. It is unspecified (at this time) what should happen if a single LF or > > CR is encountered. Both characters together constitute the "end of line > > marker". > > Was there an earlier discussion on this point? That looks lifted from the > gopher RFC, but is there a reason why it's applying to Gemini? It seems > quite specific and arbitrary to require both these days. Did someone have > a good justification already? Well, the use of CRLF is mentioned several times in the Gemini spec: => gopher://zaibatsu.circumlunar.space/0/~solderpunk/gemini/spec-spec.txt (specifically sections 1.2, 1.3.1 and 1.3.5.2). It doesn't have the language MUST, SHOULD, etc, though. Also, CRLF is an Internet standard for transfering text (you'll see it in many other specifications, like SMTP and HTTP), so that's why I added that language. I have code (used in both my gopher and gemini clients) that can deal with either a CRLF (Internet standard; Windows text files) or LF (Linux, Mac OS-X), but I think it might be easier on clients to just look for both (my code will not deal with just a CR or an LFCR for instance, not that I've ever come across any text with that format in a long time [1]). -spc (Didn't expect CRLF to be a point of contention ... ) [1] I know Classic Mac (pre OS-X) and some older 8-bit computers used CR only.
On Thu, Sep 5, 2019 at 10:53 PM Brian Evans <b__m__e at mailfence.com> wrote: > Michael writes: > > In my opinion this > > needs to be an all-or-nothing decision: > > > > 1. Fixed-width text with no special syntax > > 2. Reflowed text with some simple flavor of markdown for styling > > In #1 above is the idea that there would be no wrapping and any > text that went over the width limit would be truncated? Yes, in my experience this is how most gopher clients do it. You can't assume that somebody's ASCII banner, or markdown table, or whatever custom formatting they use is going to look correct if it is reflowed or wrapped. So the safest thing to do is preserve the original line breaks and just display as many columns as you can fit in your display. > I can see that being a bad user experience if someone's device > does not conform easily to the size that is chosen. It is also > annoying to enforce. A perfectly fine document would not be > readable because their lines go to 60 instead of 45? I totally agree with you that it's a bad experience. That is why I support experimenting with making the "standard" width something like 45 characters, so that it reasonably fits in most types of displays. This would only be enforced for documents of type "text/gemini" that are specifically designed to be consumed over gemini. I'm not proposing that we dictate how servers or clients handle other types of text documents. > In #2 is reflowed what we are aiming for or is wrapped what we > are aiming for? Both terms have been used during the ongoing > conversation and I have lost track of which one people are > preferring and how each person defines them. I thought we were discussing text reflowing, as defined in section 1.3.5.3 of the gemini speculative spec. Apologies if I missed the mark on this. This thread has grown so large now that I feel like conversations are starting to go around in circles.
On Fri, Sep 6, 2019 at 1:33 AM Sean Conner <sean at conman.org> wrote: > I read through those, and it appears that you generate an AST from the > markdown-like text, then run through that AST to generate HTML. Is there > code to generate text output, or is that just the file itself? > > -spc The design_document.tx is the original markdown-like document that I wrote. As you suspected, here's what I ran to generate the other files: $ ./parse.py design_document.txt > design_document.json $ ./render.py design_document.json > design_document.html I didn't include anything that can take the AST and render it as plain text. I suppose it would be pretty trivial to write though. That could be something that a TUI gemini client might utilize to generate a "fancy" text display like colorizing links or making title elements bold + centered.
> Tomasino writes: > > I can't believe how simple that is. A leading space for fixed content is > > ridiculously simple. I love it. > > Right? Nice and easy. Given that fixed content will likely be more rare > than wrapped or flowed content it seems like an elegant solution. I'm trying to be cautious and avoid jumping to a snap decision, but: wow! I think I'm more enthused about this idea than anything else which has been proposed. It's wonderfully simple, even more so than the idea which was brought up much earlier about using spaces at the *end* of the line. It is also vaguely compatible with some flavours of Markdown, where code blocks are indicated with four leading spaces - Markdown documents formatted in this way would be (in that respect) correctly rendered by Gemini clients implementing this proposal. Bulletted lists arguably look even nicer when formatted with a leading space: * The first item * A second item * Item the third I acknowledge that this solution is not perfect and e.g. won't gracefully handle bulleted lists with items which are more than one line long (not sure how big a problem this is in practice - the longer each list item gets, the more perfectly fine the whole list looks, IMHO, if each line is formatted as a paragraph of its own), but I think by now it should be obvious that there is no hope of achieving a perfect solution covering all situations while still keeping the implementation effort very low. This proposal gets us an awful lot of bang for the buck. For use cases where it just won't cut the mustard, it might be time to consider serving text/markdown over Gemini or, heck, even text/html over HTTP via Shizaru or a similarly non-evil server. Gemini won't - can't - be the ideal tool for every job. It might be a good idea for people to start experimenting with implementing this, so we can get an informed perspective on just how hard to implement this *really* is, and on how certain kinds of content renders. -Solderpunk
Sean Conner <sean at conman.org> writes: > (specifically sections 1.2, 1.3.1 and 1.3.5.2). It doesn't have the > language MUST, SHOULD, etc, though. Also, CRLF is an Internet standard for > transfering text (you'll see it in many other specifications, like SMTP and > HTTP), so that's why I added that language. IMO, it makes sense to require CRLF in the plain text parts of the protocol (after requests, after the status line of a response), but I don't think that the text/gemini file format needs to have CR/LF; IMO clients should be prepared to accept either LF or CR/LF just as they would with text/plain. And maybe if we're serious about supporting old devices, clients should be prepared for bare CR, too (Classic MacOS). But it's a pain in the arse to authors to have to save text documents with non-native line endings, and I don't feel like servers need to be in the business of reformatting the content they serve. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
solderpunk <solderpunk at SDF.ORG> writes: > This proposal gets us an awful lot of bang for the buck. For use cases > where it just won't cut the mustard, it might be time to consider > serving text/markdown over Gemini Yeah. I think this is probably the least bad of the really simple solutions. It keeps it simpler than using even Text Junior, but makes the main use cases easy to handle. > It might be a good idea for people to start experimenting with > implementing this, so we can get an informed perspective on just how > hard to implement this *really* is, and on how certain kinds of > content renders. I've not progressed on Voskhod, because of being sick and just busy at home, but I'll probably implement these wrapping rules for text/gemini before I implement support for text/markdown, once I get past fighting with the build system to be able to debug stuff. It would be great for Julien to have a go at implementing this in Asuka, since that's as far as I know the only currently existing client that has an idea of its screen width. One question: should non-wrapped lines be rendered in a fixed-width font, in a client that is capable of using both fixed-width and proportional fonts? -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
It was thus said that the Great solderpunk once stated: > > At > gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/files/text-wrapping-experiment.txt > I have some text wrapped at 70, 60, 50, 40, 30 and finally 35 columns. > Using Pocket Gopher on my phone, I can read text wrapped at 70 columns > no worries if I hold the phone in landscape orientation. In portrait I > need 35, which admittedly doesn't look quite right on my laptop screen. > But I think my phone screen is atypically small, and I wonder if maybe > 40 would be workable for a lot of people? Feedback from actual device > users very welcome... I just added something similar to my server: gemini://gemini.conman.org/test/testwrap.gemini It has links to texts of various widths. You can test other widths by going to: gemini://gemini.conman.org/test/wrap;77 Replace the number at the end with your preferred width (keep the semicolon though), and text will wrap at that point. The lower limit is 1 and there's no upper limit. That happens to be the only page where you can specify the width, by the way. -spc
It was thus said that the Great Jason McBrayer once stated: > Sean Conner <sean at conman.org> writes: > > > (specifically sections 1.2, 1.3.1 and 1.3.5.2). It doesn't have the > > language MUST, SHOULD, etc, though. Also, CRLF is an Internet standard for > > transfering text (you'll see it in many other specifications, like SMTP and > > HTTP), so that's why I added that language. > > IMO, it makes sense to require CRLF in the plain text parts of the > protocol (after requests, after the status line of a response), but I > don't think that the text/gemini file format needs to have CR/LF; IMO > clients should be prepared to accept either LF or CR/LF just as they > would with text/plain. And maybe if we're serious about supporting old > devices, clients should be prepared for bare CR, too (Classic MacOS). > But it's a pain in the arse to authors to have to save text documents > with non-native line endings, and I don't feel like servers need to be > in the business of reformatting the content they serve. I can live with that. Just note that lines can end in one of three possible ways: CR LF Windows LF Unix (which includes Linux and Mac OS-X) CR vanishingly small set of systems (classic Mac) I don't think I've ever come across the order LF CR. -spc
> IMO, it makes sense to require CRLF in the plain text parts of the > protocol (after requests, after the status line of a response), but I > don't think that the text/gemini file format needs to have CR/LF; IMO > clients should be prepared to accept either LF or CR/LF just as they > would with text/plain. And maybe if we're serious about supporting old > devices, clients should be prepared for bare CR, too (Classic MacOS). > But it's a pain in the arse to authors to have to save text documents > with non-native line endings, and I don't feel like servers need to be > in the business of reformatting the content they serve. I will admit that the current liberal use of CRLF throughout the Gemini spec is the result of me blindly copying from Gopher and other RFCs (as Sean mentioned, it's ubiquitous). My initial response to what you wrote above is that it makes an awful lot of sense. And, in fact, is probably a lot closer to what extant clients are actually doing. Does anybody want to make a strong counterargument, that CRLF should be strictly required in text/gemini? If not I'll update the spec-spec. -Solderpunk
It was thus said that the Great solderpunk once stated: > > IMO, it makes sense to require CRLF in the plain text parts of the > > protocol (after requests, after the status line of a response), but I > > don't think that the text/gemini file format needs to have CR/LF; IMO > > clients should be prepared to accept either LF or CR/LF just as they > > would with text/plain. And maybe if we're serious about supporting old > > devices, clients should be prepared for bare CR, too (Classic MacOS). > > But it's a pain in the arse to authors to have to save text documents > > with non-native line endings, and I don't feel like servers need to be > > in the business of reformatting the content they serve. > > I will admit that the current liberal use of CRLF throughout the Gemini > spec is the result of me blindly copying from Gopher and other RFCs (as > Sean mentioned, it's ubiquitous). My initial response to what you wrote > above is that it makes an awful lot of sense. And, in fact, is probably > a lot closer to what extant clients are actually doing. > > Does anybody want to make a strong counterargument, that CRLF should be > strictly required in text/gemini? If not I'll update the spec-spec. As memntioned, I think the request and the status response should have the CRLF, but the content, even if text/gemini, should use whatever line endings are easier/standard for the content creator.. So clients should expect text/* types with endings of CRLF or LF (CR will probably be rare, and I've never encountered LFCR). -spc
It was thus said that the Great solderpunk once stated: > > I've started a test client which you provide a hostname and it throws > various requests at that host and checks the response status against its > expectations. It's not quite ready for prime time yet, though. How's the test client coming along? > Although, it's already made me realise something that we haven't specced > any behaviour for at all: how should a server respond to an empty > request? i.e. just CRLF. Is this invalid, such that it should trigger > a 59 response? Some servers do this, but others seem to treat it as a > request for the root document. Oh, another question I have. Because my server treats that as an error, and I think it should be an error---why not all servers currently support multiple sites, it is possible, and without a URL to give the server name, there's no way a multi-site server will know what to serve up. -spc
> I can't believe how simple that is. A leading space for fixed content is > ridiculously simple. I love it. > I acknowledge that this solution is not perfect and e.g. won't gracefully > handle bulleted lists with items which are more than one line long We could avoid this problem if we treat lines starting with spaces as a way to force a line break. This would cause code to not wrap, as desired, and it allows for wrapping lists simply by not putting a space in the 2nd+ lines. Example:
I've been thinking about this for a while. What's the purpose of wrapping line breaks in the first place? Why not just have super long lines for text that should be reflowed (then *soft*-wrap in the text editor)? I understand that reflowing could be complex for simple clients, but couldn't they just rely on their display mechanism's (e.g. a terminal's) wrapping system? I imagine that the answer is very obvious. I just want to make sure that I understand before spewing any more opinions :-)
On 12/15/19 4:49 AM, Aaron Janse wrote: > What's the purpose of wrapping line > breaks in the first place? There's a lot of examples further back in this chain. In short: - code snippets - poetry - ascii art - tables
On Sun, Dec 15, 2019, at 4:33 PM, James Tomasino wrote: > On 12/15/19 4:49 AM, Aaron Janse wrote: > > What's the purpose of wrapping line > > breaks in the first place? > > There's a lot of examples further back in this chain. In short: > > - code snippets > - poetry > - ascii art > - tables But none of these examples require reflowing text in the source code, right? In markdown, if you want text to reflow, you just let it go past 80 columns, or whatever the viewport width is. So this entire paragraph would have no newlines in its source code. fn main() { print!("But none of this text reflows because ") println!("it's less than the viewport width.") }
On 12/15/19 6:12 PM, Aaron Janse wrote: > But none of these examples require reflowing text in the source code, right? > In markdown, if you want text to reflow, you just let it go past 80 columns, > or whatever the viewport width is. So this entire paragraph would have no > newlines in its source code. > > fn main() { > print!("But none of this text reflows because ") > println!("it's less than the viewport width.") > } Source code can be longer than 80 columns. Also, some clients may not want to use 80 column widths. We have mobile gopher clients favoring 40 columns.
On Sun, Dec 15, 2019, at 7:32 PM, James Tomasino wrote: > On 12/15/19 6:12 PM, Aaron Janse wrote: > > But none of these examples require reflowing text in the source code, right? > > In markdown, if you want text to reflow, you just let it go past 80 columns, > > or whatever the viewport width is. So this entire paragraph would have no > > newlines in its source code. > > > > fn main() { > > print!("But none of this text reflows because ") > > println!("it's less than the viewport width.") > > } > > Source code can be longer than 80 columns. Also, some clients may not > want to use 80 column widths. We have mobile gopher clients favoring 40 > columns. Correct. I should never have said "80 columns." Instead, I should have said "viewport width" each time. The problem with wrapping code when it gets wider than the viewport, as said previously on this mailing list, is that it breaks semantic meaning, expecially for languages like Python. One alternative for clients is behaving like `less`: truncate code horizontally and require the viewport to be widened in order to select/copy code with the cursor. This doesn't seem much easier to copy code from than simply widening the viewport until code is no longer wrapped. Or, clients take the approach of Github on mobile: truncate code visually but allow it to be properly copied graphically. But, couldn't this work for wrapping code, too? Most graphical text editors will show long lines of code as wrapped but, when copied, long lines preserve their original line breaks. Maybe I'm missing something.
I believe, once again, we are wading into territory where wrapping and reflowing are being conflated. I do not know of any client that scrolls horizontally and I do not imagine people will begin writing them since they would have to code that from scratch rather than let their terminal handle wrapping and scrolling. So wrapping seems like it will be a part of most clients. Wrapping, for our purposes, can be thought of as adding newlines to text to fit a width (I agree that viewport/termwidth makes sense in most situations). Reflow seems to be the opposite of wrapping: the removal of newlines and more than one space character in a row. Reflowing text necessarily creates long lines of text... that will then be wrapped. I definitely want to find a way to make the distinction in text/gemini documents. At present, wrapping is the default that is happening for the vast majority of clients. As such, I remain of the position that wrapped content should remain the default and we should add a syntax of some form or other to declare reflowed text. I am of the opinion that this syntax should include exclusively visible characters (even if a field separator or some other mostly unused ascii char would be convenient in some respects). Lists tend to get brought up in this conversation a good bit. The way I see it, lists work fine in the current, wrapped, geminispace. Yes, a line may wrap, but it is clear that it is still part of the above line since a new list indicator has not been seen (1, -, <bullet>, etc). Wrapping does make Python a pain to look at... but a user can always save the document and open it in any editor they like. I would be surprised to see python code in a text/gemini file anyway as that filetype certainly can, but is not meant to provide code and will be predominantly used for other things. Users can always link to code files that will get rendered in whatever reader they have configured for such a purpose. --? Sent with https://mailfence.com Secure and private email
On Sun, Dec 15, 2019, at 5:53 PM, Brian Evans wrote: > I believe, once again, we are wading into territory where wrapping > and reflowing are being conflated. Wow. I absolutely did make this mistake. Thank you for clarifying. The speculative spec states: > In order to facilitate a comfortable user experience with simple > clients which do not implement reflowing, authors SHOULD limit the > width of lines to 78 characters, excluding CRLF pairs. I got so caught up in the false dichotomy of either reflowing or 80-column text that I never questioned why source code is hard-wrapped in the first place. > At present, wrapping is the default that is happening for the vast majority > of clients I agree. The spec's justification for suggesting that source code is hard-wrapped is the idea that it makes life easier for simple client. And yet: 1. It's much easier for simple clients to wrap than to reflow 2. Afaik, most simple clients won't even need to worry about wrapping because it will already be handled by their output medium (e.g. terminal, browser). >From a while ago: > Text reflow IS A MUST. We live in a world of wildly varying screen sizes > these days. The trend is clearly headed towards polarization: huge 22'+ > desktop screens in one corner, small less than 7' smartphone screens in the > other. Nothing in the middle. There's no way you're going to get decent > presentation of non reflown text on both kinds of displays. No matter how > many kludgy rules you instate on default width, max before forced wrap, etc. All of this is completely possible without reflow. All we need to do is to state in the spec that text MUST NOT be hard-wrapped at 80 columns unless it's supposed to be displayed that way. tl;dr If the spec is changed to discourage hard-wrapping source code, there' would be no need for reflow in order to have text fit the width of a client's screen. This would make life much easier for both authors and readers using the gemini protocol. Pardon my ignorance if the above is what everyone else was already thinking.
It was thus said that the Great Aaron Janse once stated: > On Sun, Dec 15, 2019, at 5:53 PM, Brian Evans wrote: > > >From a while ago: > > Text reflow IS A MUST. We live in a world of wildly varying screen sizes > > these days. The trend is clearly headed towards polarization: huge 22'+ > > desktop screens in one corner, small less than 7' smartphone screens in the > > other. Nothing in the middle. There's no way you're going to get decent > > presentation of non reflown text on both kinds of displays. No matter how > > many kludgy rules you instate on default width, max before forced wrap, etc. > > All of this is completely possible without reflow. All we need to do is to > state in the spec that text MUST NOT be hard-wrapped at 80 columns unless it's > supposed to be displayed that way. How can you determine if text is text or source code? This is text, and this local opts = { { "c" , "cert" , true , function(c) CERT = c end }, { "k" , "key" , true , function(k) KEY = k end }, { "n" , "noverify" , false , function() NOVER = true end }, { 'h' , "help" , false , function() io.stderr:write(string.format(usage,arg[0])) os.exit(false,true) end }, } if #arg == 0 then io.stderr:write(string.format(usage,arg[0])) os.exit(false,true) end URL = arg[getopt(arg,opts)] nfl.spawn(main,URL) nfl.client_eventloop() is source code. Both are less than 80 characters in length. How do you determine when to wrap and when to reflow? I would also like to remind others that not only have I proposed a way to designate reflow/verbatim text rendering: gemini://gemi.dev/gemini-mailing-list/messages/000114.gmi (under the SAME suject line, from back in September no less!) but I also have set up some reflowing tests on my Gemini site: gemini://gemi.dev/gemini-mailing-list/messages/000127.gmi If anyone would like to test their client for wrapping purposes, this link might be beneficial: gemini://gemini.conman.org/test/wrap;1500 -spc (I like my bikesheds green, by the way ... )
On Sun, Dec 15, 2019, at 9:40 PM, Sean Conner wrote: > It was thus said that the Great Aaron Janse once stated: > > All of this is completely possible without reflow. All we need to do is to > > state in the spec that text MUST NOT be hard-wrapped at 80 columns unless > > it's supposed to be displayed that way. > > How can you determine if text is text or source code? Both are less than 80 > characters in length. What I meant to say is: my proposal is to only insert newlines into the gemini source code if those newlines are expected to be preserved by the client. > How do you determine when to wrap and when to reflow? My suggestion was to wrap in the client, never in the source code, and to never reflow. > If anyone would like to test their client for wrapping purposes, this link > might be beneficial: > > gemini://gemini.conman.org/test/wrap;1500 I did see that. Your site is always the first site I test on gemini clients :-)
Okay, I have started to re-engage with this endless discussion - slowly and, I have to admit, reluctantly. When I think about how many details there are to consider here, how many different options we have to choose among, and how absolutely incredible the power-to-weight ratio is of verbatim fixed-width text with a predefined width (I mean, really, you can: Left align text, center text, even right align text without the client having to even know what those things are!), it's incredibly tempting to echo the "reflowed text be damned!" sentiment recently expressed at mozz.us[1] and spec 40 character fixed text and just move on. But I know that'd be rash. These days I have been worrying less about catering to small-screened mobile devices and thinking more about the ability for Gemini to be self-documenting. Granted, the text/gemini format is pretty darned simple by design, but nevertheless, having really nice and clear instructions for beginners, many of whom may conceivably never have written either web or gopher content before in their life, will be important to the widespread adoption of the protocol. Imagine how much harder it would be to learn HTML if websites couldn't actually contain any code that you could copy and paste! This is exactly where Gemini is today. Yes, okay, you could put a single space at the start of a link line so it would be displayed as-is rather than treated as a link, and that would be mostly fine. But it would interfere with the ability to simply copy and paste a block of example links and have it work as-is, because the spaces at the front of at least some of the lines would get copied as well. Gopher is better than the current Gemini spec in this regard, because you can put gophermap lines in an item type 0 text file no problem and they'll just be displayed as-is. But copying and pasting that gophermap is not guaranteed to go smoothly. With terminal-based applications, the tabs would stand a good chance of being transformed into consecutive spaces, which would actually break them. Let's be better than that! Let's make it possible to display, copy and paste Gemini links inside of Gemini documents, to facilitate teaching and talking about Gemini over Gemini. It seems quite natural that this should be possible. Even if text/gemini were specced at 40 fixed-width characters with no reflow, meeting this goal would require some syntax comparable to <pre> tags in HTML, to switch off processing of Gemini links. If we're going to have that anyway, we may as well have reflowed text be the default and this <pre> syntax can do double duty by also enabling non-reflowed text for source code, poetry, etc. I remain as commited as ever to the idea that any text/gemini markup should have the property that simple clients can just dump it right to stdout verbatim and that should be very usable. As I argued previously, this rules out any kind of syntax where <pre> lines are indicated by some per-line prefix, as that prefix would break copy-and-pastability for simple clients dumping things to stdout. This is one more reason not to use the "leading whitespace" system proposed by Sean[2] (whose detailed spec remains a very nice and useful precise definition of lots of fuzzy concepts being slung around here). So, given this, I am pretty much settled on using an easily recognisable line to toggle this mode on or off, say: ``` This is the syntax used in the simplified Markdown proposal at mozz.us[3], and in ratfactor's "Text Junior" format[4], which others on this list have argued is a good candidate for Gemini syntax. I have experimented with supporting this syntax, and allowing reflowing of text not enclosed by ``` lines to an arbitrary user-specified width in AV-98 (not pushed to tildegit yet, but expect it soon). It really is not too difficult to do, so I don't think complexity of implementation is a good argument against this. Does anybody know of a programming language where lines consisting only of three consecutive single quotes happen to occur frequently? To address briefly some other points raised in this thread: Somebody suggested that the ``` syntax or similar be used to toggle on reflowed text, with fixed text being the default. I have to admit this feels really backwards to me. There are many obvious and common use cases for wanting to embed small fragments of fixed text inside a document that one would otherwise want flowed. I can't think of any case where I'd want the reverse, but of course if people can I'm happy to hear them. I am not crazy about the idea of having text/gemini "source code" consist of extremely long lines of text which are then brutally wrapped to a given width by a user's terminal. Some editors might be smart enough to present that long line wrapped at word boundaries to the author, but many won't be and this will result in a very ugly editing environment. And I'm not aware of any terminal emulator which wraps long lines at word boundaries, so this will result in an ugly reading experience for *all* sized screens. I also really don't like the idea of supporting colour in Gemini documents. I see no way to do this with a syntax which would degrade gracefully when simply dumped to stdout. This would also open the possibility of obnoxious pages with gratuitous use of colour. No, thanks! Cheers, Solderpunk [1] gemini://mozz.us/journal/2019-12-05.txt [2] gemini://gemini.conman.org/gRFC/0004 [3] gemini://mozz.us/markdown/ [4] gopher://sdf.org:70/0/users/ratfactor/phlog/2019-04-21-text-junior
It was thus said that the Great solderpunk once stated: > > I remain as commited as ever to the idea that any text/gemini markup > should have the property that simple clients can just dump it right to > stdout verbatim and that should be very usable. As I argued > previously, this rules out any kind of syntax where <pre> lines are > indicated by some per-line prefix, as that prefix would break > copy-and-pastability for simple clients dumping things to stdout. > This is one more reason not to use the "leading whitespace" system > proposed by Sean[2] (whose detailed spec remains a very nice and > useful precise definition of lots of fuzzy concepts being slung around > here). Thanks. I understand the concern about copy-n-pasting (I see issues all the time on Reddit when people don't understand to prepend each line with four spaces when posting source code). > So, given this, I am pretty much settled on using an easily > recognisable line to toggle this mode on or off, say: > > ``` I have questions. Oh, do I have questions. ``` This text is on the left. This text is mostly centered. This text is on the right. ``` But now ... ```` Note that there are four ticks. What's the expected result? Literal text? Or reflowed text? ``` ``` This is similar to the above, but the block ends with four ticks, not the expecte three. ```` ``` What about trailing text? Is that allowd? What is the expected result? ``` ``` ``` (You should be able to handle the above as well). > I have experimented with supporting this syntax, and allowing > reflowing of text not enclosed by ``` lines to an arbitrary > user-specified width in AV-98 (not pushed to tildegit yet, but expect > it soon). It really is not too difficult to do, so I don't think > complexity of implementation is a good argument against this. One thing to watch out for are lines that exceed the wrapping length with no space (or dash if you are ambitious) characters to break on. > Does anybody know of a programming language where lines consisting only > of three consecutive single quotes happen to occur frequently? I want to say Python. I don't program in Python, but I know I have seen that syntax in some lanuage. -spc > [2] gemini://gemini.conman.org/gRFC/0004
> Does anybody know of a programming > language where lines consisting only > of three consecutive single quotes happen to > occur frequently? Just a point of clarification: markdown code fences and your examples use three backticks, not single quotes
It was thus said that the Great James Tomasino once stated: > > Does anybody know of a programming > > language where lines consisting only > > of three consecutive single quotes happen to > > occur frequently? > > Just a point of clarification: markdown code fences and your examples use > three backticks, not single quotes Ah! That's it! I knew I saw *something* similar to that in Python. -spc
On Mon, Jan 13, 2020 at 11:08:58PM +0000, James Tomasino wrote: > > Does anybody know of a programming > > language where lines consisting only > > of three consecutive single quotes happen to > > occur frequently? > > Just a point of clarification: markdown code fences and your examples use three backticks, not single quotes > Whoops, right you are! I blame years and years of LaTeX for blurring the distinction in my mind with it's `unusual quotation syntax'. Cheers, Solderpunk
On Mon, Jan 13, 2020 at 05:55:04PM -0500, Sean Conner wrote: It seems easiest to me (and my current experimental implementation in AV-98 does this) to make the spec quite strict in this regard: verbatim mode is toggled on/off by lines consisting of precisely 3 ticks and nothing else. Which means... > ```` > Note that there are four ticks. > What's the expected result? > Literal text? > Or reflowed text? > ``` The first 5 lines of this will be reflowed, the final line will trigger literal text below it. > ``` > This is similar to the above, > but the block ends with four ticks, > not the expecte three. > ```` Literal mode is turned on by the first line but never turned off, so there will be a literal line of four ticks and that end, and whatever comes lext will be literalised, too! > ``` What about trailing text? > Is that allowd? > What is the expected result? > ``` Literal mode won't be turned on until the very end, and the first three lines will be flowed. > ``` > ``` > > (You should be able to handle the above as well). Hmm, I think AV-98 will handle that just fine, hang on...yep, no problems. > One thing to watch out for are lines that exceed the wrapping length with > no space (or dash if you are ambitious) characters to break on. Ah, that's an annoying edge case. I guess such lines can just be broken at exactly the viewport width, possibly with a dash at the end of all but the final line? > I want to say Python. I don't program in Python, but I know I have seen > that syntax in some lanuage. I was briefly worried about Python docstrings (before Tomasino pointed out we aren't using quotes at all!), but PEP 257 recommends always using double quotes for them anyway. Cheers, Solderpunk
-------- Original Message -------- From: solderpunk <solderpunk@SDF.ORG> > verbatim mode is toggled on/off by lines > consisting of precisely 3 ticks and nothing > else. Again, just clarifying for people who might not be familiar: the code fencing in markdown often allows the initial backtick trio to be followed by a filetype which aids clients in syntax highlighting. They look like so: ```bash ```javascript ```ruby I don't suggest gemini needs that, but know that people may assume the behavior because it is familiar. It might be easier to declare lines starting with 3 backticks to be the code fence barriers regardless of what comes later. If clients want to develop extra syntax highlighting they could, but mostly it would mean people pasting code they'd use on GitHub or stackoverflow will not be surprised by broken wrapping.
On Sun, Jan 12, 2020 at 1:43 PM solderpunk <solderpunk at sdf.org> wrote: > > Okay, I have started to re-engage with this endless discussion - > slowly and, I have to admit, reluctantly. When I think about how many > details there are to consider here, how many different options we have > to choose among, and how absolutely incredible the power-to-weight > ratio is of verbatim fixed-width text with a predefined width (I > mean, really, you can: > > Left align text, > center text, > even right align text > > without the client having to even know what those things are!), it's > incredibly tempting to echo the "reflowed text be damned!" sentiment > recently expressed at mozz.us[1] and spec 40 character fixed text and > just move on. For what it's worth, I have been trying out the 40-character width thing for a while now and I'm really enjoying it! I actually find it a lot more pleasant to type vs 80 character lines. I don't know if it's because my eyes don't need to jump as far, or because it takes fewer keystrokes to move my cursor to the middle of a line... Something about it just *feels* good to type. Not to mention, pages like this [1] display perfectly on my iphone using a gemini-http proxy server. Regardless of whether you choose to adopt the ``` mode, you're still going to need to recommend a line length for authors to hard wrap their text/gemini files at. And I suggest that 40 is still worth considering for this. > Gopher is better than the current Gemini spec in this regard, because > you can put gophermap lines in an item type 0 text file no problem and > they'll just be displayed as-is. But copying and pasting that > gophermap is not guaranteed to go smoothly. With terminal-based > applications, the tabs would stand a good chance of being transformed > into consecutive spaces, which would actually break them. Let's be > better than that! Let's make it possible to display, copy and paste > Gemini links inside of Gemini documents, to facilitate teaching and > talking about Gemini over Gemini. It seems quite natural that this > should be possible. > > Even if text/gemini were specced at 40 fixed-width characters with no > reflow, meeting this goal would require some syntax comparable to > <pre> tags in HTML, to switch off processing of Gemini links. If > we're going to have that anyway, we may as well have reflowed text be > the default and this <pre> syntax can do double duty by also enabling > non-reflowed text for source code, poetry, etc. Here are some other alternatives that might be worth considering. I do think that displaying gemini links is a valid use-case, but adding a whole new preformatted text mode only for this narrow case feels a bit heavy-handed to me. Granted, I realize there are other benefits to the preformatted mode that have already been outlined. Option 1. Use a no-op link Pick a URL that by convention doesn't lead anywhere useful, and then hijack the (link friendly name) portion to display your gemini link. =># =>/about.txt About "#" is a valid relative URL, right? Somebody else on this list *cough* sean might be able to some up with something better. This would be displayed on most gemini clients as: =>/about.txt About The line would be highlighted as a link (unless clients choose to handle this special case), but otherwise it should work without any changes to the spec. Option 2. Use text/plain For the narrow use-case where you want to show off some examples of gemini links, stick those links in a separate text/plain document. Or just serve your whole page as text/plain. The example links can't intermingle with real gemini links in the same document, but is that really such a big deal? How you feel about this option likely depends on which side of the fence you fall on regarding text/gemini usage. Should text/gemini be used like HTML is on the web, with most content being written as gemini files? Or should it be more like gopher, where directories are type text/gemini but many people write their blog posts and other leaf documents as text/plain. Lately I have been leaning more towards the second interpretation. Take another example: Instead of writing a python code snippet inline in a text/gemini document, what if you instead added a link to your code snippet and served it as "text/x-python"? This feels natural to me given that other media content like images also can't be displayed inline. [1] https://portal.mozz.us/gemini/mozz.us/diagnostics/2020-01-08/notes.gmi - mozz
It was thus said that the Great Michael Lazar once stated: > > Option 1. Use a no-op link > > Pick a URL that by convention doesn't > lead anywhere useful, and then hijack > the (link friendly name) portion to > display your gemini link. > > =># =>/about.txt About > > "#" is a valid relative URL, right? > Somebody else on this list *cough* sean > might be able to some up with something > better. Who? Me? Wait! Did you ask for two examples of a pronoun? Anyway, yes '#' is a vaild URL, or at least, it passes through my URL parser. And it's a cute work around. > This would be displayed on > most gemini clients as: > > =>/about.txt About > > The line would be highlighted as a link > (unless clients choose to handle this > special case), but otherwise it should > work without any changes to the spec. > > Option 2. Use text/plain Option 3, Use text/markdown. RFCs 7763 and 7764. Other than that, I don't really ha much of a horse in this particular race. -spc
Solderpunk wrote recently regarding use of color in gemini documents. I would like to put in a vote to the contrary. I know that not all clients will support color, those that do not may not be coded to remove any methodology of adding color (current methods surround vt100/ansi escape sequences). Many clients will also not be terminal based, making escape sequences nearly meaningless. Having said all this, my two cents are that the spec should not address this whatsoever. It should be on content creators to decide whether adding said escape sequences is something that are comfortable with. Likewise, each client should choose whether this is a detail they want to handle or not. At present my client supports color for all protocols that it supports (gopher, gemini, finger, local files). When color mode is toggled on any escape sequences in the \033[???m series will be rendered (but no other escape sequences). It feels weird to _remove_ gemini from this feature. It feels like it should not be a part of the spec, just like it is not part of the gopher spec, and some clients support it and some dont: content creators beware. I am running into similar worries with the wrapping and reflowing conversation. I do not want to add horizontal scrolling to my client so hard wrapping will be done at screen width. Reflowing is another matter of course.... I'm starting to feel like it just isnt worth the fuss. Since _all_ clients seem to hard wrap, just use a long line if you want flowed content. Done. No extra work required for clients. It keeps things very very simple. At present gemini is great to navigate around (what there is of it anyway) and fulfills the goal of being like gopher, but better. Just my two cents... --? Sent with https://mailfence.com Secure and private email
On Wed, Jan 15, 2020 at 12:03:28AM -0500, Michael Lazar wrote: > > Not to mention, pages like this [1] > display perfectly on my iphone using a > gemini-http proxy server. Regardless of > whether you choose to adopt the ``` > mode, you're still going to need to > recommend a line length for authors to > hard wrap their text/gemini files at. > And I suggest that 40 is still worth > considering for this. Yes, I definitely intend to include a recommendation that text/gemini content be hard wrapped at something less than the traditional 70 or 80 columns. This convention seems to slowly taking hold in the phlogosphere (driven, delightfully, as much by retro PDA enthusiasts as smartphone users!). In writing about this earlier, I idly raised the prospect of clients for "real computers" displaying text hard wrapped at ~40 characters displaying multiple columns side by side - I actually bumped into an example of this in the wild the other day (see the end of gemini://tilde.black/users/brool/stoned.txt) and thought it looked great! > Here are some other alternatives that > might be worth considering. I do think > that displaying gemini links is a valid > use-case, but adding a whole new > preformatted text mode only for this > narrow case feels a bit heavy-handed to > me. Granted, I realize there are other > benefits to the preformatted mode that > have already been outlined. I agree that the ability to display literal Gemini links inside a text/gemini page probably doesn't, by itself, justify adding any kind of markup complexity. But there are indeed many other benefits, as you mentioned. I had kind of been on the fence about whether or not such complexity was worth adding mostly on the basis of those other benefits (which mostly relate to supporting mobile devices, which, even though I do use them, I kind of hate and feel reluctant to cater to). The realisation that a verbatim text mode can also do some genuinely useful work even on a "big screen" where text reflowing never has to be done has, I think, tipped the scales for me enough to decide that this is worth it. After all, the essential complexity cost we'd be paying for these benefits is quite low. Remembering that it will be explicitly okay for simple clients not to reflow text if they don't want to, the most anybody will be obligated to do is something like the following: in_verbatim = False for line in all_the_lines: if line == "```": in_verbatim = not in_verbatim elif line.startswith("=>"): handle_a_link(line) else: print(line) This won't display the ``` lines and will avoid trying to parse any links which are supposed to be presented verbatim to the uesr (which, for the purposes of education, might be syntactically invalid and not something the client should try to parse anyway). Nobody can reasonably call the above difficult or bloated, and it's only three lines longer than the bare minimum that is currently required: for line in all_the_lines: if line.startswith("=>"): handle_a_link(line) else: print(line) (as an aside, the times recently I've talked about "simple clients catting content straight to stdout" was a bit careless, because of course you need to actively extract the links as above) If this is all it takes to make it possible for ambitious client authors to support word reflowing on mobile clients without making it impossible to include ASCII art, source code or poetry, that seems like a fair trade to me. > Option 1. Use a no-op link > > Pick a URL that by convention doesn't > lead anywhere useful, and then hijack > the (link friendly name) portion to > display your gemini link. > > =># =>/about.txt About > > "#" is a valid relative URL, right? This is kind of a cute hack, I'll admit, but I worry that (especially to people that aren't intimiately familiar with RFC3986!) it's obscure and less easy to remember than the back tick syntax familiar from several other markup languages. Also... > This would be displayed on > most gemini clients as: > > =>/about.txt About ...it would display as something like: [7] =>/about.txt About in AV-98. And with a button styling in Castor! In general I think it's nice if clients have a little leeway in choosing how they want to present links. > The line would be highlighted as a link > (unless clients choose to handle this > special case), Which I guess would take at *least* two extra lines of code, giving this approach only a one line advantage over the ``` approach which also facilitates reflowing text (this hack is just to make it possible to display link syntax insie text/gemini documents, right?). > Option 2. Use text/plain > > ... > > How you feel about this option likely > depends on which side of the fence you > fall on regarding text/gemini usage. > Should text/gemini be used like HTML is > on the web, with most content being > written as gemini files? Or should it be > more like gopher, where directories are > type text/gemini but many people write > their blog posts and other leaf > documents as text/plain. > > Lately I have been leaning more towards > the second interpretation. Take another > example: Instead of writing a python > code snippet inline in a text/gemini > document, what if you instead added a > link to your code snippet and served it > as "text/x-python"? This feels natural > to me given that other media content > like images also can't be displayed > inline. This is a conversation well worth having! From my point of view, Gemini very deliberately chose to cast aside Gopher's strict distinction between menus and content, in part because so many people in Gopherspace seemed to be disatisfied with it. So, it seems kind of a shame to just disregard this and go back to the gopher way. Of course, people are totally free to do so if they want to! Neither the protocol nor I will object. It is true that there seem to be quite a lot of cases in the wild where people are using text/gemini menus to link to collections of text/plain files. Sometimes this seems to be accidental, because the files actually contain attempted Gemini links, e.g.: gemini://tilde.black/users/fox/journal/20190831-starting-out.txt Other times I suspect this is to facilitate straightforward bihosting on Gopher and Gemini. I presonally plan to write Gemini content primarily in text/gemini, and in fact I'll probably decide how to distribute content between my Gopher and Gemini sites based in part on how well different kinds of content benefit from Gemini's ability to do in-line links. Is anybody else planning to do this? Are the people currently not doing this doing so out of conscious choice or just carrying over old Gopher habits? As for using in-line links as a way to include un-flowed text, I did think of this option (maybe somebody else already mentioned it in the list?). It would work, but I feel like the user experience would be fairly clunky, especially for the combination of articles with frequent short code snippets (like programming tutorials) and simple terminal-based clients where going "back" to the text/gemini document from the text/x-python document wouldn't return the reader to the same point in the (potentially quite long) document from which they followed the link. I do take your point, though, that we're already forced into that clunky paradigm for images and other non-textual media... Ultimately I don't think I like either of these alternatives as much as the ``` syntax, but I'm happy to hear others' thoughts... Cheers, Solderpunk
On Wed, Jan 15, 2020, at 12:21 PM, solderpunk wrote: > On Wed, Jan 15, 2020 at 12:03:28AM -0500, Michael Lazar wrote: > > > > Not to mention, pages like this [1] > > display perfectly on my iphone using a > > gemini-http proxy server. Regardless of > > whether you choose to adopt the ``` > > mode, you're still going to need to > > recommend a line length for authors to > > hard wrap their text/gemini files at. > > And I suggest that 40 is still worth > > considering for this. > > Yes, I definitely intend to include a recommendation that text/gemini > content be hard wrapped May I ask why? Hard wrapping causes several issues: 1. If someone is using git, for example, to version control their gemini blog source code, they might have to re-wrap entire paragraphs in order to add one word to the beginning. 2. Many markdown documents have 500+ column lines (one paragraph per newline). If we require that Gemini source code is hard-wrapped, users couldn't copy-paste markdown into their gemini blog/site without hard-wrapping first. In fact, I think *discouraging* hard-wrapping might actually make life easier for client implementers. Hard-wrapping would require clients to un-wrap the newlines then re-wrap to the desired width. I think that's more complex than: 1. Leaving wrapping up to the output medium (e.g. terminal, web browser). 2. Implementing wrapping within the client source code (which would be done anyway if Gemini source code *can be* hard-wrapped) Oh, well. I'm vocal about this because Gemini looks really exciting to me. As someone who plans to use it, I simply want it to be what I think is as easy to use as possible :-) Cheers!
> 2. Many markdown documents have 500+ column lines (one paragraph per > newline). If we require that Gemini source code is hard-wrapped, users > couldn't copy-paste markdown into their gemini blog/site without > hard-wrapping first. Why would they need to hard wrap it? It would be the client or terminal that would be required to follow the spec and hard wrap it, not the content creators. I think every current client hard wraps at least to window size (if not some other value). This makes it so that a user can have a 500+ column row and the client/terminal/browser/etc will wrap it for them, leaving nothing for the user to do but paste it into their document for it to be wrapped automatically. I would image that most clients that allow the saving of pages would retain the original document structure free of any modifications made by the wrap (though I suppose mileage may vary in that regard). This text reflow thread has been quite the minefield over time. Who would have thought that this would be the big thing to figure out when all this started?
On Wed, Jan 15, 2020 at 12:47:27PM -0800, Aaron Janse wrote: > > Yes, I definitely intend to include a recommendation that text/gemini > > content be hard wrapped > > May I ask why? Mostly because I want simple-as-possible clients to be viable, which means simply printing non-link lines to stdout should result in something usable. Paragraphs of text formatted as a single line are tremendously unpleasant to read when displayed this way! If this were the norm for text/gemini documents, I suspect nobody would use any client that didn't include (tedious to write!) code to wrap these lines at word breaks. > In fact, I think *discouraging* hard-wrapping might actually make life easier > for client implementers. Hard-wrapping would require clients to un-wrap the > newlines then re-wrap to the desired width. It wouldn't *require* that, the lines could simply be displayed at their hard-wrapped width, which is why we're concerned with recommending a narrow enough width that this would work on devices with narrow displays. Basically, consider what happens with the bare minimum amount of code in either case: A) If an entire paragraph is one line and a client doesn't have code to break that big line up at word boundaries, leaving wrapping up to the terminal results in multiple lines (potentially uncomfortably long for reading) with randomly cut-up words at their beginnings and ends. B) If the paragraph is hard-wrapped at ~40 characters and a client doesn't have code to un-wrap and re-rewrap those lines, the result is lines of a comfortabe length for reading, which fit on a smartphone and don't have any randomly cut-up words at their beginnings and ends. Given these two choices, surely B) is preferable? Admittedly, this analysis is somewhat terminal-centric. A lot of GUI toolkits probably have text displaying widgets which will handle breaking lines at word boundaries without the developer having to give it a second though. On the one hand, the vast majority of extant clients are terminal-centric and I think the majority of the early adopters (being gopher folk) lead terminal-centric lives, so a terminal-centric perspective is only natural. On the other hand, Gemini isn't supposed to be only for a certain group of people, so I'm reluctant to lean on this too much... > Oh, well. I'm vocal about this because Gemini looks really exciting to me. > As someone who plans to use it, I simply want it to be what I think is as > easy to use as possible :-) I appreciate you speaking up and I'm glad you're excited! Cheers, Solderpunk
On Wed, Jan 15, 2020 at 10:21:54PM +0100, Brian Evans wrote: > Who would have thought that this would be the big thing to figure out when all this started? Let me assure you, I certainly didn't! :) Cheers, Solderpunk
On Wed, Jan 15, 2020, at 2:11 PM, solderpunk wrote: > B) If the paragraph is hard-wrapped at \~40 characters and a client doesn't > have code to un-wrap and re-rewrap those lines, the result is lines of a > comfortabe length for reading, which fit on a smartphone and don't have any > randomly cut-up words at their beginnings and ends. > > Given these two choices, surely B) is preferable? > I think that one-line-per-paragraph would be much better for mobile phones. I personally use a web browser proxy to read gemini pages when on mobile. Text wrapped at 80 columns looks horrible. Text wrapped at 40 columns looks okay, depending on my font size. But if text was one-line-per-paragraph, firefox would wrap it wonderfully with *zero* effort from the proxy author. > Admittedly, this analysis is somewhat terminal-centric. A lot of GUI > toolkits probably have text displaying widgets which will handle > breaking lines at word boundaries without the developer having to give > it a second though. Yeah. And the variety of clients adds a lot of nuance to how text wrapping degrades. I can only imagine unlimited-column text catastrophically failing in two places: 1. A very wide terminal 2. A very wide web browser However, each has a simple solution: 1. Pipe text into the bash `fmt` command, which intelligently wraps text 2. Use *simple* css to make the viewport narrower, then let the browser do the wrapping Thanks, solderpunk, for being a thoughtful BDFL! Cheers!
On Wed, Jan 15, 2020 at 3:21 PM solderpunk <solderpunk at sdf.org> wrote: > > On Wed, Jan 15, 2020 at 12:03:28AM -0500, Michael Lazar wrote: > > Here are some other alternatives that > > might be worth considering. I do think > > that displaying gemini links is a valid > > use-case, but adding a whole new > > preformatted text mode only for this > > narrow case feels a bit heavy-handed to > > me. Granted, I realize there are other > > benefits to the preformatted mode that > > have already been outlined. > > ... > > Ultimately I don't think I like either of these alternatives as much as > the ``` syntax, but I'm happy to hear others' thoughts... > > Cheers, > Solderpunk Thanks for your response. After seeing everything laid out in front of me, I think I agree with you and I like your proposal the best. I wanted to visualize what this would look like on mobile, so I hacked up my Gemini-HTTPS proxy to render the ``` markdown syntax. Here are some example pages: https://portal.mozz.us/gemini/mozz.us/files/aaatutorial.gmi?reflow=2 https://portal.mozz.us/gemini/mozz.us/files/flask_mega_tutorial_part_2.gmi?reflow=2 I think they look quite nice! - mozz
On Wed, Jan 15, 2020 at 06:41:00PM -0800, Aaron Janse wrote: > I think that one-line-per-paragraph would be much better for mobile phones. > I personally use a web browser proxy to read gemini pages when on mobile. > Text wrapped at 80 columns looks horrible. Text wrapped at 40 columns looks > okay, depending on my font size. But if text was one-line-per-paragraph, > firefox would wrap it wonderfully with *zero* effort from the proxy author. I am kind of reluctant to make any Gemini design decisions based on the assumption of a web browser as the user agent. I understand that in these very early days this is by far the quickest and easiest way to get into Geminispace, and for smartphones it's probably the *only* viable way. But I hope that as time moves forward the proxies will become a niche thing, serving as a "gateway drug" for real clients. This is surely a valuable role, but for regular use I think they should be "considered harmful". They represent single (or few) points of failure, they involve trusting the proxy operator not to manipulate content (Gemini isn't sophisticated enough to permit proper TLS proxying, so the the web proxy is basically a MITM between client and server), proxy operators have more opportunities to log and track their users than individual Gemini servers do, and proxy users need to run vastly more complex than necessary software (i.e. web browsers, although at least Gemini proxies should typically be usable with nice alternative browsers like dillo). (aside: some of these issues go away with proxies designed to be run locally by the user - a nice project for anybody itching for one!) None of this is to say the web proxies or bad and that people shouldn't run them or use them - I'm very gratefully to the people who have set them up! But I don't think of them as "first class" clients, and given a choice between pushing implementation effort onto native client authors or onto web proxy authors, I will make life easier for native client authors every time. Besides, getting a web proxy to provide beautiful wrapping if a text/gemini file is hard-wrapped at 40 chars involves nothing more than wrapping paragraphs in <p> and </p> tags. That's a significantly easier task than getting a terminal client to provide beautiful wrapping if a text/gemini file has lines thousands of characters long, which requires splitting the line into words, calculating and summing the lengths of words, etc, etc. Given the choice between making web proxy authors do a little bit more work and makig native client authors to a moderate amount of more work, I'm definitely going to choose the former. > I can only imagine unlimited-column text catastrophically failing in two > places: > > 1. A very wide terminal > 2. A very wide web browser Neither of which are terribly uncommon, right, with full-screen windows on desktops or even laptops? My "daily driver" laptop terminal is 113 chars wide. Although, even on a terminal < 80 chars wide, I kind of consider words being split across lines as pretty severe failure. I don't want to read that - it's even less pleasant than hard-wrapped 80 char lines on a mobile. > Thanks, solderpunk, for being a thoughtful BDFL! I'm glad you think I'm thoughtful! Sorry if I seem to be dismissing the "long lines" approach out of hand, I promise you I'm giving it a lot of thought. I'm already stressing out that I'm being unduly influenced by the fact that I use simple / old-fashioned editors and mostly write stuff that should be hard-wrapped (plain text email, gopher content, source code). From my perspective, writing "long line" content is a less pleasant experience for authors, because my editors don't work that way out of the box. But I realise that, actually, for the majority of people that's far *more* accessible. Someone using something resembling Notepad is going to have a miserable time writing content hard-wrapped at 40 chars, while the "long line" format just happens, probably without them even realising it. Then again, making text/gemini easy to write with "normal" editors arguably isn't worth much if the next step is anyway "now use sftp or a git push to get your content on the server". Gemini is never going to be able to support easy WYSIWIG authoring experiences akin to WordPress, so perhaps it's pointless to consider the user experience for non-technical types. Argh! Simplicity ain't simple. Cheers, Solderpunk
On Thu, Jan 16, 2020, at 7:10 AM, solderpunk wrote: > But I realise that, actually, for the majority of people that's far > *more* accessible. Someone using something resembling Notepad is going > to have a miserable time writing content hard-wrapped at 40 chars, while > the "long line" format just happens, probably without them even realising > it. I do admit that you've nearly convinced me that 40 chars is easiest for readers, even though I'm not sure if it's best if we consider authors, too. > Then again, making text/gemini easy to write with "normal" editors > arguably isn't worth much if the next step is anyway "now use sftp or a > git push to get your content on the server". In my experience, though, hard-wrapping text doesn't work well with git anyway. And I don't know of any text editors that re-hard-wrap automatically when the beginning of a paragraph is edited. Maybe gemini files could be authored in either wrapped or unwrapped format, then the server could hard-wrap intelligently before sending it to the reader? Would that be bad practice? Cheers!
It was thus said that the Great Aaron Janse once stated: > On Thu, Jan 16, 2020, at 7:10 AM, solderpunk wrote: > > But I realise that, actually, for the majority of people that's far > > *more* accessible. Someone using something resembling Notepad is going > > to have a miserable time writing content hard-wrapped at 40 chars, while > > the "long line" format just happens, probably without them even realising > > it. > > I do admit that you've nearly convinced me that 40 chars is easiest for > readers, even though I'm not sure if it's best if we consider authors, too. > > > Then again, making text/gemini easy to write with "normal" editors > > arguably isn't worth much if the next step is anyway "now use sftp or a > > git push to get your content on the server". > > In my experience, though, hard-wrapping text doesn't work well with git > anyway. And I don't know of any text editors that re-hard-wrap automatically > when the beginning of a paragraph is edited. > > Maybe gemini files could be authored in either wrapped or unwrapped format, > then the server could hard-wrap intelligently before sending it to the > reader? Would that be bad practice? First question---how to tell the server the width? Well, one solution: gemini://gemini.conman.org/test/wrap;80 gemini://gemini.conman.org/test/wrap;40 gemini://gemini.conman.org/test/wrap;32 gemini://gemini.conman.org/test/wrap;132 Just replace the number at the end with your preferred width. If not given: gemini://gemini.conman.org/test/wrap it will currently default to 77. I just happen to have the code to handle reflowing text at arbitrary widths, but the code to do so has to take into account a few edge cases that might not be readily apparent (what if there is no breakpoint within the mandated width?). -spc (Also note---I added that to my Gemini site last September)
On Thu, Jan 16, 2020, at 10:27 AM, Sean Conner wrote: > First question---how to tell the server the width? Well, one solution: On one hand, I like the idea of the server doing the wrapping. It would allow viewport-specific fixed text. For example, two things that server-side wrapping would allow that are currently used in the Gemini speculative spec: - viewport-width-specific dividers (a bunch of dashes) - indented paragraphs or trailing-indent bullet points On the other hand, server-side wrapping has the following drawbacks: - broadcasting viewport width goes against Gemini privacy values - clients can no longer download a gemini blog then view it at different widths - this could open up room to server-side trickery, such as indenting fixed-width content, which I don't think is a good idea Also, speaking of privacy, does Gemini work over Tor?
On Thu, Jan 16, 2020 at 11:48:09AM -0800, Aaron Janse wrote: > On Thu, Jan 16, 2020, at 10:27 AM, Sean Conner wrote: > > First question---how to tell the server the width? Well, one solution: > > On one hand, I like the idea of the server doing the wrapping. > > It would allow viewport-specific fixed text. For example, two things > that server-side wrapping would allow that are currently used in the > Gemini speculative spec: > > - viewport-width-specific dividers (a bunch of dashes) > - indented paragraphs or trailing-indent bullet points > > On the other hand, server-side wrapping has the following drawbacks: > > - broadcasting viewport width goes against Gemini privacy values > - clients can no longer download a gemini blog then view it at > different widths > - this could open up room to server-side trickery, such as indenting > fixed-width content, which I don't think is a good idea > > Also, speaking of privacy, does Gemini work over Tor? Yeah, I don't think this does it because not only does it complicate server implementation, but it does so in a way that seeks to allow servers to learn about clients. That being said, as far as I'm concerned, literally anything at all is preferable to hard-wrapping at 40 characters -- less than a quarter of my *laptop* screen with a mid-size font; I can only imagine how it looks on an actual monitor. Limiting lines to 5-6 words would seem to limit, on a fundamental level, the types of content that can even be served over gemini. Adding an arbitrary cap to line lengths purely so that a hypothetical mobile client doesn't require 10-20 lines of wrapping code (code that Google suggests already exists in the Android SDK and merely has to be invoked) seems absurd to me, particularly when the wrapping itself is trivial, and this is entirely because word-wrapping is considered preferable to naive, occasionally-mid-word wrapping. The argument I'm reading is that nobody would use a client that occasionally wraps in the middle of words, and the gemini spec is explicitly geared toward making client implementations as simple as possible. I understand this, yet I think there exists a world of difference between "it should be trivial to write a client" and "it should be trivial to write a client that formats every line beautifully and that we would all love using," particularly when the point of difference between the two is whether or not some words are line-broken some of the time. This is just my two cents, but I feel like that's something that it makes more sense to leave up to the client. The actual client-side process of interacting with the server is still simple, but printing it can be done in a simple way (wrap, occasionally breaking words) or with a few lines and a couple conditions to guard edge-cases (to word-wrap). I don't know, I understand that some arguments in favor of a 40 character hard-wrap have been made from a reading level, and I understand them, but I don't think that justifies hard-capping all lines at only a few words. If you want to be able to speed-read lines, you could write a client that word-wraps at 40 characters very, very easily. You could write a client that flashes words at you one at a time quickly like those speed-reading programs if you want to take that idea to an extreme. But I feel like this is a very drastic response to a non-issue. Maybe I wouldn't feel so bad about it if the limit being suggested weren't as puny as 40 characters. But no matter what the limit is, there are still screens, or fonts, or what have you, where any hard-wrapping solution does not solve the problem. And if aesthetics are the main concern, and we can't stomach a few split words, then I really don't get why the best solution is to squash every single gemini site into the left 25% of our screens. I'm sure whatever flagship gemini client eventually emerges on top is going to be beautiful and not wrap in the middle of words, but I also don't think that the spec should be specced based upon the hypothetical flagship client -- client implementation can be simple, and it will no doubt remain so for gemini; but flagship client implementation is not, cannot, and will never be, for any protocol. I'm not trying to be insulting to anyone, and I get that y'all will likely disagree. I fully understand the reasons behind what everyone in favor of the 40 char limit is suggesting, but I think that practically speaking, it's not liveable for anyone trying to use gemini, and it causes more problems than it solves. idk ~ lel
On Thu, Jan 16, 2020 at 09:40:50AM -0800, Aaron Janse wrote: > In my experience, though, hard-wrapping text doesn't work well with git > anyway. And I don't know of any text editors that re-hard-wrap automatically > when the beginning of a paragraph is edited. Hmm. Neither do I and, in fact, well...I just do it manually. Which I feel very sheepish admitting because that's kind of ridiculous. But everybody writing for Gopherspace (which is many people here) must face precisely this problem, because hard-wrapping is basically compulsory there. What are other people doing, writing in "long line" form and then feeding the result to `fmt` or `par` before uploading? Cheers, Solderpunk
On Thu, Jan 16, 2020 at 12:41 PM Aaron Janse <aaron at ajanse.me> wrote: > > On Thu, Jan 16, 2020, at 7:10 AM, solderpunk wrote: > > Then again, making text/gemini easy to write with "normal" editors > > arguably isn't worth much if the next step is anyway "now use sftp or a > > git push to get your content on the server". > > In my experience, though, hard-wrapping text doesn't work well with git > anyway. And I don't know of any text editors that re-hard-wrap automatically > when the beginning of a paragraph is edited. > > Maybe gemini files could be authored in either wrapped or unwrapped format, > then the server could hard-wrap intelligently before sending it to the > reader? Would that be bad practice? > > Cheers! I see no reason we can't let each server decide this on a case-by-case basis, based on the needs of their users. If you're hosting shared gemini content for a group of non-technical users, it might make sense for your server to automatically reformat gemini files before sending them. On the other hand, if you're running a home-grown gemini server and writing all of your own content, this step might be unnecessary or invasive to your preferred workflow. If we're adopting the proposed format with ``` for preformatted blocks, the rules can be totally ambiguous for server-side line wrapping. 1. preformatted blocks are not touched by the server 2. links are not touched by the server 3. everything else can be wrapped by the server to the preferred width Content authors should not expect their non preformatted text to have line lengths / indents / etc. preserved. - mozz
On Thu, Jan 16, 2020, at 12:26 PM, solderpunk wrote: > But everybody writing for Gopherspace (which is many people here) must > face precisely this problem, because hard-wrapping is basically compulsory > there. And my understanding is that while Gemini isn't supposed to be a Gopher 2.0, it's supposed to make it better than Gopher. I think that non-hard-wrapping could be a big step towards this goal. lel said: > Adding an arbitrary cap to line lengths purely so that a hypothetical > mobile client doesn't require 10-20 lines of wrapping code (code that > Google suggests already exists in the Android SDK and merely has to be > invoked) seems absurd to me, particularly when the wrapping itself is > trivial, and this is entirely because word-wrapping is considered > preferable to naive, occasionally-mid-word wrapping. Honestly, even if it takes 40 lines of code, I'd rather write that then manually wrap every single gemini document I ever write. > What are other people doing, writing in "long line" form and > then feeding the result to `fmt` or `par` before uploading? For things such as email, after manually wrapping quotes, I think using `fmt` or `par` is the most efficient way to wrap text. However, this only works for text that's written once, such as emails. This doesn't work for continually edited content such as websites. > If you're hosting shared gemini content for a group of non-technical users, > it might make sense for your server to automatically reformat gemini files > before sending them I don't have a problem with people "rendering" into Gemini. I just think that the spec shouldn't make it necessary. Rendering/compiling into Gemini adds a level of indirection that hides the transparency that makes me love the protocol. Cheers!
On Wed, Jan 15, 2020 at 11:58:48PM -0500, Michael Lazar wrote: > I wanted to visualize what this would > look like on mobile, so I hacked up my > Gemini-HTTPS proxy to render the ``` > markdown syntax. Here are some example > pages: > > https://portal.mozz.us/gemini/mozz.us/files/aaatutorial.gmi?reflow=2 > https://portal.mozz.us/gemini/mozz.us/files/flask_mega_tutorial_part_2.gmi?reflow=2 > > I think they look quite nice! These look great! And I don't just mean aesthetically, they are an excellent example of the practical usefulness of being able to embed prefromatted text in otherwise "normal" prose. Cheers, Solderpunk
Oops, I meant to mention two things: 1. Clients are already using TLS. I think wrapping is significantly easier than encryption. 2. Lists are tough to handle when re-flowing text. Thanks again for the thoughtfulness, solderpunk!
On Thu, Jan 16, 2020 at 3:51 PM Aaron Janse <aaron at ajanse.me> wrote: > On Thu, Jan 16, 2020, at 12:26 PM, solderpunk wrote: > > If you're hosting shared gemini content for a group of non-technical users, > > it might make sense for your server to automatically reformat gemini files > > before sending them > > I don't have a problem with people "rendering" into Gemini. I just think that > the spec shouldn't make it necessary. Rendering/compiling into Gemini adds > a level of indirection that hides the transparency that makes me love the > protocol. > > Cheers! To be clear, I agree with you that a max line length shouldn't be enforced by the spec. I support the current wording of SHOULD as opposed to MUST. I'm more talking about establishing a recommended best practice. Gemini files with lines over N characters long should still be considered "valid" and must be accepted by any gemini client. They might just look a little funky depending on how sophisticated your client is. My belief is that if we don't establish a guidance on the line length, the community will informally gravitate towards one anyway. And that will almost certainly be 70-80 characters because that's what most people in this community are used to. Servers that return 500+ characters on a single line will look bad on many of the gemini clients that people write, and content authors will be pressured into conforming to maintain readability for those clients. The spec-spec as written right now says that all gemini lines should be formatted so that they can reflowed by clients if the client chooses to do so. That spec has been almost unanimously rejected by current gemini servers, in favor of hard wrapping at 70-80 characters. - mozz
On Thu, Jan 16, 2020 at 01:09:18PM -0800, Aaron Janse wrote: > Oops, I meant to mention two things: > 1. Clients are already using TLS. I think wrapping is significantly > easier than encryption. I was about to quip that the TLS is made really easy by high-level library support while the wrapping has to be done manually, but then I thought to check and, holy heck, Python has a `textwrap` module I wasn't aware of. > 2. Lists are tough to handle when re-flowing text. One of the most appealing things to me about the "long lines" approach is that it actually makes lists with multi-line items feasible without having to define any special syntax for them. I might elaborate on this later... Cheers, Solderpunk
It was thus said that the Great solderpunk once stated: > On Thu, Jan 16, 2020 at 09:40:50AM -0800, Aaron Janse wrote: > > > In my experience, though, hard-wrapping text doesn't work well with git > > anyway. And I don't know of any text editors that re-hard-wrap automatically > > when the beginning of a paragraph is edited. > > Hmm. Neither do I and, in fact, well...I just do it manually. Which I > feel very sheepish admitting because that's kind of ridiculous. But > everybody writing for Gopherspace (which is many people here) must face > precisely this problem, because hard-wrapping is basically compulsory > there. What are other people doing, writing in "long line" form and > then feeding the result to `fmt` or `par` before uploading? Setting aside my phlog [1], the rest of the content on my gopher and Gemini servers is written with 80 columns in mind. The editor I use will wrap at a default setting of 77 (but can be changed on the fly). The last computer I used with a width of less than 80 characters was my TRS-80 Color Computer which had a width of 32 characters (and I last used regularly in 1987). Since then, all the computers I've had supported at least 80 characters [2]. For my phlog, it's a rendering of my web-based blog [3] where I use Lynx to do the conversion from HTML to plain text (with a bit of post-processing to fix intra-blog links). How do I write my blog entries? I use my regular editor and I've adopted a style a few years ago where I write ... not exactly a sentance per line, but a thought per line. Okay, an example from a previous post [4]: [=== example ===] <p>Okay, to be fair, I did find references and draft material covering the problem of pirates, but I found his stance on a 12 gauge shotgun to be ?more accurate? than a hand gun to be questionable at best. ?Accuracy? on a rolling, pitching boat in the open water is going to be questionable, regardless of choice of firearm.</p> <p>There is correspondence with yatch manufactuers, blue prints, price breakdowns (nearly $300,000 in 1982 dollars, making it nearly $800,000 in today's dollars?ouch!) and scores of articles on everything related to sailing. It also appears that Dad was trying to invent a new type of sail, as there were drawings he did and correspondence with an engineering firm. I'm not sure what I'll do with it all, but the blueprints are cool.</p> [=== end ===] Web browsers don't care about the raw formatting---they reflow it, with styling coming from HTML and CSS. And the reason I keep it very ragged like that is to make it easier to move sentances and fragments about when editing the entry. I also run the entry through a processing script to convert some shortcuts, like turning `` into ? or "1^st" into "1<sup>ST</sup>" [4] or even pulling out image sizes for the <IMG> tag. This is probably more than you wanted to know 8-) -spc [1] gopher://gopher.conman.org/1phlog.gopher [2] Okay, yes! I have an iPhone. I don't use it to browse gopherspace though. [3] http://boston.conman.org/ [4] Yes, I have a type of markup I use, but it's totally custom to me and how I write entries, but it does quite a bit. Here's a sample that includes all the features (I think I keep this up to date): https://github.com/spc476/mod_blog/blob/master/NOTES/testmsg And the script I use: https://github.com/spc476/mod_blog/blob/master/Lua/format.lua
It was thus said that the Great Sean Conner once stated: > > How do I write my blog entries? I use my regular editor and I've adopted > a style a few years ago where I write ... not exactly a sentance per line, > but a thought per line. Okay, an example from a previous post [4]: This footnote [4] should have pointed to here: http://boston.conma.norg/2020/01/13.1 > Web browsers don't care about the raw formatting---they reflow it, with > styling coming from HTML and CSS. And the reason I keep it very ragged like > that is to make it easier to move sentances and fragments about when editing > the entry. I also run the entry through a processing script to convert some > shortcuts, like turning `` into ? or "1^st" into "1<sup>ST</sup>" [4] or > even pulling out image sizes for the <IMG> tag. And this footnote [4] should have been footnote [5]. I got distracted when writing the email. -spc > [5] Yes, I have a type of markup I use, but it's totally custom to me > and how I write entries, but it does quite a bit. Here's a sample > that includes all the features (I think I keep this up to date): > > https://github.com/spc476/mod_blog/blob/master/NOTES/testmsg > > And the script I use: > > https://github.com/spc476/mod_blog/blob/master/Lua/format.lua
On Thu, Jan 16, 2020 at 05:00:59PM -0500, Michael Lazar wrote: > To be clear, I agree with you that a max line length shouldn't be enforced by > the spec. I support the current wording of SHOULD as opposed to MUST. I'm more > talking about establishing a recommended best practice. Gemini files with lines > over N characters long should still be considered "valid" and must be accepted > by any gemini client. They might just look a little funky depending on how > sophisticated your client is. Oh, absolutely. All this discussion about hard wrapping at 40 chars is firmly in the real of best practice recommendation. > The spec-spec as written right now says that all gemini lines should be > formatted so that they can reflowed by clients if the client chooses to do so. > That spec has been almost unanimously rejected by current gemini servers, in > favor of hard wrapping at 70-80 characters. The current spec-spec says that clients can reflow long lines if they want to, but it doesn't place any obligation on authors or servers to actually provide long lines. Cheers, Solderpunk
On Thu, Jan 16, 2020, at 2:24 PM, solderpunk wrote: > The current spec-spec says that clients can reflow long lines if they > want to, but it doesn't place any obligation on authors or servers to > actually provide long lines. Hmm, even allowing reflowing sounds like it could cause incompatibility issues. Should there be standardized markup for bullet points? What about poems, something I think are much easier to find on Gopher than on the HTML web? How about markdown title lines immediately followed by text? I think that wrapping then reflowing fundamentally leads to loss of original information. I'll provide below a few things that I'm not sure how to reflow. Note that, except for the haiku, many of the lines could go past the viewport width, making fixed text mode not an option ( especially for lists). This is a haiku Should it be reflowed or not? I don't think it should. - this is a bullet point - so is this
On Thu, Jan 16, 2020 at 03:26:13PM -0800, Aaron Janse wrote: > > Hmm, even allowing reflowing sounds like it could cause incompatibility > issues. Should there be standardized markup for bullet points? What about > poems, something I think are much easier to find on Gopher than on the > HTML web? How about markdown title lines immediately followed by text? Well, a lot of this is exactly what the recently proposed ``` was supposed to help fix. Things precisely like the haiku example you provided would be enclosed within ``` lines so that they didn't get mangled by a reflowing client. A speedy recap of the history of text/gemini, excluding the link syntax debate which is thankfully past us:
Hmm, I've just realised something which might salvage this whole mess. It's possible even that what I'm about to describe is exactly what the "long line" folks have been talking about all along without my realising it. Sorry for missing it if so, but I don't think it was ever made explicit! I have always conceptualised our choice as being between two alternatives: 1. Hard-wrapped text which clients display verbatim, line-by-line, exactly the way Gopher works. 2. What I'll call "full blown reflowing", the way HTML and LaTeX work. This involves lines longer than the viewport being split up into multiple shorter lines, but also consecutive non-blank lines shorter than the viewport being joined into fewer, longer lines. Basically, this model of reflow is "paragraph based". Consecutive non-empty lines of text form clumps called paragraphs which are formatted as a whole, whether this results in more or fewer total lines compared to the "source". There is another option that I hadn't thought about until now, which is to do only the first half of 2. above. That is, lines longer than the viewport get broken up nicely at word boundaries into lines of length equal to or less than the viewport width - but that's it. Consecutive shorter lines are *not* joined together. Blank lines in the "source" are rendered, one by one, into empty vertical space. The renderer has no explicit concept of a paragraph. This allows writing things we want to look like paragraphs as individual long lines (easy for most editors, plays nicely with version control) with the knowledge they'll be nicely wrapped to the viewport width, but it doesn't break things like one word per line for emphasis, because the lines won't be sucked up and joined together, and it also doesn't break lists for the same reason (more on lists below). I kind of like this. Unlike the paragraph-oriented web/LaTeX model where ten consecutive newlines and two consecutive newlines are identical, this also allows us to put larger gaps between paragraphs to give the impression of pausing for thought. Neat, huh? This does rescue lists, right? A list of short items:
On Fri, Jan 17, 2020 at 01:33:12PM +0000, solderpunk wrote: > It's possible even that what I'm about to describe is exactly what the > "long line" folks have been talking about all along without my > realising it. > ... > There is another option that I hadn't thought about until now, which > is to do only the first half of 2. above. That is, lines longer than > the viewport get broken up nicely at word boundaries into lines of > length equal to or less than the viewport width - but that's it. > Consecutive shorter lines are *not* joined together. Blank lines in > the "source" are rendered, one by one, into empty vertical space. > The renderer has no explicit concept of a paragraph. Yes, this is exactly what I was trying to say. There's no need to join subsequent lines together. That's just reflowing, as far as I can tell. But you don't need to reflow in that specific way unless your content is inconveniently hard-wrapped when you want it to be resizable. This sort of reflowing leads to loss of information if the single line-breaks were intended and meaningful, so it makes more sense to me to not add those line-breaks into the content, but into their rendering on the client-side. This is the happiest I've been to see an email in a while honestly lol ~ lel
On Fri, Jan 17, 2020 at 08:58:04AM -0500, lel wrote: > > Yes, this is exactly what I was trying to say. Ah, right, sorry for not getting that. I think this whole conversation, despite Sean's best efforts to give us formally defined terms, is suffering a lot from the fact that things like "reflowing" mean different things to different people or in different contexts, so we're often talking past each other. We'd still need ``` in this system, right, to avoid e.g. long lines of source getting mangled on very narrow phone screens? So minimal rendering pseudocode looks like this: preformatted = False for line in all_the_lines: if line == "```": preformatted = not preformatted elif preformatted: print(line) elif line.startwith("=>"): handle_link(line) else: wrap_line_to_viewport(line) Where wrap_line_to_viewport may need to be written by the user if there isn't a library function (and should print empty lines if given them). Non-minimal rendering just involves replacing that final "else" clause with more "elifs" to catch e.g. list items or headers (dispatching to different functions, like wrap_bold_line_to_viewport to display headings in bold). You can add as many or as few of those extra clauses as you like to pretty things up, as long as whatever prettiness you want doesn't depend upon anything. If this rendering code *is* fed text/gemini that has been hard wrapped to a width less than or equal to the viewport, that text comes back unmangled (but narrower than it otherwise could be). Are there really no catches beyond this? I'm sure there must be, but if not I guess the next thing to do is to start thinking about how difficult a "good enough" (not necessarily perfect) implementation of wrap_line_to_viewport is in most languages, and decide whether or not we think that burden is too high. Cheers, Solderpunk
On Fri, Jan 17, 2020 at 02:27:15PM +0000, solderpunk wrote: > ... > I guess the next thing to do is to start thinking about how > difficult a "good enough" (not necessarily perfect) implementation of > wrap_line_to_viewport is in most languages, and decide whether or not we > think that burden is too high. I don't think it's too high. I just wrote up an awful implementation in the most naive way possible and it came out ~30 lines, with a good deal of that being the edge case of words longer than the viewport width (i handled this by splitting these words with hyphens at the viewport width; i don't know if there are other edge-cases that need to be handled but i can't think of any). I'm attaching it. You can execute it with: ``` python3 wrap.py [length] [text] ``` Or import it and call wrap with first parameter being the text and the second being the viewport width. I can't stress enough how "not perfect" this is (I wrote it up on my phone, of all things, in about 5 minutes, so very little thought was put into making it good) but at the very least it's doable. I know you mention that python can do this natively, but I didn't use any dynamic-typing, and all that's required for this sort of implementation is a method to split by newline and space, so really something like this could be done pretty universally. But it's still not good. ~ lel -------------- next part -------------- A non-text attachment was scrubbed... Name: wrap.py Type: text/x-python Size: 1233 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200117/4bec ec65/attachment.py>
lel said: > Yes, this is exactly what I was trying to say. Ahaha, same here. Thank you, solderpunk, for hearing us out. Sorry for not being so explicit! > then defining special > line types for headings, etc. doesn't actually add any extra burden on > simple clients. It's basically a question of how many cases you want > to handle in a switch statement... This is definitely a big appeal for me. Fancy clients could also wrap quote blocks by putting a greater-than sign on the beginning of each wrapped line, without requiring simple clients to do the same :-) > even bare minimal clients need to be able to wrap long lines to result > in readability. In order to deal with wide clients, yes. But if we did hard wrapping, we'd have to do the exact same thing (but plus reflowing) on narrow clients anyway. > I don't think it's too high. I just wrote up an awful implementation in > the most naive way possible and it came out ~30 lines Plus, most systems have bash, so worst case, someone either calls `fmt`/`par` from the client source code or pipes their client output into `fmt`/`par`. > This is the happiest I've been to see an email in a while honestly lol Definitely. Unless there is a problem I'm not seeing, this sounds very exciting! Cheers!
On Fri, Jan 17, 2020 at 08:52:52AM -0800, Aaron Janse wrote: > > Definitely. Unless there is a problem I'm not seeing, this sounds very > exciting! I'm excited by this, too. Because I can't control myself I've already started writing up a rough spec for this, included below for comment. I definitely want to wait until more people have chimed in before getting too excited, and I'd especially like to hear from all, or at least most, of the authors of existing clients before making even a tentative decision. But I have to say, in writing up the below, I get a very good feeling. The description is not exactly short, but it's very *not fiddly* compared to how it would look if we were going to support "reflow in both directions". This "half reflow" approach seems to hit an amazing sweetspot of simplicity, richness and adaptability of display widths. I have figured out how to get vim to let me write this kind of content in a pleasant and easy way, so I think I'm happy to switch. And if anybody *really* hates it, they can still hard-wrap their files and wide clients following these rules will display it properly. Narrow clients will end up with a mess, but that's just the price people will have to pay for sticking to their hard-wrapping guns. Many of them will probably be happy to dismiss mobile clients out of hand, anyway! Feedback on the below very welcome! Cheers, Solderpunk --------- The text/gemini syntax is inspired by and looks visually similar to Markdown, but it is significantly simpler (and consequently less powerful). The syntax is strictly line-based: a file can be processed one line at a time and each line can be handled in a way which does not depend upon any previous or subsequent lines. The only internal state a rendering engine needs to maintain is a single boolean variable (for preformatting mode, described below). The lines of a text/gemini file come in 8 different types. It is possible to unambiguously determine the type of a line by only considering its first three characters. Knowing a line's type is the only information necessary to know how to handle it correctly. Just like Gemini's status code system is designed so that simple clients can ignore the second digit of the code and still function correctly, the text/gemini syntax is designed so that simple clients can treat several different types of line identically and still provide a usable representation of the document. A bare-minimum client need only recognise four different types of line. These are: 1. PREFORMATTED TOGGLE LINES Lines consisting of only and exactly three back ticks (```) are preformatted toggle lines. These lines should not be displayed to the user and are instead used to toggle preformatted mode on and off (the mode is off when the parser is intialised). When preformatted mode is on, the usual rules for identifying line types are suspended and all lines should be unconditionally identified as PREFORMATTED TEXT LINES. 2. PREFORMATTED TEXT LINES Preformatted text lines should be presented to the user in a monowidth font exactly as they appear in the text/gemini file. Lines longer than the client's viewport must not be wrapped, leading or trailing whitespace must not be removed, etc. Handling of lines longer than the viewport is client-specific. Advanced clients may display a horizontal scrollbar. Simple clients may simply truncate the line. 3. LINK LINES Lines beginning with the two characters => are link lines. We all know how these work by now. 4. TEXT LINES All other lines are TEXT LINES. They should be presented to the user in a client-specific "pleasing manner". Lines longer than the client's viewport should be wrapped into multiple lines of a suitable length. Variable width fonts may be used. Blank lines are a special case of TEXT LINES and should be reproduced in the output. It is important to realise that while Markdown, HTML, LaTeX and many other document markup formats are "block based" or "paragraph based", the text/gemini format is not. Consecutive, non-blank lines of text which are much shorter than the client's viewport should *not* be combined into longer lines. Equivalently, *every* newline character in a text/gemini file is significant, not only consecutive pairs of newline characters. Clients ignoring this fact will produce incorrect output. Advanced clients may also recognise the following additional line types. Simple clients may treat all line types below as TEXT LINES with no loss of essential function. 5. HEADING LINES Lines beginning with one, two or three consecutive # characters are HEADING LINES, corresponding to headings, subheadings or susubheadings respectively. The text following the # characters (with leading whitespace removed) constitute the heading and should be displayed to the user. Clients MAY choose to apply special styling to headings to distinguish them from ordinary lines. However, the primary purpose of HEADING LINES is to represent the internal structure of the document in a machine-readable way. Advanced clients can use this information to, e.g. display a hierarchically formatted "table of contents" for a long document in a side-pane, allowing users to easily jump to specific sections without excessive scrolling. Or CMS-style tools automatically generating menus or Atom/RSS feeds for a directory of text/gemini files can use the first heading in a file as a human-friendly label for links to it. 6. UNORDERED LIST ITEMS Lines beginning with a * are UNORDERED LIST ITEMS. This line type exists purely for stylistic reasons. The * may be replaced in advanced clients by a bullet symbol. Any text after the * character should be presented to the user as if it were a TEXT LINE, i.e. wrapped to fit the viewport. Advanced clients can take the space of the bullet symbol into account when performing wrapping and ensure that all lines of text corresponding to the item are aligned with one another. 7. ORDERED LIST ITEMS As above with obvious changes.
> Lines consisting of only and exactly three back ticks (```) are preformatted > toggle lines. Hmm, this might confuse some people who are used to markdown's allowance of specifying the language: ```python print("Hello!") ``` Plus, if we *allow* people to specify the language, maybe some clients could implement syntax highlighting. Would that be too complex? If not, what would happen if people add text to the closing three ticks? > Lines longer than the client's viewport must not be wrapped, leading or > trailing whitespace must not be removed, etc. Handling of lines longer than > the viewport is client-specific These two sentences contradict each other, I think. Maybe state that clients MUST NOT remove trailing space, etc, but state that clients SHOULD allow readers to copy the text such that it can be pasted directly into a text editor then run (this would allow the 100-line python client to just print the lines; terminals should un-soft-wrap lines longer than the viewport when copied to the clipboard). > They should be presented to the user in a client-specific "pleasing manner" > Variable width fonts may be used. I *love* that this is part of the spec! > 6. UNORDERED LIST ITEMS Are minus signs allowed for unordered list items? How about plus signs etc? I'm somewhat in favor of limiting bullets to astrisks and minuses. Can top-level bullets begin with a space? What about nested bullet points? > 7. ORDERED LIST ITEMS I'd further specify what is/isn't allowed here. For example: - Some people may try `1)` instead of `1.`. I personally think that spec should say that only the latter is allowed but clients may choose to *render* ordered lists as the former - What about nested ordered bullet points. Wouldn't they start with whitespace? - Markdown allows lists to be auto-numbered. For example, the nested bullet points would be re-numbered from 1 to 5 in markdown: 1. one 1. two 5. three 2. four 1. five - What about lettered lists (A-Z)? I think these would be cool to have but I doubt they'd be worth the complexity. I'd be explicit in the spec - What about roman numerals? I don't think these should be allowed While we could leave some stuff un-specified and see how things play out, I think there could be some value in limiting authors for the sake of preventing a scenario where making a client is so complex that very few nice clients exist (as we see in the world of web browsers). > The lines of a text/gemini file come in 8 different types. Hmm, I only saw 7 different types specified. I also recommend you specify that people may use greater-than-symbol quotes, which may be nested. I'd recommend that authors MUST NOT unnecessarily hard-wrap their quotes, suggesting that advanced clients MAY add a visual greater-than symbol to the beginning of each wrapped line. Example: > hello this is wider than the viewport Displayed by advanced clients as: > hello this is > wider that the > viewport --- I asked a ton of questions, but I still like the direction we're going. I'd just like to bring up one more thing: What about Gemini proxies of comment thread sites, such as hacker news (*waves to Michael Lazar's awesome HN Gopher proxy*). I think we should take those into consideration. If we wanted to introduce new syntax (which is a bit crazy, but fun), we could use pipes like greater-than signs. We could use the exact same code for this that we would use for fancy-wrapping greater-than-sign quotes. For example, the source code would be: | # John | this is a comment wider than the very narrow viewport | | | # Joe | | and this is a sub comment tthat is very very very long! | | | | | # You | | | and this is a sub sub comment! | | | # Bob | | and this is a sub comment The output on a narrow viewport would be: | # John | this is a comment wider | than the very narrow | viewport | | | # Joe | | and this is a sub | | comment that is very | | very very long! | | | | | # You | | | and this is a sub | | | sub comment! | | | # Bob | | and this is a sub | | comment I know it's super crazy, but it sounds simple to implement. I was considering mentioning it in its own thread, but I think this is the most relevant time to bring it up. Cheers!
It was thus said that the Great solderpunk once stated: > > Feedback on the below very welcome! Red alert! Raise shields! Strap in! This is going to be a bumpy ride. Attached is a sample document that I created that I'm trying to format per the spec below. I'm already running into trouble, else I would have not replied with this particular response. > The text/gemini syntax is inspired by and looks visually similar to > Markdown, but it is significantly simpler (and consequently less > powerful). The syntax is strictly line-based: a file can be processed > one line at a time and each line can be handled in a way which does > not depend upon any previous or subsequent lines. The only internal > state a rendering engine needs to maintain is a single boolean > variable (for preformatting mode, described below). > > The lines of a text/gemini file come in 8 different types. I only see 7 listed below. > It is > possible to unambiguously determine the type of a line by only > considering its first three characters. Knowing a line's type is the > only information necessary to know how to handle it correctly. Not quite true, even according to this document. Leading white space in lists is the glaring exception here. > Just like Gemini's status code system is designed so that simple > clients can ignore the second digit of the code and still function > correctly, the text/gemini syntax is designed so that simple clients > can treat several different types of line identically and still > provide a usable representation of the document. A bare-minimum > client need only recognise four different types of line. These are: Yet seven are listed. > 1. PREFORMATTED TOGGLE LINES > > Lines consisting of only and exactly three back ticks (```) are > preformatted toggle lines. These lines should not be displayed to the > user and are instead used to toggle preformatted mode on and off (the > mode is off when the parser is intialised). When preformatted mode is > on, the usual rules for identifying line types are suspended and all > lines should be unconditionally identified as PREFORMATTED TEXT LINES. > > 2. PREFORMATTED TEXT LINES > > Preformatted text lines should be presented to the user in a monowidth > font exactly as they appear in the text/gemini file. Lines longer > than the client's viewport must not be wrapped, leading or trailing > whitespace must not be removed, etc. Handling of lines longer than the > viewport is client-specific. Advanced clients may display a > horizontal scrollbar. Simple clients may simply truncate the line. No real problems so far. > 3. LINK LINES > > Lines beginning with the two characters => are link lines. We all know > how these work by now. Again, no real problem. > 4. TEXT LINES > > All other lines are TEXT LINES. They should be presented to the user > in a client-specific "pleasing manner". Lines longer than the > client's viewport should be wrapped into multiple lines of a suitable > length. Variable width fonts may be used. Blank lines are a special > case of TEXT LINES and should be reproduced in the output. > > It is important to realise that while Markdown, HTML, LaTeX and many > other document markup formats are "block based" or "paragraph based", > the text/gemini format is not. Consecutive, non-blank lines of text > which are much shorter than the client's viewport should *not* be > combined into longer lines. Equivalently, *every* newline character > in a text/gemini file is significant, not only consecutive pairs of > newline characters. Clients ignoring this fact will produce incorrect > output. Fair enough. And so far, things are okay. > Advanced clients may also recognise the following additional line > types. Simple clients may treat all line types below as TEXT LINES > with no loss of essential function. It's here we start running into trouble. > 5. HEADING LINES > > Lines beginning with one, two or three consecutive # characters are > HEADING LINES, corresponding to headings, subheadings or susubheadings > respectively. The text following the # characters (with leading > whitespace removed) The parenthetical here is ambiguous. Does it refer to this issue? #A title ## A title with space between the '#' and text ### Even more white space or #A title ##A title with leading space before the '#' ###Even more white space I'm thinking the former now that I'm replying, but my code deals with both cases combined, so I can handle: # A title ## A title with spaces ### Yippee! Spaces galore! Almost---tabs (and yes, I do use tabs---I like me the tabs) are an issue and while I can handle them (I have code that will expand them up to 8 spaces) not everybody has code to deal with this. So question: WHAT ABOUT TABS? They WILL show up. > constitute the heading and should be displayed to > the user. Clients MAY choose to apply special styling to headings to > distinguish them from ordinary lines. However, the primary purpose of > HEADING LINES is to represent the internal structure of the document > in a machine-readable way. Advanced clients can use this information > to, e.g. display a hierarchically formatted "table of contents" for a > long document in a side-pane, allowing users to easily jump to > specific sections without excessive scrolling. Or CMS-style tools > automatically generating menus or Atom/RSS feeds for a directory of > text/gemini files can use the first heading in a file as a > human-friendly label for links to it. So another question. I have some headings like this: ### Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sodales eget nisi quis condimentum. Donec ipsum arcu, fermentum eu ullamcorper sit amet, facilisis id nunc. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nam tempus nulla ut dolor luctus malesuada. Suspendisse orci sem, semper at maximus non, pharetra et justo. Quisque lectus arcu, viverra ac convallis eu, vulputate ut enim. Nulla aliquam, lacus consequat suscipit facilisis, nisl tortor facilisis nisi, vel mattis eros arcu sed tellus. Duis quis lectus pellentesque, posuere dolor ut, sodales massa. Proin vel blandit mauris. Given a screen width of 40, which of the four below should be displayed? ### Lorem ipsum dolor sit amet, consect ### Lorem ipsum dolor sit amet, cons... ### Lorem ipsum dolor sit amet, ### consectetur adipiscing elit. Cras ### sodales eget nisi quis condimentum. ### Donec ipsum arcu, fermentum eu ### ullamcorper sit amet, facilisis id ### nunc. Class aptent taciti sociosqu ### ad litora torquent per conubia ### nostra, per inceptos himenaeos. Nam ### tempus nulla ut dolor luctus ### malesuada. Suspendisse orci sem, ### semper at maximus non, pharetra et ### justo. Quisque lectus arcu, viverra ### ac convallis eu, vulputate ut enim. ### Nulla aliquam, lacus consequat ### suscipit facilisis, nisl tortor ### facilisis nisi, vel mattis eros ### arcu sed tellus. Duis quis lectus ### pellentesque, posuere dolor ut, ### sodales massa. Proin vel blandit ### mauris. ### Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sodales eget nisi quis condimentum. Donec ipsum arcu, fermentum eu ullamcorper sit amet, facilisis id nunc. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nam tempus nulla ut dolor luctus malesuada. Suspendisse orci sem, semper at maximus non, pharetra et justo. Quisque lectus arcu, viverra ac convallis eu, vulputate ut enim. Nulla aliquam, lacus consequat suscipit facilisis, nisl tortor facilisis nisi, vel mattis eros arcu sed tellus. Duis quis lectus pellentesque, posuere dolor ut, sodales massa. Proin vel blandit mauris. ### Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sodales eget nisi quis condimentum. Donec ipsum arcu, fermentum eu ullamcorper sit amet, facilisis id nunc. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nam tempus nulla ut dolor luctus malesuada. Suspendisse orci sem, semper at maximus non, pharetra et justo. Quisque lectus arcu, viverra ac convallis eu, vulputate ut enim. Nulla aliquam, lacus consequat suscipit facilisis, nisl tortor facilisis nisi, vel mattis eros arcu sed tellus. Duis quis lectus pellentesque, posuere dolor ut, sodales massa. Proin vel blandit mauris. Or is this way into the "you have *got* to be kidding!" territory? I swear, I'm not trying to take these things to the extreme ... well, okay, I
It was thus said that the Great solderpunk once stated: > > Feedback on the below very welcome! Okay, I went ahead and implemented some of this spec. I didn't bother with the list stuff as that has to be clarified. But the rest I did, and so far, the implementation wasn't *too* bad---about 50 lines, but I'm using a library to wrap text which may or may not exist in other languages. Anyway, you guys can test out a server rendered version at various widths (you can even specify the width) at: gemini://gemini.conman.org/test/testreflow.gemini Some notes---the link section may or may not be how clients render links---since the output is pure text, I decided to just make them look distinct. Also, that large block of text (with the header "Very long lines, one after the other") I might change (add a blank line that doesn't exist) but that would complicate the implementation (in case there *is* a blank line in the input), so I'm still thinking on that one. But in any case, here's a sample implementation (the link to the code is in the page mentioned above) that I'm presenting for your bikeshedding excitement. -spc (Remember---I haven't implemented lists yet ... )
On Thu, Jan 16, 2020 at 5:12 PM solderpunk <solderpunk at sdf.org> wrote: > > On Thu, Jan 16, 2020 at 01:09:18PM -0800, Aaron Janse wrote: > > Oops, I meant to mention two things: > > 1. Clients are already using TLS. I think wrapping is significantly > > easier than encryption. > > I was about to quip that the TLS is made really easy by high-level > library support while the wrapping has to be done manually, but then I > thought to check and, holy heck, Python has a `textwrap` module I wasn't > aware of. Python's textwrap module is fundamentally flawed for unicode and they have no intention of ever fixing it [0]. Once you start going down the rabbit hole of CJK characters, emojis, grapheme clusters, etc. it becomes exceedingly hard to figure out how to correctly determine the width of unicode text. You can get it working 99% of the time, but there's always those fringe cases that no one thinks about until somebody files a bug report. I don't know if this has any bearing on the discussion, but it's worth keeping in the back of your mind if you intend to make unicode a first-class citizen. I would be cautious about calling text wrapping "significantly" easier than TLS or anything else for that matter. This was actually one of the things that drew me to gopher in the first place, I could assume everything was ASCII and throw all of that complexity out the window. [0] https://bugs.python.org/issue24665
On Fri, Jan 17, 2020 at 09:59:18AM -0800, Aaron Janse wrote: > > Lines consisting of only and exactly three back ticks (```) are preformatted > > toggle lines. > > Hmm, this might confuse some people who are used to markdown's allowance of > specifying the language: > > ```python > print("Hello!") > ``` > > Plus, if we *allow* people to specify the language, maybe some clients could > implement syntax highlighting. Would that be too complex? Ah, yes, Tomasino mentioned this earlier. I guess it is harmless (?) to change the definition of these lines to any which begin with ```, not which consist strictly of ```. That lets sufficiently fancy clients support this, the rest can just ignore it. These lines are never actually shown to the user, so it doesn't matter what junk comes after the ```. > If not, what would happen if people add text to the closing three ticks? Well, that's unambiguous. If we didn't make the above change, lines with text after the three ticks would not meet the definition of a preformatted toggle line, so a client would identify them as a text line and they would be rendered accordingly. > > Lines longer than the client's viewport must not be wrapped, leading or > > trailing whitespace must not be removed, etc. Handling of lines longer than > > the viewport is client-specific > > These two sentences contradict each other, I think. Hmm, okay, I guess they technically do. Clients can handle over-long lines however they want as long as they don't wrap them? > Are minus signs allowed for unordered list items? How about plus signs etc? > I'm somewhat in favor of limiting bullets to astrisks and minuses. I'm fairly strongly in favour of limiting everything to exactly one way of doing it. The other day I skimmed the CommonMark spec https://spec.commonmark.org/0.29/ to reassure myself we weren't doing too much wheel re-invention. Holy heck, there are so many different ways to do everything! Simple is best, less is more, one is enough! > Can top-level bullets begin with a space? What about nested bullet points? If a line begins with a space, then it doesn't begin with a *! So then it's by definition not an UNORDERED LIST ITEM. It's a TEXT LINE. I really did mean "begins with" everywhere I said it. IMHO this syntax quickly becomes really unappealing unless the task of deciding which type of line a line is remains dirt simple. Yes, this means no nested lists. It may seem like I've gone nuts and suddenly happily let a whole bunch of complicated stuff into the spec, but I really haven't! This is still supposed to be a very simple syntax, which inevitably comes with limitations. There is no nested anything in the whole syntax. Reliable detection of nestedness in the face of even slight variation in how authors write things will require considering lines in the context of previous or subsequent lines, and that's a no-no here. I'm only willing to allow all these fun toys in if we can do it in such a way that an adequate rendering job can be done by considering each line of the file in perfect isolation, with a single pass of the entire document. > > 7. ORDERED LIST ITEMS > > I'd further specify what is/isn't allowed here. For example: > - Some people may try `1)` instead of `1.`. I personally think that spec should > say that only the latter is allowed but clients may choose to *render* > ordered lists as the former > - What about nested ordered bullet points. Wouldn't they start with whitespace? > - Markdown allows lists to be auto-numbered. For example, the nested bullet > points would be re-numbered from 1 to 5 in markdown: > 1. one > 1. two > 5. three > 2. four > 1. five > - What about lettered lists (A-Z)? I think these would be cool to have but I > doubt they'd be worth the complexity. I'd be explicit in the spec > - What about roman numerals? I don't think these should be allowed > > While we could leave some stuff un-specified and see how things play out, I > think there could be some value in limiting authors for the sake of preventing > a scenario where making a client is so complex that very few nice clients exist > (as we see in the world of web browsers). Okay, I totally goofed up here in declaring the changes as "obvious". I meant "obvious to everybody who has read gemini://mozz.us/markdown/design_document and accepted is as their personal Lord and saviour". Well, actually, I read that and foolishly assumed the nice approach to ordered list items was more or less the same as standard Markdown. Turns out it's not and Michael has done very good simplifying work here. As said above, I am strongly in favour of there being exactly one way to do things, and of identifying a line's type being brutally simple. This totally rules out letting authors actually write numbers. Actually having a number followed by a period define such a line type would also bring a very high risk of falsely identifying ordered list item lines when processing hard-wrapped text if a sentence ending in a number, like "Ten Gemini crews flew low Earth orbit (LEO) missions during 1965 and 1966." was wrapped in such a way that the final word was at the start of a line. Yes, I know, this new syntax works best when there is no hard wrapping, so that we can use lines to determine the scope of certain kinds of "specialness", but I will be VERY HAPPY if we can make the syntax robust enough that it can still be applied to rare instances of hard-wrapped content without much going wrong. So, lines beginning with a + (not any whitespace, but a +!) are ORDERED LIST ITEM lines. Clients who want to be fancy can add a little bit of extra internal state to their rendering code and can replace the +s with incrementing numbers. It's the client's choice whether it uses 1. 2. 3. or 1) 2) 3) or i> ii> iii> or whatever else. Very fancy clients can let the user decide. Yes, this means content authors lose precise control over how their content is rendered (while retaining precise control over the *semantics* of their content, i.e. authors decide whether an item is ordered or not). I'm not just okay with this, I'm actively happy about it. The web paradigm where readers are subordinate to authors with regards to layout is a cause of many different kinds of grief. Good riddace to it! > Hmm, I only saw 7 different types specified. Okay, turns out I can't count in a hurry. :) I actually just wrote N when I first wrote that sentence, then when I ran out of time to sketch this thing out I went back, did a quick, incorrect count, and changed it. In the actual spec I'll double check, and the number will be whatever it is - I didn't mean for the list I sent out to be exhaustive, although I also think we should resist the urge to add every nice little thing we can think of. It'd be great if we kept the total number to 10 or less. > I also recommend you specify that > people may use greater-than-symbol quotes, which may be nested. I have no problem with quotes, but I'm not thrilled by the nesting prospect. > that authors MUST NOT unnecessarily hard-wrap their quotes, suggesting that > advanced clients MAY add a visual greater-than symbol to the beginning of each > wrapped line. > > Example: > > > hello this is wider than the viewport > > Displayed by advanced clients as: > > > hello this is > > wider that the > > viewport This is exactly how I'd expect advanced clients to handle this, and I think this whole idea is implicit in the design of this syntax: the start of a line indicates what kind of line it is, and the scope of that type is precisely that line. A hard-wrapped quote with a > at the start of each line is, in this syntax, actually several distinct consecutive quotes. > What about Gemini proxies of comment thread sites, such as hacker news (*waves > to Michael Lazar's awesome HN Gopher proxy*). I think we should take those > into consideration. Hmm. I would want to think a bit before I lay down a hard statement on this because I don't want to impose too much of my own ideology on Gemini, as it's supposed to be a general-purpose tool....but I am not excited by verbatim dragging of mainstream web 2.0 cultural concepts like comment threads into Gemini. The circle of Gemini early-adopters overlaps considerably with the "Small Internet" / "Slow Internet" movement, which I guess has coloured how I think about the protocol. Thus the idea of adding something into the spec specifically to support visualising deep comment threads in the web-conventional way kind of gives me the heebie-jeebies. Cheers, Solderpunk
On Sat, Jan 18, 2020 at 12:02:42AM -0500, Michael Lazar wrote: > Python's textwrap module is fundamentally flawed for unicode and they have no > intention of ever fixing it [0]. Once you start going down the rabbit hole of > CJK characters, emojis, grapheme clusters, etc. it becomes exceedingly hard > to figure out how to correctly determine the width of unicode text. You can > get it working 99% of the time, but there's always those fringe cases that > no one thinks about until somebody files a bug report. ... God, I hate computers. But, many thanks for bringing this to my attention. > I don't know if this has any bearing on the discussion, but it's worth keeping > in the back of your mind if you intend to make unicode a first-class citizen. Unicode is already a first-class citizen in Gemini (text/gemini is assumed to be UTF-8 if a different encoding is not explicitly provided in the response header), and I don't think I have any interest in changing that. As for the present discussion...well, it's obvious this problem is no less of a problem under paragraph-oriented "bidirectional" reflowing. It's not obvious to me if it's less of a problem under a Gopher-style hard-wrapping to a pre-defined maximum width model....I suppose if the width of line including CJK characters is dependent upon the combination of font and terminal being used (I don't know if it is, but it seems probable) then it's not actually possible for a CJK-using author to comply with a spec like "Hard-wrap all your content at X characters"... Hmm... Solderpunk
From: solderpunk <solderpunk@SDF.ORG> > Yes, this means no nested lists. It may seem like I've gone nuts and > suddenly happily let a whole bunch of complicated stuff into the spec, > but I really haven't! This is still supposed to be a very simple > syntax, which inevitably comes with limitations. There is no nested > anything in the whole syntax. Love the + thing. If you think of lists and sublists hierarchically rather than nested then we do have a parallel, the # headers! Ordered list: + List item one + List item 2 ++ Level 2 item 1 ++ Level 2 item 2 Unordered list:
On Sat, Jan 18, 2020 at 01:10:39PM +0000, James Tomasino wrote: > From: solderpunk <solderpunk at SDF.ORG> > > Love the + thing. If you think of lists and sublists hierarchically rather than nested then we do have a parallel, the # headers! > > ... > > And it maintains the whole "read the first 3 characters to determine what this is" rule. No extra tabbing or spacing to get in the way and follows similar conventions to the headings. > Ah, now *that*'s nice! Having a consistent system makes the whole thing easier to learn and remember. Good thinking! Cheers, Solderpunk
On 1/18/20 8:10 AM, James Tomasino wrote: > From: solderpunk <solderpunk at SDF.ORG> > >> Yes, this means no nested lists. It may seem like I've gone nuts and >> suddenly happily let a whole bunch of complicated stuff into the spec, >> but I really haven't! This is still supposed to be a very simple >> syntax, which inevitably comes with limitations. There is no nested >> anything in the whole syntax. > > > Love the + thing. If you think of lists and sublists hierarchically rather than nested then we do have a parallel, the # headers! > > Ordered list: > + List item one > + List item 2 > ++ Level 2 item 1 > ++ Level 2 item 2 > > Unordered list: > * List 1 > ** Deeper > *** Maximum deepitude > > And it maintains the whole "read the first 3 characters to determine what this is" rule. No extra tabbing or spacing to get in the way and follows similar conventions to the headings. Hello! I've been lurking throughout the entire long history of this discussion. ("Wanted to contribute, but other commitments...etc. etc.") I just wanted to pipe in to share my excitement about this simple 3-char rule. I *love* the simplicity of "context-free" line-based parsing. This reminds me of several times in which I've used an "indent level + sort order" in databases to simulate complex nested tree hierarchies to make rendering to the screen absolutely trivial. For some reason, it had never occurred to me that counting the number of stars '*' at the beginning of the line logically amounted to the same thing. This is wonderful. I'm also strongly in favor of the Maximum Deepitude rule. <3 -ratfactor
On 1/18/20 2:10 PM, James Tomasino wrote: > Ordered list: > + List item one > + List item 2 > ++ Level 2 item 1 > ++ Level 2 item 2 > > Unordered list: > * List 1 > ** Deeper > *** Maximum deepitude > > And it maintains the whole "read the first 3 characters to determine what this is" rule. No extra tabbing or spacing to get in the way and follows similar conventions to the headings. I really like this idea combined with the "read the first 3 characters", simple and elegant!
On Sat, Jan 18, 2020 at 04:21:04PM +0100, Julien Blanchard wrote: > I really like this idea combined with the "read the first 3 characters", > simple and elegant! Julien, as the author of multiple Gemini clients, how do you feel about the prospect of most text/gemini content having very long lines which need to be wrapped to fit the viewport? Cheers, Solderpunk
On 1/18/20 4:39 PM, solderpunk wrote: > Julien, as the author of multiple Gemini clients, how do you feel about > the prospect of most text/gemini content having very long lines which > need to be wrapped to fit the viewport? Both my clients try to display content as the author wrote it and I parse it line by line so I don't see any issue with the proposal as long as I can get the viewport size in my clients. Both NCurses and GTK should provide that I'm quite sure. Identifying what is meant by a line by it's first 3 characters is great, will make the parsing even simpler! The ``` case will be a little bit more problematic as the parsing process would need to know what was parsed a few lines before (was a "tag" opened?). Line by line parsing is preferable to allow simple implementations to me, but that's not be a deal-breaker either.
On Fri, Jan 17, 2020 at 09:16:27PM -0500, Sean Conner wrote: > Red alert! Raise shields! Strap in! This is going to be a bumpy ride. I'd have expected nothing less from you. :) In fact, the very fact that felt this abrupt change of direction was worth careful nit-picking rather than dismisisng it with a fatal flaw up front is a plesant surprise! > Not quite true, even according to this document. Leading white space in > lists is the glaring exception here. I'll deal with this properly in the relevant part below, but there *is* no leading whitespace in lists by definition. "Begins with" means "begins with"! :) > > 5. HEADING LINES > > > > Lines beginning with one, two or three consecutive # characters are > > HEADING LINES, corresponding to headings, subheadings or susubheadings > > respectively. The text following the # characters (with leading > > whitespace removed) > > The parenthetical here is ambiguous. Does it refer to this issue? > > #A title > ## A title with space between the '#' and text > ### Even more white space > > or > > #A title > ##A title with leading space before the '#' > ###Even more white space > > I'm thinking the former now that I'm replying, but my code deals with both > cases combined, so I can handle: Definitely the former. Lines with leading spaces before the # do not satisfy the definition of a HEADING LINE, which is any line that begins with #, ## or ### I'm open to reconsidering the leading whitespace thing. I guess I was thinking that some people might like to line up their headings in this fashion: # Title ## Section ### Sub-section and we should think of the top-level heading as "Title" and not " Title". > Almost---tabs (and yes, I do use tabs---I like me the tabs) are an issue > and while I can handle them (I have code that will expand them up to 8 > spaces) not everybody has code to deal with this. So question: > > WHAT ABOUT TABS? > > They WILL show up. In headings??? I have no idea what to suggest regarding how to handle that. Is this such a deranged edge case that we feel safe letting that behaviour be client-defined? > > constitute the heading and should be displayed to > > the user. Clients MAY choose to apply special styling to headings to > > distinguish them from ordinary lines. However, the primary purpose of > > HEADING LINES is to represent the internal structure of the document > > in a machine-readable way. Advanced clients can use this information > > to, e.g. display a hierarchically formatted "table of contents" for a > > long document in a side-pane, allowing users to easily jump to > > specific sections without excessive scrolling. Or CMS-style tools > > automatically generating menus or Atom/RSS feeds for a directory of > > text/gemini files can use the first heading in a file as a > > human-friendly label for links to it. > > So another question. I have some headings like this: > > ### Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sodales eget nisi quis condimentum. Donec ipsum arcu, fermentum eu ullamcorper sit amet, facilisis id nunc. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Nam tempus nulla ut dolor luctus malesuada. Suspendisse orci sem, semper at maximus non, pharetra et justo. Quisque lectus arcu, viverra ac convallis eu, vulputate ut enim. Nulla aliquam, lacus consequat suscipit facilisis, nisl tortor facilisis nisi, vel mattis eros arcu sed tellus. Duis quis lectus pellentesque, posuere dolor ut, sodales massa. Proin vel blandit mauris. > > Given a screen width of 40, which of the four below should be displayed? > > ### Lorem ipsum dolor sit amet, consect > > ### Lorem ipsum dolor sit amet, cons... > > ### Lorem ipsum dolor sit amet, > ### consectetur adipiscing elit. Cras > ### sodales eget nisi quis condimentum. > ### Donec ipsum arcu, fermentum eu > ### ullamcorper sit amet, facilisis id > ### nunc. Class aptent taciti sociosqu > ### ad litora torquent per conubia > ### nostra, per inceptos himenaeos. Nam > ### tempus nulla ut dolor luctus > ### malesuada. Suspendisse orci sem, > ### semper at maximus non, pharetra et > ### justo. Quisque lectus arcu, viverra > ### ac convallis eu, vulputate ut enim. > ### Nulla aliquam, lacus consequat > ### suscipit facilisis, nisl tortor > ### facilisis nisi, vel mattis eros > ### arcu sed tellus. Duis quis lectus > ### pellentesque, posuere dolor ut, > ### sodales massa. Proin vel blandit > ### mauris. > > ### Lorem ipsum dolor sit amet, > consectetur adipiscing elit. Cras > sodales eget nisi quis condimentum. > Donec ipsum arcu, fermentum eu > ullamcorper sit amet, facilisis id > nunc. Class aptent taciti sociosqu > ad litora torquent per conubia > nostra, per inceptos himenaeos. Nam > tempus nulla ut dolor luctus > malesuada. Suspendisse orci sem, > semper at maximus non, pharetra et > justo. Quisque lectus arcu, viverra > ac convallis eu, vulputate ut enim. > Nulla aliquam, lacus consequat > suscipit facilisis, nisl tortor > facilisis nisi, vel mattis eros > arcu sed tellus. Duis quis lectus > pellentesque, posuere dolor ut, > sodales massa. Proin vel blandit > mauris. > > ### Lorem ipsum dolor sit amet, > consectetur adipiscing elit. Cras > sodales eget nisi quis condimentum. > Donec ipsum arcu, fermentum eu > ullamcorper sit amet, facilisis id > nunc. Class aptent taciti sociosqu ad > litora torquent per conubia nostra, per > inceptos himenaeos. Nam tempus nulla ut > dolor luctus malesuada. Suspendisse > orci sem, semper at maximus non, > pharetra et justo. Quisque lectus arcu, > viverra ac convallis eu, vulputate ut > enim. Nulla aliquam, lacus consequat > suscipit facilisis, nisl tortor > facilisis nisi, vel mattis eros arcu > sed tellus. Duis quis lectus > pellentesque, posuere dolor ut, sodales > massa. Proin vel blandit mauris. > > Or is this way into the "you have *got* to be kidding!" territory? I > swear, I'm not trying to take these things to the extreme ... well, okay, I > *am* trying to take these things to the exteme, but only to find out where > the borders are. You're right to do so. Your last example is what would be produced by a simple client which recognises only the compulsory line types (and hence treats heading lines as text lines). I think any of the others would be reasonable outputs for an advanced client, although as a reader I would strongly prefer one of the non-truncating options. I'm not sure it makes sense to specify the correct behaviour here, any more than it makes sense to specify what kind of bullet symbol should be used for list items. > Okay, several questions here. First off, replace the '###' with '*' in > the above example---how to properly format a list item that is ridiculously > long. Well, a simple client that doesn't recognise list items as anything special will just wrap it:
On Sat, Jan 18, 2020 at 05:41:05PM +0100, Julien Blanchard wrote: > Both my clients try to display content as the author wrote it and I > parse it line by line so I don't see any issue with the proposal as > long as I can get the viewport size in my clients. > Both NCurses and GTK should provide that I'm quite sure. Okay, great! > The ``` case will be a little bit more problematic as the parsing > process would need to know what was parsed a few lines before > (was a "tag" opened?). By "was a tag opened?", do you just mean "am I currently inside or outside a pair of ```s?". I have found this very easy to track, you just need a single boolean variable, initialised to False at the start of the document. Every time you see a ``` line you flip it's value. When processing all other line types, the first thing you do is check whether that value's true. Cheers, Solderpunk
On 1/18/20 6:23 PM, solderpunk wrote: > By "was a tag opened?", do you just mean "am I currently inside or > outside a pair of ```s?". I have found this very easy to track, you > just need a single boolean variable, initialised to False at the start > of the document. Every time you see a ``` line you flip it's value. > When processing all other line types, the first thing you do is check > whether that value's true. Yes, no doubt it's relatively easy to implement but this could open the way to more "tags" and I just hope it won't. I'm really glad this conversation is coming to a conclusion which so far seems to satisfy everybody and we can finally move on to other topics!
On Sat, Jan 18, 2020 at 06:39:47PM +0100, Julien Blanchard wrote: > I'm really glad this conversation is coming to a conclusion which so far > seems to satisfy > everybody and we can finally move on to other topics! Me too! Although I'm still very keen to hear from jmcbray on these recent developments. I *suspect* they'll be happy. But I really don't want to rush into this. But I am amazed at how much more tractable getting away from gopher-style hard wrapping has made this whole discussion. It has made it obvious to me that the problem all along has been with determining the scope of special features. Consider bulleted lists, with hard-wrapping. Suppose we have this content, hard wrapped to 40 characters, and our viewport is 60 characters so we want to combine lines:
On Sat, Jan 18, 2020 at 09:46:47AM -0500, Dave Gauer wrote: > > Hello! I've been lurking throughout the entire long history of this > discussion. ("Wanted to contribute, but other commitments...etc. etc.") > > I just wanted to pipe in to share my excitement about this simple 3-char > rule. I *love* the simplicity of "context-free" line-based parsing. I'm so glad to hear this from you in particular! Reading your phlog convinced me that you've spent more time thinking about markup and formatting issues for plain text than most other people talking about it combined. If you think we're on a good track here, I'm feeling a lot better. Cheers, Solderpunk
On Wed, Jan 15, 2020 at 08:30:50PM +0100, Brian Evans wrote: > Solderpunk wrote recently regarding use of color in gemini > documents. > > I would like to put in a vote to the contrary. Nearly forgot to get around to replying to this, but not quite. :) When I previously declared I wasn't interested in supporting colour, I was thinking purely about defining some kind of syntax for it, like <red>this</red> (or any similar such thing). I didn't mean to say anything about ANSI escape codes. I know we're going to have to come up with some kind of official stance on those eventually, but I've given it no serious thought yet. Cheers, Solderpunk
I think we made an oversight: syntax nested within quotes. For example: ```gemini > > It looks like things are moving along, I wonder if we forgot anything > > that would make us need to read more than three chars to specify the > > line type? > > Here's a site to demo gemini's syntax: > => gemini://example.com Gemini Syntax Demo > > In the future, we can add the following features: > * foo > * bar Client: *doesn't know to make quoted link clickable* *doesn't know to fancy-render the quoted list* ``` --- Regardless, here are my thoughts on everything else... I definitely love:
On 1/18/20 8:05 PM, solderpunk wrote: > I didn't mean to say anything about ANSI escape codes. I know we're > going to have to come up with some kind of official stance on those > eventually, but I've given it no serious thought yet. FWIW I intend to add some stuff with ANSI colors on my Gemini-space, I have some ideas I need to tryout :) I might add two links though, one for the colored version and another for the readable one if needed. Since Bombadillo and Castor already support ANSI colors and kompeito.media is such a great place, I'm in favor of officially supporting them or at least not ban them, up to the clients to deal with them or not.
> Since Bombadillo and Castor already support ANSI colors and > kompeito.media is such a great place, I'm in favor of officially > supporting them or at least not ban them, up to the clients to deal with > them or not. Hmmm. It does seem, though, that *allowing* ANSI colors would require non-terminal clients to strip ANSI colors, which would be a PITA, expecially considering that ANSI is a hot mess (I built an ANSI parser a while ago [1]) => https://github.com/aaronjanse/i3-tmux/tree/master/vterm [1]
I thought the use case of ``` was precisely do nothing with the next lines (ie. don?t parse) until marker is met again or am I wrong here?
I meant to say "text after the ticks, but in the same line." Such as: ```this text Code code code ```or this text
On Sat, Jan 18, 2020 at 11:20:10AM -0800, Aaron Janse wrote: > I think we made an oversight: syntax nested within quotes. > > Client: > *doesn't know to make quoted link clickable* > *doesn't know to fancy-render the quoted list* > ``` Hmm, I really would not have had any expectation that quoted Gemini syntax would do anything at all. What sort of context would you expect this to occur in? > * Preformatted text toggle lines only need to *start* with three > backticks. Specify the significance of text after the backticks. I think I'm probably fine with this, does anybody have objections? > * Specify that preformatted code blocks are intended for content such as > ascii art and code, meaning that it should be easily copy-pasteable > into a text editor without needed to undergo extra steps to revert it > from its displayed form to its original form Yeah, it might be a good idea to emphasise the intent to preserve copy-and-pastability. > * Add horizontal rule lines (three+ dashes) I guess this is harmless. It feels a bit to me like we're adding it just because Markdown has it - unlike headings and lists and even, occasionally, quotes, I don't know that I've ever seen a horiozontal rule used in Gopherspace. But I don't see a good reason to disallow it. > * Specify that ordered lists MUST use plus sign markers Yep. > * Specify Tomasino's nested list system I think I'm still onboard with this, although I'm starting to wonder about how these nested lists will look when rendered by a basic client treating them as text lines. I'm not sure it degrades to something terribly readable. > * Explicitly specify markdown syntax that is not allowed. It feels very strange to me for a syntax specification to explicitly list stuff from a different syntax specification which isn't allowed. I can see it being helpful to point this stuff out in a tutorial for people learning text/gemini, but in a formal specification of a markup format, it goes without saying that anything which isn't explicitly supported is unsupported. > but maybe we could even advise clients to shame this syntax the same > way modern web browsers are shaming non-HTTP sites? Wouldn't doing that (all questions about whether this is appropriate behaviour aside) require writing code to detect all the stuff that we're not supporting precisely because it's a pain to write code to reliably detect it? Seems counterproductive! > Regardless, here > are some things that I think we should explicitly ban in text/gemini: > ... > * Hard-wrapping text I don't want to explicitly ban hard-wrapped text, I don't see the need to. I think this syntax actually degrades pretty gracefully when fed hard-wrapped text that is shorter than the viewport, and that's nice. I think the vast majoity of people will end up taking the long line approach because it will support a wider range of clients (especially narrow screens) and some things will render slightly nicer. If a small percentage want to stick to the old ways for whatever reason, knowing and accepting the downsides, I see no reason not to let them. > Are we really limited to a max depth of three? Even if we allow unlimited > depth of headers and lists, clients would only need to read the first two > chars of a line to determine its type (unless we add horizontal rules, > in which case we'd need to read three characters). Good catch, technically speaking once a line is detected, on the basis of the first three or fewer chars, as a header or list, it can be passed to a function than handles a header line or a list line, and that function has access to the whole line. That said, maybe we should add a limit anyway. Otherwise clients have to write totally generalised code to handle arbitrarily many levels, which could get tricky. > Well, worst case scenario, if someone really badly wants comment threads, > maybe they could use nested quote blocks (assuming we figure that out). Well, it seems like the > syntax generalises in exactly the same way as the heading and list syntaxes. Speaking of these...what happens when a client encounters this:
Aaron Janse writes: > Hmmm. It does seem, though, that *allowing* ANSI colors would require > non-terminal clients to strip ANSI colors, which would be a PITA, > expecially considering that ANSI is a hot mess (I built an ANSI parser > a while ago [1]) Currently Bombadillo has a few different modes. The normal mode removes ansi escape codes. As I am parsing a document if I read an `\033` character I just toggle an escape code boolean and then consume until I read a A-Za-z character (and consume that char as well). It works very quickly and handles removing them quite well. I do the same thing for the color mode for any escape codes that do not end in `m`. That said, it may not work as well for people not parsing by writing characters into a buffer char by char. I would also argue that it would _not_ require clients to strip them. There are a few options: 1. Decide, rightly in my opinion, that if a content creator uses escape codes they are taking the chance that the codes themselves will be displayed to the eventual viewer depending on the client. 2. Do a simple find and replace on the whole document for '\033' and replace it with "ESC". While this will still leave the codes displaying to the viewer they will not actually render, thus you do not need to worry about line movement, screen clears, etc. --? Sent with https://mailfence.com Secure and private email
On Sat, Jan 18, 2020, at 1:33 PM, solderpunk wrote: > Hmm, I really would not have had any expectation that quoted Gemini > syntax would do anything at all. > > What sort of context would you expect this to occur in? I often see quoted syntax (e.g. links, lists) on StackOverflow. However, I haven't seen quoted syntax in phlogs. Plus, supported mixing nested syntax could be a PITA to implement. Also seen in the wild:
It was thus said that the Great Julien Blanchard once stated: > On 1/18/20 4:39 PM, solderpunk wrote: > > >Julien, as the author of multiple Gemini clients, how do you feel about > >the prospect of most text/gemini content having very long lines which > >need to be wrapped to fit the viewport? > > The ``` case will be a little bit more problematic as the parsing > process would need to know what was parsed a few lines before > (was a "tag" opened?). I found that to be easy to support. The code looks like: local literal = false for line in file:lines() do if line:match "^#" then ... elseif line:match "^```$" then literal = not literal -- Yup, that's all there was to it. elseif line:match "^=>" then ... else if literal then display_text_to_width(line) else wrap_text_to_width(line) end end -spc
It was thus said that the Great solderpunk once stated: > On Fri, Jan 17, 2020 at 09:16:27PM -0500, Sean Conner wrote: > > > Red alert! Raise shields! Strap in! This is going to be a bumpy ride. > > I'd have expected nothing less from you. :) In fact, the very fact that > felt this abrupt change of direction was worth careful nit-picking > rather than dismisisng it with a fatal flaw up front is a plesant > surprise! Well, I don't have very strong feelings about text formatting for Gemini. Also, anything that's been cleared up in subsequent emails will not be addressed here as there's no need to clutter things up. > > Almost---tabs (and yes, I do use tabs---I like me the tabs) are an issue > > and while I can handle them (I have code that will expand them up to 8 > > spaces) not everybody has code to deal with this. So question: > > > > WHAT ABOUT TABS? > > > > They WILL show up. > > In headings??? I have no idea what to suggest regarding how to handle > that. Is this such a deranged edge case that we feel safe letting that > behaviour be client-defined? Not necessarily in headers, but they will appear in Gemini files (I'm thinking most likely in pre-formatted blocks). The thing the clients have to be aware of is that tabs take 1 character space, but can reference up to N spaces (where N is typically 8, but can be arbitrary values per tab). > Well, I guess a lot of your questions here are moot in light of my > earlier post about the + syntax. They are, and I shall update my code accordingly. > > -spc (Will torture specs for food ...) > > You'll never go hungry! Ha! -spc
On 1/18/20 10:18 PM, Aaron Janse wrote: > Speaking of these...what happens when a client encounters this: > > ** Foo > ** Bar > ** Baz > > i.e. a bunch of allegedly nested list items which are not emedded in a > higher-level list? Line 1: A list has started at depth 2. Display a depth 2 list item. Line 2: A list has continued at depth 2. Display another depth 2 list item. Line 3: A list has continued at depth 2. Display another depth 2 list item. Pretty straight forward if we're just processing line by line. It's every-so-slightly trickier if we're working in ordered lists. You'll want to keep an ordinal stack. + item 1 ++ sub item 1 +++ sub-sub item 1 +++ sub-sub item 2 ++ sub item 2 +++ a new sub-sub item 1 ++ sub item 3 + item 2 As you push deeper you'll want to keep references to those items at a higher level so you can continue to number when you pop back out. Upon popping, though, you can reset the deeper list ordinal. At least, that's how I'd handle it. What constitutes the end of an ordered list? Unordered lists are "dumb" in that they just display their appropriate depth no matter what, but the ordered lists need to keep track of that state. Does a non-list line break the list? What about empty lines? + is this list + still counting + up to 3?
On Sat, Jan 18, 2020 at 11:13:50PM +0000, James Tomasino wrote: > What constitutes the end of an ordered list? Unordered lists are "dumb" > in that they just display their appropriate depth no matter what, but > the ordered lists need to keep track of that state. Does a non-list line > break the list? What about empty lines? > > + is this list > > + still counting > > + up to 3? Argh, excellent question. It seems like something vaguely like the notion of a block has crept in! This is a timely question, too. I'm goofing around trying to implement some of these new ideas in AV-98 (and so far I'm very pleased with the results - # headers are made bold with ANSI escape codes, and unordered lists get nice bullets and nice spacing, looks great!) and had to deal with exactly this. For testing I just decided that anything other than an ordered list line breaks a list and resets the counter. I'm open to other ideas but I worry that anything other than this is liable to be too complicated. Cheers, Solderpunk
> Le 19 janv. 2020 ? 00:22, solderpunk <solderpunk at sdf.org> a ?crit : > > ?Argh, excellent question. It seems like something vaguely like the > notion of a block has crept in! > > This is a timely question, too. I'm goofing around trying to implement > some of these new ideas in AV-98 (and so far I'm very pleased with the > results - # headers are made bold with ANSI escape codes, and unordered > lists get nice bullets and nice spacing, looks great!) and had to deal > with exactly this. For testing I just decided that anything other than > an ordered list line breaks a list and resets the counter. I'm open to > other ideas but I worry that anything other than this is liable to be > too complicated. > Do we really need ordered lists? I?m not sure the what the use case is. Couldn?t they be replaced by: ### 1. Foo ### 2. Bar Or something like that if you really wanted numbers?
It was thus said that the Great Brian Evans once stated: > Aaron Janse writes: > > Hmmm. It does seem, though, that *allowing* ANSI colors would require > > non-terminal clients to strip ANSI colors, which would be a PITA, > > expecially considering that ANSI is a hot mess (I built an ANSI parser > > a while ago [1]) > > Currently Bombadillo has a few different modes. The normal mode removes > ansi escape codes. As I am parsing a document if I read an `\033` character I > just toggle an escape code boolean and then consume until I read a A-Za-z > character (and consume that char as well). It works very quickly and handles > removing them quite well. I do the same thing for the color mode for any > escape codes that do not end in `m`. That said, it may not work as well for > people not parsing by writing characters into a buffer char by char. Having written an ECMA-48 (the terminal control codes everybody calls ANSI escape codes when they aren't defined by ANSI) parser you'll probably catch 99% of the control codes used. But the actual definition is (RFC-5234 BNF): CSI = %d27 '[' / %d155 ; ISO-8859-1 or similar / %d194 %d155 ; UTF-8 encoding param = %d48-63 ; chars '0' through '?' meta = %d32-47 ; chars ' ' through '/' cmd = %d64-126 ; chars '@' through '~' sequence = CSI *param *meta cmd There are other ECMA-48 sequences that could prove dangerous if not filtered for. I do have Lua code to parse these [1][2] and use them in my current gopher client to filter them out (and yes, I have come across sites that embed ECMA-48 control codes). > 2. Do a simple find and replace on the whole document for '\033' and replace > it with "ESC". While this will still leave the codes displaying to the viewer > they will not actually render, thus you do not need to worry about line > movement, screen clears, etc. You might want to replace the following codepoints to render control codes harmless: 0 - 31 ; C0 set, except interpret the range from 7-13 inclusive 127 ; DEL 128-159 ; C1 set I say codepoints because in UTF-8, the C1 set is represented by the sequences 194 128 through 194 129 -spc [1] https://github.com/spc476/LPeg-Parsers/blob/master/iso/control.lua This handles encodings in ISO-8859-1 and similar. I have a UTF-8 one that is separate. This one just returns the escape sequence as a unit with no further parsing of the actual sequence. [2] https://github.com/spc476/LPeg-Parsers/blob/master/iso/ctrl.lua This does a more complete parse of the escape sequence, to include its name (if any). Again, This is for ISO-8859-1 and similar encodinds. I have another version for UTF-8.
On Sat, Jan 18, 2020, at 3:36 PM, Julien Blanchard wrote: > > > Le 19 janv. 2020 ? 00:22, solderpunk <solderpunk at sdf.org> a ?crit : > > > > ?Argh, excellent question. It seems like something vaguely like the > > notion of a block has crept in! > > > > This is a timely question, too. I'm goofing around trying to implement > > some of these new ideas in AV-98 (and so far I'm very pleased with the > > results - # headers are made bold with ANSI escape codes, and unordered > > lists get nice bullets and nice spacing, looks great!) and had to deal > > with exactly this. For testing I just decided that anything other than > > an ordered list line breaks a list and resets the counter. I'm open to > > other ideas but I worry that anything other than this is liable to be > > too complicated. > > > > Do we really need ordered lists? I?m not sure the what the use case is. Some use cases: 1. Instructions with step numbers. Yes, references to steps may need to be changed if items are items, but that's less work than replacing both references and the bullet markers themselves. 2. Listing things (e.g. problems), then referring to them by number ("but this once again causes problem #1") 3. "There are X ways to do this... <ordered list>" I don't know what to do about spaces between ordered lists. If there's text between items, I think manually-numbered sections should be used instead. But if it's only newlines between items, I don't see why the count should be restarted. Cheers!
Sorry I wasn?t clear, I meant what would be the use case of showing them in a special way compared to headers with a number or plain text?
-------- Original Message -------- From: Julien Blanchard <julien@typed-hole.org> Sorry I wasn?t clear, I meant what would be the use case of showing them in a special way compared to headers with a number or plain text? ---- 1. Listing things 2. Quick instructions where headers are overkill 3. Track listings 4. Top 10 lists 5. The same reasons we want bullets the reflow 6. Recipes! And so on.
solderpunk <solderpunk at SDF.ORG> writes: > Hmm. Neither do I and, in fact, well...I just do it manually. Which I > feel very sheepish admitting because that's kind of ridiculous. But > everybody writing for Gopherspace (which is many people here) must > face precisely this problem, because hard-wrapping is basically > compulsory there. What are other people doing, writing in "long line" > form and then feeding the result to `fmt` or `par` before uploading? I use Emacs. I set fill-column to 72 characters, and turn on auto-fill-mode. This means that things get hard wrapped while I am writing. You can reflow a paragraph with one key: M-q, mapped to fill-paragraph or unfill-toggle. I'm normally writing Markdown, which will get converted to HTML for my static blog, or served raw on my gopherhole. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
solderpunk <solderpunk at SDF.ORG> writes: > There is another option that I hadn't thought about until now, which > is to do only the first half of 2. above. That is, lines longer than > the viewport get broken up nicely at word boundaries into lines of > length equal to or less than the viewport width - but that's it. > Consecutive shorter lines are *not* joined together. Blank lines in > the "source" are rendered, one by one, into empty vertical space. > The renderer has no explicit concept of a paragraph. If 'paragraphs' are always written as continuous long lines, this works. If they get written as hard-wrapped 80-column lines, then you get the existing issue on narrow displays. So specifying this implementation of wrapping is also a recommendation to authors to write paragraphs as continuous long lines, I guess. It's not bad. It preserves quality 1 (ease of implementation) and 2 (some richness allowed by literal text formatting), and *may* get you 3 most of the time, as long as authors comply with the recommendation. I'd recommend, if you go with this, to not also include the ``` literal formatting. There's going to be a breakpoint of complexity somewhere where text/gemini will not do the job you want it to, and you should be serving text/markdown or even text/html. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
"I use Emacs. I set fill-column to 72 characters, and turn on auto-fill-mode" I author gopher content in vim with a text width of 67 using vim-pencil in hard-wrap mode. If the proposed changes get implemented then in Gemini I'll tell pencil to go into soft mode instead. Bam! Done. Recommending against hard wrapping outside of ``` fences is nice and easy.
Okay, it looks like we are not as close to a consensus as I had hoped or imagined. That's fine. I don't want to rush this process, as much as I'm looking forward to it being over. I wonder if we can make a simple incremental improvement to the spec-spec now, though, using some of the ideas that have come out of this latest round of discussion. As a reminder, the current spec-spec, version 0.9.2, basically defines text/gemini thusly:
solderpunk <solderpunk at SDF.ORG> writes: > Would anybody *prefer* that we spec hard-wrapping to some specified > length (80, 40, whatever) over speccing the above "long line" solution? > Please speak up if so! No; though I think the long-line solution is imperfect, it is strictly better than specifying hard-wrapping to a specified length. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
On Wed, Jan 22, 2020 at 09:31:15AM -0500, Jason McBrayer wrote: > solderpunk <solderpunk at SDF.ORG> writes: > > > Would anybody *prefer* that we spec hard-wrapping to some specified > > length (80, 40, whatever) over speccing the above "long line" solution? > > Please speak up if so! > > No; though I think the long-line solution is imperfect, it is strictly > better than specifying hard-wrapping to a specified length. Okay, thanks! I realised I asked "Would anybody *prefer* hard-wrapping?", but please do respond with "no" if you wouldn't, so I can tell the difference between nobody prefering it and nobody having had time to respond yet. :) Cheers, Solderpunk
On Wed, Jan 22, 2020, at 7:30 AM, solderpunk wrote: > I realised I asked "Would anybody *prefer* hard-wrapping?", but please > do respond with "no" if you wouldn't, so I can tell the difference > between nobody prefering it and nobody having had time to respond yet. > :) I would prefer the spec explicitly discouraging hard-wrapping :)
On Wed, Jan 22, 2020 at 08:25:45AM -0800, Aaron Janse wrote: > I would prefer the spec explicitly discouraging hard-wrapping :) Ditto!
Solderpunk write's: > I think this is, in fact, the smallest possible change to the current > spec-spec which solves my original complaint without sacrificing support > for arbitrary screen width. I am FOR eliminating text reflow (ala RFC 1896) and AGAINST a specific hard wrap number (other than the viewport itself). Which is to say: I think your proposal is a good compromise that _does_ improve the spec and makes things more clear. The issue of lists and such is still open, but I have come around on that slightly. Solderpunk, via e-mail, pointed out that lists and headers can provide more varied and flexible display in graphical clients... and I had not thought of things in terms of graphical clients. That makes sense to me. So long as it is not a mandate, but an optional part of the spec, I withdraw my objection.
On Wed, Jan 22, 2020 at 11:51:37PM +0100, Brian Evans wrote: > So long as it is not a mandate, but an optional part of the spec, I > withdraw my objection. I don't intend anything to be mandatory except link lines, ordinary text lines and possibly ``` raw/verbatim/preformatted handling (I'm not 100% sure on that last one, I think perhaps switching to wrapped long lines as the official recommendation removes many of the justifications we had for first proposing it - though certainly not all of them). Anything to do with headings, lists, etc. will be strictly optional, and a major factor in whether I decide to adopt any of those things will be how well they degrade when viewed on a simple terminal-based client which completely ignores all optional components. All I've ever wanted is to permit improvements to readability or navigation in advanced (possibly, but not necessarily, graphical) clients as much as possible without interfering in any non-trivial way with the usability of incredibly simple clients. Cheers, Solderpunk
Okay, I'm going to update the spec-spec this weekend to replace the current RFC-1896 text wrapping with the new "wrap long lines but don't join short lines" approach with everybody seems either to agree is an improvement or to feel indifferent about, and which nobody objected to for the past week or so either on this list or to me directly. My plan is to replace the entirety of section 1.3.5.3 with the below. Does anybody want to suggest any minor changes to this text to remove ambiguity or anything like that? Cheers, Solderpunk ``` 1.3.5.3 Text display Textual content for Gopher is typically "hard-wrapped", i.e. composed of lines no longer than (typically) 80 characters. Each line of text is printed to the screen as-is. In contrast, in HTML content on the web, browsers ignore the length of lines of text and instead "reflow" text to a width appropriate for the display device - lines of text in a HTML file which are "too long" get split up, while consecutive lines which are "too short" get joined together. Gemini adopts a strategy between these two approaches, designed to strike a balance between implementation complexity, flexibility of display width, and support for common text formatting patterns. Lines of text in a text/gemini document which are not link lines (i.e. do not begin with "=>") which are longer than can fit on a client's display device SHOULD be "wrapped" to fit, i.e. long lines should be split (ideally at whitespace or at hyphens) into multiple consecutive lines of a device-appropriate width. Recall that text/gemini processing is strictly line-based: the above wrapping is applied to each line of text independently. Multiple consecutive lines which are shorter than the client's display device MUST NOT be combined. Blank lines receive no special treatment: they are ordinary text lines, with a display length of zero. Thus, they fit on any client's display device and never need to be wrapped. Each individual blank line in a text/gemini document MUST be rendered by the client as an individual blank line. In order to take full advantage of this method of text formatting, authors of text/gemini content SHOULD avoid hard-wrapping to a specific fixed width. Most text editors can be configured to "soft-wrap", i.e. to write this kind of file while displaying the long lines wrapped to fit the author's display device. Authors who insist on hard-wrapping their content MUST be aware that the content will display neatly on clients whose display device is as wide as the hard-wrapped length or wider, but will appear with irregular line widths on narrower clients. ``` On Thu, Jan 23, 2020 at 11:13:51AM +0000, solderpunk wrote: > On Wed, Jan 22, 2020 at 11:51:37PM +0100, Brian Evans wrote: > > > So long as it is not a mandate, but an optional part of the spec, I > > withdraw my objection. > > I don't intend anything to be mandatory except link lines, ordinary text > lines and possibly ``` raw/verbatim/preformatted handling (I'm not 100% > sure on that last one, I think perhaps switching to wrapped long lines > as the official recommendation removes many of the justifications we > had for first proposing it - though certainly not all of them). > > Anything to do with headings, lists, etc. will be strictly optional, and > a major factor in whether I decide to adopt any of those things will be > how well they degrade when viewed on a simple terminal-based client > which completely ignores all optional components. > > All I've ever wanted is to permit improvements to readability or > navigation in advanced (possibly, but not necessarily, graphical) > clients as much as possible without interfering in any non-trivial way > with the usability of incredibly simple clients. > > Cheers, > Solderpunk
---