💾 Archived View for gemi.dev › gemini-mailing-list › 000005.gmi captured on 2024-12-17 at 12:58:24. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
I have changed the subject, but this is a continuation of the text reflow convo... but the conversation has drifted and I felt a new subject was in order. Sean writes: > The issue I have with Markdown is that there is no one standard for it. > Mark Gruber created it in 2004 as a way for *him* to create HTML documents > without having to write HTML (or use a clumsy HTML editor) and he had no > desire to add to it (because it works for him). Since then, multiple > versions have been created to address shortcomings people came across as > they tried using Markdown for their own use, and as of right now, defined in > RFC-7763 and RFC-7764, are the various flavors of Markdown[...] I agree with Sean that the lack of a standard is an issue. Additionally, most versions of markdown allow for embedded html as well as inline images. I believe the majority of us were against these things. In additional to philosophical disagreements there are issues, as Sean also writes, with the mandate that a simple client should be buildable as a weekend project. Having said the things above, I like markdown. I think it is a good fit for what we have been floating around with this project on the whole. I think it has a lot of strengths. Given the issues above (coding-effort-wise and philosophy-wise) it feels like the only way to make things work is to define a dialect of markdown as a standard that is a part of the gemini spec (though it could have its own spec as well and just reference that in the gemini spec). If that route were chosen, I think we would need to create parsers that return an AST in a variety of languages and make the libraries available to developers to use in their gemini projects. This would be a pretty big undertaking and I do not know that it is exactly in scope. But we do seem to keep coming back to markdown as a good option that pretty much everyone likes at least some elements of. I'm very open to other suggestions and look forward to hearing other paths forward. --? Sent with https://mailfence.com Secure and private email
Brian Evans writes: > I agree with Sean that the lack of a standard is an issue. Additionally, most > versions of markdown allow for embedded html as well as inline images. I > believe the majority of us were against these things. In additional to > philosophical disagreements there are issues, as Sean also writes, with > the mandate that a simple client should be buildable as a weekend project. I have, from the beginning, thought that Markdown was the ideal document type to serve over Gemini ? it meets with my intuitions about "the Lynx web" being about the right level of presentation complexity. I have been supporting the use of Text Junior only because it is easier to implement without library support than Markdown. Can we talk about what we would want to restrict, were we to implement Markdown in Gemini clients, and why? 1. No embedded HTML. This should be pretty obvious; we are not generating HTML, and we do not want to be supporting arbitrary HTML in the client, so there is no use-case for including embedded HTML. We should probably specify a behavior for embedded HTML that is included by mistake. Options are to strip it entirely, to extract text from it and put the text in a paragraph, or to display it raw in a ```code``` block. 2. No inline images. There are basically two reasons. The less compelling reason to me is aesthetic: we want documents to be text, and inline images in NCSA Mosaic were the start of the slippery slope that led to the current web being mostly Not Text. I consider this less compelling, because there is a legitimate use for inline images as figures in academic papers or lead images in news stories, for example. The more compelling case, to me, is request predictability and tracking avoidance. I want following a link in a Gemini client to make one (1) Gemini request and receive one (1) Gemini response. Inline images mean that a client may make a cascade of additional requests for resources that is not predictable by following the original link. Some of these requests may be to third-party servers, enabling tracking. This is something we strongly want to avoid. The remaining reason to not use Markdown is client complexity. A simple text-mode client can basically ignore *most* of Markdown's formatting, and just display it literally. This adds no client complexity. The one thing that can't be ignored is links, and Markdown links are somewhat complex. They can appear anywhere in the text, and they can be in either [immediate](gemini://domain/path) or [reference][1] format. This is an impediment to easy implementation and I think is the biggest block to us adopting Markdown for Gemini. [1]: gemini://domain/path > If that route were chosen, I think we would need to create parsers > that return an AST in a variety of languages and make the libraries > available to developers to use in their gemini projects. This would be > a pretty big undertaking and I do not know that it is exactly in > scope. I am willing to write such libraries for Python and Common Lisp. The existing libraries for both languages can be extended to provide alternative outputs. The Python library uses ElementTree internally, and a method could be added that uses the rest of the markdown library but returns an ElementTree rather than a string. The Common Lisp library can probably only produce a string (I will have to look at it further), but that string could easily be an s-expression representing a parse tree, so that's not hard to bring back into memory. There is already such a library for C (discount)[2], though I don't know what the in-memory representation is. I am willing to wrap the C library for Vala, probably. [2]: http://www.pell.portland.or.us/~orc/Code/discount/ -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
On 8/18/19 5:10 PM, solderpunk wrote: > Given that, thanks to the inclusion of MIME types in the response > header, Gemini is already perfectly capable of serving of Markdown, and > given that Markdown is powerful enough to completely replicate all of > the semantics currently in the text/gemini spec (i.e. it can link to > other places via URL with a user-friendly label attached), what do we > actually stand to gain by speccing text/gemini up as something which is, > roughly, just Markdown with perhaps a few features removed and its > native linking syntax replaced by our own line-based => alternative? I am not much interested in forcing a markup format into Gemini at all. You have established a line-based hyperlinking that fixes some issues we faced in gopher. We have security in the protocol itself. As we start talking about the rendering of text we're getting away from heart described in the FAQ. I've been a voice behind the discussion of whether to reflow or not, but even that I'll admit goes a bit further than necessary. You're right about the MIME in the response header. Let Gemini stay simple. End the spec there and let clients decide what to do with the markup, if it's used at all. In another year the markup of the masses may change. We don't need to bundle that into Gemini from the beginning.
Tomasino writes: > I am not much interested in forcing a markup format into Gemini at all. > You have established a line-based hyperlinking that fixes some issues we > faced in gopher. We have security in the protocol itself. As we start > talking about the rendering of text we're getting away from heart > described in the FAQ. > > I've been a voice behind the discussion of whether to reflow or not, but > even that I'll admit goes a bit further than necessary. You're right > about the MIME in the response header. Let Gemini stay simple. End the > spec there and let clients decide what to do with the markup, if it's > used at all. In another year the markup of the masses may change. We > don't need to bundle that into Gemini from the beginning. I both agree and disagree with the above sentiment. I agree the basic gemini file structure that has already been preliminarily approved does solve many of the issues people find with gopher (at least when combined with the response header/mime). I like this fine and am still happy to move forward with it (though some minor markup would be neat). With regard to markdown files specifically, I do agree that it is out the very outer edges of relevance to gemini as a spec. However, I do think that if something like this is going to have an official integration into gemini that it is best to do it with intent and figure it out early. I had written a lot more here in support of markdown... but the more I thought about it the more it felt like a separate concern. It might be a good idea for a few developers that are planning on making feature rich clients get together and try to standardise on some form of markdown support so that a community standard can arise without the need of it being included in the gemini spec itself. If things went that way I would advocate for keeping the gemini file format (with the line based links) so that simple clients can still be built. Jason writes: > I am willing to write such libraries for Python and Common Lisp. The > existing libraries for both languages can be extended to provide > alternative outputs. If we go the route of markdown in either an official capacity or as a community supported standard I would be happy to work on getting a go module customized for this purpose... however, I just looked and it seems that this would already do the job: https://godoc.org/github.com/gomarkdown/markdown/ast --? Sent with https://mailfence.com Secure and private email
> Having said the things above, I like markdown. I think it is a good fit for what > we have been floating around with this project on the whole. I think it has a lot > of strengths. Given the issues above (coding-effort-wise and philosophy-wise) > it feels like the only way to make things work is to define a dialect of markdown > as a standard that is a part of the gemini spec (though it could have its own > spec as well and just reference that in the gemini spec). I am very opposed to including a long, complicated, detailed specification of any markup language in the Gemini spec. That won't happen. Referencing another spec would be fine, in principle. But here's a question: what would such a spec look like? It's occured to me that I don't think Markdown or any similar language has ever been specced in any way other than specifying how to translate it to HTML. Which, as JFM has mentioned, is a whole tech stack we don't want or need. How do you write a spec explaining how to render itemised lists in Markdown into plain text suitable for printing in a terminal? That's way fiddlier than speccing how to translate it into HTML and let the browser worry about wrapping and indenting. It's not impossible, e.g. lynx must have some system for doing this. But it doesn't sound fun to code, and it sounds even less fun to explain *how* to code. Nobody other than us is going to write code implementing it, meaning there won't be off-the-shelf libraries in any given lanuage to do it. > If that route were chosen, I think we would need to create parsers that return > an AST in a variety of languages and make the libraries available to developers > to use in their gemini projects. This would be a pretty big undertaking and I do not > know that it is exactly in scope. But we do seem to keep coming back to > markdown as a good option that pretty much everyone likes at least some > elements of. I like Markdown as much as anybody but this feels massively out of scope to me. We have already defined a way for Gemini to serve Markdown (just by specifying text/markdown). It's not our fault if nobody knows what that means. -Solderpunk
> As we start > talking about the rendering of text we're getting away from heart > described in the FAQ. I'm starting to feel this way, too. But let's see how I feel tomorrow. :) > I've been a voice behind the discussion of whether to reflow or not, but > even that I'll admit goes a bit further than necessary. The reflowing question I guess has been the start of all this. it's something that many, many people have requested - unsurprisingly in this smartphone-obsessed world. And since it seemed harmless I said, "sure, let's reflow". But it's not harmless, it breaks lots of functional styling we can do with just plain text, and it seems hard to undo that harm without cranking the complexity level up considerably by defining different kinds of entity which should/should not be reflowed. If there's no reflow, then Markdown "just works" and doesn't even need to be specced. I have to admit, I'm now starting to wonder how essential reflow is. I have a very old phone, whose screen is very small by modern standards. You definitely couldn't buy a new smartphone today with a screen as small as mine. But if I turn it horizontally and let the screen flip, I can actually (just!) view phlog posts wrapped at 70 chars in PocketGopher no problem, without any weird wrapping artifacts. If we recommended wrapping at 60 chars just to be safe, I actually think plain text would be workable on the vast majority of mobile devices. But I'm happy to hear from people with devices were this wouldn't be true. -Solderpunk
> I both agree and disagree with the above sentiment. I agree the basic > gemini file structure that has already been preliminarily approved does > solve many of the issues people find with gopher (at least when combined > with the response header/mime). I like this fine and am still happy to > move forward with it (though some minor markup would be neat). I am *super* happy with the extent to which Gemini as-specced solves common complaints/pain-points with Gopher. Lack of text reflowing is, I think, the *only* even half-way common Gopher complaint/limitation where we haven't made really very major progress. For everything else, we've knocked it out of the park. > I had written a lot more here in support of markdown... but the more > I thought about it the more it felt like a separate concern. It might be > a good idea for a few developers that are planning on making feature > rich clients get together and try to standardise on some form of > markdown support so that a community standard can arise without > the need of it being included in the gemini spec itself. If things went > that way I would advocate for keeping the gemini file format (with > the line based links) so that simple clients can still be built. If people wanted to produce, as a kind of parellel project, a complete and detailed specification of "Markdown" which lacked anything gross (like embedded HTML), I think I'd be happy to have the Gemini spec say "Gemini clients which opt to support text/markdown responses should do so using such-and-such definition of Markdown", referring to the results of that project. -Solderpunk
This is really a client concern to me and I think this should not be in the spec. If one wants to serve markdown they already have `text/markdown` at their disposal.
>From IRC: <solene> I'm on a quite fun project I'll write about soon. I rewrote a minimalism markup system, it's markdown but only with titles and code blocks supports :D the html converter is 18 lines of awk (1 instruction / line) For those unfamiliar, solene is an openbsd developer and posts here: gopher://dataswamp.org/1/~solene She's not working on Gemini, but this mini-markup she's developing may prove useful for folks writing clients. As we've discussed, it probably doesn't make sense to support the full markdown spec (even if you can choose one flavor). Her micro-version might be handy. Also, many have said they don't want to translate to HTML for rendering, which makes sense for most applications. Regardless, her awk parsing will be a helpful aid to anyone looking to write their own. I've informed her about potential Gemini applications and she said she'll be sharing her code soon. I'll post a link to the list when it comes up. - tomasino
Solene is fast. Here's the awk parser: https://ttm.sh/HX.awk You can see her code blocks she's supporting are those starting with 4 spaces, not the ``` fences. Perhaps this will inspire some of you for simple markup in your clients. -tomasino
Thanks a lot for sharing this with the list, Tomasino! This is very timely work by Solene. -Soldeprunk On Wed, Aug 21, 2019 at 11:23:35AM +0000, James Tomasino wrote: > Solene is fast. Here's the awk parser: > https://ttm.sh/HX.awk > > You can see her code blocks she's supporting are those starting with 4 > spaces, not the ``` fences. Perhaps this will inspire some of you for > simple markup in your clients. > > -tomasino >
It was thus said that the Great James Tomasino once stated: > >From IRC: > > <solene> I'm on a quite fun project I'll write about soon. I rewrote a > minimalism markup system, it's markdown but only with titles and code > blocks supports :D the html converter is 18 lines of awk (1 instruction > / line) > > For those unfamiliar, solene is an openbsd developer and posts here: > gopher://dataswamp.org/1/~solene > > She's not working on Gemini, but this mini-markup she's developing may > prove useful for folks writing clients. As we've discussed, it probably > doesn't make sense to support the full markdown spec (even if you can > choose one flavor). Her micro-version might be handy. > > Also, many have said they don't want to translate to HTML for rendering, > which makes sense for most applications. Regardless, her awk parsing > will be a helpful aid to anyone looking to write their own. > > I've informed her about potential Gemini applications and she said > she'll be sharing her code soon. I'll post a link to the list when it > comes up. Having written my own markup-langauge, I think I can offer some notes and insights into the process. I am in no way trying to promote what I have as a standard, as it barely works for me. Also, it outputs HTML directly (I use it for blogging). Anyway. I came up with my own markup-laguage [1], based off OrgMode from Emacs. It's over 700 lines of Lua code [2] geared towards how I write blog entries. I love it and absolutely hate it at the same time. Mainly, I hate it because I keep forgetting what are "special" characters and what aren't, and more often than not, I end up with half the post in an italicized monspaced font. The other problem is when a block section isn't parsed properly, and I have to hunt down what corner case I ran into this time (in fact, just today [3] I fixed yet another corner case in rendering, and that happens nearly every time I use the thing. Yes, it starts out simple. Then you want to format this, and then that, and then the other thing, and pretty soon you end up with a 700+ line of exceptions threatening to explode. As an aside, there are a few aspects I do love about the system. One is the automatic conversion of HTML entities (with exceptions like & and < for technical reasons) into their UTF-8 equivalents. This let's me type those characters I can't type, like "?" as "ü". It can also handle entities like "☬". The second aspect is conversion of some short-cut sequences like `` (two agrave accents) into ? (typographical quotes). A third aspect is the ability handle acronyms---that is, it converts TCP into <abbr title="Transmission Control Protocol">TCP</abbr> but on an ad-hoc, defined in file, way. [4] Anyway, I digress. So let's define a simple markup language just to discuss some issues. A line starting with '#' is a header. _This is italic text_. And **some bold text**. And * a bullet list * uses a single asterisk, bold uses two. * This is an unordered list. But 1 this is a numbered list 2 like this 3 see? Othersise, paragraphs are reflowed. Links ... um ... just for now, let's go with [[http://boston.conman.org][this format for links]]. And already there are issues, such as: for a bulletted list, are these the same?
Sean Conner writes: > The point here isn't to try to define a format, my point is to say > it's a messy, difficult job with lots of people wanting different > things. Anyway, I rambled on long enough ... Thanks. This is a very useful contribution to the discussion, and to me it suggests that maybe we are best off not defining any markup in text/gemini other than links, but noting that text MAY be reflowed at the client's discretion, meaning that you cannot rely on ASCII art or ASCII tables working. An aside not directly related to this: I'm playing with writing a Gnome client, and my intention is for it to support text/plain, text/gemini, and text/markdown natively, as well as Gophernicus-flavored gophermaps. It's in Python, and the Python library 'mistletoe' supports returning an AST that I plan on using to do formatting in a GtkTextView. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
I installed Jetforce on gopher.black tonight and immediately ran into a very basic question. What is the _name_ of the default index file? I looked through the spec (granted not closely) and didn't see it defined. In the end I read through Michael's source for Jetforce and found that he's using ".gemini" to override the default file listing and serve an index gemini file. Is that standard? Is there a standard? That seems like a fairly basic thing we might want to include. It seems small now, but if a Gemini server were to grow large and wanted to change server software that could become a big barrier. Anyway, check out my mostly empty Gemini server: gemini://gopher.black
It was thus said that the Great James Tomasino once stated: > I installed Jetforce on gopher.black tonight and immediately ran into a > very basic question. What is the _name_ of the default index file? It depends upon the server I would guess. I don't see why the name of an index file would be a standard. > I > looked through the spec (granted not closely) and didn't see it defined. > In the end I read through Michael's source for Jetforce and found that > he's using ".gemini" to override the default file listing and serve an > index gemini file. And I use a file called 'index.gemini', modelled after Apache's 'index.html' file (the name of which is configurable so it doesn't even have to be that). > Is that standard? Is there a standard? That seems like a fairly basic > thing we might want to include. It seems small now, but if a Gemini > server were to grow large and wanted to change server software that > could become a big barrier. It didn't seem to be a problem with having .htm and .html files. On my sever, if a directory contains an index.gemini, it will be served, otherwise a default page will be constructed. Addtionally, if a file has an extension of ".gemini" it will be served up with a MIME type of "text/gemini". But that's on my server. Others may vary. -spc (For instance, I've seen ".gmi" used as an extension)
I think the idea of a server defining (or allowing to be defined) the default file seems fine. Tomasino does bring up a solid point that standardizing to some degree or other on something would make migrating to other servers easier... but I dont know that it is a huge deal. Particularly if, as is the case on Sean's server software, an admin can set the default file. I do think it would be nice if we settled on .gemini or .gmi or some such. Or at the very least do some form of limiting to those two. Since there is not a shebang in the file the suffix can make a difference for mime sniffing (both for server and other software that might want to access gemini files). Having two isnt a big deal, but a wildly divergent or per server naming convention feels needless since it can really be an easy single line in the spec. I dont feel crazy strongly about which, though I have a slight inclination toward gmi. Not sure if others will feel this needs to be defined in the spec, but look forward to hearing thoughts. --? Sent with https://mailfence.com Secure and private email
On Thu, Aug 22, 2019 at 7:38 PM James Tomasino <tomasino at lavabit.com> wrote: > I installed Jetforce on gopher.black tonight and immediately ran into a > very basic question. What is the _name_ of the default index file? I > looked through the spec (granted not closely) and didn't see it defined. > In the end I read through Michael's source for Jetforce and found that > he's using ".gemini" to override the default file listing and serve an > index gemini file. > > Is that standard? Is there a standard? That seems like a fairly basic > thing we might want to include. It seems small now, but if a Gemini > server were to grow large and wanted to change server software that > could become a big barrier. I went with ".gemini" because it was simple and made sense to me at the time. But now that you bring it up, it could just as easily be named something else and I wouldn't mind changing Jetforce if people come to a consensus. I like the ".gmi" file extension that's being used over at carcosa.net. That file extension appears to currently only be used by some obscure GPS mapping file [1], so that shouldn't be a problem. I can't figure out why their "index.gml" file uses a different file extension though. Is this trying to follow some existing convention? Otherwise, I'm learning toward "index.gmi". Coincidentally, I just released a new version of Jetforce tonight [2]. One of the new features is support for recognizing files with the ".gmi" extension as text/gemini. > Anyway, check out my mostly empty Gemini server: > > gemini://gopher.black Nice! Please feel encouraged to spam the project issue tracker if you notice any bugs or think of features that you would like to see added. [1] https://filext.com/file-extension/GMI [2] https://github.com/michael-lazar/jetforce/releases/tag/v0.0.6
Michael Lazar writes: > I like the ".gmi" file extension that's being used over at carcosa.net. That > file extension appears to currently only be used by some obscure GPS mapping > file [1], so that shouldn't be a problem. I can't figure out why their > "index.gml" file uses a different file extension though. Is this trying to > follow some existing convention? Otherwise, I'm learning toward "index.gmi". If I'm using index.gml instead of index.gmi somewhere, it's a typo or a thinko. The situation with HTTP and HTML is that the name of the index file is not specified in any standard, and is up to the server. In Apache it's configurable, and it's common to support the first one found of index.html, index.htm, index.php, and probably even index.txt. I would expect the situation to be the same in gemini; there's no strict protocol reason that we even have to have index files at all, but I think most of us are used to them and would prefer them. It might be a convention for server authors to follow that you serve the first of index.gemini, index.gmi, and .gemini. And possibly index.txt or index.md. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
On 8/23/19 12:02 PM, Jason McBrayer wrote: > It might be a > convention for server authors to follow that you serve the first of > index.gemini, index.gmi, and .gemini. And possibly index.txt or > index.md. I fully support conventions from the start. It seems there was quite a wide diversion of naming right from the start. Having it as a commonly configurable choice in the servers is very helpful as well. That avoids the porting issues. Thanks for all the feedback. This is a very productive list.
On Fri, Aug 23, 2019 at 8:03 AM Jason McBrayer <jmcbray at carcosa.net> wrote: > Michael Lazar writes: > > I like the ".gmi" file extension that's being used over at carcosa.net. That > > file extension appears to currently only be used by some obscure GPS mapping > > file [1], so that shouldn't be a problem. I can't figure out why their > > "index.gml" file uses a different file extension though. Is this trying to > > follow some existing convention? Otherwise, I'm learning toward "index.gmi". > > If I'm using index.gml instead of index.gmi somewhere, it's a typo or a > thinko. FYI, here's where I got this from: gemini://carcosa.net/germinal "Serves index.gml as a directory listing, if it exists"
I set up Gemini on https://tilde.black for all users tonight. We're running Jetforce there (and I wrote a openbsd rcctl wrapper if anyone needs one). One of the first questions I got from the community was whether it would work over tor. I had no idea so I added a hidden service and tried it: HiddenServicePort 1965 127.0.0.1:1965 The results can be seen here: https://ttm.sh/gP.png Tor doesn't work, but a direct connection does. I admit to not groking TLS, but I suspect it's related to this. Tor uses its own certs to verify the connection, but Gemini will want to use its own. I have no idea if the protocols are compatible. If it's possible, I'd like to implement it on ~black, but I defer to the smarter people on this list.
James Tomasino writes: > Tor doesn't work, but a direct connection does. I admit to not groking > TLS, but I suspect it's related to this. Tor uses its own certs to > verify the connection, but Gemini will want to use its own. I have no > idea if the protocols are compatible. I do not know much about Tor hidden services, but I don't think it's related to TLS; HTTPS tunnels over Tor without having its certificates messed up, and we're not doing anything different from HTTPS at that level. Probably it's something in the hidden service setup? -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
On Fri, Aug 23, 2019 at 12:29:32PM +0000, James Tomasino wrote: > > I fully support conventions from the start. It seems there was quite a > wide diversion of naming right from the start. Having it as a commonly > configurable choice in the servers is very helpful as well. That avoids > the porting issues. > > Thanks for all the feedback. This is a very productive list. > Somebody suggested to me (and I forget exactly who, sorry), that in addition to the formal spec document it might be a good idea to have a separate document of "Gemini Best Practices" which are "only" recommendations or conventions, as opposed to strict requirements, but still relatively important to maintaining a smoothly functioning ecosystem. This seems like a good idea, and clarifying these filename conventions seems like an obvious thing to do there. I'll try to produce such a list soon! -Solderpunk
---