While I love the simplicity of gemtext, the inconsistency of the line rules is iratating Specifically whitespace between the start-of-line identifier, and the rest of the line. Headings? Optional. Lists? Not optional. Turns out whitepsace between the "=>" and the URL of a link line is optional. I fixed a bug in Kennedy's crawler, where I was only pars out link lines that had whitespace after the "=>". That one character bug fix led to the crawler discovering 15% more URLs!!! Kennedy is now indexing ~315K URLs
2 years ago · 👍 superfxchip, bavarianbarbarian
@acidus Ah, you're right. I did see that, actually. I thought it was a bit weird, but I put more precedence on optional over "one or more whitespaces", because technically, you can have an *optional* one or more whitespaces, lol. But this should definitely be brought up on the repo: https://gitlab.com/gemini-specification/protocol/-/issues · 2 years ago
https://gitlab.com/gemini-specification/protocol/-/issues
🤷i mostly agree, but I a) regularly read RFCs and BNF grammars and b) write parsers fairly often. And I still got tripped up by @solderpunk’s ”one or more whitespaces” which is inside of optional brackets instead of saying “zero or more white space” especially when every single example that follows uses white space 🤬
I know Buran had some bugs with this and header lines too, because I filed them.
anyway, 1 or 2 odd parts still puts Gemini way ahead of most specs for sure · 2 years ago
Wut?????
Ok, I need to check my crawler too · 2 years ago
My understanding is that the design principle behind it is that you only need to examine the first 3 chars to know what line type it is when scanning. The spec is quite clear enough in my opinion. As an author though you can just always add a space after the line type marker, so there's not much you have to remember. · 2 years ago
Sorry, I just really hate this narrative that's become common that gemini or gemtext is "so wildly inconsistent and vague"... like... it's not. 95% of the protocol is very very clear, and gemtext is actually very consistent because there's like... barely anything to it. · 2 years ago
Hm, idk. I actually think it's consistent because both headings and links have it as optional. The only oddball is lists, and they are required I think for a good reason. Astrisks are common for *bolding* things or notes, etc., and so the required space helps distinguish between that and lists. In fact, lists are the *only* thing were the space is required, not even quotes require a space, and there's a specific reason given for it. That's not very inconsistent. · 2 years ago