💾 Archived View for rawtext.club › ~sloum › geminilist › 006862.gmi captured on 2023-11-04 at 13:00:58. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
mbays mbays at sdf.org
Wed Jul 7 18:34:28 BST 2021
- - - - - - - - - - - - - - - - - - -
I checked how Lagrange handles the Byte Order Mark (BOM), and sure
enough it breaks the first line's type detection.
If we follow section 6 of RFC 3629, it looks like the right thing to do is to interpret this character as a nonbreaking space, even if it is the first character of the utf8-encoded gemtext. So then the first line should be interpreted as a text line.
Another thing to clarify in the next version of the spec.
"""A protocol SHOULD forbid use of U+FEFF as a signature for those textual protocol elements that the protocol mandates to be always UTF-8, the signature function being totally useless in those cases.
A protocol SHOULD also forbid use of U+FEFF as a signature for those textual protocol elements for which the protocol provides character encoding identification mechanisms, when it is expected that implementations of the protocol will be in a position to always use the mechanisms properly. This will be the case when the protocol elements are maintained tightly under the control of the implementation from the time of their creation to the time of their (properly labeled) transmission.
[...]
When a protocol forbids use of U+FEFF as a signature for a certain protocol element, then any initial U+FEFF in that protocol element MUST be interpreted as a "ZERO WIDTH NO-BREAK SPACE"."""-------------- next part --------------A non-text attachment was scrubbed...Name: signature.ascType: application/pgp-signatureSize: 195 bytesDesc: not availableURL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210707/4eaff8cb/attachment.sig>