💾 Archived View for gemi.dev › gemini-mailing-list › 000907.gmi captured on 2024-08-31 at 18:53:59. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
Hello, everybody! I know that there is no way in Gemini right now to check the integrity of pages. However, it would be nice for this to possible. I think this feature is pretty much required for the protocol to be taken seriously, but that's just my opinion Of course, it would be the best to do this in a way that doesn't break existing implementations. My ideas are: a) A file in the same directory as the downloaded file, but with .integrity added at the end, containing a hash as "<algorithm>:<hash-in-hex>". This is definitely compatible with all existing clients, but it breaks a lot of the conventions of Gemini, such as the one-file-per-page rule. b) Add a new MIME-type parameter, let's call it "integrity" again, that contains the has in the same format as option a. It's possible that this would be pointless, so if it is let me know. This is just an idea to, for example, allow the downloading of larger files properly. Cheers, ~almaember
On 5/18/21 3:48 PM, Almaember wrote: > I know that there is no way in Gemini right now to check the integrity of pages. However, it would be nice for this to possible. Why does page integrity matter? Shouldn't it be obvious to the person viewing a page that the client didn't download the entire page if the download failed midway? One would think that text being truncated mid-sentence would be a fairly obvious indication. > I think this feature is pretty much required for the protocol to be taken seriously, but that's just my opinion 1. Why do you think integrity checks are necessary for Gemini to be "taken seriously"? 2. Why do you think it's necessary for Gemini to be "taken seriously"? 3. What do you think it means for Gemini to be "taken seriously"? > It's possible that this would be pointless, so if it is let me know. This is just an idea to, for example, allow the downloading of larger files properly. When I want to download large files over FTP and HTTP (such as ISO images for GNU/Linux distributions) I can download not only the ISO file itself, but a checksum file (ideally generated using SHA-3, but SHA-2, SHA-1, or even MD5 will do in a pinch) that I can use to verify that I got the file I wanted in its entirety. I don't see why people who want to share large files can't do the same in Geminispace. -- Matthew Graybosch gemini://starbreaker.org/ https://starbreaker.org/ "The lie you tell yourself is the lie that defines you." --Isaac Magnin
> Why does page integrity matter? Shouldn't it be obvious to the person viewing a page that the client didn't download the entire page if the download failed midway? One would think that text being truncated mid-sentence would be a fairly obvious indication. It would be good for crawlers, that don't have human intelligence and can't decide for themselves what would be obvious to us. >> I think this feature is pretty much required for the protocol to be taken seriously, but that's just my opinion > > 1. Why do you think integrity checks are necessary for Gemini to be "taken seriously"? > 2. Why do you think it's necessary for Gemini to be "taken seriously"? > 3. What do you think it means for Gemini to be "taken seriously"? Taking back what I said, it was stupid. > When I want to download large files over FTP and HTTP (such as ISO images for GNU/Linux distributions) I can download not only the ISO file itself, but a checksum file (ideally generated using SHA-3, but SHA-2, SHA-1, or even MD5 will do in a pinch) that I can use to verify that I got the file I wanted in its entirety. I don't see why people who want to share large files can't do the same in Geminispace. Written above Cheers, ~almaember
Almaember <almaember@disroot.org> wrote: > This is just an idea to, for example, allow the downloading of larger > files properly. We've had discussions about other features, like including Content-Size or Content-Length information in the response, under the same inspiration of supporting large-file downloads (you can find them in the mailing list archives). The overwhelming takeaway from those discussions was that Gemini /isn't/ really designed for large-file transactions - you want protocols with dedicated features for it, like FTP or HTTP even, so use those instead (or make your own!). Gemini is oriented towards typically short text files intended for direct consumption by a human audience. Everything else, IMO, is secondary. ~aravk | ~nothien
Hello almaember, Almaember <almaember@disroot.org> writes: > Hello, everybody! > > I know that there is no way in Gemini right now to check the integrity > of pages. However, it would be nice for this to possible. Integrity in the sense of "the file remained unchanged in transit"? TLS should take care of that. In the sense "the file is the one that the original author intented it to be"? There are at least two attempts to deal with this: If you dare to check my capsule at => gemini://ew.srht.site/ There are two links to openbsd-signify and NetSigil. When I publish a post, my Makefile takes care to create corresponding sha256 checksums. They are concatenated into one file, which is then signed using my gpg key. That's one option. The same information is packaged differently to .well-known/signature-bundle. This file is created using openbds-signify. There are a few threads on the mailing list, too ... gemini://gemi.dev/gemini-mailing-list/messages/005550.gmi gemini://gemi.dev/gemini-mailing-list/messages/005374.gmi gemini://gemi.dev/gemini-mailing-list/messages/005331.gmi Also see my first post about experimenting with this: => gemini://ew.srht.site/en/2020/20201217-towards-a-proper-flightlog-4.gmi There are two parts to this, as I see it. 1. Create the checksums/signature in some agreed upon format. Everyone editing a capsule has to do this. While a bit tedious, it still can be done manually on the shell (unix type environment assumed). 2. Upon user request browsers have to check these agreed upon locations, download the signed file, possibly download the public key, cache these things properly and then do the verification. I am not aware that any gemini browsers have picked this up. But of course, I would be pleased to be proven wrong :) >snip< Hope this helps, ~ew -- Keep it simple!
On 19/05/2021 13:13, ew.gemini wrote: > > Almaember <almaember@disroot.org> writes: > >> On 19/05/2021 09:40, ew.gemini wrote: >>> Hello almaember, >>> Almaember <almaember@disroot.org> writes: >>> >>>> Hello, everybody! >>>> >>>> I know that there is no way in Gemini right now to check the integrity >>>> of pages. However, it would be nice for this to possible. >>> Integrity in the sense of "the file remained unchanged in >>> transit"? TLS should take care of that. In the sense "the file >>> is the one that the original author intented it to be"? >>> There are at least two attempts to deal with this: >>> If you dare to check my capsule at >>> => gemini://ew.srht.site/ >>> There are two links to openbsd-signify and NetSigil. >>> When I publish a post, my Makefile takes care to create >>> corresponding sha256 checksums. They are concatenated into one >>> file, which is then signed using my gpg key. That's one option. >>> The same information is packaged differently to >>> .well-known/signature-bundle. This file is created using >>> openbds-signify. >>> There are a few threads on the mailing list, too ... >>> gemini://gemi.dev/gemini-mailing-list/messages/005550.gmi >>> gemini://gemi.dev/gemini-mailing-list/messages/005374.gmi >>> gemini://gemi.dev/gemini-mailing-list/messages/005331.gmi >>> Also see my first post about experimenting with this: >>> => gemini://ew.srht.site/en/2020/20201217-towards-a-proper-flightlog-4.gmi >>> >>> There are two parts to this, as I see it. >>> 1. >>> Create the checksums/signature in some agreed upon format. >>> Everyone editing a capsule has to do this. While a bit tedious, >>> it still can be done manually on the shell (unix type >>> environment assumed). >>> 2. >>> Upon user request browsers have to check these agreed upon >>> locations, download the signed file, possibly download the >>> public key, cache these things properly and then do the >>> verification. I am not aware that any gemini browsers have >>> picked this up. But of course, I would be pleased to be proven >>> wrong :) >>> >>>> snip< >>> Hope this helps, >>> ~ew >> >> That would be a good (and better than my proposed) way to do it. >> It would be good to be able to automatically (with minimal user >> involvement) check the digital signatures of files. > > If you like some technical inspiration, you can check out the > Makefile, that is publishing my capsule at > https://git.sr.ht/~ew/ew.srht.site/tree/master/item/Makefile > > > Btw you did reply only to me, not the list --- just in case. > > > Cheers, > ~ew That's interesting. And I wanted to send it too the list, I clicked on the wrong button though. I will CC this one to the list, since it has the whole conversation as a quote. Cheers, ~almaember
On Tue, May 18, 2021 at 11:33:31PM +0100, nothien@uber.space wrote: > We've had discussions about other features, like including Content-Size > or Content-Length information in the response, under the same > inspiration of supporting large-file downloads (you can find them in the > mailing list archives). The overwhelming takeaway from those > discussions was that Gemini /isn't/ really designed for large-file > transactions - you want protocols with dedicated features for it, like > FTP or HTTP even, so use those instead (or make your own!). Gemini is > oriented towards typically short text files intended for direct > consumption by a human audience. Everything else, IMO, is secondary. In the Gemini tradition of multi-schema URI support you could even use a Bittorrent magnet: URI. As long as you the capsule owner are seeding the torrent, it's just as good, and doesn't require a third party hosting service even if your capsule is on restricted shared hosting. For simple files like images, PDFs, or EPUBs, Gemini itself will be fine.
On Wed, 2021-05-19, ew.gemini wrote: > Integrity in the sense of "the file remained unchanged in > transit"? TLS should take care of that. Not necessarily. The connection can be dropped before the transfer completes, or bits in the file can be flipped for various reasons. I think there is a place for simple, automated integrity checks - not just in Gemini, but also on the Web (like SRI [1], but applicable to all links). For Gemini, I once suggested using the URI fragment for this, as in "song.mp3#hash:sha256=...", which has the advantage of not requiring additional network requests. Adding hashes to all links on your capsule would not be worth the tedium, but you could use it to protect only a few larger files. It could work on links to third-party sites/capsules as well. On mismatch, clients would warn that the file either was changed or wasn't transferred correctly. Another approach is to use a well-known file, such as .well-known/SHA256SUMS - like NetSigil [2], but without the signature, reducing it to just this bit of code: ``` # Remove old SHA256SUMS file rm -f .well-known/SHA256SUMS # Generate new SHA256SUMS file hashes=$(find -type f -printf '%P\n' | sort | xargs sha256sum) echo "$hashes" > .well-known/SHA256SUMS ``` Doing it this way would require clients to make an extra network request every once in a while - and most of these requests would fail. Unusual requests contribute to client fingerprinting, so this would not be good for implicit use. It should only be done on explicit request by the user. Signatures are a more complete solution, but they're also more complex - harder for capsule admins to set up and for clients to support. We could, as tidux suggests, use magnet (or IPFS) links for large files instead, as those protocols have integrity checking built-in. However, this must be weighed against the added friction, both for server admins who must install and configure additional software and for visitors who don't have a BitTorrent or IPFS client installed. [1] https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity [2] https://tildegit.org/nervuri/NetSigil
nervuri <nervuri@disroot.org> wrote: > On Wed, 2021-05-19, ew.gemini wrote: > >Integrity in the sense of "the file remained unchanged in > >transit"? TLS should take care of that. > > Not necessarily. The connection can be dropped before the transfer > completes, or bits in the file can be flipped for various reasons. > I think there is a place for simple, automated integrity checks - not > just in Gemini, but also on the Web (like SRI [1], but applicable to all > links). Sorry, but that's just wrong. TLS already provides the mandatory close_notify signal (and there have been discussions about it before on this ML) for indicating that the complete text has been transferred. And every single authenticated encryption method provided with TLS ensures that the communicated data is the same at both ends - bit flips and the like are detected and such malformed packets are dropped appropriately. One of the mechanisms for this verification is Poly1305 - check it out if you're interested in how and why these work. > We could, as tidux suggests, use magnet (or IPFS) links for large > files instead, as those protocols have integrity checking built-in. > However, this must be weighed against the added friction, both for > server admins who must install and configure additional software and > for visitors who don't have a BitTorrent or IPFS client installed. This is the correct solution. Note that even protocols like HTTP(S) are fine (as they - in their effort to support everything - support large file transfers), so there would be little to no friction. ~aravk | ~nothien
On Thu, 2021-05-20, nothien@uber.space wrote: > Sorry, but that's just wrong. TLS already provides the mandatory > close_notify signal (and there have been discussions about it before on > this ML) for indicating that the complete text has been transferred. We can't rely on close_notify, unfortunately. According to Lupa [1], "33.3 % of URLs do NOT send a proper TLS shutdown (application close). Even 26.7 % of those who return status 20 are in that case." [1] gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi > And every single authenticated encryption method provided with TLS > ensures that the communicated data is the same at both ends - bit flips > and the like are detected and such malformed packets are dropped > appropriately. One of the mechanisms for this verification is Poly1305 > - check it out if you're interested in how and why these work. You're referring to the transfer, but data may be corrupted server-side, on disk or in RAM.
nervuri <nervuri@disroot.org> wrote: > We can't rely on close_notify, unfortunately. According to Lupa [1], > "33.3 % of URLs do NOT send a proper TLS shutdown (application close). > Even 26.7 % of those who return status 20 are in that case." > > [1] gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi If servers have not yet been fixed to use close_notify, then there's no hope that they would implement any new companion specs / technologies for providing integrity. If a user of such a server wants integrity, then they should request it of the maintainer of the server code, or switch to a different server; there are many out there with the same features. > >And every single authenticated encryption method provided with TLS > >ensures that the communicated data is the same at both ends - bit flips > >and the like are detected and such malformed packets are dropped > >appropriately. One of the mechanisms for this verification is Poly1305 > >- check it out if you're interested in how and why these work. > > You're referring to the transfer, but data may be corrupted > server-side, on disk or in RAM. Integrity on the server-side is out of the scope of Gemini, and is really an implementation detail. If a server operator decides that they need to worry about on-disk integrity, then there are already good solutions for that (e.g. RAID); and in-RAM corruption is so rare that I don't think that adding a whole Gemini feature is worth it - it would be so rare that the costs of adding it (in terms of computation and network transfer) outweigh the benefits of detecting it. In addition, in most cases of on-disk or in-RAM corruption, the end user will easily be able to tell that something went wrong, and if they find that it's a consisent issue, then they can mail the server operator and let them know. ~aravk | ~nothien
On Fri, 2021-05-21, nothien@uber.space wrote: > If servers have not yet been fixed to use close_notify, then there's no > hope that they would implement any new companion specs / technologies > for providing integrity. My suggestions don't entail changes in server software. > If a user of such a server wants integrity, then they should request it > of the maintainer of the server code, or switch to a different server; > there are many out there with the same features. The user of a pubnix or a flounder-style hosting service would likely not be in a position to determine what server is used. But they would be able to create well-known files or append hash fragments to a few links. The idea of hash fragments for third-party links would be especially interesting to explore, I think. > Integrity on the server-side is out of the scope of Gemini As are many features of current Gemini clients.
nervuri <nervuri@disroot.org> wrote: > On Fri, 2021-05-21, nothien@uber.space wrote: > >If servers have not yet been fixed to use close_notify, then there's > >no hope that they would implement any new companion specs / > >technologies for providing integrity. > > My suggestions don't entail changes in server software. But your suggestions are patch-up solutions over issues with server software. Instead of adding more complexity into the situation, push for better (more correct) server software. > >If a user of such a server wants integrity, then they should request > >it of the maintainer of the server code, or switch to a different > >server; there are many out there with the same features. > > The user of a pubnix or a flounder-style hosting service would likely > not be in a position to determine what server is used. But they would > be able to create well-known files or append hash fragments to a few > links. But they can (and should!) request their providers to update their software. Users should never have to pay for the faults of their server providers anyways. > The idea of hash fragments for third-party links would be especially > interesting to explore, I think. They would only be feasible for static files, such as tarballs etc. that never change, as otherwise any changes (e.g. fixing typos) would break hash fragments from third-party sites to the current one. The issue thus becomes that it may be misused (i.e. used in the wrong contexts); there is no way to stop this from happening, and it would only add more pain to the situation. Better to rely on explicit hash files in these situations, as is already the convention. > >Integrity on the server-side is out of the scope of Gemini > > As are many features of current Gemini clients. Such as? I'm considering specifically the communication of data over the network via the Gemini protocol. Gemini clients obviously have more things to do. ~aravk | ~nothien
On Sat, 2021-05-22, nothien@uber.space wrote: >> The idea of hash fragments for third-party links would be especially >> interesting to explore, I think. > > They would only be feasible for static files, such as tarballs etc. that > never change, as otherwise any changes (e.g. fixing typos) would break > hash fragments from third-party sites to the current one. The issue > thus becomes that it may be misused (i.e. used in the wrong contexts); > there is no way to stop this from happening, and it would only add more > pain to the situation. Mismatches can be handled gracefully in the UI - a non-intrusive notification for whoever is interested. Hashes would be used rarely anyway. > Better to rely on explicit hash files in these situations, as is > already the convention. As you say, the solution already exists. What I'm looking for are ways to make verifying hashes (and signatures) more convenient for end-users. >> >Integrity on the server-side is out of the scope of Gemini >> >> As are many features of current Gemini clients. > > Such as? I'm considering specifically the communication of data over > the network via the Gemini protocol. Gemini clients obviously have more > things to do. Subscribing to pages, for example [1]. This is done by storing the hash of a page, then fetching it periodically and notifying the user if it changes. Vaguely similar to what I'm suggesting. [1] https://github.com/makeworld-the-better-one/amfora/wiki/Subscriptions#pages Anyway, no point dragging on. The niche for this thing may be too small to justify the implementation effort, but if anyone likes the idea, they are free to code it. I might do it at some point.
On Tue, May 18, 2021 at 09:48:31PM +0200, Almaember <almaember@disroot.org> wrote a message of 26 lines which said: > I know that there is no way in Gemini right now to check the integrity of > pages. However, it would be nice for this to possible. > > I think this feature is pretty much required for the protocol to be taken > seriously, The Web does not have such a feature. Does it mean it is not taken seriously?
On Fri, May 21, 2021 at 07:08:22PM +0100, nothien@uber.space <nothien@uber.space> wrote a message of 37 lines which said: > and in-RAM corruption is so rare Unfortunately, this does not seem true. Not everyone uses ECC memory. http://dinaburg.org/bitsquatting.html http://www.verisigninc.com/assets/VRSN_Bitsquatting_TR_20120320.pdf https://www.slideshare.net/nicknikiforakis/bitsquatting-exploiting-bitflips -for-fun-or-profit https://en.wikipedia.org/wiki/Soft_error
On 30/05/2021 10:39, Stephane Bortzmeyer wrote: > On Tue, May 18, 2021 at 09:48:31PM +0200, > Almaember <almaember@disroot.org> wrote > a message of 26 lines which said: > >> I know that there is no way in Gemini right now to check the integrity of >> pages. However, it would be nice for this to possible. >> >> I think this feature is pretty much required for the protocol to be taken >> seriously, > > The Web does not have such a feature. Does it mean it is not taken > seriously? > The web however has a Length header in HTTP. The point isn't to perform overly complicated checks, but to be able to somewhat tell if the whole file got transferred successfully. HTML also seems to allow integrity checks sometimes, such as in CSS and script tags (IIRC)
On Sun, May 30, 2021 at 01:38:54PM +0200, Almaember <almaember@disroot.org> wrote a message of 21 lines which said: > The web however has a Length header in HTTP. The point isn't to perform > overly complicated checks, but to be able to somewhat tell if the whole file > got transferred successfully. Since Gemini requires TLS, the proper solution is to require an explicit TLS close <https://gitlab.com/gemini-specification/protocol/-/issues/2>.
Hi Stephane again, the issue with the Unicode characters is fixed. - At least half way... - It works for diacritic characters, like the "ü" in my lastname: . . | . \ | / '. \ ' / .' '. .'```'. .' <>...:::::::`.......`:::::::..<> <>: Frank Jüdes :<> <>:..........................:<> <><><><><><><><><><><><><><><><> But it will not work with other Unicode characters, for example emoticons: . . | . \ | / '. \ ' / .' '. .'```'. .' <>...:::::::`.......`:::::::..<> <>: Frank Jüdes is a 🦆 :<> <>:..........................:<> <><><><><><><><><><><><><><><><> The string-length of multibyte-characters is not calculated correctly by the /boxes/ program. Its a problem with the /boxes/ program that i cannot fix. 😟 Can you please check your string again and tell me if it works now? - Thank you very much in advance for your help. Best regards from Charleston (WV), Frank/2 On 2021-05-30 07:51, Stephane Bortzmeyer wrote: > On Sun, May 30, 2021 at 01:38:54PM +0200, > Almaember <almaember@disroot.org> wrote > a message of 21 lines which said: > >> The web however has a Length header in HTTP. The point isn't to perform >> overly complicated checks, but to be able to somewhat tell if the whole file >> got transferred successfully. > Since Gemini requires TLS, the proper solution is to require an > explicit TLS close > <https://gitlab.com/gemini-specification/protocol/-/issues/2>. -- ------------------------------------------------------------------------ My Gemini capsule orbits at gemini://h2903872.stratoserver.net/ ------------------------------------------------------------------------
Ooops! - Wron thread! Sorry about that! ☺️ On 2021-06-04 16:10, Frank Jüdes wrote: > > Hi Stephane again, > > the issue with the Unicode characters is fixed. - At least half way... - It works for diacritic characters, like the "ü" in my lastname: > > . > . | . > \ | / > '. \ ' / .' > '. .'```'. .' > <>...:::::::`.......`:::::::..<> > <>: Frank Jüdes :<> > <>:..........................:<> > <><><><><><><><><><><><><><><><> > > But it will not work with other Unicode characters, for example emoticons: > > . > . | . > \ | / > '. \ ' / .' > '. .'```'. .' > <>...:::::::`.......`:::::::..<> > <>: Frank Jüdes is a 🦆 :<> > <>:..........................:<> > <><><><><><><><><><><><><><><><> > > The string-length of multibyte-characters is not calculated correctly by the /boxes/ program. Its a problem with the /boxes/ program that i cannot fix. 😟 > > Can you please check your string again and tell me if it works now? - Thank you very much in advance for your help. > > Best regards from Charleston (WV), > Frank/2 > > On 2021-05-30 07:51, Stephane Bortzmeyer wrote: >> On Sun, May 30, 2021 at 01:38:54PM +0200, >> Almaember<almaember@disroot.org> wrote >> a message of 21 lines which said: >> >>> The web however has a Length header in HTTP. The point isn't to perform >>> overly complicated checks, but to be able to somewhat tell if the whole file >>> got transferred successfully. >> Since Gemini requires TLS, the proper solution is to require an >> explicit TLS close >> <https://gitlab.com/gemini-specification/protocol/-/issues/2>. > -- ------------------------------------------------------------------------ > My Gemini capsule orbits at gemini://h2903872.stratoserver.net/ > ------------------------------------------------------------------------ -- ------------------------------------------------------------------------ Oracle <http://www.oracle.com> Frank Jüdes | Senior Principal Consultant | +1.713.885.4421 Oracle America Inc. 1200 Smith Street, Suite 1500 Houston, TX 77002 ------------------------------------------------------------------------
---
Previous Thread: Opening of the first modules a new Gemini station
Next Thread: [users] Introducing gemthought, a format for micro gemlogging