💾 Archived View for gemi.dev › gemini-mailing-list › 000840.gmi captured on 2024-08-19 at 02:16:29. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
Hello, I'm planning on creating my own Gemini client (with C99 and OpenSSL, if you want to know). However, I have a problem with how I should be implementing the downloading part. So, how should I be buffering a Gemini connection? Since there is no size indicator, I can't be sure everything even fits into memory, or how much time it's going to take to download, or if it's even going to end at all. Another problem is long-polling. Through my exploration of Geminispace, I found a few capsules that used a hack to push live updates to the client. Specifically, these servers didn't close the connection, but instead kept it open and just sent new data when something happened. Since the spec says nothing (or I'm blind) about buffering, so I want to ask others about what I should be doing, while still being able to actually parse the output. My ideas as of right now: - Have a big buffer and store everything in it until the connection is closed. The connection would be terminated if: - the connection closes - the server didn't send anything for X seconds - Buffer by line and don't close the connection unless the user terminates it (by pressing the stop key or by loading another capsule). Compatible with long polling. Thanks in advance for any help! ~almaember
On 3/28/21 11:07 PM, almaember wrote: > or how much time it's going to take to download, or if it's even going > to end at all. As far as I'm aware no client has solved this. If TLS close_notify gets standardized we might get a "connection failed" state, but other than that you won't be able to know how long the content is and whether you successfully downloaded it or not. > My ideas as of right now: > > - Have a big buffer and store everything in it until the connection is > closed. The connection would be terminated if: > - the connection closes > - the server didn't send anything for X seconds If I know which client you use I can just send a huge file over and possibly overflow your buffer. And it needs to be BIG. I'm aware of at least one capsule serving ~70MB files. (gemini://konpeito.media) > - Buffer by line and don't close the connection unless the user > terminates it (by pressing the stop key or by loading another > capsule). Compatible with long polling. This is what I do on Moonlander. Not by line, as binary data can be sent over Gemini, but in 1029 byte chunks (exactly the max length of one Gemini header. Saves me code in trying to split that too) I send each chunk into a separate "renderer" for the given file type, which in text's case decodes it into the given encoding (default UTF-8) and sends it over to the text/gemini parser which then does the line splitting and parsing there. (And in other cases it will just save the bytes directly to a file) It's a bit more work, but I have successfully downloaded a few of those 70MB files with this approach, so I'd say it's pretty robust.
On Mon, 29 Mar 2021 07:17:31 +0300 Ecmel Berk Canlıer <me@ecmelberk.com> wrote: > This is what I do on Moonlander. Not by line, as binary data can be > sent over Gemini, but in 1029 byte chunks (exactly the max length of > one Gemini header. Saves me code in trying to split that too) That makes sense, it will even save me from having to mess with malloc and realloc. > I send each chunk into a separate "renderer" for the given file type, > which in text's case decodes it into the given encoding (default > UTF-8) and sends it over to the text/gemini parser which then does > the line splitting and parsing there. (And in other cases it will > just save the bytes directly to a file) One question, what do you do if a single line is split into multiple chunks? I will try checking your source for it, but I don't really understand or know Rust so I'm not sure I will understand it. ~almaember
On 3/29/21 9:22 AM, almaember wrote: > One question, what do you do if a single line is split into multiple > chunks? https://git.ebc.li/_/moonlander/src/branch/main/gemtext/src/lib.rs The chunked_parse (Line 160) function reads character by character, saving everything into an "incomplete line" buffer (Line 31), and when it encounters a newline it empties and processes the incomplete line buffer and returns the list of "line objects" corresponding to the markup inside the buffer. When the connection is closed, the finalize function (Line 221) runs and parses the incomplete line buffer one last time to complete the document.
almaember <almaember@disroot.org> writes: > Hello, > > I'm planning on creating my own Gemini client (with C99 and OpenSSL, if > you want to know). However, I have a problem with how I should be > implementing the downloading part. > > So, how should I be buffering a Gemini connection? Since there is no > size indicator, I can't be sure everything even fits into memory, or > how much time it's going to take to download, or if it's even going to > end at all. > > Another problem is long-polling. Through my exploration of Geminispace, > I found a few capsules that used a hack to push live updates to the > client. Specifically, these servers didn't close the connection, but > instead kept it open and just sent new data when something happened. > > Since the spec says nothing (or I'm blind) about buffering, so I want > to ask others about what I should be doing, while still being able to > actually parse the output. > > My ideas as of right now: > > - Have a big buffer and store everything in it until the connection is > closed. The connection would be terminated if: > - the connection closes > - the server didn't send anything for X seconds > - Buffer by line and don't close the connection unless the user > terminates it (by pressing the stop key or by loading another > capsule). Compatible with long polling. > > Thanks in advance for any help! > > > ~almaember I'm working on a client and doing exactly as Ecmel Berk Canlıer outlined in his mail. I'm fetching the page one chunk at the time (it allows me to use a fixed buffer and plays nice with asynchronous I/O) then get sent to a proper renderer (atm only text/gemini and one generic text/* handler). There it gets parsed and split line by line. The only part where I have to actually keep a dynamic buffer is when splitting in lines, because a line can (and often is) splitted across chunks, and possibly also bigger than one chunk. I'm using LibreSSL (plus some other OpenBSD goodies), but if it saves you some time here is:
On Mon, Mar 29, 2021 at 07:17:31AM +0300, Ecmel Berk Canlıer <me@ecmelberk.com> wrote a message of 36 lines which said: > As far as I'm aware no client has solved this. If TLS close_notify > gets standardized we might get a "connection failed" state, Note that the question is currently under discussion in the specification work <https://gitlab.com/gemini-specification/protocol/-/issues/2>.
---