💾 Archived View for ageinghacker.net captured on 2024-08-18 at 16:57:44. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-03-01)
-=-=-=-=-=-=-
Welcome to Luca Saiu's experimental Gemini capsule hosted on ageinghacker.net.
The capsule was opened on 2022-01-16.
I plan to actually publish data and use this.
Written by Luca Saiu.
Luca Saiu's Gopher hole (still without much content)
This is the result of machine-converting an HTML page to Gemini markup.
The source HTML is originally machine-generated, but not too complex:
The web page I am using as a test...
The conversion is unfortunately quite disappointing:
...the page above machine-converted to Gemini
My script handles Gopher as well, and the result is similar. See:
...the page above machine-converted to Gopher
Here is a Unicode test using everybody's favourite code point: 💩.
Let me see how a lot of text gets rendered. Here I am interested in a very long paragraph, which in the source file is just a single long line, containing no \n or \r character. How will this be rendered? Of course it depends on the client: now I am testing with lagrange which, I have to say, looks quite stunning.
Here comes a preformatted mode test:
(defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
And here comes a list:
This is quoted text:
foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux.
I do not plan to actually use these features; here I am playing to see what clients support.
(This section used to contain some ANSI terminal escape sequences. The ones
setting colour appear to work well across the clients I tested.)
(2022-01-23 update: see Message-ID: <ssjhge$k74$1@dont-email.me>, which also includes my ugly conversion script:
From: Luca Saiu Subject: Simple conversions from HTML to simple markups are disappointing Newsgroups: comp.infosystems.gemini, comp.infosystems.gopher Date: Sun, 23 Jan 2022 13:25:29 +0100(CET)
news://ssjhge$k74$1@dont-email.me
)
I should be able to turn my HTML, which I kept simple (I at least had the good taste of never using Javascript), into something usable by both Gemini and Gopher, starting from something like this:
lynx -nomargins -dont_wrap_pre -force_html -image_links -dump http://ageinghacker.net
After that I can massage the output to turn the reference section into a Gopher-style menu. Notice that -nomargins serves to avoid formatting paragraphs with newlines, which is useful for Gemini but not for Gopher. In the case of Gopher I can use a run of fmt.
However -nomargins makes it difficult to distinguish headings (h1, h2 and so on) from ordinary text.
Interesting options:
Anticipated problems:
Notice that if the locale is C Lynx seems to change the character encoding -- at least in my pages, which contain some French but are all UTF-8 -- to ISO-8859. This is easy to undo by defining LC_ALL to be empty. Tested on the server as well.
Compare:
[luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/scratch | head --lines=15 | tail --lines=6 | file - /dev/stdin: ISO-8859 text
with
[luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/scratch | head --lines=15 | tail --lines=6 | file - /dev/stdin: UTF-8 Unicode text
On the other hand, maybe that is a useful feature. In Gemini UTF-8 should be well-supported, but Gopher clients might work better with an old-style character encoding. Is there a way to declare the encoding? If I actually start writing in Russian I could use a traditional Russian encoding.
Let me play with Russian text.
the test page with Russian text I am using for this conversion attempt
[luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 [1]Luka Sayu -- Russkij Privet. Menya zovut Luka Sayu, ili �Luca Saiu�. Priyatno poznakomit'sya. `Eta stranica test po-russki. �Test� slovo dejstvitel'noe ? Mo'zhet byt' ya dolzhen pisat' �proba�. Izvinite, ya eschio novichok po-russki. ___________________________________________________________________________ ___________________________________________________________________________ [2][haker simbol] [3]Luka Sayu (glavnaya stranica, po-anglijski) Poslednyaya modifikaciya: 2022-01-17 ___________________________________________________________________________ [luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 | file - /dev/stdin: ISO-8859 text
It transliterates! Which is awesome in a sense; but I guess iconv does the actual job. A pity that the guillemets “«” and “»” are not translated correctly, and in fact (according to what lagrange displays for Gemini, not in the source) those characters even garble the quoted text and more. I could make a separate pass to convert them if I used this transliteration to ISO-8859.
I cannot seem to get lynx to output any variant of KOI8 on abelson; I probably did not configure those locale data there. This probably does the right thing on my laptop, even if the file utility does not recognise the text:
[luca@moore ~/projects-by-others/lagrange/lagrange-git/_build]$ LC_ALL=ru_RU.KOI8-R lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 | file - /dev/stdin: JSON data [luca@moore ~/projects-by-others/lagrange/lagrange-git/_build]$ LC_ALL=ru_RU.KOI8-R lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 > possibly-koi-8
Of course there is no problem if I do not usa a locale setting to artificially constraint what the conversion can do and let the system use UTF-8:
[luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 [1]Лука Саю — Русский Привет. Меня зовут Лука Саю, или «Luca Saiu». Приятно познакомиться. Эта страница тест по-русски. «Тест» слово действительное ? Мо́жет быть я должен писать «проба». Извините, я ещё новичок по-русски. ___________________________________________________________________________ ___________________________________________________________________________ [2][хакер симбол] [3]Лука Саю (главная страница, по-английски) Последняя модификация: 2022-01-17 ___________________________________________________________________________ [luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 | file - /dev/stdin: UTF-8 Unicode text
So at least for Gemini this solution is viable.