💾 Archived View for ageinghacker.net captured on 2023-04-19 at 22:09:59. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2022-03-01)

-=-=-=-=-=-=-

Luca Saiu

Welcome to Luca Saiu's experimental Gemini capsule hosted on ageinghacker.net.

The capsule was opened on 2022-01-16.

I plan to actually publish data and use this.

About

Written by Luca Saiu.

Luca Saiu's web site

Luca Saiu's Gopher hole (still without much content)

An HTML-to-Gemini conversion test:

This is the result of machine-converting an HTML page to Gemini markup.

The source HTML is originally machine-generated, but not too complex:

The web page I am using as a test...

The conversion is unfortunately quite disappointing:

...the page above machine-converted to Gemini

My script handles Gopher as well, and the result is similar. See:

...the page above machine-converted to Gopher

Tests

Here is a Unicode test using everybody's favourite code point: 💩.

This is a picture

This is a text file

Go to another node

Let me see how a lot of text gets rendered. Here I am interested in a very long paragraph, which in the source file is just a single long line, containing no \n or \r character. How will this be rendered? Of course it depends on the client: now I am testing with lagrange which, I have to say, looks quite stunning.

Here comes a preformatted mode test:

(defun fact (n)
  (if (zerop n)
      1
    (* n (fact (1- n)))))

And here comes a list:

This is quoted text:

foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux foo bar quux.

Playing dirty

I do not plan to actually use these features; here I am playing to see what clients support.

(This section used to contain some ANSI terminal escape sequences. The ones

setting colour appear to work well across the clients I tested.)

Generating Gemini and Gopher from HTML?

(2022-01-23 update: see Message-ID: <ssjhge$k74$1@dont-email.me>, which also includes my ugly conversion script:

From: Luca Saiu
Subject: Simple conversions from HTML to simple markups are disappointing
Newsgroups: comp.infosystems.gemini, comp.infosystems.gopher
Date: Sun, 23 Jan 2022 13:25:29 +0100(CET)

news://ssjhge$k74$1@dont-email.me

)

I should be able to turn my HTML, which I kept simple (I at least had the good taste of never using Javascript), into something usable by both Gemini and Gopher, starting from something like this:

lynx -nomargins -dont_wrap_pre -force_html -image_links -dump http://ageinghacker.net

After that I can massage the output to turn the reference section into a Gopher-style menu. Notice that -nomargins serves to avoid formatting paragraphs with newlines, which is useful for Gemini but not for Gopher. In the case of Gopher I can use a run of fmt.

However -nomargins makes it difficult to distinguish headings (h1, h2 and so on) from ordinary text.

Lynx and my script

Interesting options:

Anticipated problems:

Character encoding, Lynx and Gopher

Notice that if the locale is C Lynx seems to change the character encoding -- at least in my pages, which contain some French but are all UTF-8 -- to ISO-8859. This is easy to undo by defining LC_ALL to be empty. Tested on the server as well.

Compare:

[luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/scratch | head --lines=15 | tail --lines=6 | file -
/dev/stdin: ISO-8859 text

with

[luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/scratch | head --lines=15 | tail --lines=6 | file -
/dev/stdin: UTF-8 Unicode text

On the other hand, maybe that is a useful feature. In Gemini UTF-8 should be well-supported, but Gopher clients might work better with an old-style character encoding. Is there a way to declare the encoding? If I actually start writing in Russian I could use a traditional Russian encoding.

Let me play with Russian text.

the test page with Russian text I am using for this conversion attempt

[luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 
                            [1]Luka Sayu -- Russkij

Privet. Menya zovut Luka Sayu, ili �Luca Saiu�. Priyatno poznakomit'sya.

`Eta stranica test po-russki.

�Test� slovo dejstvitel'noe ? Mo'zhet byt' ya dolzhen pisat' �proba�. Izvinite, ya eschio novichok po-russki.
  ___________________________________________________________________________
  ___________________________________________________________________________

[2][haker simbol]
[3]Luka Sayu (glavnaya stranica, po-anglijski)
Poslednyaya modifikaciya: 2022-01-17
  ___________________________________________________________________________

[luca@abelson ~]$ LC_ALL=C lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 | file -
/dev/stdin: ISO-8859 text

It transliterates! Which is awesome in a sense; but I guess iconv does the actual job. A pity that the guillemets “«” and “»” are not translated correctly, and in fact (according to what lagrange displays for Gemini, not in the source) those characters even garble the quoted text and more. I could make a separate pass to convert them if I used this transliteration to ISO-8859.

I cannot seem to get lynx to output any variant of KOI8 on abelson; I probably did not configure those locale data there. This probably does the right thing on my laptop, even if the file utility does not recognise the text:

[luca@moore ~/projects-by-others/lagrange/lagrange-git/_build]$ LC_ALL=ru_RU.KOI8-R lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8  | file -
/dev/stdin: JSON data
[luca@moore ~/projects-by-others/lagrange/lagrange-git/_build]$ LC_ALL=ru_RU.KOI8-R lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 > possibly-koi-8

The possibly-koi-8 file

Of course there is no problem if I do not usa a locale setting to artificially constraint what the conversion can do and let the system use UTF-8:

[luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 
                     [1]Лука Саю — Русский

Привет. Меня зовут Лука Саю, или «Luca Saiu». Приятно познакомиться.

Эта страница тест по-русски.

«Тест» слово действительное ? Мо́жет быть я должен писать «проба». Извините, я ещё новичок по-русски.
  ___________________________________________________________________________
  ___________________________________________________________________________

[2][хакер симбол]
[3]Лука Саю (главная страница, по-английски)
Последняя модификация: 2022-01-17
  ___________________________________________________________________________

[luca@abelson ~]$ LC_ALL= lynx -image_links -nomargins -dont_wrap_pre -force_html -dump http://ageinghacker.net/test-russian/ | head --lines=-8 | file -
/dev/stdin: UTF-8 Unicode text

So at least for Gemini this solution is viable.