💾 Archived View for wilmhit.pw › blog › forgotten-html.gmi captured on 2024-05-26 at 14:40:42. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-03-21)
-=-=-=-=-=-=-
---
title: "Forgotten HTML"
date: 2024-02-29
draft: false
---
Hi~! I gotta be honest. I have unhealthy fascination by old internet
technologies (yes - many such cases). There is something about how browsers
went from being essentially document viewers to being fully-featured
application platforms. Understanding this may help understand the place we're
in today with all the mistakes we made along.
<!--more-->
However today we won't be contemplating why current internet standards are a
mess. I'll save it for (few) other posts. Today I'm reviewing old HTML 2.0
standard and I'll pick few elements that were forgotten as years passed.
HTML 2.0 was the first **formal** HTML specification. It was released by IEFT
HTML Working Group in November 1995 and was meant to replace all various
implementation used before.
[HTML 2.0 Spec](https://datatracker.ietf.org/doc/html/rfc1866)
These today are displayed as unordered lists. `<DIR>` is deprecated now. It
was used to list contents of directory. I don't really understand why it went
away. Directory listings were and always will be an important part of the
internet.
`<MENU>` is still in a standard. While you can use it, it will bring no real
benefits over using `<ul>`. It's a shame. We had semantics in html and we just
chose to ignore them.
This tag today will give you just monospaced font. This is neither `<pre>` or
`<code>` as all the formatting will be redone the usual way: removing double
spaces newlines etc.
Today this feature is removed from standard. W3Schools recommends you use:
<p style="font-family:'Lucida Console', monospace">This text is monospace text.</p>
What an awful solution... Especially that `tt` did not open new paragraph.
This feature lets you create a line break longer then what normal line brake
looks like. Clear can take 3 values: left, right, both. If you add clear=left
the agent will create line break long enough to flow next line to the left side
of the document. This can be needed for example to miss the image.
In a separate document there exists entire section dedicated to shaming
browsers for skipping unrecognized markup. The spec authors didn't like how
when browsers encountered tags that were not known, it would just be
skipped. The skipped parts could contain important information needed to
understand whole document.
https://www.w3.org/MarkUp/WD-doctypes
This is led to creation of HMTL strict mode. This mode enables more validation
features of html. If document uses alternative SGML DTD (doctype) agents should
inform user about missing features rather than ignoring them without user
knowledge.
Another feature that is reflected as different DTDs are HTML levels. Authors of
the specification (for some reason) did not expect to all agents to support all
the new features of HTML 2.0. Forms in particular.
So HTML 2 is not one Document Type Definition. It is 4:
- Level 1
- Level 1 Strict
- Level 2
- Level 1 Strict
This is actually quite a feature. Today many programs present you keyboard
shortcuts as this cute little buttons so know right away that this is meant to
show a button. HTML has that too!
![Button-like shortcut hint apearance](/images/shortcut-buttons.jpg)
Before you get all excited I regret to inform you that all browsers implement
this feature by only showing it just in monospaced font. But yes, you can do
this:
<p>Press <kbd>Ctrl</kbd> + <kbd>C</kbd> to copy text (Windows).</p>
By writing a bit of CSS you can get closer to button-like appearance.
URN was meant to disclose what is the permanent address of the anchored
resource. It can be different than where HREF points to.
\<A> has many more attributes. Some of them are used, like REL (that describes
the relation of the anchored resource to current one). Some of them were only
advisory.
You can denote title like this and even hint the methods allowed:
<A HREF="..." TITLE="CROW manual" METHODS=GET>How to design streets</a>
This is a really weird mechanism that still works today. What I want to
demonstrate here is different from the image maps you probably know of.
So defining a map via <MAP> tag and areas is something that most of you know.
It was added in HTML 3.2.
HTML 2.0 already had a funny mechanism that would let you redirect user to
different places based on the part of the image they clicked on. On Wikipedia it is
described as server-side image maps.
The trick is to enclose \<IMG> in a \<A> tag and add ISMAP attribute. Like
this:
<A HREF="/cgi/campus-redir"> <IMG ISMAP SRC="/img/campus.jpg" ALT="map of campus"> </A>
Now when user click on pixel 666,999 of the image they will be redirected to
`/cgi/campus-redir?666,999` instead of just `/cgi/campus-redir`. This has
advantage of hiding all the complex polygonal areas from the markup file and
making it more readable. On the other hand with newer map approach we can
inform user agent were given areas are and name them. That can be more
machine parsable (i.e. for screen readers).
I really don't advise you to use neither of those approaches. They are both
really bad if you want to ensure that your page works everywhere. This part of
HTML has not aged well!
This old alternative to \<CODE> is a little bit like \<PRE>. I actually don't
know it's purpose as all 3 tags were present at the time of HTML 2.0.
It may be the case that all 3 of them were widely used beforehand and because
HTML 2.0 was meant to bring together all the standards that came before, it was
decided to keep it for compatibility reasons.
There is also \<LISTING> that I've seen somewhere used. Now that I need to find
docs to it, it's all gone. And we also have \<PLAINTEXT>. So there are 5 tags
in total that serve similar purpose. I'm lost.
From w3schools:
compact - Specifies that the list should render smaller than normal
Available for all list elements. Not supported anyway.
Oh it's just another tag that confirms that semantic HTML was a thing before
all those new \<main> \<content> \<nav> additions. You know what? \<div>
wasn't even a thing at this time. It was introduced in HTML 3.
So yeah. If you were to write an address, you write it in this tag. Still
widely supported and not deprecated at all. It gives you nothing in return.
Browser will apply italics to it. I would consider using it best practice.
There are few weird things about forms. I've already described that handling
them is optional feature.
The other feature is that we can leverage different encoding type then
`multipart/form-data` and `application/x-www-form-urlencoded`. We will wait
until HTML 5 comes out until we're given any other encoding type.
What somewhat strikes me is the in a example given by the authors.
All fields default to empty strings but one: gender defaults to male. :woozy:
Yes - there was no \<DIV>s in HTML 2 but \<META> was already present. It is
defined as an optional tag to provide extra information to user agent.
I'm not to keen on such solutions. After all these years many of those keys are
now standardized. We. must now use key-value pairs that look like some kind of
non-standard addition but in fact are quite standard. If we did not have it
we could not abuse it like that. Wouldn't it be great if instead of:
<meta name=viewport content="width=device-width, initial-scale=1" />
we were doing this:
<viewport width=device-width initial-scale=1 />
As for other funny HEAD things: title is mandatory. It is recommended that it is
under 64 bytes (yes bytes not chars; encoding may be different) but there is no
upper limit.
Indexes can be used to link multiple pages. Page is an index if it contains
\<ISINDEX> tag in \<HEAD>. It is functionally connected to \<BASE>. It tells
user agents to prompt users for a single text input. This input will be
query-encoded after URL contained in \<BASE>. This feature does not require
HTML level 2 so this is a very primitive form input that should work with all
browsers.
This tag was already deprecated when HTML 2.0 came out. From specification:
The <NEXTID> element is included for historical reasons only. HTML
documents should not contain <NEXTID> elements.
The <NEXTID> element gives a hint for the name to use for a new <A>
element when editing an HTML document. It should be distinct from all
NAME attribute values on <A> elements. For example:
<NEXTID N=Z27>
The DTD for HTML 2.0 does not specify \<HTML> tag. There are examples that
contain it. On the other hand you may also find examples like this:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HEAD> <TITLE>Introduction to HTML</TITLE> </HEAD> <BODY> Body </BODY>
Accessibility is something that you may hear from time to time during
development of modern web-based applications. Everyone treats it like an egg
and everyone will tell you that it is really important. Yet many websites lack
basic accessibility features.
During the design of HTML some care was put to make it so you can transform it
to ICADD. What is ICADD?
ICADD is an application of SGML - very much like HTML. It defines some basic
markup tags that need to be implemented to work with Braile readers and
computer voice. It worked before screen readers were smart enough to read you a
complex web page.
[ICADD ISO](https://xml.coverpages.org/ICADDiso.html)
HMTL to ICADD could be transformed. There were measures in place to make it
easy. Of course, because ICADD defines different tags (and generally less then
HTML) such transformation would be lossy.
What I see as loss compared to today's HTML 5 (that isn't even SGML application
anymore) is that we cannot do this anymore! Screen readers instead try to use
variety of techniques (some of them are part of the spec) to gain context on
given content.