💾 Archived View for wilmhit.pw › blog › forgotten-html.gmi captured on 2024-05-26 at 14:40:42. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-03-21)

-=-=-=-=-=-=-

---

title: "Forgotten HTML"

date: 2024-02-29

draft: false

---

Hi~! I gotta be honest. I have unhealthy fascination by old internet

technologies (yes - many such cases). There is something about how browsers

went from being essentially document viewers to being fully-featured

application platforms. Understanding this may help understand the place we're

in today with all the mistakes we made along.

<!--more-->

However today we won't be contemplating why current internet standards are a

mess. I'll save it for (few) other posts. Today I'm reviewing old HTML 2.0

standard and I'll pick few elements that were forgotten as years passed.

Background

HTML 2.0 was the first **formal** HTML specification. It was released by IEFT

HTML Working Group in November 1995 and was meant to replace all various

implementation used before.

[HTML 2.0 Spec](https://datatracker.ietf.org/doc/html/rfc1866)

\<MENU> and \<DIR> tags

These today are displayed as unordered lists. `<DIR>` is deprecated now. It

was used to list contents of directory. I don't really understand why it went

away. Directory listings were and always will be an important part of the

internet.

`<MENU>` is still in a standard. While you can use it, it will bring no real

benefits over using `<ul>`. It's a shame. We had semantics in html and we just

chose to ignore them.

Typewriter text

This tag today will give you just monospaced font. This is neither `<pre>` or

`<code>` as all the formatting will be redone the usual way: removing double

spaces newlines etc.

Today this feature is removed from standard. W3Schools recommends you use:

<p style="font-family:'Lucida Console', monospace">This text is monospace text.</p>

What an awful solution... Especially that `tt` did not open new paragraph.

'CLEAR' attribute to \<BR>

This feature lets you create a line break longer then what normal line brake

looks like. Clear can take 3 values: left, right, both. If you add clear=left

the agent will create line break long enough to flow next line to the left side

of the document. This can be needed for example to miss the image.

Unrecognized markup, strict mode, HMTL levels

In a separate document there exists entire section dedicated to shaming

browsers for skipping unrecognized markup. The spec authors didn't like how

when browsers encountered tags that were not known, it would just be

skipped. The skipped parts could contain important information needed to

understand whole document.

https://www.w3.org/MarkUp/WD-doctypes

This is led to creation of HMTL strict mode. This mode enables more validation

features of html. If document uses alternative SGML DTD (doctype) agents should

inform user about missing features rather than ignoring them without user

knowledge.

Another feature that is reflected as different DTDs are HTML levels. Authors of

the specification (for some reason) did not expect to all agents to support all

the new features of HTML 2.0. Forms in particular.

So HTML 2 is not one Document Type Definition. It is 4:

- Level 1

- Level 1 Strict

- Level 2

- Level 1 Strict

Keyboard tag

This is actually quite a feature. Today many programs present you keyboard

shortcuts as this cute little buttons so know right away that this is meant to

show a button. HTML has that too!

![Button-like shortcut hint apearance](/images/shortcut-buttons.jpg)

Before you get all excited I regret to inform you that all browsers implement

this feature by only showing it just in monospaced font. But yes, you can do

this:

<p>Press <kbd>Ctrl</kbd> + <kbd>C</kbd> to copy text (Windows).</p>

By writing a bit of CSS you can get closer to button-like appearance.

URN attribute to \<A>

URN was meant to disclose what is the permanent address of the anchored

resource. It can be different than where HREF points to.

\<A> has many more attributes. Some of them are used, like REL (that describes

the relation of the anchored resource to current one). Some of them were only

advisory.

You can denote title like this and even hint the methods allowed:

<A HREF="..." TITLE="CROW manual" METHODS=GET>How to design streets</a>

Image maps

This is a really weird mechanism that still works today. What I want to

demonstrate here is different from the image maps you probably know of.

So defining a map via <MAP> tag and areas is something that most of you know.

It was added in HTML 3.2.

HTML 2.0 already had a funny mechanism that would let you redirect user to

different places based on the part of the image they clicked on. On Wikipedia it is

described as server-side image maps.

The trick is to enclose \<IMG> in a \<A> tag and add ISMAP attribute. Like

this:

<A HREF="/cgi/campus-redir">
    <IMG ISMAP SRC="/img/campus.jpg" ALT="map of campus">
</A>

Now when user click on pixel 666,999 of the image they will be redirected to

`/cgi/campus-redir?666,999` instead of just `/cgi/campus-redir`. This has

advantage of hiding all the complex polygonal areas from the markup file and

making it more readable. On the other hand with newer map approach we can

inform user agent were given areas are and name them. That can be more

machine parsable (i.e. for screen readers).

I really don't advise you to use neither of those approaches. They are both

really bad if you want to ensure that your page works everywhere. This part of

HTML has not aged well!

\<XMP>

This old alternative to \<CODE> is a little bit like \<PRE>. I actually don't

know it's purpose as all 3 tags were present at the time of HTML 2.0.

It may be the case that all 3 of them were widely used beforehand and because

HTML 2.0 was meant to bring together all the standards that came before, it was

decided to keep it for compatibility reasons.

There is also \<LISTING> that I've seen somewhere used. Now that I need to find

docs to it, it's all gone. And we also have \<PLAINTEXT>. So there are 5 tags

in total that serve similar purpose. I'm lost.

COMPACT attribute

From w3schools:

compact - Specifies that the list should render smaller than normal

Available for all list elements. Not supported anyway.

\<ADDRESS> tag

Oh it's just another tag that confirms that semantic HTML was a thing before

all those new \<main> \<content> \<nav> additions. You know what? \<div>

wasn't even a thing at this time. It was introduced in HTML 3.

So yeah. If you were to write an address, you write it in this tag. Still

widely supported and not deprecated at all. It gives you nothing in return.

Browser will apply italics to it. I would consider using it best practice.

Form quirks

There are few weird things about forms. I've already described that handling

them is optional feature.

The other feature is that we can leverage different encoding type then

`multipart/form-data` and `application/x-www-form-urlencoded`. We will wait

until HTML 5 comes out until we're given any other encoding type.

What somewhat strikes me is the in a example given by the authors.

All fields default to empty strings but one: gender defaults to male. :woozy:

HEAD

Yes - there was no \<DIV>s in HTML 2 but \<META> was already present. It is

defined as an optional tag to provide extra information to user agent.

I'm not to keen on such solutions. After all these years many of those keys are

now standardized. We. must now use key-value pairs that look like some kind of

non-standard addition but in fact are quite standard. If we did not have it

we could not abuse it like that. Wouldn't it be great if instead of:

<meta name=viewport content="width=device-width, initial-scale=1" />

we were doing this:

<viewport width=device-width initial-scale=1 />

As for other funny HEAD things: title is mandatory. It is recommended that it is

under 64 bytes (yes bytes not chars; encoding may be different) but there is no

upper limit.

Indexes

Indexes can be used to link multiple pages. Page is an index if it contains

\<ISINDEX> tag in \<HEAD>. It is functionally connected to \<BASE>. It tells

user agents to prompt users for a single text input. This input will be

query-encoded after URL contained in \<BASE>. This feature does not require

HTML level 2 so this is a very primitive form input that should work with all

browsers.

\<NEXTID>

This tag was already deprecated when HTML 2.0 came out. From specification:

The <NEXTID> element is included for historical reasons only. HTML

documents should not contain <NEXTID> elements.

The <NEXTID> element gives a hint for the name to use for a new <A>

element when editing an HTML document. It should be distinct from all

NAME attribute values on <A> elements. For example:

<NEXTID N=Z27>

No \<HTML> tag

The DTD for HTML 2.0 does not specify \<HTML> tag. There are examples that

contain it. On the other hand you may also find examples like this:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HEAD>
    <TITLE>Introduction to HTML</TITLE>
</HEAD>
<BODY>
    Body
</BODY>

ICADD

Accessibility is something that you may hear from time to time during

development of modern web-based applications. Everyone treats it like an egg

and everyone will tell you that it is really important. Yet many websites lack

basic accessibility features.

During the design of HTML some care was put to make it so you can transform it

to ICADD. What is ICADD?

ICADD is an application of SGML - very much like HTML. It defines some basic

markup tags that need to be implemented to work with Braile readers and

computer voice. It worked before screen readers were smart enough to read you a

complex web page.

[ICADD ISO](https://xml.coverpages.org/ICADDiso.html)

HMTL to ICADD could be transformed. There were measures in place to make it

easy. Of course, because ICADD defines different tags (and generally less then

HTML) such transformation would be lossy.

What I see as loss compared to today's HTML 5 (that isn't even SGML application

anymore) is that we cannot do this anymore! Screen readers instead try to use

variety of techniques (some of them are part of the spec) to gain context on

given content.