The SGML Implementation Guide A Blueprint for SGML Migration
Created: 2020-10-04T10:19:46+00:00
Return to the Index
- Structured documents are important as they allow computers to work with the data.
- SGML was created as a means to bridge markup between working groups.
- Professional typesetting systems once cost 50k$-100k$ USD.
- People made GUI tools to typeset SGML documents and edit DTDs.
- Railroad diagram: a way to visualize parsers.
- Format tagging: when text has style information defined explicitly in the tag.
- Structure tagging: when structural information is in the name of the tag, not implied. ex <h1>, <h2>, instead of <header><header/></header>.
- Content tagging: when tags describe what kind of content is happening.
- A DTD is very important for defining how a particular document is to be read.
- When making a document system; interview users and make their feedback part of the design process. Don't design useless shit nobody uses.
- Parsers have "capabilities" and "features" to identify things like the maximum size of a tag, whether including other documents is allowed.
I gave up around page 252. It goes in to depth on how tags work and are defined. But it's the kind of thing that probably deserves to be in a hypertext reference and not a Zettelkasten card.
DTDs
- ?: optional, zero or one
- +: required, one or more
- *: optional, zero or more
- ,: is followed by
- (): grouping
- |: alternatives
- &: must appear but order is unimportant
Entity
- Maps an entity name to some particular content.
- Apparently this can also import another file.
- Entities also exist so you can type characters that exist in the output typesetting system but not ASCII keyboards. This was also before Unicode.
<!ENTITY onehalf '1/2'>
<!ENTITY chap1 SYSTEM 'chap1.sgm'>
Now when you see &onehalf; it SHOULD be replaced with 1/2.
Element
Defines elements and what is allowed to appear within them.
<!ELEMENT something - 0 (#PCDATA)>
Attribute list
Defines what attributes are attached to tags and whether they are required or not.
<!ATTLIST tagname parameter PARAMETER-NAME #REQIURED>
Document type
Defines the layout of the document.
<!DOCTYPE something [...]>
SGML declaration
Holds "bootstrap" information.
<!SGML ISO 8879:1986...>
Body
- Start tags; ex <foo>
- End tags; ex </foo>
- Entity reference; ex &
- Comment; ex <!-- foo -->
- Processing instruction; ex <?argh ohno>
- Character reference; ex -
- Marked declarations; ex <![IGNORE[foo]]>
- Short reference use; ex <!USEMAP aaaah>
- Link set use; ex <!USELINK #ARGH fuck>
- Start tags may not be required if the grammar says they may be implied.
- End tags may not be required if the tag is marked as empty or the grammar says it is implied.