Friday, 20. December 2024

A journey into the 8-Bit microcomputing past: Exploring the CP/M operating system - part 1

[This article has been bi-posted to Gemini and the Web]

Gary Kildall's story is a sad one. He was one of the pioneers of the microcomputing revolution and is unfortunately often overlooked today. While his various contributions were essential to eventually achieve what we have today, one thing stands out: CP/M (short for "Control Program / Monitor", later often interpreted as "Control Program for Microcomputers"). Kildall created version 1.0 of what became the dominant operating system of the 8-Bit era of microcomputers in 1974.

His company, Digital Research, was very successful for a while, proving that he correctly assessed the microcomputer as a serious thing (while many disregarded them as toys). But his life ended under dubious circumstances after having experienced betrayal (he initially couldn't believe that a man whom he called a friend would backstab him: Bill Gates!), infringement of his code (what later become MS-DOS started out as a copycat product that stole the API from CP/M), being tricked by big business (Kildall accepted IBM's proposal to offer customers both DOS and CP/M for their PC to choose from if he didn't take legal action - but then they priced CP/M six times higher...) and personal tragedy.

I do not often recommend videos, but in this case I will do so. If anything I wrote in the previous two paragraphs interests you, feel free to have a look at this well done documentary on Gary Kildall, a great man who was not just a pioneer and a visionary but also a kind person (unlike certain other people that dominated the IT sector later and with much less entitlement):

Gary Kildall - The Man That Should Have Been Bill Gates - Part I

Gary Kildall - The Man That Should Have Been Bill Gates - Part II

Gary Kildall - The Man That Should Have Been Bill Gates - Part III

I'm a child of the '80s and thus have not lived to witness the beginning of personal computing. I grew up with DOS and while I knew that it had a predecessor, I had never taken a closer look at it. CP/M is free to use today, though, and a few dedicated people have preserved some interesting material. I've dug into Unix history quite a bit and liked to gain better understanding of how we actually ended up with what we use today as well as form an appreciation for important achievements. But there have been other driving factors beyond Unix. So a couple of months ago I downloaded the available CP/M manuals that Gaby has collected and started reading them. Luckily there are also various CP/M emulators and related programs available (and even readily available in FreeBSD's ports tree!), so I could give it a try. I've been reading on and off a couple of times, taking some notes. This article is based on those.

The manuals that I used can be downloaded here:

Gaby Chaudry's CP/M pages (English version)

Introduction to CP/M Features and Facilities (CP/M 1.3 1976, 18 pages)

This is the oldest manual that I have. With under 20 pages it was a quick read but I gained a lot of insight on what CP/M is. The OS consists of four parts ("modules"):

At the time, operating systems were special-purpose programs made for one specific machine type. Kildall's innovation with CP/M was that he modularized the system so that only the BIOS part had to be modified to port it to another architecture. This made a portable operating system feasible for the first time. By modifying the BIOS, hardware manufactures could easily adopt the OS for example for other storage systems than the default that Digital Research supported in the unaltered CP/M.

There's always one disk active, indicated by the CCP using a drive letter (the manual calls it the "currently logged disk"). After boot it displays the letter of the first drive and the greater-than character as the prompt symbol: A>. It's possible to change to the second disk by giving the B: command.

The CCP has only 5 programs built-in:

File references can be unambiguous, specifying an exact file name, or ambiguous using wildcards. An unambiguous file name consists of a "primary name" of 1 to 8 characters and a "secondary name" of three characters (which is optional and considered as three blanks if empty). File names must not use the following characters: . , ; : = ? * All other characters are allowed.

The dot is used to connect primary and secondary names: pppppppp.sss. The question mark and asterisk are used for pattern matching. A ? matches any single character, so F?O.BA? matches FOO.BAR, FOO.BAZ, FIO.BAK and so on. A * matches any character(s) for the part of file name: *.COM is the same as ????????.COM and *.* means ????????.???.

There's very basic command line editing. Backspace (called "rubout") removes one character from the command line but it cannot blank it out or move the cursor back (many terminals at the time could not do this), so it echoes back the character it deletes. For example preparing the 'dir' command, deleting it and then typing it again would result in "dirriddir" printed at the prompt but actually run the dir command if return is pressed! CTRL-U marks the whole line as deleted and CTRL-Z means "end-of-input" (used by PIP and ED). Also CTRL-C is warm boot (system reset) which is required if you for example tried to access a non-existing drive (there's no other way to recover from the error). It's also required after changing diskettes. CTRL-E allows for a carriage return without executing the program (basically a line break without executing the command line, yet).

CP/M 1.3 comes with just a few transient commands:

CP/M uses the form of target before source that was also used on DEC machines. For example to rename the file A.B to C.D the user would execute the command:

REN C.D=A.B

The same is true for combining files A, B and C to Z:

PIP Z=A,B,C

It's possible to copy files between disks, for example to copy A.BAK from drive B to the current one the command is:

PIP A.BAK=B:A.BAK

Printing out FILE.PRN is done with:

PIP LST:=FILE.PRN

PIP supports copying to or from various other devices than a disk drive. These are referenced via special names; destination devices:

Source devices:

All things considered the OS is very much bare bone, not just when compared to the monsters we got later. It can either be seen as shockingly primitive or as refreshingly simple. Of course you must not forget the technical context of the time. What could you do on a machine that might have only a few kilobytes (yes, that's right!) of RAM?

The super large (and at the time terribly expensive) 8-inch floppy drives that superseded the punch cards, were used and for the format, CP/M stuck with SSSD (single-sided single density) IBM 3740 diskettes as the de-facto standard even when better technology became available later. These provided about 241k of storage space. The Z80 (which was one of the supported platforms) was an 8-bit processor with a 16-bit data bus which means it could address up to 64k of RAM but systems with as little as 16k were common. Digital Research actually fit a lot into this very limited and highly constrained environments.

Introduction to CP/M Features and Facilities (CP/M 1.4 1978, 44 pages)

In this introduction document the OS is described as: "CP/M provides a general environment for program construction, storage, and editing, along with assembly and program check-out facilities."

Standard CP/M now supports up to 4 disk drives: A, B, C and D. There's more reserved characters which must not be used in file names: < > . , ; : = ? * [ ]

The CCP is able to access files on other drives than the current drive; if the active drive is A, the user can for example do 'DIR B:' and 'TYPE B:INFO.PRN' without having to change the active drive and then change back. It could do this before, but now it's documented in the introduction. The mentioning that file specifications can be prefixed with a drive address is repeated throughout the manual and shows that by this time systems with multiple floppy drives have become more common.

Line editing has improved a bit. Command lines can be up to 255 characters long. The introduction of CTRL-R which re-prints the current line can be useful it you used backspace and lost track of which characters you deleted and which ones not. Very useful is CTRL-S which freezes console output until another key is pressed - which gives you a chance to read something that would be gone if the text just scrolls by. Finally CTRL-P allows to send all output to the list device as well as to the console if any is selected (typically a printer).

The ability of the STAT command to display device assignment is now documented: 'stat val:' and 'stat dev:'. Setting devices works like this: 'stat con:=crt:'. The special names have been re-organized and the manual mentions that while for example RDR can still stand for the paper tape reader device, it could also refer to a teletype reader or to cassette tape. New are:

Especially the PIP command is described in much more detail (about 7 pages where the previous manual had only 2.5). PIP can now take special parameters which are enclosed in square brackets. Among these are [B] (Block mode, buffered copy), [Dn] (Delete characters after column n, useful for narrow printers), [E] (Echo transfer operations to the console), [L] (Translate upper case letters to lower case), [N] (Add line numbers), and so on.

I'm not sure what new features ED got since its section in the manual for the previous version was so short (mostly just pointing at the ED handbook which I don't have). The new handbook mentions that on a 16k system approximately 5,000 characters can be held in memory while in CP/M 1.3 it were 6,000. Since that's quite a difference and I was not sure if the OCR messed it up, I took a look at the program - and indeed the editor has grown from 5k (40 records) to 6k (48 records), which is quite a bit!

I learned that when editing a file FOO.TXT, a new one is created on disk (FOO.$), and when the file is saved the old one is renamed to FOO.BAK before FOO.$ is renamed to FOO.TXT. Which is nice. It also supports opening a source file and writing to a target file later which can be on another disk. ED even warns the user if a file of that name exists on the other drive so it will not be overwritten accidentally. For the actual editing the reader is still referred to the ED handbook which I don't have for this version, either.

There's also a new transient program available: MOVCPM. It is used to write the operating system image into memory and adapt it for the amount of memory to expect. This can be done either to use together with SYSGEN or to use SAVE to write the image to disk where it is ready to be "patched" according to the "System Alteration Guide".

To me CP/M 1.4 seems to be a useful step up from 1.3. It still can't do much, but being an _operating system_ the whole point is to make the computer work and allow for running other programs. CP/M definitely does this and I'd in fact say it does it well. It even comes with an assembler so programming the system is possible with just CP/M and nothing else is required. My little exploration has been interesting enough to make me look forward to also exploring CP/M 2.x.

CP/M 2.0 User's guide for CP/M 1.4 owners (1979, 42 pages)

Version 2.0 of the CP/M operating system came with a wealth of new features and the "user's guide for 1.4 owners" conveniently points out the differences between the releases. Obviously a lot of work went into the filesystem portion of the OS. It is now much more flexible and the guide states "All of the fundamental file restrictions are removed, while maintaining upward compatibility from previous versions of release 1". The system can now handle up to 16 logical drives, each with a size of up to 8 MB (with "possibly up to 32 MB in the future"). Multiple logical drives can share the same physical drive and use different areas of it.

File specification are now known to consist of a filename and an optional file type rather than a primary and a secondary name. Files can have _read-only_ and _system_ attributes. Very interesting is that CP/M 2.0 introduced the concept of distinct users which can have different files in their respective "user areas" - not even DOS which would eventually supersede CP/M ever had this!

The operating system also takes CRT devices into account some more. This enhances line editing a lot if not using a hard-copy device (line printer): With CTRL-H a real backspace is now possible and CTRL-X will delete the whole command line so the user can start over. A simple modification is possible to make the CTRL-H behavior the default for the DEL key as well.

USER is a new built-in command of the CCP which allows to switch between the 16 possible _user areas_. By default the system uses user area 0 and this one is compatible with a CP/M 1.4 directory. Due to supporting various users, 'ERA *.*' is no longer guaranteed to erase all files of a disk - it only erases all of the files in the current user area.

DIR got a new multi-column output feature to make better use of the horizontal dimension of the output device. The STAT command has been extended significantly with parameters to deal with the new attributes. You can set the system attribute of a file (which means it won't be listed by DIR) like this:

STAT FILE.COM $SYS

The command can also provide information about the structure of a disk by using the DSK: special name. To view filesystem information about the disk in the second drive you can use 'STAT B:DSK:'. Information on users can be obtained with 'STAT USR:'.

PIP has three new parameters, two of which deal with attributes and one, [G], that deals with getting a file from a user area. Copying a file from user area 0 to user area 3 would mean switching to user 3 and then use PIP. The problem is that initially user areas except for 0 are blank and do not have any files in them. This means they also don't have any transient commands available - including the copy utility PIP which could be used to copy over files from elsewhere!

This problem is solved by first loading the PIP command into memory and then using SAVE to write it out to a program file again. The debugger can be used to load the program into memory like this: 'DDT PIP.COM' and then quitting the debugger with 'G0'. The debugger will output something like this:

DDT VERS 2.2
NEXT PC
1E00 0100

The interesting thing here is the NEXT value, which is the next free page. So the program occupies the pages up to and including 1C. The decimal value of that is 29, which means the program takes up 29 memory pages. After switching to user 3 and issuing the command 'SAVE 29 PIP.COM', the copy program is made available. It can then be used to copy over files like this:

PIP STAT.COM=STAT.COM[G0]

ED is now also honors file attributes and relative line numbering became the default. More importantly, ED's insert mode got support for line editing when using CRT monitors.

This version also introduces the XSUB transient. When used in conjunction with SUBMIT, this tool allows for including line input for programs or the CCP instead of just being able to execute a fixed batch of commands.

Being primarily interested in what the general user experience of the OS was like, I decided to skip the remaining sections ("BDOS Interface Conventions", "CP/M 2.0 Memory Organization" and "BIOS Differences"). The new features are obviously great, but in retrospective reading about them only makes you realize even more painfully how simplistic the versions before that lacked them really were. With version 2.0, CP/M basically made the jump into an age where CRT screens were becoming the norm and hard-copy terminals the exception.

CP/M 2.2 Operating System Manual (3rd edition, 1983, 317 pages)

On to the first version that I have a comprehensive manual for! Version 2.2 of CP/M has eventually outgrown systems with only 16k of main memory and needs at least 20k of RAM to operate. The list of reserved characters which cannot be used for filenames has also grown further: < > . , ; : = ? * [ ] % | ( ) / \

There wasn't too much information that struck me as new, so this time I decided to finally take a look at the handbook chapter on ED. The program is introduced as a "context editor" and does not have too much in common with the visual editors that became available later. Basically ED is a program to copy contents of a source file into a buffer for displaying or alteration and possibly to a target file. While it has a couple of convenience features like automatically renaming the original file from name.typ to name.bak and then renaming the new file (name.$$) to name.typ, it's a bit hard to wrap your head around these days.

Let's see if we can get something done in the simulator, shall we? Here's my first attempt:

A>stat
A: R/W, Space: 11k
B: R/W, Space: 168k

Alright, this simulated machine has two drives and while 11k on the system disk is still plenty of space (as in "I definitely don't plan to type that much" - but even considering the context making this statement in 2024 feels slightly weird!), I think I'll focus on the second one. What's on there?

A>dir b:
B: BOOT HEX : BYE ASM : CLS MAC : SURVEY MAC
B: R ASM : CLS COM : BOOT Z80 : W ASM
B: RESET ASM : BYE COM : SYSGEN SUB : BIOS HEX
B: CPM64 SYS : SPEED C : BIOS Z80 : SPEED COM
B: SURVEY COM : R COM : RESET COM : W COM

Nice, looks like a couple of programs and also some source code. "BYE", used to exit the simulator, might make a good example. Let's see:

A>type b:bye.asm
;
; Program to shutdown z80pack systems via hardware control port
;
; May 2018, Udo Munk
;
ORG 0100H
HWCTL EQU 0A0H ;hardware control port
IN HWCTL ;is the port locked?
ORA A
JZ UNLCK ;continue if not
MVI A,0AAH ;is locked, unlock with magic number
OUT HWCTL
UNLCK: MVI A,080H ;now stop system via hardware control
OUT HWCTL
DI ;if that didn't work halt system
HLT
RET ;if it gets out of halted state return to CP/M
END

Ok, a 24 line file. That's perfect. Let's make a copy before messing up completely, just to be sure:

A>pip b:example.txt=b:bye.asm
A>dir b:*.txt
B: EXAMPLE TXT

As we all know, with non-intuitive editors the first challenge is to exit them again without cheating (rebooting, killing the process, etc!). The 'E' command should do that:

A>ed b:example.txt
: *e
A>dir b:ex*.*
B: EXAMPLE BAK : EXAMPLE TXT

Aha, got it! As expected, upon quitting, ED copied appended the unchanged lines from the source file, renamed the original to a backup file and then renamed the temporary file. If we don't want to save the file, we must not "end" editing but _quit_ instead:

A>era b:example.bak
A>ed b:example.txt
: *q
Q-(Y/N)?y
A>dir b:*.bak
NO FILE

Ok, so far so good. After removing the backup file and quitting the editor, this time no new backup is produced. The program also prompts the user before leaving, most likely to avoid potentially losing unsaved changes. Let's see if we can get something more useful done:

A>ed b:example.txt
: *a
1: *t
1: ;

While this doesn't look terribly impressive, it just what one would expect: It copies over ("appends") one line from the source file to the buffer (which is no longer empty and got a pointer to line 1!). The T command asks ED to type out the respective line - which just contains a semicolon. What happens if we repeat this?

1: *a
1: *t
1: ;
1: *2:t
2: ; Program to shutdown z80pack systems via hardware control port
2: *

Ok, it seems like the editor has copied over another line into the buffer but didn't increase the buffer pointer, so typing out still acts upon line 1. Specifically requesting it to type line 2 does just that (and sets the pointer to it). Good to know but certainly not amazing. Does this thing support ranges, too?

2: *1::2t
1: ;
2: ; Program to shutdown z80pack systems via hardware control port

You bet! So we're getting to the point where the editor is at least useful for viewing text. I've used two absolute line numbers here, but it's also possible to just use one which implicates using the current line as the other one. If you're truly hardcore (or want to save some ink since this was meant to be used with hard-copy terminals in mind!), you can turn off line numbering, reducing the editor prompt to just an asterisk!

Some commands like A can be prefixed with a number, so to copy 3 lines from the source file into the buffer, you don't have to issue it three times. Anyway, let's see if I can use ED to input text. There's the I command for this. And to make things a little more interesting, I'm going to try to add a line after the program description but before the copyright line:

: *3a
1: *3:
3: *i
3: Here's a new line 3!
4:
4: *20a

Here I copy 3 lines into the buffer, set the buffer pointer to the third line, input a new one, exit back to command mode (pressing CTRL-Z) and fetching the rest of the input file into the buffer. That should have done the trick:

4: *1::5t
1: ;
2: ; Program to shutdown z80pack systems via hardware control port
3: Here's a new line 3!
4: ;
5: ; May 2018, Udo Munk

And indeed that's the desired result! There's a lot more to ED like moving the character pointer within a line and using that to delete characters, search and replace and so on. In fact it even allows sourcing libraries and executing macros from them! The whole chapter is 21 pages long. I guess if you really mastered ED, it's quite powerful actually. But since I did not plan on writing an ED tutorial and really just wanted to give you an impression of what it is like, this should suffice.

What follows in the manual are chapters on CP/M's assembler, dynamic debugger, system interface as well as system alteration. Once again that clearly goes beyond simply using the OS, so I skipped it. Once again, the new version is certainly a step up from the previous one, but it's still just a very simplistic operating system for early microcomputers.

Conclusion

While I certainly haven't found an OS that I'd actually like to use today, I have enjoyed the journey of educating myself on this once groundbreaking operating system. I played around with various emulators but have eventually settled on z80pack which is really convenient to use and allowed me to explore the various versions of CP/M. The OS line did not end with version 2.2, but this article is long enough already and ending after covering the first two major versions seems like a good choice.

From today's perspective CP/M is surprisingly weird in many regards. Reading command lines from left to right, you're not copying or renaming files TO a target but FROM a source (e.g. "REN ufn1=ufn2). This is not CP/M's fault, though, but was inspired by DEC's earlier systems. What _is_ CP/M's fault is that command parameters are not standardized by any means: STAT uses the dollar sign notation, PIP uses parameters in square brackets, etc. Same thing with the prompts: PIP and ED use an asterisk while DDT uses a minus character.

The weirdest thing for me however is being able to set a drive to R/O mode. I mean, this is a nice idea in theory. However trying to write to it will cause the "BDOS ERR ON d: Read-Only" error and perform an automatic warm start - well, and both cold or warm starts make the drives R/W again! Fortunately this is not the case for file stats which survive restarts. However the ability to set drives R/O puzzles me under these conditions. Unless I'm missing something, it's not really all that useful...

Unless you are into retro computing, CP/M is an operating system with virtually no practical use today - with one exception. And that exception is: Education. Both for historical education (which was my interest) and most likely also for technical education. Taking a look at Z80 assembly might be another great project since that was a processor where an interested layman could still understand its operation. But that's beyond the scope for this and the next article.