💾 Archived View for hackersphere.space › ~willowf › unix › ch03.gmi captured on 2024-09-29 at 00:25:57. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-03-21)

-=-=-=-=-=-=-

Chapter 3: The origins of C and UNIX

Return to table of contents

Previous chapter

Photograph with Ken Thompson, creator of the original Unix, on the left, and Dennis Ritchie, creator of the C programming language, on the right.

Prelude: MULTICS

As you may recall from the previous chapter, in the 1950s, computers required an operator: a person whose job was to receive input from programmers, load it into the computer, execute it, and return the output, all while fairly balancing computer time between multiple different programmers. As time went on, parts of the operator's job were automated away by subroutines, and in the 1960s, the concept of an "operating system" began to emerge: a computer program that does all the things that were an operator's job.

A few very large companies dominated the field of computing at the time. The main players in the game were IBM, which mostly made hardware but also made software, and Bell Labs, a subsidiary of AT&T that mostly made software but also made hardware. Bell Labs has a special place in computing history because a lot of the foundational building blocks of the modern computing landscape came from there. For example, in 1959, the metal-oxide-semiconductor field-effect transistor, the most consequential invention since stone spearheads, was invented at Bell Labs by scientists Mohamed Atalla and Dawon Kahng. Transistors could replace vacuum tubes in computer design to great benefit; they can be made much smaller than vacuum tubes, and they don't burn out. This led to an exponential increase in the speed and memory that computers had, an increase known as "Moore's Law", that the number of transisotrs that fit on a computer chip tends to double every two years. It is only now, as we are making transistors that are each just a few dozen silicon atoms, that we are approaching the possible end of Moore's Law.

The explosive growth of computer hardware power kicked off a similarly explosive growth in the diversity and complexity of software. We only talked about a few of the most influential languages in the last chapter, but thousands of other languages were developed throughout the 1960s, some more influential than others. At the same time, of course, operating systems were becoming more sophisticated, going from replacing the operator's job to new features that a human operator could not or did not do. People began to invent "filesystems": a novel idea that would allow a computer to organize the data on a magnetic tape into "files", which were in some cases placed under a hierachy of "directories" (which are now often called "folders"). People began trying to network computers together, and some envisioned a worldwide network of computers that would allow people to send eachother files and messages from anywhere on Earth. Such novelty!

Some people wanted to make it so that a computer could run multiple programs at the same time. Back then, the hardware could only execute one instruction at a time, and therefore, operating systems could only run one program at a time. If you wanted to run multiple programs, you would have to run them one after the other. Also, initially, there was not much of a notion of using a computer "interactively", i.e. sitting at a keyboard, typing something in, and getting something out right away. People would drop off their programs in the evening and pick up the output in the morning! But as computers got faster and operating systems more sophisticated, interactive computer use started to be a thing. But only one person could interactively use a computer at a time, and nothing else could run while someone was using it.

So in 1964, programmers and computer scientists from MIT, Bell Labs, and General Electric began work on a new operating system called MULTICS, which stands for "Multiplexed Information and Computing Service". Multics was planned to be among the earliest systems to support "time-sharing": the idea that the operating system could run multiple programs at the same time by providing an environment in which all the programs could run independently, without having to know about the other programs running at the same time. Basically, the OS would run a few instructions from one program, then save the contents of the registers and memory, then load the registers and memory for another program, execute a few instructions from that program, and repeat indefinitely for all running programs. The programs could access the same files while running, and in this way, the computer was essentially split between however many users wanted to use it. You could even connect multiple different electric typewriters to a single computer to allow multiple people ot use it interactively at once. This form of time-sharing is now most frequently known as "multitasking", and til then there were very few OSes that supported it.

Another ambitious aspect of Multics was that it would not be written in Assembly. Prior to this, people took it for granted that you just couldn't write an OS in anything other than Assembly: an OS needs direct access to the hardware with a very low level of abstraction, and programming languages with a higher level of abstraction just can't do that. But the Multics project sought to challenge that idea by having the whole OS be written in a language called PL/I, pronounced "P L one", developed by IBM. This way, you could write Multics once, and compile it for however many computers had a PL/I compiler available for them; it would also make it easier for future programmers to read and modify the source code.

Or so it was thought. Many programmers involved in Multics found that they actually didn't like PL/I, and it also turned out that writing Multics was *really* hard just in general. They tried and failed to get the project off the ground as there were delays, budget overruns, and threats of cancellation. So Multics languished in development hell for years, and in 1969, Bell Labs pulled out of the project to work on other things.

Space Travel

One of the people at Bell labs who worked on Multics was Ken Thompson. When his employer began to rethink their stake in the project, Thompson suddenly found himself with access to unused computer hardware and a whole lot of spare time on his hands. He had therefore written a video game for Multics, called "Space Travel". This was a 2d game using vector graphics, in which the player flies a little spaceship around a scale model of the solar system, trying to land on various planets and moons with simulated gravity. When Bell Labs finally withdrew from the Multics project altogether, Thompson wanted to be able to work on his game still, so rewrote the game in Fortran to run on GECOS, an operating system made by General Electric for the GE 635 and GE 645, machines that were available because Bell Labs had bought them for Multics development.

But then they did the numbers and found out that each game of Space Travel cost about $75 worth of computer time, and that's in 1969 money. And AT&T obviously wasn't interested in funding game development. Still not wanting to give up on his game, Thompson found out that there was another department that had a computer called the PDP-7. The PDP-7 wasn't a super powerful computer, even by the standards of the time, but that particular one had a relatively decent graphics terminal. So Thompson used the compiler on GECOS to recompile his game to PDP-7 machine code, and later on he rewrote the game in PDP-7 Assembly.

All this porting and rewriting was really tedious. Gamedev is hard now - imagine what agony it must have been in 1969! In particular, he wanted to be able to move files from GECOS onto the PDP-7, because what he *had* been doing was using the GECOS computer to punch the binary machine code for Space Travel onto *paper tape* and then loading that into the PDP-7. This was necessary because the PDP-7 was a "minicomputer", meaning that it was the size of a large piece of furniture rather than the size of a whole room. That meant that it had limited power even for its time; it came with nine kilobytes of RAM, which could be expanded to as much as 144KB. Compare that to your computer which likely has roughly a million times that much. So writing Assembly on the PDP-7 itself wasn't a possibility; it didn't come with support for text editing, or an assembler, or files.

Thompson wanted to change that, so he started writing a new operating system for the PDP-7. Similar to Multics, the new OS would have a "hierarchical" file system: it would have files, and it would have directories (what many now call "folders"), and every file and directory in the system would be organized into a tree of directories. It would also have a text editor, a command-line interpreter (so you could type commands in via electric typewriter and see their output printed right away), little programs for copying, moving, and deleting files, and finally, an assembler. Once all those parts plus the assembler were in place, the OS had enough features that Space Travel could be fully developed right on the PDP-7 itself. The new OS was named "Unics", as a pun on "Multics", but later this was changed to "Unix" because it looks cooler.

Thompson's Unics OS had to be as minimalist as possible. When Multics was in development, it ran over budget, failed to meet deadlines, and ended up too complicated both for the programmers and the hardware. Unics had no budget, had no deadlines except the constraints of Thompson's free time, and couldn't be more complicated than Thompson could understand by himself. With these constraints, it was far more important to get the system working at all than to get it working extremely well; every part of the system was about as minimal as it could possibly be. To run in the very limited memory that was available on the PDP-7, Thompson went as far as to abbreviate names used in the code, even ones that were already pretty short; "user" became "usr", "binary" became "bin", "remove" became "rm". So if you wanted to remove a file named "example.txt" from the user binary directory, the command would be "rm /usr/bin/example.txt". RAM is cheap as dirt now, but modern operating systems based on Unix still use those cryptically abbreviated names.

Ken's colleagues get interested

Other colleagues of Thompson's had already taken interest in Space Travel, having playtested it for him when it was on Multics and GECOS. Some of his colleagues, including Dennis Ritchie, helped him develop the hierarchical file system for Unics. His colleague Brian Kernighan claims credit for coming up with the name "Unics", and says that no one is really sure whose idea it was to change that to "Unix".

The newfangled Unix OS was great: it was like Multics, except it actually worked, and it could run on relatively cheap hardware. But it still only had support for two languages: Assembly, of course, and "shell script", the language that interactive commands could be written in (so named because the interactive command line is called a "shell"). Shell script is fine as a command language, but not so great as a programming language; Assembly is Assembly, and gets the job done, but is a pain to read and write. So Thompson and Ritchie started working on a compiler for a new high-level programming language. There was a language called "BCPL", short for "Basic Combined Programming Language", that Thompson liked. But BCPL was too complicated for them to be able to write a compiler for it on a minicomputer like the PDP-7. So they designed a language inspired by BCPL but with many of the features heavily stripped down. This new language was called B.

Here is a "hello world" program in B:

main() {
  putchar ('Hell'); putchar ('o Wo'); putchar ('rld'); putchar ('*n');
}