SCCS Weave Format

Created: 2023-05-09T23:11:37-05:00

Return to the Index

BitKeeper SCCSWEAVE notes

GNU CSSC description of delta tables

The "weave" stores all versions of the file together at once.

Control codes surround blocks (which can be nested in a woozy fashion) and indicate if the block is injected or deleted as of a given serial number.

McVoy wrote to read the control code, look at the serial, and do the opposite of what it says depending on if the serial is in the active set or not.

A full revision of a file is the set of revisions that are active at that revision.

GNU CSSC description of delta tables

Example of weaving a creation and deletion

In version 3 add these lines:
I like to pet blob cats.
In version 4 ignore these lines:
I like to pet sharks.
End conditional for version 4
I like to pet blob foxes.
End conditional for version 3

Deletions can only affect same or lower serials. This is because the weave is created through surrounding blocks with control codes, so inferior deletions can end up surrounding newer additions.

Creating the weave file basically involves diffing the current head and the incoming version. This gives you a set of added and removed ranges. Then you walk the weave to the current head (to find which lines to annotate), insert the insertion or deletion boundary markers, and save the file again.

So if the diff says to add at line 5, and delete 10-15, you walk the weave to find what is "effectively" line 5, and add the insertion.

BitKeeper developer commentary on SCCS format

SCCS is a "weave". The time to get the tip is the same as the time to get the first version or any version. The file format looks like
^AI 1
this is the first line in the first version.
^AE 1
That's "insert in version 1" data "end of insert for version one".
Now lets say you added another line in version 2:
^AI 1
this is the first line in the first version.
^AE 1
^AI 2
this is the line that was added in the second version
^AE 2
So how do you get a particular version? You build up the set of versions that are in that version. In version 1, that's just "1", in version 2, that's "1, 2". So if you wanted to get version 1 you sweep through the file and print anything that's in your set. So you print the first line, get to the ^AI 2 and look to see if that's in your set, it isn't, so you skip until you get to the ^AE 2.
So any version is the same time. And that time is fast, the largest file in our source base is slib.c, 18K lines, checks out in 20 milliseconds.

Checksum

"16 bit ignore the carry bit checksum over the whole file" used to verify that filesystem and RAM corruption did not happen while reaching for the file.

It's really surprising how well the SCCS checksum has worked. When we went to a binary format we did CRC on each block and XOR so we could put stuff back together. I still have a lot of respect for that little checksum. It served us well.

McVoy e-mail about the signature