💾 Archived View for gemini.theuse.net › utzoointro.gmi captured on 2021-12-17 at 13:26:06. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-11-30)

➡️ Next capture (2022-03-01)

-=-=-=-=-=-=-

UTZOO Tapes Introduction

Well, the thank-you's have been rather ebullient all day long today and I feel

somewhat embarrassed by the attention. Especially given how long it took us to

get the archive on line and visible! It has to be close to 10 years now. Sigh.

The story is more a story of fits and starts than of resolve. And our

contribution accounts for some (most?) of the first 10 years of the Google archive.

If I recall correctly, the issue of Henry Spencer's (actually, the University

of Toronto, Department of Zoology's) NetNews archive was raised at a Usenix

conference in the early 90's. The question: can we get at them? Bruce Jones

was especially interested in this. Henry's answer was that it really wasn't

going to be easy because he had neither the disk space nor the tape drive to

pull them all down to make them available.

I, it turned out, did. So one bright winter day I drove from London

(Ontario Canada) to Toronto (Ontario Canada) -- a two hour drive in my

shiny new pickup truck and picked up 141 magtapes from the Zoology

department at UofT and brought them back to the Department of Computer

Science at the University of Western Ontario. (A not unimpressive

bandwidth, by the way, of some 18Mb/sec :-) never underestimate the

bandwidth of a pickup truck on the highway!)

Then with the help of several people (some of whom have not yet been credited)

we started to pull the data off of the tapes and onto disks in both the

Computer Science department and the Robarts Research Institute. Lance Bailey,

then with the Robarts Research Institute, did the pulling there and I with

assistance from Bob Webber did it at Computer Science. Bruce Jones from

UCSD took some vacation time and came up here to help pull data down for a

week or so as well.

But we quickly ran out of space and time: Lance left Robarts for UBC, Bruce's

vacation ended, and Bob and I got busy doing other things (like our jobs).

As a result, the archive project made very little progress over the next few years.

Then Brewster Kahle started pushing on us (thanks Brewster!) to get it done.

He even bought us a large disk to hold the archive when we truly ran out of

space. With the help of Sue Thielen, who was out of work and bored, we got

all of the rest of the tapes read down onto that disk. Unfortunately, that

disk was not "close enough" to either a tape drive or the ftp server to

make the data available to anyone. And it wasn't organized in anyway usefully.

Brewster pushed very gently for a very long time but the new archive project

was far from the top of the list of projects I was supposed to be working

on and I just never got it going again.

Late this summer Michael Schmitt from Google started pushing as well.

And as luck would have it, I was able to hire a student to do the final

sorting of the archive as well. And, that luck still holding, I managed

to "steal" enough space on the ftp server for the entire archive! But

it still took months to get that figured out and the archive transferred

to a machine from which they pull the archive. It was the middle of October

before we were able make the collection available to Google. And it is

actually available, although totally unsorted, to anyone who wants it and can

deal with pulling some 160 files ranging in size from 1.4Mb to 65Mb. Just drop

me a line to say please and we'll arrange to make it visible to you.

I'd still like to impose a bit more order on the raw archives than we have

but the time just hasn't allowed for that...

David Wiseman, Dec 11, 2001

David Wiseman's Introduction to the UTZOO tapes (www)