24-Jul-86 20:20:28-PDT,16980;000000000000 Mail-From: NEUMANN created at 24-Jul-86 20:18:29 Date: Thu 24 Jul 86 20:18:29-PDT From: RISKS FORUM (Peter G. Neumann -- Coordinator) Subject: RISKS-3.25 Sender: NEUMANN@CSL.SRI.COM To: RISKS-LIST@CSL.SRI.COM RISKS-LIST: RISKS-FORUM Digest, Thursday, 21 July 1986 Volume 3 : Issue 25 FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Petroski on the Comet failures (Alan Wexelblat) Re: Comet and Electra (Douglas Adams) On the dangers of human error (Brian Randell via Lindsay Marshall) Software Paranoia (Ken Laws) Royal Wedding Risks (Lindsay Marshall) How to Think Creatively (John Mackin) Dangers of improperly protected equipment (Kevin Belles) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, nonrepetitious. Diversity is welcome. (Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM) (Back issues Vol i Issue j available in CSL.SRI.COM:RISKS-i.j. Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.) ---------------------------------------------------------------------- Date: Thu, 24 Jul 86 12:02:41 CDT From: Alan Wexelblat To: risks@csl.sri.com Subject: Petroski on the Comet failures Henry Petroski's book _To Engineer is Human_ has a segment discussing the Comet crashes and the detective work done to figure out why they occurred (pages 176-184). The story he tells makes no mention of curved or rounded window corners. The highlights: - On May 2, 1953, a de Havilland Comet was destroyed on takeoff from Dum-Dum Airport in Calcutta, India. The Indian Government Board of Inquiry concluded officially that the accident was caused by some sort of structural failure either due to a tropical storm or to pilot overreaction to storm conditions. - The Comet was flown "off the drawing board"; no prototypes were ever built or tested. - On January 10, 1954, a Comet exploded after takeoff from Rome under mild weather conditions. The plane was at 27,000 feet so the debris fell into a large area of the Mediterranean. Not enough was recovered to allow any conclusion on why the crash had occurred. - On April 8, 1954, another flight leaving Rome exploded. The pieces from this one fell into water too deep to allow recovery, so more pieces from the previous crash were sought and found. - Investigators eventually found the tail section which provided conclusive evidence that the forward section had exploded backward. The print from a newspaper page was seared into the tail so strongly that it was still legible after months in the Mediterranean. - The question now was WHY did the cabin explode? The reason was found only by taking an actual Comet, submerging it in a tank of water and simulating flight conditions (by pressurizing and depressurizing the cabin and by hydraulicly simulating flight stresses on the wings). - After about 3000 simulated flights, a crack appeared at a corner of one cabin window which rapidly spead (when the cabin was pressurized) and the cabin blew apart. - Analysis finally showed that rivet holes near the window openings in the fuselage caused excessive stress. The whole length of the window panel was replaced in the later Comet 4 with a new panel that contained special reinforcement around the window openings. Although Petroski doesn't give his sources directly, much of his material appears to be drawn from the autobiography of Sir Geoffrey de Havilland (called _Sky Fever: The Autobiography_, published in London in 1961) and from a book called _The Tale of the Comet_ written by Derek Dempster in 1958. In general, I recommend Petroski's book; it's quite readable and has lots of material that would be interesting to we RISKS readers. Of particular interest is the chapter called "From Slide Rule to Computer: Forgetting How it Used to be Done." It's an interesting (if superficial) treatment of some of the risks of CAD. Alan Wexelblat ARPA: WEX@MCC.ARPA UUCP: {ihnp4, seismo, harvard, gatech, pyramid}!ut-sally!im4u!milano!wex Currently recruiting for the `sod squad.' ------------------------------ Date: Thu, 24 Jul 86 07:43:49 PDT From: crash!pnet01!adamsd@nosc.ARPA (Adams Douglas) To: crash!noscvax!risks@sri-csl Subject: Re: Comet and Electra It was my understanding that the problem with the early Electras was whirl-mode flexing of the outboard half of the wing. I had heard that Lockheed reassigned its few then-existing computers to full-time research on the problem. But it was also my understand that the original design cycle for the Electra did not involve computer assistance at all--they weren't being used for aircraft "simulation" that early (1948?). ------------------------------ From: "Lindsay F. Marshall" Date: Thu, 24 Jul 86 11:28:28 bst To: risks@csl.sri.com Subject: On the dangers of human error [contributed on behalf of Brian Randell] [From brian Fri Jul 18 17:30 GMT 1986] The following article appeared in the Guardian newspaper (published in London and Manchester) for Wed. July 16. The author, Mary Midgely is, incidentally, a former lecturer of Philosophy at the University of Newcastle upon Tyne. Brian R. was pleased to see such a sensible discussion in a daily newspaper of the dangers of human error that he thought it worth passing on to the RISKS readership, so here it is..... IDIOT PROOF Little did I know, when I wrote my last article about human error, that the matter was about to receive so much expensive and high-powered attention. Since Chernobyl, it has been hard to turn on television without receiving more official reassurance that accidents do not happen here. Leading the chorus, the chairman of the Central Electricity Generating Board came on the air to explain that, in British nuclear reactors, human error has been programmed out entirely. Other equally impressive testimonies followed. Even on these soothing occasions, however, disturbing noises were sometimes heard. During one soporific film, an expert on such accidents observed that human error is indeed rather hard to anticipate, and told the following story. A surprising series of faults occurred at a newly-built nuclear power station, and were finally traced to failure in the cables. On investigation, some of these proved to have corroded at an extraordinary rate, and the corroding substance turned out to be a rather unusual one, namely human urine. Evidently the workmen putting up the power-station had needed relief, and had found the convenient concrete channels in the concrete walls they were building irresistibly inviting. Telling the tale, the chap reasonably remarked that you cannot hope to anticipate this kind of thing - infinitely variable human idiocy is a fact of life, and you can only do your best to provide against the forms of it that happen already to have occurred to you. This honest position, which excluded all possible talk of programming it out, is the one commonly held by working engineers. They know by hard experience that if a thing can go wrong it will, and that there are always more of these things in store than anybody can possibly have thought of. (Typically, two or three small things go wrong at once, which is all that is needed). But the important thing which does not seem to have been widely realised is that hi-tech makes this situation worse, not better. Hi-tech concentrates power. This means that a single fault, if it does occur, can be much more disastrous. This gloomy truth goes for human as well as mechanical ones. Dropping a hammer at home does not much matter; dropping it into the core of a reactor does. People have not been eliminated. They still figure everywhere - perhaps most obviously as the maintenance-crews who seem to have done the job at Chernobyl, but also as designers, sellers and buyers, repairers, operators of whatever processes are still human-handled, suppliers of materials, and administrators responsible for ordering and supervising the grand machines. What follows? Not, of course, that we have to stop using machines, but that we have to stop deceiving ourselves about them. This self-deception is always grossest over relatively new technology. The romanticism typical of our century is altogether at its most uncontrolled over novelties. We are as besotted with new things as some civilisations are with old ones. This is specially unfortunate about machines, because with them the gap between theory and practice is particularly stark. Only long and painful experience of actual disasters - such as we have for instance in the case of the railways - can ever begin to bridge it. Until that day, all estimates of the probability of particular failures are arbitrary guesses. What this means is that those who put forward new technology always underestimate its costs, because they leave out this unpredictable extra load. Over nuclear power, this is bad enough, first, because its single disasters can be so vast - far vaster than Chernobyl - and second, because human carelessness has launched it before solving the problem of nuclear waste. Nuclear weapons, however, differ from power in being things with no actual use at all. They exist, we are assured, merely as gestures. But if they went off, they would go off for real. And there have been plenty of accidents involving them. Since Chernobyl and Libya, people seem to be noticing these things. Collecting votes lately for my local poll on the Nuclear Freezen project, I was surprised how many householders said at once: "My God, yes, let's get rid of the things." This seems like sense. Could it happen here? Couldn't it? People are only people. Ooops - sorry... ------------------------------ Date: Thu 24 Jul 86 17:40:04-PDT From: Ken Laws Subject: Software Paranoia To: Risks@CSL.SRI.COM From: Bard Bloom The VAX's software generated an error about this. The IBM did not; and the programmers hadn't realized that it might be a problem (I guess). They had been using that program, gleefully taking sines of random numbers and using them to build planes, for a decade or two. Let's not jump to conclusions. Taking the sine of 10^20 is obviously bogus, but numbers of that magnitude usually come from (or produce) other bogus conditions. The program may well have included a test for an associated condition >>after<< taking the sine, instead of recognizing the situation >>before<< taking the sine. Poor programming practice, but not serious. A major failing of current programming languages is that they do not force the programmer to test the validity of all input data (including returned function values) and the success of all subroutine calls. Debugging would be much easier if errors were always caught as soon as they occur. The overhead of such error checking has been unacceptable, but the new hardware is so much faster that we should consider building validity tests into the silicon. The required conditions on a return value (or the error-handling subroutine) would be specified as a parameter of every function call. I tend to write object-oriented subroutines (in C) that return complex structures derived from user interaction or other "knowledge-based" transactions. Nearly every subroutine call must be followed by a test to make sure that the structure was indeed returned. (Testing for valid substructure is impractical, so I use NULL returns whenever a subroutine cannot construct an object that is at least minimally valid.) All these tests are a pain, and I sometimes wish I had PL/I ON conditions to hide them. Unfortunately, that's a bad solution: an intelligent program must handle error returns intelligently, and that means the programmer should be forced to consider every possible return condition and specify what to do with it. Errors that arise within the error handlers are similarly important, but beyond my ability to even contemplate in the context of current languages. Expert systems (e.g., production systems) often aid rapid prototyping by ignoring unexpected situations -- the rules trigger only on conditions that the programmer anticipated and knew how to handle. New rules are added whenever significant misbehavior is noticed, but there may be no attempt to handle even the full range of legal conditions intelligently -- let alone all the illegal conditions that can arise from user, database, algorithm, or hardware errors. I like expert systems, but from a Risks standpoint I have to consider them at least an order of magnitude more dangerous than Ada software. -- Ken Laws ------------------------------ From: "Lindsay F. Marshall" Date: Thu, 24 Jul 86 13:46:31 gmt To: risks@csl.sri.com Subject: Royal Wedding Risks Yesterday (23rd) we lost all power to our machine room when a circuit breaker blew. The cause of this was a glitch which hit us at about 13:50 P.M. This was approximately the time that the main Royal Wedding television coverage stopped............ ------------------------------ From: munnari!basser.oz!john@seismo.CSS.GOV Date: Thu, 24 Jul 86 18:21:08 EST To: RISKS@CSL.SRI.COM Subject: How to Think Creatively Recent comments in Risks about ``computer literacy'' lead Herb Lin to comment that: > The problem is ultimately related to clear thinking, and how to teach > people to do THAT. This reminded me of some mail I received last year, from a staff member here who was teaching a first-year course on data structures. His mail, which was sent to a number of us here, was a plea for assistance as to the right way to respond to some mail he had received from one of his students. The student's mail said: > Dear Jason,... You have really done a great job on IDS. It really helped to > clear a lot of lingering doubts Lent term left behind. Thanks a lot > again. Could you advise on how to think creatively. I can't "see" a > program naturally and think deep enough to make the required alterations... None of us really knew how to answer that. John Mackin, Basser Department of Computer Science, University of Sydney, Sydney, Australia john%basser.oz@SEISMO.CSS.GOV {seismo,hplabs,mcvax,ukc,nttlab}!munnari!basser.oz!john ------------------------------ Date: Thu, 24 Jul 86 01:08:50 PDT From: crash!pnet01!kevinb@nosc.ARPA (Kevin Belles) To: crash!noscvax!risks@sri-csl Subject: Dangers of improperly protected equipment Is there any device or devices that protect not only the active lines but the ground lines as well from surge, spike, and EMI-type disturbance? My system appears to have been victimized, thanks to our local electric utility, by the ground for my apartment complex being raised, which caused damage to all the damage to all the grounded equipment on my home computer system, save some cards apparently protected by my boat-anchor power supply, and the fact that each card in my cage is independently regulated. In my case, the surge entered the ground and apparently corrupted my main floppy drive supply to the point where it propagated along the 8" and 5 1/4" cables, destroying the logic boards on all drives and the dynamic memory, which was being accessed at that time. It also managed to get my printer, on another leg entirely, while miraculously missing my terminal and modem. This completely bypassed the fuses and only a trace on the controller board being opened saved the rest of my system being damaged. Result: 3 dead DSDD 8" drives, 1 dead SSDD 5 1/4" drive, 3 drive power supplies, 1 dot-matrix printer, 1 64K DRAM board, and a floppy controller board. Dollar cost: estimated minimum of over $2000.00 if equipment is replaced by new, with no cost for loss of access being figured in. Let this be a warning: Protect your equipment! Any investment in anti-surge equipment, anti-spike equipment, and UPSs are investments in your computing future. Kevin J. Belles - UUCP {sdcsvax,noscvax,ihnp4,akgua}!crash!pnet01!kevinb (Disclaimer: Anything I may say is my opinion, and does not reflect the company I keep. KjB) ------------------------------ End of RISKS-FORUM Digest ************************ -------