1-Sep-86 16:35:40-PDT,15947;000000000000 Mail-From: NEUMANN created at 1-Sep-86 16:33:36 Date: Mon 1 Sep 86 16:33:36-PDT From: RISKS FORUM (Peter G. Neumann -- Coordinator) Subject: RISKS-3.47 Sender: NEUMANN@CSL.SRI.COM To: RISKS-LIST@CSL.SRI.COM RISKS-LIST: RISKS-FORUM Digest, Monday, 1 September 1986 Volume 3 : Issue 47 FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Flight Simulators Have Faults (Dave Benson) Re: QA on nuclear power plants, the shuttle, and beer (Henry Spencer) Acts of God vs. Acts of Man (Nancy Leveson -- two messages) Computer Literacy (Mike McLaughlin) Another supermarket crash (Ted Lee) A supermarket does not grind to a halt (Brint Cooper) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, nonrepetitious. Diversity is welcome. (Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM) (Back issues Vol i Issue j available in CSL.SRI.COM:RISKS-i.j. Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.) ---------------------------------------------------------------------- Date: Sat, 30 Aug 86 23:08:47 pdt From: Dave Benson To: risks%csl.sri.com@CSNET-RELAY.ARPA Subject: Flight Simulators Have Faults I mentioned the F-16 RISKS contributions to my Software Engineering class yesterday. After class, one of the students told me the following story about the B-1 Flight Simulator. The student had been employed over the summer to work on that project, thus having first-hand knowledge of the incident. Seems when a pilot attempts to loop the B-1 Flight Simulator that the (simulated) sky disappears. Why? Well, the simulated aircraft pitch angle was translated by the software into a visual image by taking the trigonometric tangent somewhere in the code. With the simulated aircraft on its nose, the angle is 90 degrees and the tangent routine just couldn't manage the infinities involved. As I understand the story, the monitors projecting the window view went blank. Ah, me. The B-1 is the first aircraft with the capability to loop? Nope, its been done for about 70 years now... The B-1 Flight Simulator is the first flight simulator with the capability to allow loops? Nope, seems to me I've played with a commercially available Apple IIe program in which a capable player could loop the simulated Cessna 180. $$ to donuts that military flight simulators with all functionality in software have been allowing simulated loops for many years now. Dick Hamming said something to the effect that while physicists stand on one another's shoulders, computer scientists stand on one another's toes. At least on the toes is better than this failure to do as well as a game program... Maybe software engineers dig one another's graves? And this company wants to research Starwars software... Jus' beam me up, Scotty, there's no intelligent life here. ------------------------------ Date: Sun, 31 Aug 86 01:35:29 edt From: decwrl!decvax!LOCAL!utzoo!henry@ucbvax.Berkeley.EDU To: LOCAL!CSL.SRI.COM!RISKS Subject: Re: QA on nuclear power plants, the shuttle, and beer Equipment failures and human errors are common enough in any human endeavor; the question is not whether they happen, but whether they present actual or potential risks of serious consequences. In this context the lack of publicity is not at all surprising: one form of serious consequence is public hysteria over insignificant trivia. When an attempt to reach a vacationing brewery programmer gets blown up into stories of a total production shutdown and impending beer shortage -- this, mind you, in an industry which is *not* the focus of hostile propaganda campaigns and widespread irrational fears -- the people involved with nuclear plants have every reason to be very quiet about even routine, unexciting, non-hazardous problems. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry ------------------------------ Date: 30 Aug 86 17:26:20 PDT (Sat) From: Nancy Leveson To: risks@csl.sri.com Subject: Acts of God vs. Acts of Man >> From: Nancy Leveson >> ... My real quibble is with the term "computer errors"... > From: Herb Lin >While I agree with the above sentiment for the most part, it strikes >me that there is a sense in which it is not true... I would describe it as a mistake on the part of the designers, not the computer. It probably did exactly what the designers told it to do. You certainly could not expect it somehow magically to come up with a fix for the software bug that the designers put in. Only under those assumptions could the computer have made a mistake by not coming up with such a fix. >We can describe this as a design (and thus human) error. But given >the circumstances, it is not implausible to argue that an "Act of God" >occurred. But unless I am working very fast, am distracted, or have taken drugs, you have described all situations in which I make mistakes. By this argument, as long as I work slowly, pay attention, and remain sober, then I will never make a mistake again -- I can attribute any stupid thing I do to God and never have to accept responsibility. I will also never have to do anything to improve my performance since it is God's fault (or the computer's fault or the fault of the complexity of the problem I am working on), and therefore there is nothing I can do about it. There is a difference between responsibility and liability. I can be responsible without being negligent (although I argue below that the designers could be considered negligent under the above hypothesized circumstances). By Herb's definition, no accident can ever be ascribed to human error since: >It was a rare happening; All accidents are rare happenings. And unless the human is extremely incompetent, most human mistakes are rare happenings. >we didn't understand it very well; I rarely make mistakes when I understand things well (unless I am working fast, not paying attention, or stoned). >we could not predict that it would happen. If I could predict that I will make a mistake, then I either don't do the thing or I fix the mistake I made. I have yet to write a program which I did not think was correct (although a few actually had mistakes in them). In fact, it is precisely the stochastic, hardware-type failures which we can predict and for which we can provide probabilities. It is human mistakes which are difficult to predict. BUT it IS possible to predict that human-made flaws will occur in complex designs (although the particular mistakes may not be predictable) and, it IS possible to provide mechanisms to protect against them. This is precisely what engineers do when they put interlocks into potentially dangerous systems. Software and computers are not unique in this respect. One can argue that the humans were responsible for the original design flaw hypothesized above (although not negligent) and that they were responsible and liable for the mistake they made in not doing something to avert disaster in case their assumptions were wrong. Most readers will agree that the Russian explanation for the accident at Chernobyl implies human error and not an act of God. But certainly, the events were rare, the operators did not understand what they were doing, and they could not predict the consequences (or they certainly would not have done it). Again, computer software is not needed for these conditions to hold. My real fear of this type of thinking is that it will be used to justify not doing anything about software safety. If all software design flaws are "acts of God," then we need not spend a lot of money for safety because nothing can be done about programmer mistakes which are not the result of negligence or maliciousness. This is, however, untrue. System safety engineers have developed scientific and engineering principles to identify and control hazards (caused by human design flaws or hardware failures) in complex systems through analysis, design, and management procedures. We need to do the same for software before catastrophies result. Nancy Leveson Information and Computer Science Dept. University of California, Irvine [There is a lengthy response to this message from Herb, but since he mostly agrees with Nancy or reemphasizes earlier points, it is not included in this issue -- except for a paragraph that prompted a further message from Nancy. I am somewhat amused that this entire debate began when Nancy responded to my original statement -- which enumerated a few causative factors, but which in no way intended to imply that those were the ONLY factors in those cases. I have maintained consistently throughout RISKS that a holistic approach is absolutely essential; any factor individually, or various factors in combination, could be devastating. Herb concludes that he and Nancy really seem to agree, so I presume this exchange will now end. PGN] ------------------------------ Date: 30 Aug 86 23:30:36 PDT (Sat) From: Nancy Leveson To: lin@xx.lcs.mit.edu, risks@csl.sri.com Subject: Acts of God vs. Acts of Man, Round n+1 (eastbound) >From Herb Lin: >But the number of assumptions that >designers must make is enormously large, and it is essentially >impossible to even articulate ALL of one's assumptions. Agreed. But there are ways to determine which are the critical assumptions with regard to particular hazards. This is exactly what some of my techniques, e.g. software fault tree analysis, attempt to do. In the Firewheel example that I published, we determined a critical assumption which could have resulted in the satellite being destroyed. That is, if there were two sun pulses detected within 64 milliseconds of each other, the microprocessor interrupt system became hung which could possibly result in destruction of the sensor booms (and thus the usefulness of the satellite). We found this assumption by working backward through the software from the hazardous condition. The solution, once the critical assumption had been determined, was a simple blocking of the second sun pulse interrupt. I don't know for what size systems these backward analysis approaches are practical. It took Peter Harvey (my student) two days to analyze the Firewheel software (which is about 1600 lines long) by hand. Obviously, it would be possible to analyze larger software, but we do not yet know how much this will scale up practically. We are working on a software tool to automate as much as possible. These techniques are, of course, no more perfect than other more traditional software engineering techniques. And better ones may be found. I am just not ready to say it is impossible without first trying. Backwards analysis, verification of safety, software interlocks, software fault tolerance, fail-safe design, ... -- there are possible solutions which we should be examining. Nancy Leveson ------------------------------ Date: Mon, 1 Sep 86 11:49:10 edt From: mikemcl@nrl-csr (Mike McLaughlin) To: risks@csl Subject: Computer Literacy From THE WASHINGTON POST, Monday, 1 Sept 86, page A14, Letters to the Editor [ "..." indicates omissions]. While I do not entirely agree with Mr. Jordan, much of what he says is directly applicable to Risks. [Before responding to this, please recall that this topic has already been discussed at some length in RISKS-2.36 and 37, and in RISKS-3.17, 19, 20, and 21. PGN] +++++++++++++++++++++++++++++++++++++++++++++++++++++++ COMPUTER LITERACY Although I earn my living as a consultant in computerized data bases, I strongly oppose the view... that computer literacy should be mandatory in the secondary school curriculum. Computers are a device for performing some task that either is already performed by other means or first must be understood in other terms, usually a mathematical equation. Learning how to operate a computer, or program one, is not going to improve a student's knowledge of languages, mathematics, history or political science. Alfred North Whitehead observed that civilization increases the number of things that we can do without thinking, i.e., that we can take for granted. This is evident in the development of computers, which increasingly are becoming like automobiles; anyone can drive them. Learning the technology of computers has as much relevance to everyday life as learning the technology of auto engines. Unquestionably there are tasks for which computers are indespensable, but individuals will learn those functions as they become involved in the task itself, whether it be medical diagnosis, controlling the flow of electric power over a grid or determining the authorship of a 16th-century poem. What students need to know is how to think, especially about the human condition. As more and more college students flock to "practical" majors, the secondary schools should be concentrating on the liberal arts. In this perspective, "computer literacy" may be just another form of a larger illiteracy. - John S. Jordan, Washington, D.C. ------------------------------ Date: Sat, 30 Aug 86 23:26 EDT From: TMPLee@DOCKMASTER.ARPA Subject: Another supermarket crash To: Risks@CSL.SRI.COM Same thing happened here (Minneapolis-St.Paul) a couple of years ago -- I was in a major discount store (Target), during a normal busy time -- Saturday morning, I think -- only to find that all the cash registers wre down because the central computer was down. Don't know how long it lasted, but at least long enough that by the time I got there the cashiers were using paper and pencil. Really, stupid -- (so he says as a so-called computer expert) -- given that those registers probably had Z80's or 6502 or such in them, a printed record, etc., so they could just as well have worked off-line (except for not knowing the price on instant sale items, which would I assume have been in the central machine.) ------------------------------ Date: Mon, 1 Sep 86 13:29:45 EDT From: Brint Cooper To: risks@csl.sri.com cc: mnetor!lsuc!dave@seismo.css.gov Subject: A supermarket does not grind to a halt Last month, as I awaited checkout in a "Giant" supermarket in Bel Air, MD, an area-wide power outage lasting several minutes occured. The following was the sequence of events: 1. All lights, outside and inside, instantly out. 1A. Display on cash register_cum_terminal RETAINED display! 2. Some of the same lights came back almost immediately (seemed to be back-up power). 3. Several minutes passed; additional lights began to come on. 4. One-by-one, the terminals "beeped" and became functional. No market employee with whom I spoke seemed to understand what actually happened. But the computer system obviously was protected from such random power outages (which occur FREQUENTLY here). Brint Cooper UUCP: ...{seismo,unc,decvax,cbosgd}!brl-smoke!abc ------------------------------ End of RISKS-FORUM Digest ************************ -------