1-Oct-86 20:42:26-PDT,12855;000000000000 Mail-From: NEUMANN created at 1-Oct-86 20:40:31 Date: Wed 1 Oct 86 20:40:31-PDT From: RISKS FORUM (Peter G. Neumann -- Coordinator) Subject: RISKS-3.72 DIGEST Sender: NEUMANN@CSL.SRI.COM To: RISKS-LIST@CSL.SRI.COM RISKS-LIST: RISKS-FORUM Digest, Wednesday, 1 October 1986 Volume 3 : Issue 72 FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Viking Lander (Nancy Leveson) Deliberate override (George Adams) Overriding overrides (Peter Ladkin) A propos landing gear (Peter Ladkin) Paths in Testing (Mark S. Day) Confidence in software via fault expectations (RISKS-3.69) (Darrel VanBuer) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, nonrepetitious. Diversity is welcome. (Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM) (Back issues Vol i Issue j available in CSL.SRI.COM:RISKS-i.j. Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.) ---------------------------------------------------------------------- Date: Wed, 1 Oct 86 11:44:21 edt From: leveson@sei.cmu.edu To: risks@sri-csl.arpa Subject: Viking Lander I have spoken to a man who was involved in the building of the Viking Lander software. He told me that the on-board software consisted of three basic functions: terminal descent software, some transmission of data, and a sequence of operations while the lander was on the ground. The Honeywell computer had a memory of about 18,000 24-bit words. It was programmed in assembly language. There was an interrupt-driven executive program. What is interesting is his claim that there were several bugs in the software. There just were not any bugs that caused an abort of the mission or other serious consequences. The programs were, in fact, overloaded on the way to Mars because of discovery of some problems. This man claims that there was a great unwillingness to admit that there were bugs in the software and so some remain undocumented or difficult to find in the documentation. But the software was NOT bug-free. There was also ground system software which was responsible for auxiliary computations such as pointing the antenna. One of the reasons I have come into contact with this software is that I have been asked to participate in a fault tolerance experiment which will use the Viking lander as an example. In examining the software in the light of using it for such an experiment, Janet Dunham at RTI has found it not to be adequate in its present form because it is too small and straightforward. In fact, she is presently writing a specification of the terminal descent portion which adds complexity to the problem to make it more interesting and complex for the experiment (she is afraid that there just will not be enough errors made given the original problem). She says that the navigation portion of the terminal descent software (which is the most complex part) is only about 100 lines of Ada code. The point to remember is that not all software is alike. Small, straightforward problems with very little complexity in the logic (e.g., just a series of mathematical equations) may not say much about the reliability of large, complex systems. We know that scaling-up is a very serious problem in software engineering. In fact, it has been suggested that small vs. large software projects have very few similarities. It should also be noted that the avionics part of this relatively small and relatively simple software system cost $18,000,000 to build. Although the 18,000 words of memory were overloaded a few times, the amount of money spent per line of code was extremely high. I worry when anecdotal evidence about one software project is used as "proof" about what will happen with general software projects. There just are too many independent variables and factors to do this with confidence. And, in fact, we do not even know for sure what the important variables are. Nancy Leveson Info. & Computer Science Dept. University of California, Irvine (arpanet address, for angry replies, is nancy@ics.uci.edu despite evidence above to the contrary) ------------------------------ Date: Wed, 1 Oct 86 10:18:43 pdt From: George Adams To: risks@csl.sri.com Subject: Deliberate override Even automobiles might appropriately have overrides of automated controls, and even of automated safety systems. I have only read about the following automobile item and wish I had the opportunity to verify it, but it seems reasonable. Consider the anti-lock braking systems now becoming more widely available in automobiles. The driver can apply a constant input to the brake pedal, but modulated braking forces are applied at the wheels so that the wheels do not lock. Many have probably seen the ad on tele- vision in which the car with anti-lock brakes sucessfully negotiates the turn on wet pavement while coming to a rapid stop without skidding out of control. Yet, perhaps such vehicles should have a switch to disable anti-lock and allow conventional braking. Imaging trying to stop quickly with anti-lock brakes on a gravel road. Even if an incompetent driver forgot to enable the system on hard pavement, performance would be no worse than now common. Without the switch a competent driver might hit that cow instead of stopping in time. Regarding aircraft, a report on the midair collision involving the Aero Mexico flight said that the flight crew applied thrust reversers after the collision. This seems like a creative response, and one that might easily be disallowed in a more automated aircraft in which a check for weight on extended landing gear was a prerequisite for thrust reversal. While thrust reversal had no benefit for the Aero Mexico flight itself, perhaps it reduced impact speed and consequently reduced the extent of damage on the ground. A vehicle and its operator are a system. By automating vehicle systems we can adapt operator workload to better match the capabilities of human beings and make it possible for an operator to do a better job. We can also automate to limit operator options for coping with non-routine situations and impede rapid operator override, thereby making a more expert system and also a less generally capable one. ------------------------------ Date: Wed, 1 Oct 86 17:08:22 pdt From: ladkin@kestrel.ARPA (Peter Ladkin) To: risks@sri-csl Subject: Overriding overrides An example of a deliberate override that led to disaster: An Eastern Airlines 727 crashed in Pennsylvania with considerable loss of life, when the pilots were completing an approach in instrument conditions (ground fog), 1000 feet lower than they should have been at that stage. They overrode the altitude alert system when it gave warning. Peter Ladkin ------------------------------ Date: Wed, 1 Oct 86 16:55:14 pdt From: ladkin@kestrel.ARPA (Peter Ladkin) To: risks@sri-csl Subject: A propos landing gear Alan Marcum's comment on gear overrides in the Arrow reminded me of a recent incident in my flying club (and his, too). The Beech Duchess, a light training twin, has an override that maintains the landing gear *down* while there is weight on the wheels, ostensibly to prevent the pilot from retracting the gear while on the ground (this is a problem that Beech has in some of its airplanes, since they chose to use a non-standard location for gear and flap switches, encouraging a pilot to mistake one for the other). Pilots can get into the habit of *retracting* the gear before takeoff, secure in the knowledge that it will remain down until weight is lifted off the wheels, whence it will commence retracting. This has the major advantages that it's one less thing to do during takeoff, allowing more concentration on flying, and the gear is retracted at the earliest possible moment, allowing maximum climb performance, which is important in case an engine fails at this critical stage. Can anyone guess the disadvantage of this procedure yet? Our club pilot, on his ATP check ride, with an FAA inspector aboard, suffered nosewheel collapse on take-off, and dinged the nose, and both props, necessitating an expensive repair and engine rebuild. Thankfully, all walked away unharmed. It was a windy day. It is popularly supposed that the premature retraction technique was used, and a gust of wind near rotation speed caused the weight to be lifted off the nosewheel. When the plane settled, the retraction had activated, and the lock had disengaged, allowing the weight to collapse the nosewheel. Both pilots assure that the gear switch was in the down position, contrary to the popular supposition. All gear systems in the aircraft were functioning normally when tested after the accident. The relevance to Risks? The system is simple, and understood in its entirety by all competent users. The technique of premature retraction has advantages. It's not clear that a gedankenexperiment could predict the disadvantage. Peter Ladkin ------------------------------ Date: Wed 1 Oct 86 11:57:32-EDT From: Mark S. Day Subject: Paths in Testing To: RISKS@CSL.SRI.COM In-Reply-To: <8610010515.AA25307@ATHENA> Message-ID: <12243347691.50.MDAY@XX.LCS.MIT.EDU> "Paths" as used in the discussion of "path coverage" are probably intended to be what are called "basis paths." A piece of code with loops can indeed have an infinite number of paths, but every path is a linear combination of a much smaller set of paths. Testing that covers every basis path and also tests each loop using "engineer's induction" ("zero, one, two, three... good enough for me") is significantly better than random testing to "see what breaks" and much more feasible than trying to test all the combinations of basis paths. The McCabe or cyclomatic complexity metric defines the number of basis paths through a piece of code; see T.J. McCabe, "A Complexity Measure", IEEE Transactions on Software Engineering SE-2, 4 (Dec 1976) pp. 308-320. A quick approximation of McCabe complexity is that straight-line code has a complexity of 1 (obviously, I guess) and most control statements (if-then, if-then-else, while, repeat, for...) each add 1 to the complexity. An n-way case statement adds n-1 paths to the "straight" path, so it adds n-1 to the complexity. This approximation only applies to code with no gotos. The IEEE Computer Society puts out a tutorial volume called "Structured Testing" that includes the previously-cited paper and a number of other related articles, including a heuristic for using the McCabe complexity to select test paths. --Mark ------------------------------ Date: Tue, 30 Sep 86 18:37:27 pdt From: hplabs!sdcrdcf!darrelj@ucbvax.Berkeley.EDU (Darrel VanBuer) To: hplabs!CSL.SRI.COM!RISKS@ucbvax.Berkeley.EDU Subject: Re: Confidence in software via fault expectations (RISKS-3.69) >From: Dave Benson >... Software ordinarily has no manufacturing defects and the >usual way ordinary backups are done insures that most software does not >wear out before it becomes obsolete. The thing is software DOES wear out in the sense that it loses its ability to function because the world continues to change around it (maybe a bit because the pattern of bits does NOT wear out): e.g. operating systems which have gone psychotic because the number of bits used to represent a date because compatible hardware has continued to run far longer than designers of software and hardware anticipate (how many IBM-360 programs will correctly handle the fact that year 2100 is NOT a leap year, but still be running inside some emulation/automatic retranslation) or financial software unable to deal with 1000 fold inflation because all the numbers overflow... Darrel J. Van Buer, PhD, System Development Corp., 2525 Colorado Ave Santa Monica, CA 90406 (213)820-4111 x5449 / !sdcrdcf!darrelj ...{allegra,burdvax,cbosgd,hplabs,ihnp4,orstcs,sdcsvax,ucla-cs,akgua} [This one is getting to be like the "YES, VIRGINIA, THERE IS A SANTA CLAUS" letter that used to appear each year in the Herald Tribune. But I keep reprinting the recurrences because some of you still don't believe it. There is no sanity clause. PGN] End of RISKS-FORUM Digest ************************ -------