4-Oct-86 23:57:10-PDT,16201;000000000000 Mail-From: NEUMANN created at 4-Oct-86 23:55:04 Date: Sat 4 Oct 86 23:55:04-PDT From: RISKS FORUM (Peter G. Neumann -- Coordinator) Subject: RISKS-3.75 DIGEST Sender: NEUMANN@CSL.SRI.COM To: RISKS-LIST@CSL.SRI.COM RISKS-LIST: RISKS-FORUM Digest, Saturday, 4 October 1986 Volume 3 : Issue 75 FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: re: Estell on Viking (RISKS-3.73) (David Parnas, Dave Benson) Software becomes obsolete, but does not wear out (Dave Benson) The fallacy of independence (Dave Benson) Re: Paths in Testing (RISKS-3:72) (Chuck Youman, Mark Day) Mathematical checking of programs (quoting Tony Hoare) (Henry Spencer) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, nonrepetitious. Diversity is welcome. (Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM) (Back issues Vol i Issue j available in CSL.SRI.COM:RISKS-i.j. Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.) ---------------------------------------------------------------------- Date: Fri, 3 Oct 86 15:33:03 EDT From: parnas%qucis.BITNET@WISCVM.WISC.EDU To: neumann@SRI-CSL.ARPA Subject: re: Estell on Viking (RISKS-3.73) ReSent-To: RISKS@CSL.SRI.COM Robert Estell's contribution perpetuates two serious myths about the discussion on Viking and other software. (1) That any of the discussants is expecting perfection. Perfectionists do not use the net. In fact, the only computer scientist I know who could be called a perfectionist does not use computers. Most of us know that computer systems, like other human artifacts, will never be perfect. Our concern is with establishing confidence that the system is free of unacceptable or catastrophic errors. This we can do with many other engineering products. Only software products regularly carry disclaimers instead of limited warranties. That is not because they are the only products that are imperfect. It is because we have so little confidence that they are free of catastrophic flaws. (2) That size is a good measure of the difficulty of a problem. There are big programs solving dull but easy problems. Small programs occasionally solve very hard problems. The size and irregularity of the problem state space, and how well we know that state space determine, in large part the complexity of the problem. The size of the program is often determined by the simplicity of the programmer. In spite of Nancy's help, we don't know much from this forum about what the Viking software actually did. It seems clear that most of the software could have been, and was, used before the flight. Whether the descent software could have been used depends on what it did. At 100 lines one would expect that it did not do much. We all know that programs can work acceptably well. We use them and accept what they do. We also know that failures are not catastrophic and that these programs failed many times before they became reliable enough to be useful. If we had been in a situation in which those failures were unacceptable we would have found another way to solve the real problem. ------------------------------ Date: Fri, 3 Oct 86 17:57:52 pdt From: Dave Benson To: risks%csl.sri.com@RELAY.CS.NET Subject: Viking Lander, once again. I repeat some quotations from Bonnie A. Claussen's paper: The unprecented success of the Viking mission was due in part to the ability of the flight software to operate in an AUTONOMOUS and ERROR FREE manner. ... Upon separation from the Orbiter the Viking Lander, under AUTONOMOUS software control, deorbits, enters the Martian atmosphere, and performs a soft landing on the surface. [CAPS added for emphasis.] Since the up-link was only capable of 4 bits/sec and the light-speed signal requires about 14 minutes for a round-trip to Mars, manifestly the software carried out these control functions without human assistance. >I worry when anecdotal evidence about one software project is used as >"proof" about what will happen with general software projects. > Nancy Leveson I concur. But the Viking Lander experience does give a compelling example that autonomous software can be made to work under certain circumstances. Thus a claim that <> autonomous software fails in its first operational experience is in contradiction to the facts. For amusement, assume that this experience scales linearly with project size. (I assure everyone on RISKS that data suggests a diseconomy of scale--thus larger projects require a more than linear increase of effort to obtain the same reliability.) Now the Viking Lander required 135 engineer-years for about 18000 words of software. Suppose each line of Ada represents about 5 words of software. Thus, if you'll agree with these assumptions, each 40 lines of Ada requite 1.5 engineer-years to produce "equally well tested" code. So a 4 million line Ada program requires 150,000 engineer years of effort. Assuming a 1500 engineer level of effort, that's one century to write and test the code. Of course, to be "equally well tested" would require far more effort than that, from the diseconomies of scale. This is the proper lesson to draw from the Viking Lander experience. ------------------------------ Date: Fri, 3 Oct 86 18:33:12 pdt From: Dave Benson To: risks%csl.sri.com@RELAY.CS.NET Subject: Software becomes obsolete, but does not wear out ob'so.lete. Abbr. obs. Of a type or fashion no longer current; out of date; as, an obsolete machine. ob'so.les''cent. Going out of use; becoming obsolete. wear v.t. ... 6. To use up by wearing (sense 1); as, to wear out a dress; hence, to consume or cause to deteriorate by use, esp. personal use; as, the lugage is worn. 7. To impair, waste, or diminish, by continual attrition, scraping, or the like; as, the rocks are worn by water; hence, to exhaust or lessen the strength of; fatigue; weary; use up; as, to be worn with desease. 8. To cause or make by friction or wasting; as, to wear a channel or hole. wear v.i. ... 4. To be wasted, consumed, or diminished, by use; to suffer injury, loss, or extinction, by use or time;-- often with , , , etc.; as the day has worn on. Software, like any artifact, becomes obsolete over time. The changing informational environment about the software drives it to obsolesence. It becomes unmaintainable, not from wear, but because the expertise required has become dissipated. Recall that nobody knows how to make a Stradivarius violin anymore, either. I agree with the causes of software obsolescence, but strongly recommend that we use the customary meanings of words in the dictionary so that we understand one-another and so that non-software-types can somewhat understand us as well. Thus: software may become obsolete from many causes, some of which are understood. But software ordinarily does not wear out and never, never rots. [...] There is a reason for precise technical terms. In other disciplines words are coined, just to avoid the overloading and potential resultant misunderstanding. I recommend that we attempt this, but suggest looking in the dictionary first. ------------------------------ Date: Sat, 4 Oct 86 22:21:39 pdt From: Dave Benson To: risks%csl.sri.com@RELAY.CS.NET Subject: The fallacy of independence A RISKS contribution suggests that since we can engineer good 100,000 statement software, the means to make good 1,000,000 statement software is to produce 10 smaller packages and hook these 10 together. Such a claim makes the assumption that the informational environment of the total software is such that the total software system can be decomposed into 10 nearly independent parts, which communicate with one another along well-understood interfaces. The key is the claim that the interfaces are well-understood. Software is an example of an extremely complex artifact, a class of artifacts which we understand poorly--for otherwise we wouldn't call them complex. In smaller programs we repeatedly see that the interfaces are not well-understood until the program is available for experimentation. Even then, our everyday experiences with software demonstrate again and again that what we had assumed about the program behavior does not match the reality of actual experience. Thus we discover that the interfaces are not well-understood. Example: Virtual storage managers in operating systems provide a superficially simple interface to the hardware and the rest of the operating system. The interface to the user program is the essense of simplicity--complete transparency. Now the earliest virtual storage managers were the essense of simplicity. So nothing could go wrong, right? Wrong. The interaction of user virtual storage requests, the operating system scheduler, and the virtual storage manager led to thrashing--slowing performance to a crawl, at best. Upon OBSERVING this phenomenon, theories were developed and better, more complex, algorithms were installed. But this phenomenon was not predicted a priori. The essential point is that even the cleanest design may fail in actual engineering practice until it is tried in the fully operational environment for which it was intended. In software engineering we only have confidence in a design if it is similar to a previous, successful design. But that is just like any other engineering practice. The intuition and insight of a Roebling (Brooklyn Bridge, 1883) is rare in any engineering field. Most of us are good copiers, making local improvements to a design already shown to be successful. The corollary is that it is wrong to assume the near-independence of components until this near-independence has been abundantly shown in practice and theory. Example: The division of the frontend of a compiler into lexical and syntactic parsing components which interact in well-understood ways has an excellent underlying theory and works well in practice. Thus it is common to teach this practice and theory, since post-facto it is a workable engineering design of nearly-independent components. By all means color me realist. Also color me existentialist. What works is that which works, not what we might hope or dream or imagine works. The near-independence of software components is an aspect which is proved in practice to be a near-independence of components. As there is no "software decomposition theorem" which provides a general framework for that elusive quality, near-independence, we cannot assume that 10 good parts will actually form a cohesive, practical reliable whole. In each separate design, then, the value of the whole system can only be demonstrated by the use of the whole system. Thus I claim it is a fallacy to assert independence, or even near-independence, for any division of the work within a system until this has been conclusively demonstrated. I further claim, with ample historical precedent, that the reliability of a system is only poorly correlated with the reliability of its parts. Without a specific design one can say nothing in general. ------------------------------ To: mday@xx.lcs.mit.edu Cc: risks@csl.sri.com Subject: Re: Paths in Testing (RISKS-3:72) Date: Fri, 03 Oct 86 16:46:13 -0500 From: Chuck Youman A comment on basis paths. There was a paper on "Evaluating Software Testing Strategies" presented by Richard Selby at the 9th Annual NASA Goddard Software Engineering Workshop that compared the strategies of code reading, functional testing, and structural testing in three aspects of software testing. One of the conclusions I recall is that structural testing was not as effective as the other two methods at detecting omission faults and control faults. The conference proceedings are report SEL-84-004 and can be obtained from Frank E. McGarry, Code 552, NASA/GSFC, Greenbelt, MD 20771. Charles Youman (youman@mitre.arpa) ------------------------------ Date: Sat 4 Oct 86 14:03:24-EDT From: Mark S. Day Subject: Re: Paths in Testing (RISKS-3:72) To: m14817@MITRE.ARPA cc: risks@CSL.SRI.COM It's reasonably well known that structural (path-based) testing is poor at detecting faults of omission. Correspondingly, functional testing is poor at detecting faults on "extra" paths that are present in the implementation (for optimization of common cases, for example) but are not "visible" in a functional spec of the module. The conclusion to draw is that proper testing requires a combination of "external" testing (treating the module as a black box and examining its input/output structure) and "internal" testing (examining the contents of the module). --Mark ------------------------------ From: decvax!utzoo!henry@ucbvax.Berkeley.EDU Date: Sat, 4 Oct 86 21:12:13 edt To: ucbvax!CSL.SRI.COM!RISKS@ucbvax Subject: Mathematical checking of programs (quoting Tony Hoare) I agree with much of the quoted discussion from Hoare, including the obvious desirability of rather heavier use of mathematical analysis of safety-critical programs. I do have one quibble with some of his comments, though: > ... never even heard of the possibility that you can establish > the total correctness of computer programs by the normal mathematical > techniques of modelling, calculation and proof. ... > A mathematical proof is, technically, a completely reliable method of > ensuring the correctness of programs, but this method could never be > effective in practice unless it is accompanied by the appropriate attitudes > and managerial techniques. ... I think talk of "total correctness" and "complete reliability" shows excess enthusiasm rather than realistic appreciation of the situation. Considering the number of errors that have been found in the small programs used as published examples of "proven correctness", wariness is indicated. Another cautionary tale is the current debate about the validity of the Rourke/Rego proof of the Poincare conjecture. As I understand it -- it's not an area I know much about -- the proof is long, complex, and sketchy, and nobody is sure whether or not to believe it. And this is a case where the specs for the problem are very simple and obviously "right". Mathematical proof has its own feet of clay. If one defines "effective in practice" to imply complete confidence in the results, then I would not fly on an airliner whose flight-control software was written by a team making such claims. Complete confidence in provably fallible techniques worsens risks rather than reducing them. (The apocryphal comment of the aeronautical structure engineer looking at his competitor's aircraft: "Fly in it? I wouldn't even walk under it!") On the other hand, if one defines "effective in practice" to mean "useful in finding errors, and valuable in increasing one's confidence of their absence", I wholeheartedly agree. One should not throw out the baby with the bathwater. If one sets aside the arrogant propaganda of the proof- of-correctness faction, there is much of value there. To borrow from the theme of a PhD thesis here some years ago, proving programs INcorrect is much easier than proving them correct, and is very useful even if it isn't the Nirvana of "total correctness". The mental discipline imposed on program creation (defining loop invariants, etc.) is also important. Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry ------------------------------ End of RISKS-FORUM Digest ************************ -------