RISKS-LIST: RISKS-FORUM Digest Monday 5 February 1990 Volume 9 : Issue 66 FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Another SAGE memoir (Jon Jacky) DoD plans another attack on the "software crisis" (Jon Jacky) The Cultural Dimensions of Educational Computing (Phil Agre) Vincennes' Aegis System: Why did RISKS ignore specifications? (R. Horn) Computer Virus Book of Records (Simson L. Garfinkel) Re: AT&T (Gene Spafford, David Keppel, Stanley Chow) Sendmail (Brian Kantor, Rayan Zachariassen, Geoffrey H. Cooper, Kyle Jones, Craig Everhart) Re: Risks of Voicemail systems (Randall Davis) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, and nonrepetitious. Diversity is welcome. CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line (otherwise they may be ignored). REQUESTS to RISKS-Request@CSL.SRI.COM. TO FTP VOL i ISSUE j: ftp CRVAX.sri.comlogin anonymousAnyNonNullPW cd sys$user2:[risks]get risks-i.j . Vol summaries now in risks-i.0 (j=0) ---------------------------------------------------------------------- Date: Sun, 4 Feb 1990 14:46:36 PST From: JON@GAFFER.RAD.WASHINGTON.EDU (Jon Jacky) Subject: Another SAGE memoir Les Earnest's posting on SAGE reminded me of an anecdote James Fallows tells in his book, NATIONAL DEFENSE (Vintage Books, 1982, p. 59): "As a child in California, I grew up five miles from SAGE headquarters at Norton Air Force Base. Each year our classes would take school field trips to Norton. The dramatic conclusion came when we were ushered into the SAGE control room. The commanding general would appear at this point and attempt a demonstration of how quickly and reliably his system worked. In every instance I can remember, there was a technical screw-up of some kind, and the general would lead us out, assuring us that, heh heh, this sort of thing did not happen very often." Much of Fallows' book is a critique of technically complex weapons systems, which many readers of this digest would find interesting. Fallows summarizes the SAGE story this way (also on p. 59): "Wouldn't it be wonderful if, instead of leaving aerial combat to a group of pilots trying to figure it out for themselves which enemy planes to destroy, the whole enterprise could be automatically controlled from the ground? If you had a huge radar and computer complex, it might be able to identify all the "friendly" and "enemy" planes in the sky and rationally distribute assignments for shooting them. Then it could transmit commands to each fighter plane, guiding it precisely to its target. Visions of this sort lay behind a $20 billion radar complex of the sixties known as SAGE --- which, after countless revisions, finally foundered due to the technical complexity of devising a computer program that could keep the friendly and enemy planes straight. Nonetheless, the Air Force and Navy have invested further billions in radar planes known as the AWACS and E-2, which face the far greater technical challenge of doing the same thing from a single plane in the air." In a footnote Fallows says more of the technical difficulties of distinguishing friend from foe: "The real problem was that since planes in a dogfight fly in unpredictable patterns, when two "blips" from two planes crossed on the radar screens the computer could not be sure which plane was which when they separated again. ..." (Fallows doesn't give a source for this explanation of the foundering of SAGE; the more usual explanation is that SAGE became obsolete after the Soviet Union concentrated on aiming ICBM's, rather than manned bombers, at the continental U.S.A. I remember a local news story around 1982 saying they were finally shutting down the SAGE installation at McChord Air Force Base near Seattle, Washington). - Jon Jacky, University of Washington ------------------------------ Date: Sun, 4 Feb 1990 15:23:21 PST From: JON@GAFFER.RAD.WASHINGTON.EDU (Jon Jacky) Subject: DoD plans another attack on the "software crisis" Here are excerpts from ELECTRONICS ENGINEERING TIMES, Jan 29 1990, p. 16: DOD PLAN ADDRESSES SOFTWARE PROBLEMS by Brian Robinson Washington - The Defense Department is expected to go public next month with an ambitious plan aimed at solving its growing software problems. The product of an agency-wide collaboration, the plan represents the first time the Pentagon has managed to get a broad consensus on the issue. The master plan, to be implemented over five years, will tackle the rapidly expanding size and development costs of defense software, a problem made worse by the tendency of different groups within the Pentagon to go their separate ways when it comes to software requirements. Some 20 groups within the defense community reportedly took part, including the Army, Navy, Air Force, Defense Communications Agency, National Security Agency, and the Defense Advanced Research Projects Agency. The plan covers six major topics: software acquisition and life cycle management, government software policies, organizational coordination and cooperation, personnel, the software technology base and software technology transition. ... Pentagon analysts have predicted growing problems as military systems expand in size and complexity and as projects are developed that require programs with many millions of lines of code, the Strategic Defense Initiative being the prime example. ... A House investigations subcommittee report late last year accused the Defense Department and other federal agencies of putting lives at risk and wasting billions of dollars with substandard software ... The National Research Council also condemned the current state of sofware and development practices, contending that researchers in government and industry had not kept up with the development of complex software systems. The DoD will collect public comments on the plan at a forum April 3 -- 5 in Falls Church, VA. (The article does not mention who, or which agency, was the source of this story. The article also does not mention any of the DoD agencies and projects already charged with this problem, including STARS, SEI, AJPO, or the 1987 Defense Science Board study). - Jon Jacky, University of Washington ------------------------------ Date: Sun, 4 Feb 90 18:05:30 199 From: "Phil Agre" Subject: The Cultural Dimensions of Educational Computing Anyone who is interested in technology as a cultural phenomenon will probably want to read the following book: C. A. Bowers, The Cultural Dimensions of Educational Computing: Understanding the Non-Neutrality of Technology, Teacher's College Press, 1988. It is often said that computers are neutral in that, like pencils and hammers, they can be used for either good or evil. This might be true on some possible interpretations, but Bowers argues that it is false on a long list of others. Specifically, he argues that particular computer systems for education often incorporate unarticulated assumptions about computers, about thinking, about society, and about the relations among these things, and that the use of these systems can inculcate or reinforce the uncritical, even unwitting acceptance of those assumptions by students and teachers alike. He gives many examples, and his arguments seem to me to apply equally well to a wide variety of other applications of computers. In developing his arguments, he touches on a wide variety of critiques of technology and of computation as social phenomena. Although he has done a valuable public service in presenting these ideas in accessible ways, the principal weak spot of the book is a sometimes excessive credulity toward the critiques themselves, which are not of uniform quality. Reading him thus calls for a critical and selective attitude as well as an open mind, but then it is exactly his point that we should bring such an attitude to everything that concerns technology. Highly recommended. ------------------------------ Date: Mon, 5 Feb 90 09:15 EST From: HORN%HYDRA@sdi.polaroid.com Subject: Vincennes' Aegis System: Why did RISKS ignore specifications? Recent continuation of the Vincennes controversy in the Naval press spurs an observation: When faced with a catastrophic failure, the non-computer naval community is analyzing the system specifications and comparing them with the actions taken (albeit non-mathematically). When faced with the same catastrophe, the RISKS community utterly ignored the specifications and lept into discussion of potential software flaws and changes. This from a group where proof of conformance to specification has a strong following. By specifications I refer not to the engineering documents used in building the shipboard equipment. I mean the laws and treaties governing the behaviour of combatant and non-combatant in areas of conflict. They did and do have direct relevance to the computer systems. There have been at least five relevant treaties covering such behaviour in the last century. There is a tremendous literature exploring issues and alternatives. Situations like the Vincennes are explicitly explored and analyzed. These are the recognized specifications for how all parties should behave. I am not interested in renewing controversy over the events. Parties interested in the relevant treaties and laws might start with the overview in _International Law for Seagoing Officers_, Roberts, and branch out from there into details and current discussions. I am interested in some introspective analysis of why the computer community of RISKS totally ignored the specifications. Understanding this behaviour can lead to understanding a common failure mode of computer systems. Was it ignorance? If so, why no requests for information? Why such willingness to proceed in ignorance? Was it fear that the discussion of treaty and law might degenerate into political arguments? If so, how can systems involving political sensitivity be subject to specification and analysis? Was it other group dynamics? If so, how can these be controlled productively? (In my own case I would place this as dominant. I became interested in the discussions and overlooked the growing irrelevancy. Furthermore, the group dynamic discouraging radical departures from the current topic of discussion made me hesitate to change topics.) What other factors were involved? I can think of only one invalid excuse. Unlike most systems, these specifications are readily available to the public world wide. R Horn horn%hydra@polaroid.com [I don't think the RISKS community completely ignored the `specifications'. See my summary of Matt Jaffe's discussion, RISKS-8.74, 26 May 1989. PGN] ------------------------------ Date: 4 Feb 90 14:46:06 EST (Sun) From: simsong@prose.CAMBRIDGE.MA.US (Simson L. Garfinkel) Subject: Computer Virus Book of Records (This is chart 74 from the National Center for Computer Crime Data's 1989 report, Commitment to security.) Please forgive any typos. $97,000,000 John McAfee's estimate of the cost of the "Internet Worm." [John McAfee is president of the Computer Virus Industry Association.] $10,000,000 Cliff Stoll's estimate of cost of "Internet Worm." (See $100,000.) 250,000 Richard Brandow's estimate of the number of computers his "World Peace Virus" infected. $250,000 Cost of reacting to "Internet Worm" at Los Alamos National Laboratory. $200,000 Gene Spafford's estimate of cost of "Internet Worm." 168,000 Records destroyed by one computer trojan horse planted in Texas. $100,000 Cliff Stoll's low bound estimate of cost of "Internet Worm" (see $97,000,000) $72,500 NASA Ames' estimate of its loss from "Internet Worm" 8,000 Gene Spafford's estimate of personnel hours lost battling "Internet Worm" 6,000 NCCD's esimate of number of these hours which were not compensated. 6,000 Most common estimate of the number of computers affected by "Internet Worm." 3,000 Copies of "world peace" virus found in Aldus software 2,000 Cliff Stoll's estimate of the number of computers infected by the "Internet Worm." 800 Computer virus incidents reported to Computer Virus Industry Association in first 8 months of 1988 (see 96) 130 Countries in which computers were infected by "Christmas Tree" virus 96 Percentage of reports received by Computer Virus Industry Association which incorrectly identified viruses. 53 Percentage of National Center survey respondents who expected to be using anti-virus software in 1991 28 Articles about viruses listed in Reader's Guide to Periodical Literature in 1988 (see 1) 22 Percentage of National Center security survey respondents who were using anti-virus products in 1988 (see 1.5) 21 Editorials arguing that the "Internet Worm" demonstrates the need for greater commitment to security (see 10). 14 Computer virus cartoons collected at NCCCD 10 States currently considering new computer crime laws to fight viruses 10 Letters to the editor saying we should applaud rather than punish those who set loose computer viruses (see 21) 8 Letters to the editor and columns calling for punishment of those who set loose computer viruses (see 10) 7 Editorials calling for tough law enforcement against computer virus vandals. 6 Books in English on viruses 5 Years since term "virus" was coined. 5 Publicized calls for computer ethics in light of "Internet Worm" 4 State computer crime law virus prosecutions 2 Civil suits over viruses 1.5 Percentage of national Center survey respondents who were using anti-virus products in 1985. (See 22,53) 1 Articles about viruses listed in Reader's Guide to Periodical Literature in 1987 (see 28) 1 Federal computer crime law virus prosecutions. ------------------------------ From: spaf@cs.purdue.edu (Gene Spafford) Subject: Re: AT&T (RISKS-9.62) Date: 27 Jan 90 17:52:42 GMT In article risks@csl.sri.com writes: >From Telephony, Jan 22, 1990 p11: > > The problem began the afternoon of Jan 15 when a piece of trunk > interface equipment developed internal problems for reasons that > have yet to be determined. An interesting twist to this: several members of the media have gotten phone calls from a rogue hacker claiming that he and a few friends had broken into the NYC switch and were "looking around" at the time of the incident. This raises two interesting (at least to me) possibilities: 1) They had, indeed, broken in, and were responsible for the crash. (Don't blindly accept published statements from AT&T that it was all a simple glitch. Stories told off-the-record by law enforcement personnel and telco security indicate this kind of break-in is common.) If this is true, what to do from here? Obviously, this raises some major security questions about how best to protect our phone systems. It also raises some interesting social/legal questions. The nationwide losses here are probably greater than the Internet Worm, but the Federal Computer Crime and abuse act don't cover it (only one system tampered with). Other laws maybe cover it, but is there any hope of proving it and prosecuting? 2) These guys were not on the machine but are trying to get the press to publish their names as the ones responsible. This would greatly enhance their image in the cracker/phreaker community. It's akin to having the Islamic Jihad call up and claim that a suicide caller had crashed the system (to protest dial-a-prayer and dial-a-porn, perhaps; remember that the Great Satan is a local call from NYC :-). It raises interesting questions about how the press should handle such claims, and how we should react to them. A third possibility exists, of course, that those guys had hacked into the switch, but they had nothing to do with the failure. That raises both sets of questions. I worry that it won't be long before this kind of thing happens and the phone calls ARE from some terrorist group claiming responsibility: "We are holding your dial tone hostage until you get your troops out of Panama, make abortion illegal, stop killing animals for fur, and prevent Peter Neumann from making more puns." Or, perhaps AT&T security gets a call like: "We've planted a logic bomb in the switching code. Put $1 million in small unmarked bills in the following locker at the bus station, or in 4 hours every call made in Boston will get routed to dial-a-porn numbers in NYC. We'll tell you how to fix it as soon as we get the money." Any bets that something like this will happen this year? Last year's WANK worm and politically-motivated viruses seem to suggest the time is ripe. Gene Spafford, NSF/Purdue/U of Florida Software Engineering Research Center, Dept. of Computer Sciences, Purdue University, W. Lafayette IN 47907-2004 uucp ...!{decwrl,gatech,ucbvax}!purdue!spaf [By the way, AT&T is certain it was an open&shut (a no-pun&shut?) case of a hardware-triggered software flaw, reproducible in the testbed ... PGN] ------------------------------ Date: 1 Feb 90 02:26:40 GMT From: pardo@cs.washington.edu (David Keppel) Subject: Re: AT&T (RISKS-9.63) In RISKS 9:63, Willaim Murray (WHMurray.Catwalk@DOCKMASTER.NCSC.MIL) writes: >AUTOMATE ONLY THOSE THINGS THAT HAPPEN WITH SUFFICIENT FREQUENCY THAT >AUTOMATION IS JUSTIFIED. AVOID GRATUITOUS AUTOMATION. ``Sufficient frequency'' should be qualified. When I was a system manager I was frequently glad that our Unix would reboot itself after minor panics. That reminds me of a chap named Ferguson who built cars in the 60's. They had 4-wheel drive and anti-skid braking. He said ``They [the safety features] only need to save your life *once* to have them pay for themselves.'' ;-D on ( Rhino boot ) Pardo ------------------------------ Date: Wed, 31 Jan 90 00:30:14 EST From: Stanley Chow Subject: Re: AT&T (RISKS-9.62) I would like to make a comment regarding the AT&T incident. I want to state clearly that I work for Bell-Northern Research. We are the R&D arm of Northern Telecom, which happens to be in hot competition with AT&T. In particular, I work on the DMS switches, for which 4ESS is the prime competition. In no way do I represent the official views of BNR or NT. What follows is strictly the observations of a computer professional. The article does not answer the key question: How can such a simple and REPRODUCIBLE bug be released to the field? Especially in such a critical arena? Note that a number of things had to happen: 1) The failure of a single piece of equipment turns into the (perceived) failure of all equipment at the same site. Thus, "Multiplying" the problem. I.e., recovery of a single trunk ends up sending messages out through ALL trunks. 2) Recovery action of "healthy" sites ends up "Spreading" the problem. The recovery of a truck should not cause the collapse of the whole site. 3) Testing does not catch the problem. The problem is reproducible and should have been caught in a pre-release simulated real-life testing of the error recovery system. All of these are likely flaws are in any system. Error recovery is rarely needed - on most systems, you just reboot the machine or just logoff/logon to your account again. As a result, most people don't think about error recovery much less test it. Unfortunately, error recovery is very difficult to design and even harder to test. Stanley Chow, BNR BitNet: schow@BNR.CA UUCP: ..!psuvax1!BNR.CA.bitnet!schow (613) 763-2831 ..!utgpu!bnr-vpa!bnr-rsc!schow%bcarh185 ------------------------------ Date: Fri, 2 Feb 90 22:12:52 -0800 From: brian@ucsd.edu (Brian Kantor) Subject: Re: sendmail flaw It is not an SMTP requirement to complete delivery between receipt of the "." signifying the end of the message and returning the "250 OK" message. It is perfectly valid to simply store the message and return the OK; you do NOT have to deliver in real time while the sender waits. That sendmail often does this is perhaps a common flaw, but don't confuse it with any RFC requirement! It's valid to accept the message and mail back a delivery failure later. Probably sendmail should do that. The most common cause of long waits is expanding mailing lists; sometimes this takes so long that the sender times out and resends the message on the assumption that it's failed. However, the recipient sendmail believes it to have succeeded, so lots of people gets lots of copies of the message until such time as the network environment lets things happen within the timeout limits. We at UCSD have most of our mailing lists explode in deferred time so that an incoming message for one of them is just stored and immediately acknowledged. It is then delivered later. Another common reason for long waits between "." and "250 OK" is the time taken to process headers; if that invokes calls to the system nameserver to look something up in the DNS, there might well be a delay. That's not good planning but lots of people do it. Sending sites probably ought to have their timeout set to around 15 minutes to a couple of hours to avoid these problems, at least until sendmail is fixed. Finally, sendmail has a tendency to do invalid longjumps on timeouts of various kinds: occasionally the stack is buggered and the longjump winds up causing it to die horribly, leaving the list of delivered addresses un-updated. Then the next queue run happens and the people early on the list get the message again and again and again.... I've heard rumors of a patch to sendmail that makes it checkpoint the delivery list (the qf file) after every successful delivery. That solves that problem, but it's really a bandaid on the bad longjump problem. I don't have a copy of that patch. I know people swear at sendmail; it's a difficult program to understand and it's been worked on by a lot of people, so some degree of bit-rot has indeed set in. But I'm pleased that it works as well as it does. It just happens to be one of those programs that causes maximum user annoyance when it goes wrong. brian@ucsd.edu ucsd!brian Brian Kantor, UCSD Network Operations, UCSD C-024, La Jolla, CA 92093-0124 ------------------------------ Date: Sat, 3 Feb 90 16:37:14 EST From: rayan@cs.toronto.edu (Rayan Zachariassen) Subject: Re: Sendmail Flaw I would rephrase the SMTP problem slightly, and draw different conclusions: After the message terminator ("." CRLF) has been sent, there are three possible states: 1. The server SMTP crashes before accepting responsibility for delivery (defined by receipt of an OK code at the client SMTP). 2. The server SMTP crashes after accepting responsibility for delivery but before it can deliver the OK code to the client SMTP. 3. The server SMTP doesn't crash. What makes this bad is that during synchronous delivery, the final acceptance OK code isn't returned until the server SMTP has delivered the message to its recipients. If the recipient is really an address exploder, some addresses may be processed to completion before the server SMTP crashes. This is a state 1 condition because the server SMTP has implicitly accepted responsibility for delivery to *some* of the recipients of the message, but not yet all. There is also a vulnerable window in state 2 above. You would think that the window is very small, but there is ample opportunity for a swapout or some other act-of-God delay in the execution of the acknowledgement delivery, during which time the server can crash. Both of these seem to happen more frequently that people thought. # (BTW, it was decided in SMTP's design that it was better to have multiple # messages than to have messages get lost, so it is not considered acceptable # for the SMTP server to queue the message but not deliver it [synchronously]). On the contrary, the server SMTP may do anything it wants as long as it takes responsibility for delivery of the message. In particular this means using asynchronous delivery, after simply queueing the message to decrease the vulnerable window (of state 1). Some people like the 'real-time' feedback of synchronous delivery, but it is a dangerous thing to like given the cost. There are economic arguments for doing synchronous address verification in the SMTP protocol (if you are on a volume-charged network, you don't want to transfer the message data until you know the server SMTP knows what to do with the message), but doing so also leads to instability on client SMTP computers as queues build up waiting for a slow server. Barring economic/bandwidth issues, in message transfer the HOT ROCK model is very appropriate: you try to get rid of a queued message as quickly as possible, by almost any means. This requires asynchronous checking and delivery in server SMTPs. See also RFC1047 by Craig Partridge on "Duplicate messages and SMTP". rayan ------------------------------ Date: Sat, 3 Feb 90 18:43:03 PST From: geof@aurora.com (Geoffrey H. Cooper) Subject: Re: Re: sendmail flaw Thanks for your message. > It is not an SMTP requirement to complete delivery between receipt of > the "." signifying the end of the message and returning the "250 OK" > message. I stand corrected. The problem I bring up is in the design of the protocol and the consequent generalization to software design. It would be different if the protocol spec SPECIFIED that no processing was to be done during this time. That would certainly diminish the problem, and one could make the valid argument that this fixes the problem "enough." After all, by the same top-level reliability argument, SMTP itself can never guarantee truly reliable mail (only the sender and recipient of the mail can do that). > Another common reason for long waits between "." and "250 OK" is the > time taken to process headers; if that invokes calls to the system > NAMESERVER to look something up in the DNS, there might well be a > delay. That's not good planning but lots of people do it. (my emphasis added) That one is interesting, since SMTP (and, I admit, my significant exposure to it) somewhat predates domain naming. An example where a lingering bug in a design is made worse by changing system requirements. > I know people swear at sendmail. I'm not one of them. Although I hate debugging sendmail scripts as much as the next system type, I'd much rather do that than deal with binary distribution software that is non-configurable. After all, my systems requirements change from time to time, too. - Geof ------------------------------ Date: Sat, 3 Feb 90 21:09:53 EST From: kyle@cs.odu.edu (Kyle Jones) Subject: re: Sendmail Flaw In RISKS 9.65, Geoffrey H. Cooper writes: > The sendmail problem to which our moderator frequently refers is > actually a design problem in the SMTP protocol [...] > > (BTW, it was decided in SMTP's design that it was better to have > multiple messages than to have messages get lost, so it is not > considered acceptable for the SMTP server to queue the message but > not deliver it during the pause in [5]). I never knew of such a design decision. It's certainly not applicable to the mail on the Internet today, considering that the domain system allows mail to be sent to hosts on networks not directly connected to the Internet. Queueing is inevitable since there is no way for the SMTP-server to wait for final delivery on a network that does not support notification of that event. RFC 821, page 2: When the recipients have been negotiated the SMTP-sender sends the mail data, terminating with a special sequence. If the SMTP-receiver successfully processes the mail data it responds with an OK reply. Note the word used is "processes", not "delivers". The RFC also specifies that if the server finds that it can deliver to some of the recipients but not others, then it should respond with an OK reply, but also compose and send an "undeliverable mail" notification message back to the original sender of the message. If I were writing an SMTP-server I would take take the above as an invitation to queue the message after doing a cursory check of the recipient addresses, send an OK reply, and dispose of the message at leisure, sending error notifications as necessary. kyle jones ...!uunet!talos!kjones ------------------------------ Date: Mon, 5 Feb 90 10:11:48 -0500 (EST) From: Craig_Everhart@transarc.com Subject: Re: Sendmail Flaw Certainly Geof Cooper's problem is inherent in SMTP, but I assumed that PGN's distribution problems were more sendmail-specific. To wit: sendmail maintains each outgoing mail message as a pair of files, a qfXXX file listing headers and recipients and a dfXXX file listing the mail body (where the two values of XXX match). Sendmail processes an outgoing mail request by locking the qf/df pair (well, the XXX value) and attempting delivery to each of the recipients listed in the qfXXX file. When it's made an attempt on each recipient, it writes a new qfXXX file recording the recipients to which the mail has yet to be delivered. In our environment, sendmail executions got interrupted all the time: we rebooted our mail-handling servers daily, and our sendmail processes would get stuck on an SMTP connection for all kinds of reasons. Thus, when our sendmail would start processing a message with many recipients, its run would often be interrupted before it had made a complete pass through all the recipients; in such cases, it would never record the fact that delivery was successful to any of the recipients. The next time sendmail started processing that long list of recipients, it would try 'em all again: bingo, duplicates. My solution was to have sendmail update the qfXXX file (containing the list of recipients) after every successful delivery. This required a little source-code hacking, but it was very helpful for us. Not only did we stop generating lots of duplicate mail, but we also reduced our mail-processing load so that processing of those many-recipient lists would terminate! Craig Everhart ------------------------------ Date: Wed, 24 Jan 90 20:11:14 est From: davis@ai.mit.edu (Randall Davis) Subject: Re: Risks of Voicemail systems (RISKS-9.61) Date: Thu, 18 Jan 90 08:24:18 EST From: r.aminzade@lynx.northeastern.edu Subject: Risks of Voicemail systems that expect a human at the other end Last night my car had a dead battery (I left the lights on -- something that a very simple piece of digital circuitry could have prevented... Yes, indeed. And the first time that piece of circuitry failed in any interesting, amusing, or dangerous way, 40 people would send articles to RISKS deploring the inexorable trend toward technological overkill in today's society, suggesting how dumb the engineers were to have replaced the good, old fashioned manual switches, and pointing out how that sort of failure NEVER happened with manual switches. They would of course be right (manual switches fail differently) and they would have forgotten all the dead batteries that didn't happen. Three morals: Accidents that don't happen rarely make it into the papers, the public consciousness, or get factored into the ire over failures. As Don Norman put it rather nicely a while back, the baseline on any technology isn't zero defects. Nothing is perfect now, and for any change the relevant question is how it works, how it fails, and whether on balance it's better than what we had; not whether it's perfect. There is no free lunch: if you want the convenience, you have to accept the attendant, inevitable risks. Applying this to the phone system failure, the only perfectly reliable communication medium is none at all; you have to be in the same room with someone. If you want to be someplace else and talk to them, you have to accept the risk of malfunction. And you want direct dial international calls, call waiting, one-touch memory dialing, conference calls, and call forwarding, too? Then accept the risks inherent in increased complexity that inevitably come along. It won't be perfect, but you might be better off than you were. ------------------------------ End of RISKS-FORUM Digest 9.66 ************************