RISKS-LIST: RISKS-FORUM Digest Friday 10 June 1988 Volume 7 : Issue 7 FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Accidental breach of software security (Martin Minow) "Sewage flows into river; Computer Failure Blamed" (Randal L. Schwartz) Canadian Public Service warned against SINing (John Coughlin) Betting network crash in Australia (George Michaelson) John Pershing on ATMs (David Thomasson) A typo in "UK Poly; another root typo" (Matt Bishop) Re: The Challenger and visionary software architects (Eugene Miya) COMPASS '88 CONTACT (Frank Houston) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, and nonrepetitious. Diversity is welcome. Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM. PLEASE use a relevant "Subject:" line, not just "RISKS DIGEST i.j...". THANKS. For Vol i issue j / ftp kl.sri.com / get stripe:risks-i.j ... . Volume summaries in (i, max j) = (1,46),(2,57),(3,92),(4,97),(5,85),(6,95). ---------------------------------------------------------------------- Date: 9 Jun 88 16:08 From: minow%thundr.DEC@decwrl.dec.com (Martin Minow THUNDR::MINOW ML3-5/U26 223-9922) Subject: Accidental breach of software security From Electronic Engineering Times, June 6, 1988, condensed and abstracted. Unauthorized access to software cited Shuttle security lapse by Richard Doherty An accidental breach of software security at Rockwell International Corp. gave unauthorized engineers and programmers access to raw code being prepared for the space shuttle Discovery's return to space, now slated for Aug. 22. The six-month lapse in security resulted from "an unintentional keyboard error," Last November, the most recent embodiment of "Operational Increment" shuttle software (revision OI8-A) was somehow stripped of its normal RCAF [Resource Access Control Facility] protection... [IBM] Engineers discovered on Nov. 22, 1987, that they were able to modify the code while in the supposedly restrictive "browse" mode. An IBM supervisor then immediately notified NASA, and by afternoon's end, a weekend-long bit-by-bit comparison was initiated. NASA says that in all, 16 changes were made. The agency still does not know by whom or when. Because multiple copies of the OI8-A software existed at that time, NASA said there is no chance the security breach could endanger the mission. NASA conducted a six-month long inquiry into how the event happened. As a result, the agency said it now knows the normal protection was somehow stripped off during a change in editing modes (between integer and block) and who the operator was at the time. But NASA said it still doesn't know how many accesses were made and by whom over a six-month period. Nor has it discovered why it took six months to notice the code was open to editing. Shuttlegate? The acknowledgement by NASA and Rockwell engineers came in the wake of a $5.2 million lawsuit field by shuttle engineers who were dismissed after expressing concern about overall OI8-A software vulnerability, among many other problems, to Rockwell management last fall. The engineers, Sylvia Robins and Ria Solomon, first took their concerns to the company's ombudsman, then to the NASA Inspector General and, eventually, to the White House. Finally, seeing no action taken on their concerns regarding accelerated software verification audits and a steady lack of secure data access, Robins and Solomon took action by filing a lawsuit here [Houston] against Rockwell and its prime software contractor, Unisys, last fall. That action triggered widespread denial by Rockwell of the lawsuit's many safety concerns, including the charge of lax shuttle programming security that was finally acknowledged last month. [There's more, but this is getting long. Any idea what "between integer and block" modes mean?] Martin ------------------------------ Date: Tue, 07 Jun 88 10:34:13 PDT Subject: "Sewage flows into river; Computer Failure Blamed" From: Randal L. Schwartz Page one (below the fold) of the 6/7/88 morning edition of The Oregonian (Portland, Oregon) reports: [BEGINNING OF ARTICLE] "Sewage flows into river; computer failure blamed" * The five-hour spill from the Sullivan Pump Station poured about 5.4 million gallons into the Willamette River downtown A computer failure caused about 5.4 million gallons of raw sewage to spill into the Willamette River in downtown Portland early Monday, prompting state officials to warn against recreational use of the river through Tuesday morning. "That's a major spill," said Shirley Kengla, spokeswoman for the Oregon Department of Environmental Quality. [What a quote! We Oregonians now know what a major spill is... sheesh.] The spill began about 3 a.m. when a computer failure at the Sullivan Pump station caused sewage to flow directly into the river [...]. The spill was stopped by 8 a.m. [...]. Peterson [operations director] said the computer failure was caused by a loss of electrical power, shutting down the pumps at the pump station. He said the computer system was designed so that the operators could override computer commands at the station, but they discovered Monday they could only do that when the computer had electrical power [!!!]. Monday's loss of power prevented that. [... more stuff about volume and repair efforts ...] Kengla said the DEQ is investigating the spill. "Because it was a computer failure, it was an accident [!!!] and for that reason, we're just looking toward making sure it doesn't happen again," she said. [... more stuff about location of spill ...] Contact with the water could cause sickness with severe flu-like symptoms. In June 1985, another computer failure caused the dumping of more than 3 million gallons of raw sewage into the Willamette from the same pump station. Kengla said that state and city officials had been working to prevent a recurrence of the problem. "We thought we had done enough, but obviously we hadn't," she said. [Brilliant quotes!] [... stuff about another tiny spill last year, not computer-related...] [END OF ARTICLE] (1) Why did they have a computer-controlled system that depended on electrical power to operate with no backups? (2) Given that they already had a failure before, how could they have failed to design a failsafe system? (3) The statements made by the spokeswoman, like "because it was a computer failure, it was an accident..." make me wonder. Why should an accident be so obviously required to happen after a computer failure? Why should we not *plan* for the system to break? These guys obviously didn't. Sheesh. And during Rose Festival too! -- Randal L. Schwartz, Stonehenge Consulting Services (503)626-6907 on contract to Intel Technical Publications: merlyn@omepd.intel.com (for now) [Moderator's note: This article was also reported by Andrew Klossner and by Richared {?} England, Mentor Graphics Corp., Customer Services, Technical Support, Beaverton, Oregon, pdx.MENTOR!rengland@uiucdcs. The abridged version from Randal was chosen for brevity, but the following comments from R. England are also worth including: [This case] points out, once again, the need for total design consideration. A truly fail-safe system would include manual overrides instead of requiring all control signals to originate from the automatic control system. Conversation with the writer lead me to believe that there was a wide-spread power loss which may have precluded operation of the pumps as well, in which case nothing would have helped. Note, however, that the computer gets the blame. [R. England] ------------------------------ Date: 10 Jun 88 10:25:00 EDT From: John Coughlin Subject: Canadian Public Service warned against SINing The following article appeared in the Ottawa Citizen, Friday June 10, and is reproduced without permission. =========================================================================== PS warned against SINing By Iain Hunter Citizen staff writer Treasury Board President Pat Carney has ordered bureaucrats who run hundreds of federal programs to SIN no more. And Justice Minister Ray Hnatyshyn has warned that the government may bring in laws to regulate the collection and use of the social insurance number if other levels of government and the private sector don't follow the federal example. Carney announced Thursday that the use of the individual identity number will be phased out over five years in several institutions to reduce the risks of invasion of [...] privacy. Carney said the use of the SIN will be restricted mainly to the administration of tax, pension and social benefit programs. Any new collection or use of the SIN beyond those covered under the new policy will have to be approved by Parliament. A Treasury Board spokesman said "hundreds" of programs in which the number is now used will be affected by the new restrictions. The cost of phasing out the use of SIN is estimated at $16 million. Privacy Commissioner John Grace called Carney's announcement heartwarming and said it puts the federal government "in a much stronger moral position to preach controls on the use of SIN to other governments and the private sector." ------------------------------ Date: Thu, 09 Jun 88 09:40:23 +1000 From: George Michaelson Subject: Betting network crash in Australia Earlier this week (Monday) the T.A.B suffered a major loss of a computerised betting control system which caused much anguish to punters across the state of Victora. I phoned their computer manager & got a loose description of the problem, he was understandably reticent since the network manages many millions of dollars of bets daily, and there are public relations issues aside from the security ones. Brief Description of TAB & Gambling: ++++++++++++++++++++++++++++++++++++ Gambling, the 4th most ancient vice after programming sex, religion and politics is endemic in Australia. It is also very tightly controlled by the state, concentrating in several HUGE casinos with banks of "pokeys" or one-armed bandits, and of course the ponies. [remember phar lap anyone?] Apart from pre-computed instant win cards and lotterys the only [legal] form of betting off the racetrack in Australia is run by the T.A.B. or Totalizer Agency Board. They run a "totalizer" scheme, where the pool of cash placed in any given race is split into the dividend, so that the winnings are more "socially" adjusted, and also not simply calculated from starting price ("SP" bookmakers thrive in the better drinking establishments and police stations, according to popular myth & corruption tribunals) however there is an element of feedback, in that offered "odds" are obviously adjusted according to the poolsize as well as traditional form, much as in normal betting. the "divi" depends on the size of the pool and is post-close-of-betting calculated, although a running total could be available (I don't know for sure, I'm not a gambling man you understand...) Each state has a transaction processing network which underpins this exercise, so that the pool/dividend holds across the entire state for each race. The Problem: ++++++++++++ The Victorian T.A.B. uses a tightly coupled "cluster" of 5 nodes handling the incoming transactions, and a central node that acts as a controller and calculates the divi. Some form of shared memory is used to do all this. At some stage in the day the central node noticed an inconsistency in 2 separate pools, across two of the nodes. They were disconnected, possibly automatically. Afterwards, the entire network was taken down and the betting calculated manually to prevent any data inconsistency being propagated out into the real world. -From the "feedback"-y nature of the tote algorithm it's plausible that had this not been done, many many thousands of punters would have been rooked of their winnings.. actually perhaps they would have made too much money as well! -The problem was verbally ascribed to software, but no details are available. According to radio reports over $1M was lost due to the much less efficient manual methods. Punters were furious. I wanted to end with some pun about shutting the sTABle door after the horse has bolted, but I can't think of a good punchline. George Michaelson, CSIRO Division of Information Technology ACSnet: G.Michaelson@ditmela.oz Phone: +61 3 347 8644 Postal: CSIRO, 55 Barry St, Carlton, Vic 3053 Oz Fax: +61 3 347 8987 ------------------------------ Date: Thu, 09 Jun 88 09:38:27 EDT From: David Thomasson Subject: John Pershing on ATMs A tip of the hat to John Pershing (pershing@ibm.com) for his reply concerning risks of ATM's (RISKS 7.6). This piece is truly a standout in that it is (a) not anecdotal and (b) not hysterical in tone. I commend it to others as a model for crafting their arguments and replies. ------------------------------ Date: Thu, 9 Jun 88 09:01:57 EDT From: bishop@bear.Dartmouth.EDU (Matt Bishop) Subject: A typo in "UK Poly; another root typo" (Re: RISKS-7.6) Error in my last letter: he typed "halt", not "exit"; "halt" halts the CPU and "exit" takes you out of superuser mode. Sorry for the boo-boo. Matt ------------------------------ Date: Thu, 09 Jun 88 16:16:29 PDT From: Eugene Miya Subject: Re: The Challenger and visionary software architects >From: stork@humu.nosc.mil (Kent Stork) >The May issue of Defense Science validates something that many computer >scientists have probably suspected: ultimately, the failure of the Challenger >and the death of the astronauts was due to a control loop software design >oversight - just another bug. I have grabbed the May issue of Defense Science (an oxymoron): %A Yale Jay Lubkin %T The Challenger Disaster %J Defense Science %V 7 %N 5 %D May 1988 %P 13-18 (actually only 3 pages) This article neither validates nor mentions software, bugs, or computers. There are two sets of problems 1) Kent's comment, and 2) the article itself. The article is basically a summary of different causal hypotheses: the cold joint and the AW&ST puncture (Abu-Taha) hypotheses. There is a good tone of conspiracy in the paper, but I am not in a position to side in defense of Washington bureaucrats or Lubkin. The closest comment to "control loop software design" is: "It was not the leak that killed the astronauts. It was the attempt to correct the sidethrust, which sent the Challenger into violent oscillations. There are probably some difficulties in asserting "cause." Further: "If the Challenger had been permited to go off course, without attempting the major correction, the side booster would not have broken out, the booster would have burnt out with the Callenger still intact, and the crew could have ejected, off course but aslive." This latter is probably false, but we will not really know. I don't usually trust words like "ultimately." Researching this article did have a positive side benefit while in the library, I found a good 5 page obit to Richard Feynman in Engineering and Science [Caltech's Alumni magazine]. There is a very "hardware must fly" orientation in NASA, and you should understand software is not well understood in the Agency. This was documented by internal reports [Sagan et al. 1980]. Any disaster will have underlying PHYSICAL causes sought. >To what extent must software architects be visionaries? Certainly the >requirements are proportional to the power of the system the software controls. >Do such visionaries exist to design our Challengers and our other more >aggresive weapons? I take some offense at you calling the Challenger a weapon, and I know several thousand others who would, too. I would not call any of the software people working on the Shuttle "visionaries," there is a degree of salesmanship, however. Your codes have to withstand many walk throughs (or walk overs depending on your prespective 8-). Don't think the control problem scales linearly, it doesn't, typically O(n^2) and greater. --eugene miya NASA Ames (not speaking directly for the Agency) ------------------------------ Date: Fri, 10 Jun 88 16:51:28 edt From: houston@nrl-csr.arpa (Frank Houston) Subject: COMPASS '88 CONTACT Regarding the COMPASS '88 advance program and registration info. Some people have had trouble contacting me by email and I, in turn, have had some difficulty getting the nrl-csr mailer to recognize return addresses. I recommend including a surface mail address with requests for registration info. For those who could not reach me by net, my telephone number is (301)443-5020. Registration is still open. Frank Houston (I really exist @nrl-csr.arpa) [Note, not on the foregoing, but on the following: I have had a request to add the issue number on the standard trailer, as you see I have done. That seems like a fine idea for people who tend to string RISKS issues together. However, this may defeat the UNDIGESTIFIER programs if they look for the precise trailer. Please let me know if this causes any troubles. PGN] ------------------------------ End of RISKS-FORUM Digest 7.7 ************************