RISKS-LIST: RISKS-FORUM Digest  Saturday 20 January 1990   Volume 9 : Issue 61

        FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  Shortage of RISKS but no shortage of risks -- the week in review (PGN)
  AT&T Failure (Bill Murray, Jim Horning)
  Risks of Voicemail systems that expect a human at the other end (R. Aminzade)
  Risks of vote counting (Alayne McGregor)
  Risks of supermarket checkout scanners (David Marks)
  European R&D in Road Transportation (Brian Randell)
  Old habits die hard (Dave Horsfall)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, and nonrepetitious.  Diversity is welcome.
CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line
(otherwise they may be ignored).  REQUESTS to RISKS-Request@CSL.SRI.COM.
TO FTP VOL i ISSUE j:  ftp CRVAX.sri.com<CR>login anonymous<CR>AnyNonNullPW<CR>
  cd sys$user2:[risks]<CR>get risks-i.j .  Vol summaries now in risks-i.0 (j=0)

----------------------------------------------------------------------

Date: Fri, 19 Jan 1990 15:14:08 PST
From: "Peter G. Neumann" <neumann@csl.sri.com>
Subject: Shortage of RISKS but no shortage of risks

Well, it was a very strange week for me to be off-line, not to be able to
report timely commentaries on the AT&T long-distance breakdown, several
glitches aboard the shuttle Columbia, the Internet Worm trial of Robert T.
Morris, and the San Jose CA indictment of three young computer people.  On the
other hand, most everything was well covered in the media, which had a
field-week (= 5 field-days?).

The exact cause of the 15 Jan 90 AT&T slowdown is apparently still not known,
although the result involved the propagation of bogus status information
describing the partial outage of one network node, due to a software flaw.
Apparently this information propagated to other nodes, and was amplified in
turn by each of them (because of the same program flaw?), creating the
`illusion' of a major outage.  It should be noted that AT&T's service record up
until that time had been exceptionally good, reflecting a fundamental concern
for very high system availability.  This outage immediately brought to mind (1)
the article by Eric Rosen (Vulnerabilities of Network Control Protocols, ACM
Software Engineering Notes, vol 6, no 1, January 1981) on the October 1980
four-hour collapse of the Arpanet, as a result of accidentally propagated bogus
status messages that could not be garbage collected, and (2) the possibility of
intentional insertion of a malicious effect.  The latter possibility has been
discounted by AT&T, but I observe somewhat tangentially that if an effect
(e.g., a fault mode or vulnerability) can be triggered accidentally, in many
cases it could alternatively have been caused intentionally.  This was indeed
the case in the Arpanet collapse, which was completely accidental.

The shuttle glitches included the spurious sounding of various alarms as well
as the sudden rolling over of the spacecraft, among others.

The Morris trial awaits the final summations, the jury's decision, and the
verdict, probably Monday or Tuesday.

The San Jose indictment involves three people accused varyingly of certain
malicious computer-related and/or telephone-related security violations.  This
case will presumably drag on for a long time.

In each of these cases, RISKS will look forward to some definitive,
nonspeculative reports, i.e., first-hand analyses by people involved, rather
than just press clippings.  The article by Eric Rosen on the Arpanet outage,
noted above, and the article by Jack R. Garman, on the first shuttle
synchronization problem, The Bug Heard 'Round the World, Software Engineering
Notes, vol 6 no 5, October 1981, are superb examples of what I have in mind.
Those of you who haven't seen those articles really should.  They are
absolutely required reading for all RISKS folks.

Our sendmail time-out multiple-copy problem may or may not still exist (I
believe all known fixes and some other hopeful improvements have been
installed).  However, several of you suffered multiple copies of RISKS-9.59,
apparently because of a gateway glitch caused by someone's overflowing mailbox.
(I continue to further subpartitioned the mailing list, in an attempt to
isolate or minimize the problem, but even with small sublists it seems to
continue sporadically.)  I apologize for the frustration that results from
multiple copies.  My own frustration is most considerable when I cannot do
anything about the problem.  

Above all, the difficulties of getting concurrent processing programs flaw-free
is illustrated by almost all of the above-mentioned cases, the AT&T slowdown,
the Internet Worm, the first shuttle synchronization problem, and the Arpanet
collapse.  PGN

------------------------------

Date:  Wed, 17 Jan 90 09:37 EST
From: WHMurray.Catwalk@DOCKMASTER.NCSC.MIL [WHMurray@DOCKMASTER.ARPA]
Subject:  AT&T Failure

The assertion by AT&T, "in an effort to allay customer fears about the networks
reliability," that the outage was "traced to a single computer program," not
only fails to reassure me, but alarms me greatly.  It suggests a serious
failure on their part to understand the nature of the problem.

While the proximate cause of this problem was an error in the design or
implementation of a single program, the actual cause is a system that is unable
to isolate failing components, and indeed that specifically designed to
propagate the failures in such a way as to cause the failure of the entire
system.

This is the second event in as many years to demonstrate the failure of the new
telephone system to cope with the challenges that confront it.  If, on the day
before the Hinsdale fire, you had asked me if the failure of a single central
office could cause the loss of all long distance service to 350,000
subscribers, I would have said "No, we do not build telephone systems that way.
You might lose access to one carrier, but you would retain access to the
others."  Never would I have imagined that Illinois Bell, only partially in
response to the equal access requirements, would centralize all access to all
carriers in a single unattended central office.

Likewise, if on Sunday, you had asked me if a change, error, or manipulation of
a single program in a single switch could bring down the entire AT&T network, I
would have been happy to reassure you that such could not happen.  AT&T has
been the leader in teaching the rest of the world how to avoid such failures.
While an authorized programmer, or even a hacker, might be able to affect a
single switch, the system is specifically designed to prevent the effect from
propagating.  Little did know; little would I have believed.

If AT&T management actually believes its press releases, if they are not simply
propaganda designed to comfort the sheep, then it truly bodes ill for our world
economy.  Of course, part of the difficulty with such propaganda is that you
might yourself forget that that was what it was.  In a large organization like
AT&T, it is likely that at least some of your employees will be taken in.

I recognize that there are limits to our ability to identify and isolate
failing components, that at some point further attempts to do so become
self-defeating.  If AT&T were claiming that this event was, similar to the
second great Northeastern blackout, caused by, or even in spite of, such
measures, I would be less alarmed.  What concerns me is the pretense that such
a failure is so anomalous that special precautions or design considerations are
not indicated.

William Hugh Murray, Ernst & Young

------------------------------

Date: 16 Jan 1990 1108-PST (Tuesday)
From: horning@decpa.pa.dec.com (Jim Horning)
Subject: Telephone malpractice?

Did you have trouble completing long-distance calls yesterday?  Maybe you 
should sue AT&T for malpractice.  Consider the following Congressional 
testimony (on the topic of SDI software) from Solomon J. Buchsbaum, Executive 
Vice President, Customer Systems, AT&T Bell Laboratories, December 3, 1985:

  Some critics have specifically questioned if it is possible to generate 
  great quantities of {\it error-free} software for the system, and to ensure 
  that it is, indeed, {\it error-free software.}

  This is the wrong question. ... Software is always part of a larger system 
  that includes hardware, communications, data, procedures, and people.

  The right question, as well as the key issue, is the broader one of whether 
  the total BM/C3 system can be designed to be robust and resilient in a 
  changing and error-prone environment.  The key, then, is not whether the 
  software contains errors, but how the whole system compensates for such 
  errors as well as for possible subsystem failures. ...

  Can such a large, robust, and resilient system be designed--and not only 
  designed, but built, tested, deployed, operated and further evolved and 
  improved?  I believe the answer is yes.  I seem confident of this answer 
  because most if not all of the essential attributes of the BM/C3 system 
  have, I believe, been demonstrated in comparable terrestrial systems.

  The system most applicable to the issues at hand is the U.S. Public 
  Telecommunications Network. ...

  There are three keys to achieving high reliability, availability, 
  maintainability, and adaptability.

  The first is the use of distributed architectures both for the entire 
  network and for major systems within the network.  The approach 
  compartmentalizes crucial functions in modules throughout the country ...
  
  The second key is the use of redundancy, again both in the entire network 
  and in the component systems.  And the third key is the coupling together, 
  the integration, of all the component systems by means of well-specified, 
  well-controlled interfaces.

  The network as a whole is much more reliable than its individual 
  components.  That's because the network is designed to be fault tolerant.  
  It continuously and automatically checks its own condition.  When a problem 
  is detected, it isolates the faulty component, so that the network can 
  continue to function using a substitute or redundant component.

  For high availability, the public telecommunications network is designed 
  to work at its specified level of performance even when some of its 
  component elements are unavailable. ...

  ... This approach not only reduces software complexity.  It also permits 
  the fullest use of software as a strength, enhancing network flexibility 
  and resiliency.

Perhaps Dr. Buchsbaum envisioned an SDI that might for a significant fraction 
of a day tell over half the incoming ICBMs "We can't handle you right now, 
please keep trying"?

Perhaps Dr. Teller thinks that the problem would go away if we just gave 
everyone a Brilliant Telephone?

Jim H.

------------------------------

Date: Thu, 18 Jan 90 08:24:18 EST
From: r.aminzade@lynx.northeastern.edu
Subject: Risks of Voicemail systems that expect a human at the other end

Last night my car had a dead battery (I left the lights on -- something that a
very simple piece of digital circuitry could have prevented, but I digress), so
I called AAA road service.  I noted that they had installed a new digital
routing system for phone calls. "If you are cancelling a service call press
1,if this is an inquiry about an existing service call, Press 2, if this is a
new service call, Press 3."  All well and good, except that when I finally
reached a real operator, she informed me that the towtruck would arrive "within
90 minutes."  In less than the proposed hour and a half I managed to beg jumper
cables off of an innocent passerby and get the car strarted, so I decided to
call AAA and cancel the service call.

I dialed, pressed 1 as instructed, and waited.  The reader should realize that
my car was illegally parked (this is Boston), running (I wasn't going to get
stuck with a dead battery again!), and had the keys in the ignition.  I was not
patient.  I waited about four minutes, then tried again.  Same result.  I was
now out of dimes, but I noticed that the AAA machine began its message with "we
will accept your collect call..." so I decided to call collect.

Surprise!  I discovered that New England Telephone had just installed _its_
digital system for collect calls.  It is quite sophisticated, using some kind
of voice recognition circuit.  The caller dials the usual 0-(phone number), and
then is asked "If you wish to make a collect call, press 1...If you wish to..."
Then the recording asks "please say your name."  The intended recipient of the
collect call then gets a call that begins "Will you accept a collect call from
<recording of caller stating his name>"

I knew what was coming, but I didn't want to miss this experience.  I gave my
name as something like "Russell, Goddammit!," and NETs machine began asking
AAAs machine if it would accept a collect call (which it had already, plain to
the human ear, said it _would_ accept) from "Russell Goddammitt!".  Ms. NET
(why are these always female voices?) kept telling Ms. AAA "I'm sorry, I don't
understand you, please answer yes or no," but Ms. AAA went blithely on with her
shpiel, instructing Ms. NET which buttons to push.

I stood at the phone (car still running...machines nattering away at each
other) wondering who could do this episode justice.  Kafka?  Orwell?  Groucho?
I was sure that one machine or the other would eventually give up and turned
things over to a human being, but, I finally decided to dial a human operator,
and subject the poor woman to a stream of abuse.  She connected me to AAA,
where I punched 3 (rather than the appropriate but obviously malfunctioning 1),
and subjected yet another underpaid clerk to my wrath.

------------------------------

Date: 	Fri, 19 Jan 90 14:40:17 EST
From: alayne@gandalf.UUCP (Alayne McGregor)
Subject: risks of vote counting

High-tech vote counting unreliable, city decides
<Toronto Globe and Mail (Dec. 16, 1989)>

In the wake of a chaotic election last year, the first Canadian city to use
sophisticated electronic-voting machines is going back to counting votes the
old-fashioned way.  By a 13-3 vote, Toronto politicians decided yesterday to
sell more than $1.6-million worth of optical scanning machines bought for the
November, 1988, municipal election. That race was marred by a recount brought
about by the discovery that 1,408 ballots had been improperly handled by the
machines.

More than a year later, the election is still not over. On Monday, a
court-ordered recount is to be held in the Wards 3 and 4 separate school
trustee race. [In Ontario, separate == Roman Catholic.]  The council's decision
flew in the face of a task force proposal, put forward before Council
yesterday, to double the number of electronic voting machines, to 500 from 250,
in time for the next election. It would have cost an additional $1.6-million.

Not only are the machines more efficient, they are more accurate and provide
quicker results at the end of the day, said Martin Silva, who headed an
elections task force set up last year.  "There's nothing wrong with the
machines," Mr. Silva said in an interview.  He attributed the election problems
to the previous council's decision to buy only half the number of machines
needed.

Toronto was the first Canadian city to use the optical scanning machines, in
a by-election before the 1988 election, said an official in the city clerk's
department. It is also the first to get rid of them, he said.
Other cities using the same machines include North York and Vancouver. A
similar system is also used in Scarborough and Etobicoke, said Robert Clark,
director of legislative services in the Toronto clerk's department.

   = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

A few comments:

>From my experiences in working for candidates in many elections (municipal,
provincial, and federal), I would say that it's not just optical scanning
machines that get confused by ballots. Manual counting depends on the quality
of the people doing the counting. Given that poll clerks and Deputy Returning
Officers (the people who actually count the votes on election day) are patronage
appointments (in Canada), I'm not too hopeful.

And there are the confused rules: Here in Ottawa, we had endless recounts in
one city ward election (first one candidate was declared the winner,
then the other). It finally had to be settled by a new election (won
convincingly by one candidate, thank heavens), after the last recount declared
a tie. And what was one of the big issues in the recounts: whether a
"happy face" beside a candidate was a vote for that candidate!

------------------------------

Date: 19 Jan 90 16:33:23 GMT
From: djm408@tijc02.uucp (David Marks)
Subject: Risks of supermarket checkout scanners

The following appeared in the BUSINESS BRIEFING section of the 22 January 1990
issue of INSIGHT magazine:

          "Supermarket shoppers would be wise to keep a watchful
     eye as their purchases are passed over the checkout scanning 
     systems used in most stores, says the New York State Department
     of Agriculture. The agency reported last month that inspections
     of 33 stores with electronic checkout showed that all but one 
     overcharged their customers.

          'It is in the shoppers' best interest to check that the 
     prices they are being charged are those that are being advertised,'
     says department spokesman Gerald Moore. The problem, he adds, is
     that 'scanners are only as accurate as the people who program them.'
     The department contends that stores are not reprogramming the 
     computerized price scanners frequently enough to reflect sales and
     other price changes. Inspectors posing as customers and purchasing
     some 150 items in each store were overcharged for 3 percent on 
     average. The department kept no tally of whether inspectors were
     overcharged.

          Retailers contend that just as many pricing errors are made in 
     favor of the customer and that overall accuracy oof scanning systems
     is superior to conventional cashier checkout. `Consumers would not 
     accept the technology if they felt it was inaccurate,' says Peter 
     Larkin, spokesman for the Food Marketing Institute, a grocery trade 
     group. To ease consumer anxiety about the technology, many 
     supermarkets offer a guarantee that backs up the accuracy with product 
     giveaways for items consumers find overpriced by the scanner.

          Whether customers come out even in the end is not the issue
     according to Moore. `It shouldn't be a crapshoot. People should be
     charged waht is advertised.'"

     I seems to me that consumers accept the technology not because they feel
it is accurate, but because it makes checkout faster. Also, where I live
Kroger gives your money back on items that are not charged the price marked
on the shelf. However, this still puts the burden on the customer. In the
"old days," you could check the price marked on the product against the one 
rung up. Now you have to write down the price on the shelf and compare it at 
the checkout. When there are lots of people impatiently waiting in line behind 
you, are you really going to do that?
     
David J. Marks M/S 3520 Texas Instruments, Johnson City, TN. 37605

------------------------------

Date: Thu, 18 Jan 90 17:58:04 BST
From: Brian Randell <Brian.Randell@newcastle.ac.uk>
Subject: European R&D in Road Transportation

Today's Guardian Newspaper contains an article by Rex Malik, a well-known UK
commentator on the computer scene, entitled "Every Move You Make ...",
discussing possible implications of two current European R&D initiatives
concerning the use of computers in road transportation, DRIVE and PROMETHEUS.
As I understand it, DRIVE is an EC (European Community) initiative that has
been largely motivated by concerns about social and environmental impacts of
road traffic, whereas PROMETHEUS is a EUREKA project (which means that it is
collaboratively sponsored by various European national governments directly)
that is backed mainly by the automative manufacturers.
                                                            
Brian Randell, Computing Laboratory, University of Newcastle upon Tyne, UK

PS: Some years ago Malik authored a book entitled "And Tomorrow .. The World?
Inside IBM" (Millington, London, 1975), which - how shall I put it - redressed
the somewhat uncritical view of IBM to be found in Thomas J.  Watson Jr's "A
Business And Its Beliefs: The Ideas That helped build IBM" (McGraw-Hill, New
York 1963). I recommend both - but only if read together! (This
"recommendation" is not intended to have any specific explicit relevance to
RISKS :-)

     =====
 
 Contracts for the EC's Drive (Dedicated Road Infrastructure for
 Vehicle Safety in Europe) programme preliminary phase were awarded
 earlier this year....
 ...
 Drive is a major EC pre-competitive research and development
 programme.  It is almost a cousin of the EC's Esprit, and it is likely
 to have radical consequences.
 ...
 Drive is an attempt to work out how to use information technology to
 create the infra-structure needed to help reduce the EC's road
 accident-related death and injury figures, (currently around 55,000 of
 the former and 350,000 of the latter), reduce environmental pollution
 and improve road traffic efficiency by enmeshing the road system in a
 web of IT.
 
 That's the radical bit.
 
 The parallel Prometheus programme aims to add electronics - automatic
 guidance and navigation systems - to vehicles, creating "smart cars"
 permanently linked to the Drive traffic environment management control
 systems.
 
 The long-term aim is to transform driving and driving conditions.  But
 when put in the context of other trends, these technical developments
 may have different connotations.
 
 The initial Drive contracts are for the specification phase.  Proposal
 T416, for example, is for a "Black Box", a Vehicle Journey Data
 Recorder - an improved digital version of the tachographs which
 monitor the driving times of professional drivers.
 
 But T416 goes much further.  It aims to produce a record for
 non-professional drivers giving the exact trajectory of a vehicle
 during the last 1,000 meters preceding an accident.
 
 This system is likely to operate within a road management environment
 in which vehicles will be linked at all times via a radio
 telecommunications network which will provide information about routes
 and traffic conditions - which, in the longer term, may over-ride
 systems of driver control.
 
 It could effectively turn vehicles into part of instantly assembled or
 disassembled what-you-might-call electronically-coupled trains.
 
 It could also provide a running record that we lawfully behave by
 recording in real time where and who we are when we leave our own
 private world and enter society's.
 
 This traffic management environment will operate across national
 frontiers.  You thought you could get away?  You thought wrong.
 Orwell may well have been right in intent even if he wasn't with the
 date, technology or politics.
 
 We are, with the best of motives - what after all could be a better
 motive than the saving of lives in massive numbers? - and for the
 greatest good of the greatest number (where have I heard that before?)
 creating the conditions which underlie the Orwellian nightmare.
 
 Whether we operate them in the Orwellian way is, of course, a
 different issue.
 ...

------------------------------

Date: Thu, 18 Jan 90 09:15:42 est
From: Dave Horsfall <dave@stcns3.stc.oz.au>
Subject: Old habits die hard

Taken from the "Sydney Morning Herald" 15 Jan 90:

``A [Sydney] reader recalls his time in Zimbabwe, when computer setting
  was installed at the country's main commercial printers.  A supervisor
  from the hot-metal printing days had always used a mallet to jog the
  linotype machines back into action, and found that old habits die
  hard.  The result?  A technician flown in from Johannesburg to repair
  a badly bruised computer.''

[ Not so much a risk to the public from computers as a risk to the
  computer from the public, I guess ]

Dave Horsfall (VK2KFU),  Alcatel STC Australia,  dave@stcns3.stc.oz.AU

------------------------------

End of RISKS-FORUM Digest 9.61
************************