Subject: RISKS DIGEST 14.16
REPLY-TO: risks@csl.sri.com

RISKS-LIST: RISKS-FORUM Digest  Tuesday 8 December 1992  Volume 14 : Issue 16

        FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

  Contents:
Name confusion and its implications -- PART ONE (Don Norman, Guest Moderator, 
  with contributions from Will Taber, George Buckner, Eric Johnson, Brian 
  Hawthorne, Russell Aminzade, Bob Frankston, Chris Hibbert)
  [PART TWO IS IN RISKS-14.17.]

 The RISKS Forum is moderated.  Contributions should be relevant, sound, in 
 good taste, objective, coherent, concise, and nonrepetitious.  Diversity is
 welcome.  CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive 
 "Subject:" line.  Others may be ignored!  Contributions will not be ACKed.  
 The load is too great.  **PLEASE** INCLUDE YOUR NAME & INTERNET FROM: ADDRESS,
 especially .UUCP folks.  REQUESTS please to RISKS-Request@CSL.SRI.COM.     

 Vol i issue j, type "FTP CRVAX.SRI.COM<CR>login anonymous<CR>AnyNonNullPW<CR>
 CD RISKS:<CR>GET RISKS-i.j<CR>" (where i=1 to 14, j always TWO digits).  Vol i
 summaries in j=00; "dir risks-*.*<CR>" gives directory; "bye<CR>" logs out.
 The COLON in "CD RISKS:" is essential.  "CRVAX.SRI.COM" = "128.18.10.1".
 <CR>=CarriageReturn; FTPs may differ; UNIX prompts for username, password.

 For information regarding delivery of RISKS by FAX, phone 310-455-9300
 (or send FAX to RISKS at 310-455-2364, or EMail to risks-fax@cv.vortex.com).

 ALL CONTRIBUTIONS CONSIDERED AS PERSONAL COMMENTS; USUAL DISCLAIMERS APPLY.
 Relevant contributions may appear in the RISKS section of regular issues
 of ACM SIGSOFT's SOFTWARE ENGINEERING NOTES, unless you state otherwise.

----------------------------------------------------------------------

Date: Tue, 8 Dec 1992 11:22:19 -0800
From: Don Norman <norman@cogsci.ucsd.edu>
Subject: Name confusion and its implications.

In RISKS-14.12 (30 November 1992), Jerry Leichter and I independently
discussed the problems of name confusion -- where two different people might
have identical names and (in at least one case) identical birthdates.  Our
contributions produced a large number of responses -- it took 76 single-spaced
pages to print them all out!  Peter Neumann, moderator of RISKS, asked me to
provide a summary. This is it.

Before I review the individual comments, let me summarize my own views, which
have become much enriched by this interaction. Because the topic is so
complex, they can only be dealt with fairly by a rather lengthy, complex
review. I apologize for the length of this contribution to RISKS, but not only
would a shorter treatment be unfair, but this may be too short to do justice
to all the issues.

First: an executive summary, in bullet format:

* No single, simple solution is possible. The issues are too complex. They
involve legal, moral, religious, and cultural factors that vary radically
across the United States, North America, and the world. The choice of names
creates intensely emotional responses: names define a person's self image and
culture.

* Names serve two functions: 1. Cultural and self-image; 2: Societal
identification. If we separate these functions, then the discussion is much
simplified

* Societal identification leads to issues of privacy. Privacy issues are
complex. Privacy is a culturally-based notion. What one person considers
intensely private, another might consider public business. Some cultures
simply cannot understand another's concern for privacy in some matters and
lack of concern in others. 

* Privacy also divides into several distinct areas: 1. Reliability and
accuracy; 2. Misuse; 3. Privacy. Discussion of these issues is simplified
if the different concerns are separated.

* Finally, once again, the issues are so complex that no single, simple
solution is possible.

  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This debate started with an analysis of the non-unique character of names
and the statement that it didn't seem too onerous to require individuals to
select unique names. Note that the suggestion still allowed people the same
freedom in their choice of names as they now have, adding only the
requirement that they be unique. I am now convinced that this belief is
wrong -- names are critically important for a person or the family's self
image and cultural values. It would indeed by onerous to establish
artificial conventions -- now matter how well intentioned and gently
enforced -- to name selection.

People's names derive from a wide variety of sources and serve a wide variety
of purposes. Today, they are an essential component of self identity and
self-image. It is dangerous to tamper with them. The real issue is that names
serve so many functions that coupling the problem of unique identification
with that of a person's name simply adds confusion.  If we separate the
self-image, self-identity function of the name from that of societal
identification, then the problem simplifies. Let people be free to assign
their children or themselves any reasonable name consistent with their
culture, religion, and self values (and that are not deemed immoral or
improper by society). It would not matter if there were multiple people with
the same name. But then we must devise some other means of identification for
society. Moreover, there is no need to have a single scheme: we might have
different identifications for different purposes, thus helping thwart possible
misuse. (Credit: The suggestion to separate the identification aspect of a
name from its self-identity comes from several correspondents: the texts of
their suggestions are appended to the latter part of this message.)

This now raises the question of how we can invent a unique identification
scheme that addresses the problems of accuracy, misuses, and privacy. I also
believe strongly that the identifier be a humane one -- easy to learn, easy to
use. And it had better be one that is difficult to counterfeit. I suspect that
this means that the identifier will have to have several versions: A simple
one for non-critical usage (e.g., checking out library books or charging small
amounts of money), but more complex and with more encryption and other
personal identifiers when it comes to critical items.  Here I can imagine the
identification supplemented by various cryptographic schemes (the FBI and NSA
permitting), including the use of random voice segments, or
retinal/fingerprint/DNA scans -- the technical issues should be discussed
elsewhere. Many would argue that databases for different functions be
separated, not allowed to be interconnected (maybe with different, encrypted
identification schemes so as to avoid possible misuses).

UNIVERSAL IDENTIFICATION

The discussion about names soon became a discussion about universal
identification and the many issues associated with that. So now, a digression
into those issues. Concern about access to personal records can be divided
into at least three areas: accuracy, misuse, and privacy. These three
different topics often get confused in discussion, but I think we will make
more progress if we separate them:

ACCURACY OF RECORDS: If we rely on databases for credit ratings, police
checks, medical records, and other aspects of modern life, then those
records must be accurate. All too often they are erroneous, either by
having incomplete, inaccurate, or fallacious information or by combining
records of different individuals.

MISUSE: When people ask me what the problem is when others have access to
personal files, I do not have a good answer. The problem, I think, is that in
the United States, we do not practice what we preach. We claim that we have
religious, political, sexual, and racial freedom, but we do not. If we really
had that freedom, then maybe we wouldn't need so much privacy. If people wish
to be adulterers, or gay, or communist, or purple-skinned, or to subscribe to
(legal) pornographic magazines or films -- whatever --they should not be
ashamed to let others know. But that is not our society, regardless of what
the laws might state: Their lives would become intolerable if people knew (or
thought they knew) that kind of information.  Similarly, people shouldn't care
if their employer knew their medical history. Unfortunately, they have to
care, because they might get fired because their employer had erroneous
beliefs about the implications of the history.

One common complaint is that it is possible to learn a lot about someone and
then pretend to be them, gaining financial or other benefits (at their
expense). But this is not a problem with open access to records. Rather the
problem is that of insufficient verification of an individual's identity.  To
solve the identity problem we need other means -- a complex issue, but
nonetheless, one that in principle can be separated from the general issue of
privacy.

PRIVACY: Part of the privacy issue comes from the potential for misuse. But
some of it is because some people wish to keep their activities known only
to themselves or their close associates. Presumably this should be
permitted within the bounds of public safety and public good and as long as
their activities do not violate the laws of the country. The problem is
that many will argue (correctly, in my opinion) that the definitions of
"public safety" and "public good" are vague and question who has the right
to make those decisions; others will question a person's rights when the
laws of the country are thought to be immoral or improper. The reverse of
privacy is secret collection of information -- when the person about whom
the information is collected is not permitted access to the material, or in
some cases, is even unaware of the fact that such a collection exists.

UNIQUE IDENTIFIERS

But even given all these concerns about privacy, geopolitical groups and
countries will need unique identifiers for its citizens, if only for
legitimate societal concerns -- e.g., licensing for some activities (for
driving, flying, voting, being a medical doctor, ...), or keeping track of
people (voting, social security benefits, income tax). The convenience of
credit cards requires some unique identifiers. All this seems to require some
sort of central clearing house to ensure uniqueness. However, unique
identifiers have both virtues and difficulties, perhaps best summarized by
Peter Neumann (not as RISKS moderator, but as contributor, in a private note):

  There is no easy answer.  User Identification systems (UIDs) can
  disambiguate and could have staved off many of the ugly false arrest cases
  and other mistaken identities (e.g., see my CACM Inside Risks column of Jan 
  1992), if they were used properly.  UIDs can also create many disasters,
  particularly when they are abused or not used properly."

One interesting fact I discovered was that the United States Social Security
Number (SSN) is NOT a good choice of personal identifier for technical
reasons. Forget all the civil libertarian concerns -- it is simply a crappy
piece of technology, poorly implemented at that. Think about it: only nine
digits to register 250 million people -- that's only a factor of four leeway,
much of which is used up by non-used digits, etc. No check-digits, and such a
dense packing of the encoding that any random guess or simple typing or memory
error is apt to lead to someone else's account.

Chris Hibbert supplies an excellent discussion of social security numbers
in his FAQ (Frequently Asked Questions). SSNs are intended to be unique but
they goof now and then (it's happened in fewer than a hundred documented
cases). When the Social Security Administration discovers this, they issue
a new number to one of the people.

Reference: Hibbert, C. (Oct. 27, 1992). What to do when they ask for your
Social Security Number. "Social Security Number FAQ (Frequently Asked
Questions)."  uunet news groups: alt.privacy, misc.legal, news.answers,
alt.society.civil-liberty, comp.society.privacy. (hibbert@xanadu.com) (Copy
provided me by Esther Lumsdon.)

 - - - - - - - - - - - - - - - -

SUMMARY OF SUBMISSIONS
Note that I have not included all submissions, just the ones that made
unique points. I have deleted considerable material from each response
(else this document would run 70-90 pages), but aside from deletions, a
spelling-check (I did correct spelling errors), and a few minor
typographical edits, I have made no alterations.  [PGN did a little
cosmetic work as well, and caught a few more mispelings.]

Any line preceded by ">" comes from my original contribution to RISKS (except
for the notorious Internet mail scheme that will precede the word "From"
with ">" if it is the first word on a line -- another hack reminding us of
our UNIX legacy.) All comments by me are preceded by the phrase: 

COMMENT BY DN: 
At times, it may be difficult to distinguish the boundaries of the
contributions and my comments from the summaries. In an ideal world I would
use indentation and different type fonts to make the distinction clear. But
Internet is restricted to plain ASCII, so different fonts are out. And my
mail system (Eudora), for all its virtues, is not WYSIWYG, so I can't use
indentations -- not reliably anyway. (Sometimes I dream I am still using
Emacs, but then I pinch myself and wake up.)

 - - - - - - - - - - - - - - - -

NAMES AS SELF-IDENTITY

From: Will Taber <Will_Taber@dgc.com>

The most important function of any name is to identify us to ourselves. A
person's name is a significant factor in how they come to see themselves.
At some level, parents tend to pick names that reflect the person that they
hope that their child will become. My wife and I had a hard enough time
deciding on names that we liked. I am sorry, but any kind of global
registry is just out of the question. Some things just are not worth
changing for the convenience of database designers.

 - - - - - - - - - - - - - - - - - - - -

From: "George Buckner" <GRB@NCCIBM1.BITNET>

Excuse me, but are we now proposing to have all names verified for uniqueness
through a central database? What about the freedom of expression that would be
restricted by such a scheme? I would insist that it is the arbitrary
assumptions underlying the program logic (or lack thereof), not the persons
name, which needs to be changed.

This is really another example of "risk of invalid assumptions". We see
examples posted here all the time (i.e. programs which assume that no
transaction will require more than N digits to accommodate). In this case an
assumption was made that the full name, together with the birthdate and city
of residence, would guarantee uniqueness.

... the problem is in assuming that you can guarantee uniqueness via a
combination of name/birthdate/hair color/zodiac sign/whatever. It is these
assumptions which impose arbitrary restrictions on data values. As Jerry
Leichter pointed out, the safest way to handle this is to assign a unique
number to each instance of an object in a file. This is nothing new
-businesses have been assigning unique customer ids/account numbers to people
for years.  Thus it doesn't matter how many John Q. Smiths, born on the same
day, living in the same city, are contained in the database.

> Would a world-wide registry for names work? 
Perhaps, but it is quite unnecessary, and quite undesirable:

"I'm sorry sir, but the computer has rejected the name you have chosen
 for your child. It isn't unique."

> This is a serious request. how can we invent unique identifiers for people
> that
> 1. Make it easy to select a name
 Select whatever name you choose. To hell with arbitrary
 programming assumptions. Incorporate a separate field to hold
 the unique ID.
> 2. Work for an entire country and potentially scale to the entire world
 Moot question, if globally unique identifiers, SEPARATE FROM THE
 NAME, are used. If it's unique globally then by definition it will
 be unique within a country.
> 3. Do not violate civil liberties
 This is a matter of coordinating policy with implementation.
 Use of unique numbers (SSN) doesn't NECESSARILY infringe on civil
 liberties, though it may make such violations easier. Likewise,
 absence of these identifiers doesn't guarantee that our liberties
 are safe.
> 4. Do not make it possible for others to misuse the system
 This is related to point #3 above, and is thus another matter of
 coordinating policy with implementation.

> In other words, how do we get the benefits and avoid the risks?
 Yes, we always want to have it both ways, and to some degree, perhaps
 we can. But remember that we are trying to accommodate two quite
 divergent (if not mutually exclusive -at least on the surface)
 imperatives.

 And the price of liberty is STILL eternal vigilance.

 - - - - - - - - - - - - - - - -

From: Eric Johnson <johnsone@camax.com>

* There are also cultural factors. In some communities (I think Australian
Aborigines, but I may be mistaken), people take on different names in the
course of their lifetimes. For example, if a close relative dies, the
survivors may change their names.  After a mourning period, the survivors may
change their names again and may not go back to the original names.

* Icelanders still use the father's or mother's name as their last name, with
the "daughter" ("dottir") or "son" (""sson") suffix.  Thus, the last name
changes for every generation (by sex). I think Russians may still have part of
their names based on this. Also, Spanish names typically use the mother's and
father's last name, I believe.  These naming schemes may be tied into a
person's cultural identify.  In such a case, these people may not want to
change.

* Many immigrants to the USA had their names changed at the border by
insensitive/confused immigration officials. Many people hold a strong cultural
identity in their names. It seems that your proposal would hinder this.

* What about people who wish to name their children, especially sons,
after their parents? (Just read 100 Years of Solitude to see a lot of 
common names :-) Some examples:
 William William Williams, Sr. (Bill Bill Bill :-)
 William William Williams, Jr.
 William William Williams III

* Marriage: This is still a touchy issue in the USA. Many people desire to
change their last name when they marry (many also do not).  In such cases,
people won't care much about your national/international name registry. Even
if the names were unique in the beginning, the changed name may not be. Do you
think the religious right will go for this?

* Religion: Some people change their name when they a) convert religions or b)
go through some religious experience. To use an example from a current movie,
Malcolm Little changed his name to Malcolm X when he converted to Islam (or at
least to the Nation of Islam's Islam).  The "X" also had political
significance (the theory--at least from X's autobiography--was that the TRUE
last name was wiped out by whites in the past, so that the X acts as a
placeholder until--presumably-- God will provide the real name. After Malcolm
X went on the Hajj to Mecca, he changed his name again (to something like
El-Hajj Malik Shabazz, his wife still calls herself Betty Shabazz). Many
people also want to name their children after religious names (Rebecca,
Matthew, Joseph, etc.). Eduard Shevardnadze recently was baptized in Georgia
(the country, not the state). His new religious name is Georgi.

 - - - - - - - - - - - - - - - -

From: Brian.Hawthorne@east.sun.com (Brian Holt Hawthorne - SunSelect
Engineering - Norwood)
.....

An identifier should be static data, names are dynamic. Within a computer
database, a personal name should be considered as stable as such things as
weight, height, or hair color. Nobody would think of making these the primary
identifiers of a person in anything other than short-term databases.

Just in mainstream U.S. culture, there are an unending number of events that
may lead to a change in name: birth, Catholic confirmation, marriage,
Amerindian rites of passage, divorce, adoption, etc. Taken in combination, it
is nearly impossible to predict how many names an individual may have, and
what these names may mean to them and to their peers.

It is not up to me as an individual to ensure that my name is convenient to
somebody's database, even, or perhaps especially, if that database is
maintained by the police. If they have a need to enter me in their data, it is
their responsibility that that entry be distinguishable from others.  All of
the other data you suggested (birth date, birth place, etc.) are much better
identifiers than my personal name, as it is unlikely that I will be able to
change my birth date once it has occurred!

In order to provide any serious answers to Don's serious request for a way to
invent unique identifiers for people, we need to decide what these identifiers
are going to be used for.

For casual conversation, there is no problem, we already know how to do this.
"John Smith. No, not that one, John Smith from Poughkeepsie. With the blue
eyes."

In written form, we have also solved the problem.
"Don Norman from the Cognitive Science Department of the University of
California, San Diego at La Jolla."

This leaves open the need for identifiers in databases (whether computerized
or not). The requirements of these identifiers differ for different databases.
Do they need to be unique for all individuals in the world? In that case, the
identifier probably needs some location information. Do they need to be
inherently bound to an individual? In that case, there probably needs to be
some birth, fingerprint or other static information in the identifier itself.

None of these requirements imply that personal names would make a good
identifier for such purposes.
....

P.S. Along the lines of databases asserting control over personal
information, if you were receiving this message directly, instead of in a
digest, you would see that my email address had been changed from
rowan@sea.east.sun.com to brian.hawthorne@East.Sun.COM and that my name had
been changed from Rowan Hawthorne to Brian Hawthorne. This is because the
corporate machine (at least the human parts) fails to understand that some
of us have an additional name, known as a "nickname". My friends call me
Rowan, people on the net call me rowan, but since the IRS considers me to
be Brian the company follows suit. I suppose I could change my name legally
to Rowan, but I enjoy having the two different names for different
purposes. Forcing me to either abandon my nickname, or adopt it in all
situations seems a bit draconian.

 - - - - - - - - - - - - - - - -

From: "Russell Aminzade: Trinity College of VT" <AMINZADE@UVMVAX.BITNET>
....
I would NEVER grant any authority the right to select a name for me, but I'm
quite comfortable having my ADDRESS assigned by someone. 
...
Norman states that "Names.. are a technological invention to make it easy to
identify people uniquely." Jerry Leichter suggests something similar: "Any
society has to have a way to identify individuals.  At onetime, when the scale
of society was small, a single name, plus perhaps a city of origin, or a
parent's name, or a job name, was enough identification..." But names do much,
much more. Names define who we are emotionally, socially and spiritually. I
know a family that named their daughters Hope Faith, and Charity. My family
named the three boys Robert, Ronald, and Russell for quirkish reasons. And
what about names like "He Who Conquers" or "Dances With Wolves." Something
more than a unique identifier here.

Addresses, however, fit the description that Norman and Leichter use. They are
bureaucratic, assigned by others. We can accept an address that has something
like 05667 or @UCSD.EDU. I'm quite comfortable with AMINZADE@UVMVAX.UVM.EDU,
but I doubt my parents would have been willing to give me that name. Makes it
lots easier to develop a unique identifier.

A personal "address" begins to make sense when telephone, mail, and other
information services are smart enough to route our communications to the
person rather than to the device. We're at the threshold of being able to do
this now, so perhaps now is the time to talk about personal addresses.

(Also see the comment by Roy G. Saltman in RISKS-14.15)

 - - - - - - - - - - - - - - - -

COMMENT BY DN: I start with a question:

From: Bob_Frankston@frankston.com
....
 In countries that do allow the use of national identifiers in databases, are
they universally used to avoid name confusions?

COMMENT BY DN: I forwarded the question to Chris Hibbert. He responded:

From: hibbert@xanadu.com (Chris Hibbert)

This is a very broad question, but there's a little bit that's worth saying.
Countries that have good universal unique identifiers fall into two camps,
mostly free, and mostly not. In the ones that are mostly free, (Canada, e.g.)
the SIN is used in government databases, but cross-matching isn't routinely
allowed, and private companies don't seem to use the same number as the
government. (I don't actually have solid references to back this up, but I've
read the annual reports of the Canadian Privacy Commissioner, and am
extrapolating from the kind of complaints they do and don't get compared to
the US.)

In more intrusive societies, the same identifiers are used throughout the
government, and the government pushes for the efficiency that results from
consolidated databases, so data matching isn't even a concern. The government
of Thailand (I think) has a single database which all the government
departments share access to. This makes it possible for someone in one
department to make a decision based on your interactions with another
department. (This generalization is based on even less information.)

Anyway, my bottom line is that I believe there's a lot of benefit in not using
the same identifier in our dealings with different parties.  That's behind a
lot of my attention to SSNs. They wouldn't be a problem if they were only used
by the IRS or the SSA, but the fact that my employer can find out about my
medical history, or (in some states) someone who sees my driver's license can
find out my credit rating or pretend to be me and ruin it is a real weakness
of this system.

 - - - - - - - - - - - - - - - -

From: xanadu!hibbert@uunet.UU.NET (Chris Hibbert)
...
Jerry Leichter lays the problem at the feet of the individuals being
identified. 
    If [Steven Reid] stands by his insistence that "Steven Reid,
    born on xx/yy/zzzz" is all the identifying information he
    will give, he cannot expect to be distinguished from the
    other Mr. Reid who just as adamantly insists on his right to
    identify himself in the same way.

In my experience, (e.g., the case of Terry Dean Rogan) people in this
situation are eager to be distinguished, and go to enormous trouble to
convince the authorities to support them in this.  The problem becomes
intractable when the authorities involved insist that they have to treat the
identifier stored by their computers as if they provide secure unique
identification.  If the computer systems were made more flexible they could
store more information and allow people to work around the problems when they
arise.  In other situations, the relevant authorities ignore the extra
distinguishing info that is there.  Robert Ellis Smith's compendium of Privacy
War Stories lists numerous cases where the authorities arrested someone with a
name similar to one on a warrant, even though the physical descriptions were
very different.  (As much as a foot in height, 100 pounds in weight, differing
hair color, etc.)  What's a person to do if using any variant of your own name
is close enough to match against someone the police are looking for?

The Montreal police also need to realize that in a metropolitan area like
that, they have to be aware that names aren't close to unique.  I don't know
if they have a large Korean or other Asian community there, but common names
there are much more common than in groups descended from western european
societies.  And in these cases, physical descriptions aren't going to help
much.  (I didn't say "they all look alike", I said our western system of
classifying people doesn't distinguish orientals at all.  Hair color, eye
color, height don't vary in the way we're used to, and most caucasian police
are much worse at guessing ages of orientals who turn gray, go bald, and
wrinkle in different ways than whites.

    [THIS SPECIAL DOUBLE ISSUE IS CONTINUED IN RISKS-14.17.]

------------------------------

End of RISKS-FORUM Digest 14.16
************************