Subject: RISKS DIGEST 14.16 REPLY-TO: risks@csl.sri.com RISKS-LIST: RISKS-FORUM Digest Tuesday 8 December 1992 Volume 14 : Issue 16 FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Name confusion and its implications -- PART ONE (Don Norman, Guest Moderator, with contributions from Will Taber, George Buckner, Eric Johnson, Brian Hawthorne, Russell Aminzade, Bob Frankston, Chris Hibbert) [PART TWO IS IN RISKS-14.17.] The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, and nonrepetitious. Diversity is welcome. CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line. Others may be ignored! Contributions will not be ACKed. The load is too great. **PLEASE** INCLUDE YOUR NAME & INTERNET FROM: ADDRESS, especially .UUCP folks. REQUESTS please to RISKS-Request@CSL.SRI.COM. Vol i issue j, type "FTP CRVAX.SRI.COMlogin anonymousAnyNonNullPW CD RISKS:GET RISKS-i.j" (where i=1 to 14, j always TWO digits). Vol i summaries in j=00; "dir risks-*.*" gives directory; "bye" logs out. The COLON in "CD RISKS:" is essential. "CRVAX.SRI.COM" = "128.18.10.1". =CarriageReturn; FTPs may differ; UNIX prompts for username, password. For information regarding delivery of RISKS by FAX, phone 310-455-9300 (or send FAX to RISKS at 310-455-2364, or EMail to risks-fax@cv.vortex.com). ALL CONTRIBUTIONS CONSIDERED AS PERSONAL COMMENTS; USUAL DISCLAIMERS APPLY. Relevant contributions may appear in the RISKS section of regular issues of ACM SIGSOFT's SOFTWARE ENGINEERING NOTES, unless you state otherwise. ---------------------------------------------------------------------- Date: Tue, 8 Dec 1992 11:22:19 -0800 From: Don Norman Subject: Name confusion and its implications. In RISKS-14.12 (30 November 1992), Jerry Leichter and I independently discussed the problems of name confusion -- where two different people might have identical names and (in at least one case) identical birthdates. Our contributions produced a large number of responses -- it took 76 single-spaced pages to print them all out! Peter Neumann, moderator of RISKS, asked me to provide a summary. This is it. Before I review the individual comments, let me summarize my own views, which have become much enriched by this interaction. Because the topic is so complex, they can only be dealt with fairly by a rather lengthy, complex review. I apologize for the length of this contribution to RISKS, but not only would a shorter treatment be unfair, but this may be too short to do justice to all the issues. First: an executive summary, in bullet format: * No single, simple solution is possible. The issues are too complex. They involve legal, moral, religious, and cultural factors that vary radically across the United States, North America, and the world. The choice of names creates intensely emotional responses: names define a person's self image and culture. * Names serve two functions: 1. Cultural and self-image; 2: Societal identification. If we separate these functions, then the discussion is much simplified * Societal identification leads to issues of privacy. Privacy issues are complex. Privacy is a culturally-based notion. What one person considers intensely private, another might consider public business. Some cultures simply cannot understand another's concern for privacy in some matters and lack of concern in others. * Privacy also divides into several distinct areas: 1. Reliability and accuracy; 2. Misuse; 3. Privacy. Discussion of these issues is simplified if the different concerns are separated. * Finally, once again, the issues are so complex that no single, simple solution is possible. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This debate started with an analysis of the non-unique character of names and the statement that it didn't seem too onerous to require individuals to select unique names. Note that the suggestion still allowed people the same freedom in their choice of names as they now have, adding only the requirement that they be unique. I am now convinced that this belief is wrong -- names are critically important for a person or the family's self image and cultural values. It would indeed by onerous to establish artificial conventions -- now matter how well intentioned and gently enforced -- to name selection. People's names derive from a wide variety of sources and serve a wide variety of purposes. Today, they are an essential component of self identity and self-image. It is dangerous to tamper with them. The real issue is that names serve so many functions that coupling the problem of unique identification with that of a person's name simply adds confusion. If we separate the self-image, self-identity function of the name from that of societal identification, then the problem simplifies. Let people be free to assign their children or themselves any reasonable name consistent with their culture, religion, and self values (and that are not deemed immoral or improper by society). It would not matter if there were multiple people with the same name. But then we must devise some other means of identification for society. Moreover, there is no need to have a single scheme: we might have different identifications for different purposes, thus helping thwart possible misuse. (Credit: The suggestion to separate the identification aspect of a name from its self-identity comes from several correspondents: the texts of their suggestions are appended to the latter part of this message.) This now raises the question of how we can invent a unique identification scheme that addresses the problems of accuracy, misuses, and privacy. I also believe strongly that the identifier be a humane one -- easy to learn, easy to use. And it had better be one that is difficult to counterfeit. I suspect that this means that the identifier will have to have several versions: A simple one for non-critical usage (e.g., checking out library books or charging small amounts of money), but more complex and with more encryption and other personal identifiers when it comes to critical items. Here I can imagine the identification supplemented by various cryptographic schemes (the FBI and NSA permitting), including the use of random voice segments, or retinal/fingerprint/DNA scans -- the technical issues should be discussed elsewhere. Many would argue that databases for different functions be separated, not allowed to be interconnected (maybe with different, encrypted identification schemes so as to avoid possible misuses). UNIVERSAL IDENTIFICATION The discussion about names soon became a discussion about universal identification and the many issues associated with that. So now, a digression into those issues. Concern about access to personal records can be divided into at least three areas: accuracy, misuse, and privacy. These three different topics often get confused in discussion, but I think we will make more progress if we separate them: ACCURACY OF RECORDS: If we rely on databases for credit ratings, police checks, medical records, and other aspects of modern life, then those records must be accurate. All too often they are erroneous, either by having incomplete, inaccurate, or fallacious information or by combining records of different individuals. MISUSE: When people ask me what the problem is when others have access to personal files, I do not have a good answer. The problem, I think, is that in the United States, we do not practice what we preach. We claim that we have religious, political, sexual, and racial freedom, but we do not. If we really had that freedom, then maybe we wouldn't need so much privacy. If people wish to be adulterers, or gay, or communist, or purple-skinned, or to subscribe to (legal) pornographic magazines or films -- whatever --they should not be ashamed to let others know. But that is not our society, regardless of what the laws might state: Their lives would become intolerable if people knew (or thought they knew) that kind of information. Similarly, people shouldn't care if their employer knew their medical history. Unfortunately, they have to care, because they might get fired because their employer had erroneous beliefs about the implications of the history. One common complaint is that it is possible to learn a lot about someone and then pretend to be them, gaining financial or other benefits (at their expense). But this is not a problem with open access to records. Rather the problem is that of insufficient verification of an individual's identity. To solve the identity problem we need other means -- a complex issue, but nonetheless, one that in principle can be separated from the general issue of privacy. PRIVACY: Part of the privacy issue comes from the potential for misuse. But some of it is because some people wish to keep their activities known only to themselves or their close associates. Presumably this should be permitted within the bounds of public safety and public good and as long as their activities do not violate the laws of the country. The problem is that many will argue (correctly, in my opinion) that the definitions of "public safety" and "public good" are vague and question who has the right to make those decisions; others will question a person's rights when the laws of the country are thought to be immoral or improper. The reverse of privacy is secret collection of information -- when the person about whom the information is collected is not permitted access to the material, or in some cases, is even unaware of the fact that such a collection exists. UNIQUE IDENTIFIERS But even given all these concerns about privacy, geopolitical groups and countries will need unique identifiers for its citizens, if only for legitimate societal concerns -- e.g., licensing for some activities (for driving, flying, voting, being a medical doctor, ...), or keeping track of people (voting, social security benefits, income tax). The convenience of credit cards requires some unique identifiers. All this seems to require some sort of central clearing house to ensure uniqueness. However, unique identifiers have both virtues and difficulties, perhaps best summarized by Peter Neumann (not as RISKS moderator, but as contributor, in a private note): There is no easy answer. User Identification systems (UIDs) can disambiguate and could have staved off many of the ugly false arrest cases and other mistaken identities (e.g., see my CACM Inside Risks column of Jan 1992), if they were used properly. UIDs can also create many disasters, particularly when they are abused or not used properly." One interesting fact I discovered was that the United States Social Security Number (SSN) is NOT a good choice of personal identifier for technical reasons. Forget all the civil libertarian concerns -- it is simply a crappy piece of technology, poorly implemented at that. Think about it: only nine digits to register 250 million people -- that's only a factor of four leeway, much of which is used up by non-used digits, etc. No check-digits, and such a dense packing of the encoding that any random guess or simple typing or memory error is apt to lead to someone else's account. Chris Hibbert supplies an excellent discussion of social security numbers in his FAQ (Frequently Asked Questions). SSNs are intended to be unique but they goof now and then (it's happened in fewer than a hundred documented cases). When the Social Security Administration discovers this, they issue a new number to one of the people. Reference: Hibbert, C. (Oct. 27, 1992). What to do when they ask for your Social Security Number. "Social Security Number FAQ (Frequently Asked Questions)." uunet news groups: alt.privacy, misc.legal, news.answers, alt.society.civil-liberty, comp.society.privacy. (hibbert@xanadu.com) (Copy provided me by Esther Lumsdon.) - - - - - - - - - - - - - - - - SUMMARY OF SUBMISSIONS Note that I have not included all submissions, just the ones that made unique points. I have deleted considerable material from each response (else this document would run 70-90 pages), but aside from deletions, a spelling-check (I did correct spelling errors), and a few minor typographical edits, I have made no alterations. [PGN did a little cosmetic work as well, and caught a few more mispelings.] Any line preceded by ">" comes from my original contribution to RISKS (except for the notorious Internet mail scheme that will precede the word "From" with ">" if it is the first word on a line -- another hack reminding us of our UNIX legacy.) All comments by me are preceded by the phrase: COMMENT BY DN: At times, it may be difficult to distinguish the boundaries of the contributions and my comments from the summaries. In an ideal world I would use indentation and different type fonts to make the distinction clear. But Internet is restricted to plain ASCII, so different fonts are out. And my mail system (Eudora), for all its virtues, is not WYSIWYG, so I can't use indentations -- not reliably anyway. (Sometimes I dream I am still using Emacs, but then I pinch myself and wake up.) - - - - - - - - - - - - - - - - NAMES AS SELF-IDENTITY From: Will Taber The most important function of any name is to identify us to ourselves. A person's name is a significant factor in how they come to see themselves. At some level, parents tend to pick names that reflect the person that they hope that their child will become. My wife and I had a hard enough time deciding on names that we liked. I am sorry, but any kind of global registry is just out of the question. Some things just are not worth changing for the convenience of database designers. - - - - - - - - - - - - - - - - - - - - From: "George Buckner" Excuse me, but are we now proposing to have all names verified for uniqueness through a central database? What about the freedom of expression that would be restricted by such a scheme? I would insist that it is the arbitrary assumptions underlying the program logic (or lack thereof), not the persons name, which needs to be changed. This is really another example of "risk of invalid assumptions". We see examples posted here all the time (i.e. programs which assume that no transaction will require more than N digits to accommodate). In this case an assumption was made that the full name, together with the birthdate and city of residence, would guarantee uniqueness. ... the problem is in assuming that you can guarantee uniqueness via a combination of name/birthdate/hair color/zodiac sign/whatever. It is these assumptions which impose arbitrary restrictions on data values. As Jerry Leichter pointed out, the safest way to handle this is to assign a unique number to each instance of an object in a file. This is nothing new -businesses have been assigning unique customer ids/account numbers to people for years. Thus it doesn't matter how many John Q. Smiths, born on the same day, living in the same city, are contained in the database. > Would a world-wide registry for names work? Perhaps, but it is quite unnecessary, and quite undesirable: "I'm sorry sir, but the computer has rejected the name you have chosen for your child. It isn't unique." > This is a serious request. how can we invent unique identifiers for people > that > 1. Make it easy to select a name Select whatever name you choose. To hell with arbitrary programming assumptions. Incorporate a separate field to hold the unique ID. > 2. Work for an entire country and potentially scale to the entire world Moot question, if globally unique identifiers, SEPARATE FROM THE NAME, are used. If it's unique globally then by definition it will be unique within a country. > 3. Do not violate civil liberties This is a matter of coordinating policy with implementation. Use of unique numbers (SSN) doesn't NECESSARILY infringe on civil liberties, though it may make such violations easier. Likewise, absence of these identifiers doesn't guarantee that our liberties are safe. > 4. Do not make it possible for others to misuse the system This is related to point #3 above, and is thus another matter of coordinating policy with implementation. > In other words, how do we get the benefits and avoid the risks? Yes, we always want to have it both ways, and to some degree, perhaps we can. But remember that we are trying to accommodate two quite divergent (if not mutually exclusive -at least on the surface) imperatives. And the price of liberty is STILL eternal vigilance. - - - - - - - - - - - - - - - - From: Eric Johnson * There are also cultural factors. In some communities (I think Australian Aborigines, but I may be mistaken), people take on different names in the course of their lifetimes. For example, if a close relative dies, the survivors may change their names. After a mourning period, the survivors may change their names again and may not go back to the original names. * Icelanders still use the father's or mother's name as their last name, with the "daughter" ("dottir") or "son" (""sson") suffix. Thus, the last name changes for every generation (by sex). I think Russians may still have part of their names based on this. Also, Spanish names typically use the mother's and father's last name, I believe. These naming schemes may be tied into a person's cultural identify. In such a case, these people may not want to change. * Many immigrants to the USA had their names changed at the border by insensitive/confused immigration officials. Many people hold a strong cultural identity in their names. It seems that your proposal would hinder this. * What about people who wish to name their children, especially sons, after their parents? (Just read 100 Years of Solitude to see a lot of common names :-) Some examples: William William Williams, Sr. (Bill Bill Bill :-) William William Williams, Jr. William William Williams III * Marriage: This is still a touchy issue in the USA. Many people desire to change their last name when they marry (many also do not). In such cases, people won't care much about your national/international name registry. Even if the names were unique in the beginning, the changed name may not be. Do you think the religious right will go for this? * Religion: Some people change their name when they a) convert religions or b) go through some religious experience. To use an example from a current movie, Malcolm Little changed his name to Malcolm X when he converted to Islam (or at least to the Nation of Islam's Islam). The "X" also had political significance (the theory--at least from X's autobiography--was that the TRUE last name was wiped out by whites in the past, so that the X acts as a placeholder until--presumably-- God will provide the real name. After Malcolm X went on the Hajj to Mecca, he changed his name again (to something like El-Hajj Malik Shabazz, his wife still calls herself Betty Shabazz). Many people also want to name their children after religious names (Rebecca, Matthew, Joseph, etc.). Eduard Shevardnadze recently was baptized in Georgia (the country, not the state). His new religious name is Georgi. - - - - - - - - - - - - - - - - From: Brian.Hawthorne@east.sun.com (Brian Holt Hawthorne - SunSelect Engineering - Norwood) ..... An identifier should be static data, names are dynamic. Within a computer database, a personal name should be considered as stable as such things as weight, height, or hair color. Nobody would think of making these the primary identifiers of a person in anything other than short-term databases. Just in mainstream U.S. culture, there are an unending number of events that may lead to a change in name: birth, Catholic confirmation, marriage, Amerindian rites of passage, divorce, adoption, etc. Taken in combination, it is nearly impossible to predict how many names an individual may have, and what these names may mean to them and to their peers. It is not up to me as an individual to ensure that my name is convenient to somebody's database, even, or perhaps especially, if that database is maintained by the police. If they have a need to enter me in their data, it is their responsibility that that entry be distinguishable from others. All of the other data you suggested (birth date, birth place, etc.) are much better identifiers than my personal name, as it is unlikely that I will be able to change my birth date once it has occurred! In order to provide any serious answers to Don's serious request for a way to invent unique identifiers for people, we need to decide what these identifiers are going to be used for. For casual conversation, there is no problem, we already know how to do this. "John Smith. No, not that one, John Smith from Poughkeepsie. With the blue eyes." In written form, we have also solved the problem. "Don Norman from the Cognitive Science Department of the University of California, San Diego at La Jolla." This leaves open the need for identifiers in databases (whether computerized or not). The requirements of these identifiers differ for different databases. Do they need to be unique for all individuals in the world? In that case, the identifier probably needs some location information. Do they need to be inherently bound to an individual? In that case, there probably needs to be some birth, fingerprint or other static information in the identifier itself. None of these requirements imply that personal names would make a good identifier for such purposes. .... P.S. Along the lines of databases asserting control over personal information, if you were receiving this message directly, instead of in a digest, you would see that my email address had been changed from rowan@sea.east.sun.com to brian.hawthorne@East.Sun.COM and that my name had been changed from Rowan Hawthorne to Brian Hawthorne. This is because the corporate machine (at least the human parts) fails to understand that some of us have an additional name, known as a "nickname". My friends call me Rowan, people on the net call me rowan, but since the IRS considers me to be Brian the company follows suit. I suppose I could change my name legally to Rowan, but I enjoy having the two different names for different purposes. Forcing me to either abandon my nickname, or adopt it in all situations seems a bit draconian. - - - - - - - - - - - - - - - - From: "Russell Aminzade: Trinity College of VT" .... I would NEVER grant any authority the right to select a name for me, but I'm quite comfortable having my ADDRESS assigned by someone. ... Norman states that "Names.. are a technological invention to make it easy to identify people uniquely." Jerry Leichter suggests something similar: "Any society has to have a way to identify individuals. At onetime, when the scale of society was small, a single name, plus perhaps a city of origin, or a parent's name, or a job name, was enough identification..." But names do much, much more. Names define who we are emotionally, socially and spiritually. I know a family that named their daughters Hope Faith, and Charity. My family named the three boys Robert, Ronald, and Russell for quirkish reasons. And what about names like "He Who Conquers" or "Dances With Wolves." Something more than a unique identifier here. Addresses, however, fit the description that Norman and Leichter use. They are bureaucratic, assigned by others. We can accept an address that has something like 05667 or @UCSD.EDU. I'm quite comfortable with AMINZADE@UVMVAX.UVM.EDU, but I doubt my parents would have been willing to give me that name. Makes it lots easier to develop a unique identifier. A personal "address" begins to make sense when telephone, mail, and other information services are smart enough to route our communications to the person rather than to the device. We're at the threshold of being able to do this now, so perhaps now is the time to talk about personal addresses. (Also see the comment by Roy G. Saltman in RISKS-14.15) - - - - - - - - - - - - - - - - COMMENT BY DN: I start with a question: From: Bob_Frankston@frankston.com .... In countries that do allow the use of national identifiers in databases, are they universally used to avoid name confusions? COMMENT BY DN: I forwarded the question to Chris Hibbert. He responded: From: hibbert@xanadu.com (Chris Hibbert) This is a very broad question, but there's a little bit that's worth saying. Countries that have good universal unique identifiers fall into two camps, mostly free, and mostly not. In the ones that are mostly free, (Canada, e.g.) the SIN is used in government databases, but cross-matching isn't routinely allowed, and private companies don't seem to use the same number as the government. (I don't actually have solid references to back this up, but I've read the annual reports of the Canadian Privacy Commissioner, and am extrapolating from the kind of complaints they do and don't get compared to the US.) In more intrusive societies, the same identifiers are used throughout the government, and the government pushes for the efficiency that results from consolidated databases, so data matching isn't even a concern. The government of Thailand (I think) has a single database which all the government departments share access to. This makes it possible for someone in one department to make a decision based on your interactions with another department. (This generalization is based on even less information.) Anyway, my bottom line is that I believe there's a lot of benefit in not using the same identifier in our dealings with different parties. That's behind a lot of my attention to SSNs. They wouldn't be a problem if they were only used by the IRS or the SSA, but the fact that my employer can find out about my medical history, or (in some states) someone who sees my driver's license can find out my credit rating or pretend to be me and ruin it is a real weakness of this system. - - - - - - - - - - - - - - - - From: xanadu!hibbert@uunet.UU.NET (Chris Hibbert) ... Jerry Leichter lays the problem at the feet of the individuals being identified. If [Steven Reid] stands by his insistence that "Steven Reid, born on xx/yy/zzzz" is all the identifying information he will give, he cannot expect to be distinguished from the other Mr. Reid who just as adamantly insists on his right to identify himself in the same way. In my experience, (e.g., the case of Terry Dean Rogan) people in this situation are eager to be distinguished, and go to enormous trouble to convince the authorities to support them in this. The problem becomes intractable when the authorities involved insist that they have to treat the identifier stored by their computers as if they provide secure unique identification. If the computer systems were made more flexible they could store more information and allow people to work around the problems when they arise. In other situations, the relevant authorities ignore the extra distinguishing info that is there. Robert Ellis Smith's compendium of Privacy War Stories lists numerous cases where the authorities arrested someone with a name similar to one on a warrant, even though the physical descriptions were very different. (As much as a foot in height, 100 pounds in weight, differing hair color, etc.) What's a person to do if using any variant of your own name is close enough to match against someone the police are looking for? The Montreal police also need to realize that in a metropolitan area like that, they have to be aware that names aren't close to unique. I don't know if they have a large Korean or other Asian community there, but common names there are much more common than in groups descended from western european societies. And in these cases, physical descriptions aren't going to help much. (I didn't say "they all look alike", I said our western system of classifying people doesn't distinguish orientals at all. Hair color, eye color, height don't vary in the way we're used to, and most caucasian police are much worse at guessing ages of orientals who turn gray, go bald, and wrinkle in different ways than whites. [THIS SPECIAL DOUBLE ISSUE IS CONTINUED IN RISKS-14.17.] ------------------------------ End of RISKS-FORUM Digest 14.16 ************************