Original Message Date: 06 Jun 91 13:48:41 From: Uucp on 1:125/777 To: Tom Jennings on 1:125/111 Subj: Ralph Merkle's paper ^AINTL 1:125/111 1:125/777 From: toad.com!gnu (John Gilmore) To: pozar, tom Date: Thu, 6 Jun 91 11:51:59 PDT From postnews Thu Jul 13 03:10:10 1989 Subject: Merkle's "A Software Encryption Function" now published and  available Newsgroups: sci.crypt Ralph Merkle called me today to let me know that Xerox was not going to let him submit his paper on a nice new set of encryption and hash functions to a journal for publication. The story is that a division of Xerox sells a lot of stuff to NSA and they threatened to pull their business if Xerox publishes it. There is no law that says NSA can stop Xerox from publishing it -- it's just a "business decision" on Xerox's part. Happily, however, I do not sell anything to the NSA. And I have a copy of the paper, which was distributed for several months by Xerox, without any conditions, before NSA even heard of it. The work was not government-sponsored or classified; there is no law that lets the government suppress it. As a courtesy to Xerox Corporation I could avoid publishing this paper. However, I prefer to extend the courtesy to the person who did the work, Ralph Merkle, who would like to see it released and used. I do thank Xerox for supporting his excellent work and hope that they will continue to do so. Mr. Merkle did not ask or suggest that I publish the paper, and should bear none of the blame (if any). I have published and distributed a number of copies of this paper, and I hereby offer to sell a copy of this paper to anyone who sends me $10 (cash preferred, checks accepted) and a return address. Send your  requests to: Merkle Paper Publishing PO Box 170608 San Francisco, CA, USA 94117-0608 Since the paper is "published and is generally available to the public through subscriptions which are available without restriction to any individual who desires to obtain or purchase the published information", it is exempt from State Department export control under 22 CFR 120.18 and 22 CFR 125.1 (a), and is exportable to all destintions under Commerce Department General License GTDA under 15 CFR 379.3(a). It can therefore be sent to foreign as well as US domestic individuals. I believe that the availability of fast, secure cryptography to the worldwide public will enable us to build much more secure computer systems and networks, increasing individual privacy as well as making viruses and worms much harder to write. For example, the Snefru one-way hash function described in the paper would be a good choice for validating copies of programs downloaded from BBS systems or the net, to detect virus contamination. If UUCP and TCP/IP links could be encrypted with Mr. Merkle's Khafre or Khufu ciphers, simple monitoring of phone wires or Ethernets would not yield login passwords, private mail, and other serious security violations. The technology exists; all that stands in our way is a bureacracy that has no *legal* power to restrict us, if we follow the published rules. John Gilmore From postnews Thu Jul 13 03:31:56 1989 Subject: Online copy of Merkle's "A Software Encryption Function" Newsgroups: sci.crypt References: <7981@hoptoad.uucp> [See the parent article <7981@hoptoad.uucp> for the legalese. This netnews article is exportable under 22 CFR 120.18, 22 CFR 125.1 (a), and 15 CFR 379.3(a). It can be sent to all foreign as well as US domestic destinations. -- John] A Software Encryption Function by Ralph C. Merkle Xerox PARC 3333 Coyote Hill Road Palo Alto, CA 94304 ABSTRACT Encryption hardware is not available on most computer systems in use today. Despite this fact, there is no well accepted encryption function designed for software implementation -- instead, hardware designs are emulated in software and the resulting performance loss is tolerated. The obvious solution is to design an encryption function for implementation in software. Such an encryption function is presented here -- on a SUN 4/260 it can encrypt at 4 to 8 megabits per second. The combination of modern processor speeds and a faster algorithm make software encryption feasible in applications which previously would have required hardware. This will effectively reduce the cost and increase the availability of cryptographic protection. Introduction The computer community has long recognized the need for security and the essential role that encryption must play. Widely adopted, standard encryption functions will make a great contribution to security in the distributed heavily networked environment which is already upon us. IBM recognized the coming need in the 1970's and proposed the Data Encryption Standard. or DES [19]. Although controversy about its key size has persisted, DES has successfully resisted all public attack and been widely accepted. After some debate its use as a U.S. Federal Standard has recently been reaffirmed until 1992 [14]. However, given the inherent limitations of the 56 bit key size used in DES [16] it seems clear that the standard will at least have to be revised at some point. A recent review of DES by the Office of Technology Assessment [15] quotes Dennis Branstad as saying the 'useful lifetime' of DES would be until the late 199O's. Despite the widespread acceptance of DES the most popular software commercial encryption packages (for, e.g., the IBM PC or the Apple Macintosh) typically offer both DES encryption and their own home-grown encryption function. The reason is simple -- DES is often 50 to 100 times slower than the home-grown alternative. While some of this performance differential is due to a sub-optimal DES implementation or a faster but less secure home-grown encryption function, it seems that DES is inherently 5 to 10 times slower than an equivalent, equally secure encryption function designed for software implementation. This is not to fault DES -- one of the design objectives in DES was quite explicitly a fast hardware implementation: when hardware is available, DES is an excellent choice. However, a number of design decisions were made in DES reflecting the hardware orientation which result in slow software performance -- making the current extensive use of DES in software both unplanned-for and rather anomalous. Having offered a rationale for an encryption function designed for software implementation, we now turn to the design principles, followed by the actual design. Basic Principles The basic design principles in DES seem sound. The fact that DES has not been publicly broken speaks in their favor. However, upon examining specific design decisions in DES, we find that several should be revised -- either in light of the software orientation of the new encryption function, or because of the decreased cost of hardware since the early '70's. Examining the basic design decisions one by one, and in no particular order, we can decide what reasonably should be changed. The selection of a 56 bit key size is too small, a problem which can be easily remedied. This subject has already been debated extensively, and while 56 bits seems to offer just sufficient protection for many commercial applications, the negligible cost of increasing the key size virtually dictates that it be done. The extensive use of permutations is expensive in software, and should be eliminated -- provided that a satisfactory alternative can be found. While permutations are cheap in hardware and provide an effective way to spread information (also called `diffusion' [21]) they are not the best choice for software. In the faster implementations of DES, the permutations are implemented by table look-ups on several bits at once. That is, the 48-to-32 bit permutation P is implemented by looking up several bits at once in a table -- where each individual table entry is 32 bits wide and the table has been pre-computed to contain the permutation of the bits looked up. Using a table-lookup in a software encryption function seems a good idea and can effectively provide the desired `diffusion' -- however there seems no reason to limit such a table to being a permutation, Having once paid the cost of looking up an entry in a table, it seems preferable that the entry should contain as much information as possible rather than being arbitrarily restricted to a small range of possibilities. Each individual S-box in DES provides only 64 entries of 4 bits each, or 32 bytes per S-box. Memory sizes have greatly increased since the mid 1970's when DES was designed, and larger S-boxes seem appropriate. More subtly, DES uses 8 S-boxes and looks up 8 different values in them simultaneously. While this is appropriate for hardware (where the 8 lookups can occur in parallel) it seems an unreasonable restriction for software. In software, each table lookup must follow the preceding lookups anyway -- for that is the nature of sequential program execution. It seems more valuable cryptographically to make each lookup depend upon the preceding lookup, because in software such dependency is nearly free. This means that the cascade of unpredictable changes that are so central to DES-type encryption functions can achieve greater depth with fewer lookups. Looked at another way, DES has a maximum circuit depth of 16 S-boxes, even though it has a total of 128 S-box lookups. If those same 128 S-box operations were done sequentially, with the output of each lookup operation altering the input to the next lookup, then the maximum circuit depth would be 128 S-boxes -- eight times as many and almost certainly providing greater cryptographic strength. This change would have very little impact on the running time of a software implementation on a typical sequential processor. A larger S-box size and sequential (rather than parallel) S-box usage should be adopted. The initial and final permutations in DES are widely viewed as cryptographically pointless -- or at least not very important. They are therefore discarded. The key schedule in DES has received mild criticism for not being sufficiently complicated[9]. In practice, all of the faster DES software implementations pre-compute the key schedule. This pre-computation seems a good idea when large volumes of data are being encrypted -- the pre-computation allows a more leisurely and careful arrangement of the encryption tables and means the actual encryption function can more rapidly scramble the data with less effort. A more complex key pre-computation therefore seems desirable. Finally, the design criteria used for the DES S-boxes were kept secret. Even though there is no particular reason to believe that they conceal a trap door, it would seem better if the criteria for S-box selection were made explicit, and some sort of assurances provided that the S-boxes were actually chosen randomly in accordance with the published criteria. This would both quiet the concerns about trap doors, and also allow a fuller and more explicit consideration of the S-box selection criteria. With this overview of design principles we can now proceed to the design. Khufu, Khafre and Snefru There are actually two encryption functions named Khufu and Khafre, and a one-way hash function named Snefru (all three names were taken from the Pharoahs of ancient Egypt following a suggestion by Dan Greene. To quote the Encyclopedia Britannica, "The ideal pyramid was eventually built by Snefru's successor, Khufu, and the first --- the Great Pyramid at Giza --- was the finest and must successful." Khafre was Khufu's son). The basic hardware model around which they are optimized is a 32-bit register oriented microprocessor. The basic operations are 32-bit load, store, shift, rotate, `xor' and `and'. The two encryption functions are optimized for somewhat different tasks, but use very similar design principles. Khufu is designed for fast bulk encryption of large amounts of data. To achieve the fastest possible speed, the tables used in encryption are pre-computed. This pre-computation is moderately expensive, and makes Khufu unsuited for the encryption of small amounts of data. The other encryption function -- Khafre -- does not require any pre-computation. This means Khafre can efficiently encrypt small amounts of data. On the other hand, Khafre is somewhat slower than Khufu for the encryption of large volumes of data because it takes more time to encrypt each block. The one-way hash function -- Snefru -- is designed to rapidly reduce large blocks of data to a small residue (perhaps 128 or 256 bits). Snefru requires no pre-computation and therefore can be efficiently applied to small arguments. Snefru will be used exclusively for authentication purposes and does not provide secrecy. We first discuss the design of Khufu. Khufu Khufu is a block cipher operating on 64-bit blocks. Although increasing block size was a very tempting design alternative, the 64-bit block size of DES has not been greatly criticized. More important, many systems built around DES assume that the block size is 64 bits. The pain of using a different encryption function is better minimized if the new encryption function can be easily 'plugged in' in place of the old -- which can be done if the block size is the same and the key size is larger. The new encryption function essentially looks exactly like the old encryption function -- but with some new keys added to the key space. Increasing the block size might have forced changes in more than just a single subroutine -- it might (for example) have forced changes in data formats in a communications systems. Khufu, like DES, is a multi-round encryption function in which two 32-bit halves (called L and R for Left and Right) are used alternately in the computations. Each half is used as input to a function F, whose output is XORed with the other half -- the two halves are exchanged and the computation repeated until the result appears to be random (no statistically detectable patterns). Khufu uses a different F-function than DES -- and uses different F-functions during the course of encryption. One round of DES uses an F-function defined by 8 table lookups and associated permutations. By contrast, one round of Khufu uses a single table lookup in a larger S-box. In addition, in the first step of encryption (prior to the main loop) the plaintext is XORed with 64 bits of key material, and again in the final step of encryption (following the main loop) the 64-bit block is XORed with another 64 bits of key material to produce the ciphertext. In somewhat more detail, the 64-bit plaintext is first divided into two 32-bit pieces designated L and R. L and R are then XORed with two 32-bit pieces of auxiliary key material. Then the main loop is started, in which the bottom byte from L is used as the input to a 256-entry S-box. Each S-box entry is 32-bits wide, The selected 32-bit entry is XORed with R. L is then rotated to bring a new byte into position, after which L and R are swapped. The S-box itself is changed to a new S-box after every 8 rounds (which we shall sometimes call an `octet'). This means that the number of S-boxes required depends on the number of rounds of encryption being used. Finally, after the main loop has been completed, we again XOR L and R with two new 32-bit auxiliary key values to produce the ciphertext. For efficiency reasons, we restrict the number of rounds to be a multiple of 8, i.e., an integral number of octets. If the main encryption loop is always executed a multiple of 8 times, then it can be unrolled 8 times -- which is substantially more efficient than the definitionally correct but inefficient versions given in this paper. For