Original Message Date: 06 Jun 91  13:48:41
From: Uucp on 1:125/777
To: Tom Jennings on 1:125/111
Subj: Ralph Merkle's paper 
^AINTL 1:125/111 1:125/777
From: toad.com!gnu (John Gilmore)
To:   pozar, tom
Date: Thu, 6 Jun 91 11:51:59 PDT

From postnews Thu Jul 13 03:10:10 1989
Subject: Merkle's "A Software Encryption Function" now published and 
available
Newsgroups: sci.crypt

Ralph Merkle called me today to let me know that Xerox was not going to
let him submit his paper on a nice new set of encryption and hash functions
to a journal for publication.  The story is that a division of Xerox
sells a lot of stuff to NSA and they threatened to pull their business
if Xerox publishes it.  There is no law that says NSA can stop Xerox
from publishing it -- it's just a "business decision" on Xerox's part.

Happily, however, I do not sell anything to the NSA.  And I have
a copy of the paper, which was distributed for several months
by Xerox, without any conditions, before NSA even heard of it.
The work was not government-sponsored or classified; there is no
law that lets the government suppress it.

As a courtesy to Xerox Corporation I could avoid publishing this
paper.  However, I prefer to extend the courtesy to the person who did
the work, Ralph Merkle, who would like to see it released and used.  I
do thank Xerox for supporting his excellent work and hope that they
will continue to do so.  Mr. Merkle did not ask or suggest that I
publish the paper, and should bear none of the blame (if any).

I have published and distributed a number of copies of this paper, and
I hereby offer to sell a copy of this paper to anyone who sends me $10
(cash preferred, checks accepted) and a return address.  Send your 
requests to:

        Merkle Paper Publishing
        PO Box 170608
        San Francisco, CA, USA  94117-0608

Since the paper is "published and is generally available to the public
through subscriptions which are available without restriction to any
individual who desires to obtain or purchase the published
information", it is exempt from State Department export control under
22 CFR 120.18 and 22 CFR 125.1 (a), and is exportable to all
destintions under Commerce Department General License GTDA under 15 CFR
379.3(a).  It can therefore be sent to foreign as well as US domestic
individuals.

I believe that the availability of fast, secure cryptography to the
worldwide public will enable us to build much more secure computer
systems and networks, increasing individual privacy as well as making
viruses and worms much harder to write.  For example, the Snefru
one-way hash function described in the paper would be a good choice for
validating copies of programs downloaded from BBS systems or the net,
to detect virus contamination.  If UUCP and TCP/IP links could be
encrypted with Mr. Merkle's Khafre or Khufu ciphers, simple monitoring
of phone wires or Ethernets would not yield login passwords, private
mail, and other serious security violations.  The technology exists;
all that stands in our way is a bureacracy that has no *legal* power
to restrict us, if we follow the published rules.

        John Gilmore

From postnews Thu Jul 13 03:31:56 1989
Subject: Online copy of Merkle's "A Software Encryption Function"
Newsgroups: sci.crypt
References: <7981@hoptoad.uucp>

[See the parent article <7981@hoptoad.uucp> for the legalese.  This
netnews article is exportable under 22 CFR 120.18, 22 CFR 125.1 (a),
and 15 CFR 379.3(a).  It can be sent to all foreign as well as US
domestic destinations.                                  -- John]

                             A Software Encryption Function
                                           by
                                    Ralph C. Merkle
                                       Xerox PARC
                                  3333 Coyote Hill Road
                                  Palo Alto, CA 94304


ABSTRACT

Encryption hardware is not available on most computer systems in use
today.  Despite this fact, there is no well accepted encryption
function designed for software implementation -- instead, hardware
designs are emulated in software and the resulting performance loss is
tolerated.  The obvious solution is to design an encryption function
for implementation in software.  Such an encryption function is
presented here -- on a SUN 4/260 it can encrypt at 4 to 8 megabits per
second.  The combination of modern processor speeds and a faster
algorithm make software encryption feasible in applications which
previously would have required hardware.  This will effectively reduce
the cost and increase the availability of cryptographic protection.


Introduction

The computer community has long recognized the need for security and
the essential role that encryption must play.  Widely adopted,
standard encryption functions will make a great contribution to
security in the distributed heavily networked environment which is
already upon us.  IBM recognized the coming need in the 1970's and
proposed the Data Encryption Standard.  or DES [19].  Although
controversy about its key size has persisted, DES has successfully
resisted all public attack and been widely accepted.  After some
debate its use as a U.S. Federal Standard has recently been reaffirmed
until 1992 [14].  However, given the inherent limitations of the 56
bit key size used in DES [16] it seems clear that the standard will at
least have to be revised at some point.  A recent review of DES by the
Office of Technology Assessment [15] quotes Dennis Branstad as saying
the 'useful lifetime' of DES would be until the late 199O's.

Despite the widespread acceptance of DES the most popular software
commercial encryption packages (for, e.g., the IBM PC or the Apple
Macintosh) typically offer both DES encryption and their own
home-grown encryption function.  The reason is simple -- DES is often
50 to 100 times slower than the home-grown alternative.  While some of
this performance differential is due to a sub-optimal DES
implementation or a faster but less secure home-grown encryption
function, it seems that DES is inherently 5 to 10 times slower than an
equivalent, equally secure encryption function designed for software
implementation.  This is not to fault DES -- one of the design
objectives in DES was quite explicitly a fast hardware implementation:
when hardware is available, DES is an excellent choice.  However, a
number of design decisions were made in DES reflecting the hardware
orientation which result in slow software performance -- making the
current extensive use of DES in software both unplanned-for and rather
anomalous.

Having offered a rationale for an encryption function designed for
software implementation, we now turn to the design principles,
followed by the actual design.


Basic Principles

The basic design principles in DES seem sound.  The fact that DES has
not been publicly broken speaks in their favor.  However, upon
examining specific design decisions in DES, we find that several
should be revised -- either in light of the software orientation of
the new encryption function, or because of the decreased cost of
hardware since the early '70's.  Examining the basic design decisions
one by one, and in no particular order, we can decide what reasonably
should be changed.

The selection of a 56 bit key size is too small, a problem which can
be easily remedied.  This subject has already been debated
extensively, and while 56 bits seems to offer just sufficient
protection for many commercial applications, the negligible cost of
increasing the key size virtually dictates that it be done.

The extensive use of permutations is expensive in software, and should
be eliminated -- provided that a satisfactory alternative can be
found.  While permutations are cheap in hardware and provide an
effective way to spread information (also called `diffusion' [21])
they are not the best choice for software.  In the faster
implementations of DES, the permutations are implemented by table
look-ups on several bits at once.  That is, the 48-to-32 bit
permutation P is implemented by looking up several bits at once in a
table -- where each individual table entry is 32 bits wide and the
table has been pre-computed to contain the permutation of the bits
looked up.  Using a table-lookup in a software encryption function
seems a good idea and can effectively provide the desired `diffusion'
-- however there seems no reason to limit such a table to being a
permutation, Having once paid the cost of looking up an entry in a
table, it seems preferable that the entry should contain as much
information as possible rather than being arbitrarily restricted to a
small range of possibilities.

Each individual S-box in DES provides only 64 entries of 4 bits each,
or 32 bytes per S-box.  Memory sizes have greatly increased since the
mid 1970's when DES was designed, and larger S-boxes seem appropriate.
More subtly, DES uses 8 S-boxes and looks up 8 different values in
them simultaneously.  While this is appropriate for hardware (where
the 8 lookups can occur in parallel) it seems an unreasonable
restriction for software.  In software, each table lookup must follow
the preceding lookups anyway -- for that is the nature of sequential
program execution.  It seems more valuable cryptographically to make
each lookup depend upon the preceding lookup, because in software such
dependency is nearly free.  This means that the cascade of
unpredictable changes that are so central to DES-type encryption
functions can achieve greater depth with fewer lookups.  Looked at
another way, DES has a maximum circuit depth of 16 S-boxes, even
though it has a total of 128 S-box lookups.  If those same 128 S-box
operations were done sequentially, with the output of each lookup
operation altering the input to the next lookup, then the maximum
circuit depth would be 128 S-boxes -- eight times as many and almost
certainly providing greater cryptographic strength.  This change would
have very little impact on the running time of a software
implementation on a typical sequential processor.  A larger S-box size
and sequential (rather than parallel) S-box usage should be adopted.

The initial and final permutations in DES are widely viewed as
cryptographically pointless -- or at least not very important.  They
are therefore discarded.

The key schedule in DES has received mild criticism for not being
sufficiently complicated[9].  In practice, all of the faster DES
software implementations pre-compute the key schedule.  This
pre-computation seems a good idea when large volumes of data are being
encrypted -- the pre-computation allows a more leisurely and careful
arrangement of the encryption tables and means the actual encryption
function can more rapidly scramble the data with less effort.  A more
complex key pre-computation therefore seems desirable.

Finally, the design criteria used for the DES S-boxes were kept
secret.  Even though there is no particular reason to believe that
they conceal a trap door, it would seem better if the criteria for
S-box selection were made explicit, and some sort of assurances
provided that the S-boxes were actually chosen randomly in accordance
with the published criteria.  This would both quiet the concerns about
trap doors, and also allow a fuller and more explicit consideration of
the S-box selection criteria.

With this overview of design principles we can now proceed to the design.


Khufu, Khafre and Snefru

There are actually two encryption functions named Khufu and Khafre,
and a one-way hash function named Snefru (all three names were taken
from the Pharoahs of ancient Egypt following a suggestion by Dan
Greene.  To quote the Encyclopedia Britannica, "The ideal pyramid was
eventually built by Snefru's successor, Khufu, and the first --- the
Great Pyramid at Giza --- was the finest and must successful."  Khafre
was Khufu's son).

The basic hardware model around which they are optimized is a 32-bit
register oriented microprocessor.  The basic operations are 32-bit
load, store, shift, rotate, `xor' and `and'.

The two encryption functions are optimized for somewhat different
tasks, but use very similar design principles.  Khufu is designed for
fast bulk encryption of large amounts of data.  To achieve the fastest
possible speed, the tables used in encryption are pre-computed.  This
pre-computation is moderately expensive, and makes Khufu unsuited for
the encryption of small amounts of data.  The other encryption
function -- Khafre -- does not require any pre-computation.  This
means Khafre can efficiently encrypt small amounts of data.  On the
other hand, Khafre is somewhat slower than Khufu for the encryption of
large volumes of data because it takes more time to encrypt each
block.

The one-way hash function -- Snefru -- is designed to rapidly reduce
large blocks of data to a small residue (perhaps 128 or 256 bits).
Snefru requires no pre-computation and therefore can be efficiently
applied to small arguments.  Snefru will be used exclusively for
authentication purposes and does not provide secrecy.

We first discuss the design of Khufu.


Khufu

Khufu is a block cipher operating on 64-bit blocks.  Although
increasing block size was a very tempting design alternative, the
64-bit block size of DES has not been greatly criticized.  More
important, many systems built around DES assume that the block size is
64 bits.  The pain of using a different encryption function is better
minimized if the new encryption function can be easily 'plugged in' in
place of the old -- which can be done if the block size is the same
and the key size is larger.  The new encryption function essentially
looks exactly like the old encryption function -- but with some new
keys added to the key space.  Increasing the block size might have
forced changes in more than just a single subroutine -- it might (for
example) have forced changes in data formats in a communications
systems.

Khufu, like DES, is a multi-round encryption function in which two
32-bit halves (called L and R for Left and Right) are used alternately
in the computations.  Each half is used as input to a function F,
whose output is XORed with the other half -- the two halves are
exchanged and the computation repeated until the result appears to be
random (no statistically detectable patterns).  Khufu uses a different
F-function than DES -- and uses different F-functions during the
course of encryption.  One round of DES uses an F-function defined by
8 table lookups and associated permutations.  By contrast, one round
of Khufu uses a single table lookup in a larger S-box.  In addition,
in the first step of encryption (prior to the main loop) the plaintext
is XORed with 64 bits of key material, and again in the final step of
encryption (following the main loop) the 64-bit block is XORed with
another 64 bits of key material to produce the ciphertext.

In somewhat more detail, the 64-bit plaintext is first divided into
two 32-bit pieces designated L and R.  L and R are then XORed with two
32-bit pieces of auxiliary key material.  Then the main loop is
started, in which the bottom byte from L is used as the input to a
256-entry S-box.  Each S-box entry is 32-bits wide, The selected
32-bit entry is XORed with R.  L is then rotated to bring a new byte
into position, after which L and R are swapped.  The S-box itself is
changed to a new S-box after every 8 rounds (which we shall sometimes
call an `octet').  This means that the number of S-boxes required
depends on the number of rounds of encryption being used.  Finally,
after the main loop has been completed, we again XOR L and R with two
new 32-bit auxiliary key values to produce the ciphertext.

For efficiency reasons, we restrict the number of rounds to be a
multiple of 8, i.e., an integral number of octets.  If the main
encryption loop is always executed a multiple of 8 times, then it can
be unrolled 8 times -- which is substantially more efficient than the
definitionally correct but inefficient versions given in this paper.
For