RFC # 724 NIC #37435 12 May 1977 Proposed Official Standard for the Format of ARPA Network Messages by Ken Pogran, MIT-LCS/CSR (Pogran at MIT-Multics) John Vittal, BBN (Vittal at BBN-TENEXA) Dave Crocker, RAND-ISD (DCrocker at Rand-Unix) Austin Henderson, BBN (Henderson at BBN-TENEXD) Proposed Standard for Message Format / ii PREFACE ARPA's Committee on Computer-Aided Human Communication (CAHCOM) wishes to promulgate an official standard for the format of ARPA Network mail headers which will adequately meet the needs of the various message service subsystems on the Network today. The authors of this RFC constitute the CAHCOM subcommittee charged with the task of developing this new standard; this document presents our current thoughts on the matter and a specific proposal. This document is organized as follows: First, we present a history, of the development of what has become known as the ARPA Network "mail" or "message" service, and the issues which we feel are most pressing -- problems for which solutions are lacking today, inhibiting the further development of message subsystems. We then present the specification for the new ARPA Network Message Header standard. This is followed by a References section. Essentially, we propose a revision to Request for Comments (RFC) 561, "Standardizing Network Mail Headers", and RFC 680, "Message Transmission Protocol". This revision removes and compacts portions of the previous syntax and adds several features to network address specification. In particular, we focus on people and not mailboxes as recipients and allow reference to stored address lists. We expect this syntax to provide sufficient capabilities to meet most users' immediate needs and, therefore, give developers enough breathing room to produce a new mail transmission protocol "properly". We believe that there is enough of a consensus in the Network community in favor of such a standard syntax to make possible its adoption at this time. We would like to make clear the status of this proposed standard: The CAHCOM Steering Committee has replaced the Message Service Committee as the ARPANET standards-setting organization in the area of message services. It is expected that the proposal of this CAHCOM subcommittee, when in its final form, will be adopted as an ARPANET standard by CAHCOM. In the interests of making this standard the best possible one, we are distributing this proposal as an RFC. Please send any comments and criticisms to any of the authors of this RFC by 15 June 1977. It is planned that the standard will be officially adopted by 1 September 1977, with hosts expected to accept its syntax by 1 January 1978. Proposed Standard for Message Format / iii CONTENTS I. PROBLEMS WITH ARPANET MESSAGE STANDARDS A. Background and History B. Issues and Conclusions C. Message Parts D. Adoption of the Standard II. STANDARD FOR THE FORMAT OF ARPA NETWORK MESSAGES A. Framework B. Syntax C. Semantics D. Examples III. REFERENCES APPENDIX A. Alphabetical Listing of Syntax Rules I. Problems with ARPANET Message Standards / 1 A. Background and History I. PROBLEMS WITH ARPANET MESSAGE STANDARDS A. BACKGROUND AND HISTORY Today's ARPA Network "mail" or "message" service uses, for its delivery mechanism, two special commands of the File Transfer Protocol. Viewed from within the structure of FTP, the entire message, both header and text, is data for the FTP MAIL and MLFL commands. This facility was added to the File Transfer Protocol as an afterthought; it was an interim solution to be used only until a separate mail transmission protocol was specified. Several versions of such a protocol have been proposed, but none has yet received general acceptance. Meanwhile, attempts have been made to improve upon the original interim facility. As message service subsystems on various host systems (especially TENEX) developed to the point where rudimentary parsing of incoming messages was being done, it became clear that it would be desirable to standardize the format and content of the headers of messages transmitted between hosts using these FTP commands. To this end, an ad hoc committee wrote RFC 561, which suggested a standard message header format. The committee was unofficial, so it could not legislate a standard, it could only recommend. However, the standard it suggested adequately met an urgent need, and was generally adopted. Several salient points should be noted: 1. RFC 561 defined the concept of a message header, and specified the syntax which delimited it from the actual text of a message; 2. It proposed a standard format for the most obvious and most urgently-needed header items: "From:", "Date:", and "Subject:"; 3. It proposed that a general standard syntax be used for all other header items; 4. RFC 561 is still, today, an unofficial standard, adhered to by most because of its utility; 5. Its syntax was designed to allow humans to read the text easily, without the aid of special message processing systems. I. Problems with ARPANET Message Standards / 2 A. Background and History As message services grew in sophistication, the need for specific header items in RFC 561's "miscellaneous" category grew: "To:" and "cc:", especially, were generated and recognized by several different message services. However, there was no specific standard for the syntax of the contents of these items. The message service subsystems on TENEX developed a particular format for these items; since more messages originated from the TENEX hosts on the Network than from any other type of host system, the TENEX format for these fields soon became a de facto standard. Message service subsystems on TENEX began to parse these fields, expecting them to be in the TENEX-generated format. Message service subsystems on other hosts -- Multics, for example -- began to dabble with other formats for these fields, since there was no standard for them, only to receive complaints from users of TENEX message service subsystems that their "non- standard" message headers could not be parsed according to the (de facto) "standard" syntax. Recognizing that the time had come to make an attempt to standardize the additional header fields that had come into use since RFC 561 was published, ARPA's Message Service Committee chartered a small group in 1975 to develop a revised version of RFC 561 which would define the syntax of these additional message header fields. Several things should be noted about this small group of people: first, they were TENEX-oriented; when the functionality of the message header items they desired was matched by the functionality of an already-existing message header item of the TENEX message subsystems, they adopted the syntax used by the TENEX message subsystems. Second, they based additional header items not already found on TENEX message subsystems on the deliberations of the Message Service Committee. Third, they were not familiar with the procedure for publication of a document as a Network RFC. The document which this group produced, labelled RFC 680, "Message Transmission Protocol", received only limited distribution. Matters were further confused because its title was misleading, since it was not a protocol for the transmission of messages between ARPA Network hosts, but rather a standard for the format of messages transmitted via the standard File Transfer Protocol. Some, including the Message Service Committee, believed that RFC 680 became a Network Standard. This was not strictly true, because it never received proper distribution, and it had never been "officially blessed" by anyone, to turn it from a request for comments into an accepted official ARPA Network standard document. Reflecting this confusion over the status of the document are the facts that the document DOES currently reside in the "official" ARPANET Protocol Handbook, and most users and message system implementors remain unaware that this is so. I. Problems with ARPANET Message Standards / 3 A. Background and History For all its shortcomings, RFC 680 has performed a needed service, just as did RFC 561 before it. It defined additional message header items at a time when this needed to be done. Unfortunately, since the group had not sought ideas and input from others, the specification did not adequately respond to a sufficient set of community needs. In addition, the manner in which the document was promulgated -- or not promulgated -- left a great deal to be desired. Implementators of message-processing subsystems who had not received RFC 680 proceeded to go their own ways, feeling justified in doing so, while those who accepted RFC 680 as a standard felt justified in complaining to -- and about -- those whom they considered to be maverick implementors of idiosyncratic message service subsystems. Perhaps because of the ad-hoc nature of the interim mail facility, users have not, until recently, attempted to push the system to the limits of their imagination. Presently, however, several different sites are using the "interim" mail facility for more than it was designed and in ways which are incompatible both with each other and with the original intent of the facility. Mail subsystem implementors are increasingly being asked to provide for the handling of mail from idiosyncratic hosts. Also, it has become clear that there are a few very specific features, too useful to ignore, which cannot reasonably be specified within the syntax of RFC 680. B. ISSUES AND CONCLUSIONS At first glance, it would seem that a resolution of today's somewhat chaotic situation could best be obtained by immediately junking the existing "interim" mail facility, and adopting a true mail transmission protocol. We strongly believe that this would be ill-advised at this time, for we feel that there is no general understanding within the Network community today of how to specify and implement a full and adequate mail transmission protocol. However, we are convinced that there is, finally, a strong commitment within the Network community to attack this problem (which there was not at the time the "interim" mail transmission facility was specified and developed). The frontal attacks on the mail protocol problem have, so far, resulted in at least two suggestions for a mail transmission protocol. Why should not one of these protocols be adopted immediately? We feel that, in general, there has been a tendency for experimental Network software to be prematurely treated as though it were adequately designed and fully operational. Typically, the system or protocol proposed is so much better than what was previously available that its experimental nature is disregarded, and it is pressed into service before it has had a I. Problems with ARPANET Message Standards / 4 B. Issues and Conclusions chance to properly develop and mature. We are very concerned that this phenomenon not afflict the Network mail system any more than it already has. While it is true that there are several sites in the ARPA Community which have mail systems that understand the syntax specified in RFC's 561 and 680, in addition to some of the "non- standard" syntax provided by the mail generating programs at several other sites, most mail systems do not parse much of the contents of received messages. A consideration of the syntax specified here is that messages which are sent to people should be easily read by people. Parsers which can turn an ugly, syntactically expedient form into something which is easy to read are the exception, rather than the rule, in today's message systems. Also, the modifications to the existing "non-standard" syntax should be kept to a minimum, enhancing the probability that the requirement of small perturbations to existing software will be accepted. With this syntax, we introduce mechanisms so that: 1. Users of mail systems can have multiple mailboxes, either on one machine or multiple machines, all of which are treated identically; the default mailbox for a user is not necessarily associated (directly) with his login name. 2. Mail for a person can be sent to other than a single, default mailbox. 3. Named groups may consist of both individuals and (possibly) other named groups (i.e., nesting within groups is permitted). 4. Address lists may contain references to other, stored, lists. The complete path with which one can retrieve the stored list may be specified in order to allow either manual or automatic retrieval of the stored list. 5. Address lists may contain references to addresses which are not accessible through the standard ARPANET message system. For example, U.S. Postal system addresses can be specified. Such addresses are, of course, expected to be ignored by the ARPANET system, although individual sites may provide services for using the information (e.g., automatically sending a copy of the message to a line printer, in preparation for transmission through the Postal system). 6. Parenthetical remarks, or comments, can be included and syntactically recognized as such within some header items. I. Problems with ARPANET Message Standards / 5 B. Issues and Conclusions 7. Received messages are capable of being read by humans without a program having to parse the message (or parts of it) before presenting the message to the user; however there is sufficient formal syntax t