Network Working Group K. Holtman Request for Comments: 2295 TUE Category: Experimental A. Mutz Hewlett-Packard March 1998 Transparent Content Negotiation in HTTP Status of this Memo This memo defines an Experimental Protocol for the Internet community. It does not specify an Internet standard of any kind. Discussion and suggestions for improvement are requested. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (1998). All Rights Reserved. ABSTRACT HTTP allows web site authors to put multiple versions of the same information under a single URL. Transparent content negotiation is an extensible negotiation mechanism, layered on top of HTTP, for automatically selecting the best version when the URL is accessed. This enables the smooth deployment of new web data formats and markup tags. TABLE OF CONTENTS 1 Introduction................................................4 1.1 Background................................................4 2 Terminology.................................................5 2.1 Terms from HTTP/1.1.......................................5 2.2 New terms.................................................6 3 Notation....................................................8 4 Overview....................................................9 4.1 Content negotiation.......................................9 4.2 HTTP/1.0 style negotiation scheme.........................9 4.3 Transparent content negotiation scheme...................10 4.4 Optimizing the negotiation process.......................12 4.5 Downwards compatibility with non-negotiating user agents.14 4.6 Retrieving a variant by hand.............................15 4.7 Dimensions of negotiation................................15 Holtman & Mutz Experimental [Page 1] RFC 2295 Transparent Content Negotiation March 1998 4.8 Feature negotiation......................................15 4.9 Length of variant lists..................................16 4.10 Relation with other negotiation schemes.................16 5 Variant descriptions.......................................17 5.1 Syntax...................................................17 5.2 URI......................................................17 5.3 Source-quality...........................................18 5.4 Type, charset, language, and length......................19 5.5 Features.................................................19 5.6 Description..............................................19 5.7 Extension-attribute......................................20 6 Feature negotiation........................................20 6.1 Feature tags.............................................20 6.1.1 Feature tag values.....................................21 6.2 Feature sets.............................................21 6.3 Feature predicates.......................................22 6.4 Features attribute.......................................24 7 Remote variant selection algorithms........................25 7.1 Version numbers..........................................25 8 Content negotiation status codes and headers...............25 8.1 506 Variant Also Negotiates..............................25 8.2 Accept-Features..........................................26 8.3 Alternates...............................................27 8.4 Negotiate................................................28 8.5 TCN......................................................30 8.6 Variant-Vary.............................................30 9 Cache validators...........................................31 9.1 Variant list validators..................................31 9.2 Structured entity tags...................................31 9.3 Assigning entity tags to variants........................32 10 Content negotiation responses..............................32 10.1 List response...........................................33 10.2 Choice response.........................................34 10.3 Adhoc response..........................................37 10.4 Reusing the Alternates header...........................38 10.5 Extracting a normal response from a choice response.....39 10.6 Elaborate Vary headers..................................39 10.6.1 Construction of an elaborate Vary header..............40 10.6.2 Caching of an elaborate Vary header...................41 10.7 Adding an Expires header for HTTP/1.0 compatibility.....41 10.8 Negotiation on content encoding.........................41 Holtman & Mutz Experimental [Page 2] RFC 2295 Transparent Content Negotiation March 1998 11 User agent support for transparent negotiation.............42 11.1 Handling of responses...................................42 11.2 Presentation of a transparently negotiated resource.....42 12 Origin server support for transparent negotiation..........43 12.1 Requirements............................................43 12.2 Negotiation on transactions other than GET and HEAD.....45 13 Proxy support for transparent negotiation..................45 14 Security and privacy considerations........................46 14.1 Accept- headers revealing personal information..........46 14.2 Spoofing of responses from variant resources............47 14.3 Security holes revealed by negotiation..................47 15 Internationalization considerations........................47 16 Acknowledgments............................................47 17 References.................................................48 18 Authors' Addresses.........................................48 19 Appendix: Example of a local variant selection algorithm...49 19.1 Computing overall quality values........................49 19.2 Determining the result..................................51 19.3 Ranking dimensions......................................51 20 Appendix: feature negotiation examples.....................52 20.1 Use of feature tags.....................................52 20.2 Use of numeric feature tags.............................53 20.3 Feature tag design......................................53 21 Appendix: origin server implementation considerations......54 21.1 Implementation with a CGI script........................54 21.2 Direct support by HTTP servers..........................55 21.3 Web publishing tools....................................55 22 Appendix: Example of choice response construction..........55 23 Full Copyright Statement...................................58 Holtman & Mutz Experimental [Page 3] RFC 2295 Transparent Content Negotiation March 1998 1 Introduction HTTP allows web site authors to put multiple versions of the same information under a single URI. Each of these versions is called a `variant'. Transparent content negotiation is an extensible negotiation mechanism for automatically and efficiently retrieving the best variant when a GET or HEAD request is made. This enables the smooth deployment of new web data formats and markup tags. This specification defines transparent content negotiation as an extension on top of the HTTP/1.1 protocol [1]. However, use of this extension does not require use of HTTP/1.1: transparent content negotiation can also be done if some or all of the parties are HTTP/1.0 [2] systems. Transparent content negotiation is called `transparent' because it makes all variants which exist inside the origin server visible to outside parties. Note: Some members of the IETF are currently undertaking a number of activities which are loosely related to this experimental protocol. First, there is an effort to define a protocol- independent registry for feature tags. The intention is that this experimental protocol will be one of the clients of the registry. Second, some research is being done on content negotiation systems for other transport protocols (like internet mail and internet fax) and on generalized negotiation systems for multiple transport protocols. At the time of writing, it is unclear if or when this research will lead to results in the form of complete negotiation system specifications. It is also unclear to which extent possible future specifications can or will re-use elements of this experimental protocol. 1.1 Background The addition of content negotiation to the web infrastructure has been considered important since the early days of the web. Among the expected benefits of a sufficiently powerful system for content negotiation are * smooth deployment of new data formats and markup tags will allow graceful evolution of the web * eliminating the need to choose between a `state of the art multimedia homepage' and one which can be viewed by all web users * enabling good service to a wider range of browsing platforms (from low-end PDA's to high-end VR setups) Holtman & Mutz Experimental [Page 4] RFC 2295 Transparent Content Negotiation March 1998 * eliminating error-prone and cache-unfriendly User-Agent based negotiation * enabling construction of sites without `click here for the X version' links * internationalization, and the ability to offer multi-lingual content without a bias towards one language. 2 Terminology The words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in this document are to be interpreted as described in RFC 2119 [4]. This specification uses the term `header' as an abbreviation for for `header field in a request or response message'. 2.1 Terms from HTTP/1.1 This specification mostly uses the terminology of the HTTP/1.1 specification [1]. For the convenience of the reader, this section reproduces some key terminology definition from [1]. request An HTTP request message. response An HTTP response message. resource A network data object or service that can be identified by a URI. Resources may be available in multiple representations (e.g. multiple languages, data formats, size, resolutions) or vary in other ways. content negotiation The mechanism for selecting the appropriate representation when servicing a request. client A program that establishes connections for the purpose of sending requests. user agent The client which initiates a request. These are often browsers, editors, spiders (web-traversing robots), or other end user tools. Holtman & Mutz Experimental [Page 5] RFC 2295 Transparent Content Negotiation March 1998 server An application program that accepts connections in order to service requests by sending back responses. Any given program may be capable of being both a client and a server; our use of these terms refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server may act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request. origin server The server on which a given resource resides or is to be created. proxy An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them on, with possible translation, to other servers. A proxy must implement both the client and server requirements of this specification. age The age of a response is the time since it was sent by, or successfully validated with, the origin server. fresh A response is fresh if its age has not yet exceeded its freshness lifetime. 2.2 New terms transparently negotiable resource A resource, identified by a single URI, which has multiple representations (variants) associated with it. When servicing a request on its URI, it allows selection of the best representation using the transparent content negotiation mechanism. A transparently negotiable resource always has a variant list bound to it, which can be represented as an Alternates header (defined in section 8.3). variant list A list containing variant descriptions, which can be bound to a transparently negotiable resource. Holtman & Mutz Experimental [Page 6] RFC 2295 Transparent Content Negotiation March 1998 variant description A machine-readable description of a variant resource, usually found in a variant list. A variant description contains the variant resource URI and various attributes which describe properties of the variant. Variant descriptions are defined in section 5. variant resource A resource from which a variant of a negotiable resource can be retrieved with a normal HTTP/1.x GET request, i.e. a GET request which does not use transparent content negotiation. neighboring variant A variant resource is called a neighboring variant resource of some transparently negotiable HTTP resource if the variant resource has a HTTP URL, and if the absolute URL of the variant resource up to its last slash equals the absolute URL of the negotiable resource up to its last slash, where equality is determined with the URI comparison rules in section 3.2.3 of [1]. The property of being a neighboring variant is important because of security considerations (section 14.2). Not all variants of a negotiable resource need to be neighboring variants. However, access to neighboring variants can be more highly optimized by the use of remote variant selection algorithms (section 7) and choice responses (section 10.2). remote variant selection algorithm A standardized algorithm by which a server can sometimes choose a best variant on behalf of a negotiating user agent. The algorithm typically computes whether the Accept- headers in the request contain sufficient information to allow a choice, and if so, which variant is the best variant. The use of a remote algorithm can speed up the negotiation process. list response A list response returns the variant list of the negotiable resource, but no variant data. It can be generated when the server does not want to, or is not allowed to, return a particular best variant for the request. List responses are defined in section 10.1. choice response A choice response returns a representation of the best variant for the request, and may also return the variant list of the negotiable resource. It can be generated when the server has sufficient information to be able to choose the best variant on behalf the user agent, but may only be generated if this best variant is a neighboring variant. Choice responses are defined in section 10.2. Holtman & Mutz Experimental [Page 7] RFC 2295 Transparent Content Negotiation March 1998 adhoc response An adhoc response can be sent by an origin server as an extreme measure, to achieve compatibility with a non-negotiating or buggy client if this compatibility cannot be achieved by sending a list or choice resp