Network Working Group G. Huston Request for Comments: 2990 Telstra Category: Informational November 2000 Next Steps for the IP QoS Architecture Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved. Abstract While there has been significant progress in the definition of Quality of Service (QoS) architectures for internet networks, there are a number of aspects of QoS that appear to need further elaboration as they relate to translating a set of tools into a coherent platform for end-to-end service delivery. This document highlights the outstanding architectural issues relating to the deployment and use of QoS mechanisms within internet networks, noting those areas where further standards work may assist with the deployment of QoS internets. This document is the outcome of a collaborative exercise on the part of the Internet Architecture Board. Table of Contents 1. Introduction ........................................... 2 2. State and Stateless QoS ................................ 4 3. Next Steps for QoS Architectures ....................... 6 3.1 QoS-Enabled Applications ........................... 7 3.2 The Service Environment ............................ 9 3.3 QoS Discovery ...................................... 10 3.4 QoS Routing and Resource Management ................ 10 3.5 TCP and QoS ........................................ 11 3.6 Per-Flow States and Per-Packet classifiers ......... 13 3.7 The Service Set .................................... 14 3.8 Measuring Service Delivery ......................... 14 3.9 QoS Accounting ..................................... 15 3.10 QoS Deployment Diversity .......................... 16 3.11 QoS Inter-Domain signaling ........................ 17 Huston Informational [Page 1] RFC 2990 Next Steps for QoS Architecture November 2000 3.12 QoS Deployment Logistics .......................... 17 4. The objective of the QoS architecture .................. 18 5. Towards an end-to-end QoS architecture ................. 19 6. Conclusions ............................................ 21 7. Security Considerations ................................ 21 8. References ............................................. 22 9. Acknowledgments ........................................ 23 10. Author's Address ....................................... 23 11. Full Copyright Statement ............................... 24 1. Introduction The default service offering associated with the Internet is characterized as a best-effort variable service response. Within this service profile the network makes no attempt to actively differentiate its service response between the traffic streams generated by concurrent users of the network. As the load generated by the active traffic flows within the network varies, the network's best effort service response will also vary. The objective of various Internet Quality of Service (QoS) efforts is to augment this base service with a number of selectable service responses. These service responses may be distinguished from the best-effort service by some form of superior service level, or they may be distinguished by providing a predictable service response which is unaffected by external conditions such as the number of concurrent traffic flows, or their generated traffic load. Any network service response is an outcome of the resources available to service a load, and the level of the load itself. To offer such distinguished services there is not only a requirement to provide a differentiated service response within the network, there is also a requirement to control the service-qualified load admitted into the network, so that the resources allocated by the network to support a particular service response are capable of providing that response for the imposed load. This combination of admission control agents and service management elements can be summarized as "rules plus behaviors". To use the terminology of the Differentiated Service architecture [4], this admission control function is undertaken by a traffic conditioner (an entity which performs traffic conditioning functions and which may contain meters, markers, droppers, and shapers), where the actions of the conditioner are governed by explicit or implicit admission control agents. As a general observation of QoS architectures, the service load control aspect of QoS is perhaps the most troubling component of the architecture. While there are a wide array of well understood service response mechanisms that are available to IP networks, Huston Informational [Page 2] RFC 2990 Next Steps for QoS Architecture November 2000 matching a set of such mechanisms within a controlled environment to respond to a set of service loads to achieve a completely consistent service response remains an area of weakness within existing IP QoS architectures. The control elements span a number of generic requirements, including end-to-end application signaling, end-to- network service signaling and resource management signaling to allow policy-based control of network resources. This control may also span a particular scope, and use 'edge to edge' signaling, intended to support particular service responses within a defined network scope. One way of implementing this control of imposed load to match the level of available resources is through an application-driven process of service level negotiation (also known as application signaled QoS). Here, the application first signals its service requirements to the network, and the network responds to this request. The application will proceed if the network has indicated that it is able to carry the additional load at the requested service level. If the network indicates that it cannot accommodate the service requirements the application may proceed in any case, on the basis that the network will service the application's data on a best effort basis. This negotiation between the application and the network can take the form of explicit negotiation and commitment, where there is a single negotiation phase, followed by a commitment to the service level on the part of the network. This application-signaled approach can be used within the Integrated Services architecture, where the application frames its service request within the resource reservation protocol (RSVP), and then passes this request into the network. The network can either respond positively in terms of its agreement to commit to this service profile, or it can reject the request. If the network commits to the request with a resource reservation, the application can then pass traffic into the network with the expectation that as long as the traffic remains within the traffic load profile that was originally associated with the request, the network will meet the requested service levels. There is no requirement for the application to periodically reconfirm the service reservation itself, as the interaction between RSVP and the network constantly refreshes the reservation while it remains active. The reservation remains in force until the application explicitly requests termination of the reservation, or the network signals to the application that it is unable to continue with a service commitment to the reservation [3]. There are variations to this model, including an aggregation model where a proxy agent can fold a number of application-signaled reservations into a common aggregate reservation along a common sub-path, and a matching deaggregator can reestablish the collection of individual resource reservations upon leaving the aggregate region [5]. The essential feature of this Integrated Services model is the "all or nothing" nature of the Huston Informational [Page 3] RFC 2990 Next Steps for QoS Architecture November 2000 model. Either the network commits to the reservation, in which case the requestor does not have to subsequently monitor the network's level of response to the service, or the network indicates that it cannot meet the resource reservation. An alternative approach to load control is to decouple the network load control function from the application. This is the basis of the Differentiated Services architecture. Here, a network implements a load control function as part of the function of admission of traffic into the network, admitting no more traffic within each service category as there are assumed to be resources in the network to deliver the intended service response. Necessarily there is some element of imprecision in this function given that traffic may take an arbitrary path through the network. In terms of the interaction between the network and the application, this takes the form of a service request without prior negotiation, where the application requests a particular service response by simply marking each packet with a code to indicate the desired service. Architecturally, this approach decouples the end systems and the network, allowing a network to implement an active admission function in order to moderate the workload that is placed upon the network's resources without specific reference to individual resource requests from end systems. While this decoupling of control allows a network's operator greater ability to manage its resources and a greater ability to ensure the integrity of its services, there is a greater potential level of imprecision in attempting to match applications' service requirements to the network's service capabilities. 2. State and Stateless QoS These two approaches to load control can be characterized as state- based and stateless approaches respectively. The architecture of the Integrated Services model equates the cumulative sum of honored service requests to the current reserved resource levels of the network. In order for a resource reservation to be honored by the network, the network must maintain some form of remembered state to describe the resources that have been reserved, and the network path over which the reserved service will operate. This is to ensure integrity of the reservation. In addition, each active network element within the network path must maintain a local state that allows incoming IP packets to be correctly classified into a reservation class. This classification allows the packet to be placed into a packet flow context that is associated with an appropriate service response consistent with the original end-to-end service reservation. This local state also extends to the function Huston Informational [Page 4] RFC 2990 Next Steps for QoS Architecture November 2000 of metering packets for conformance on a flow-by-flow basis, and the additional overheads associated with maintenance of the state of each of these meters. In the second approach, that of a Differentiated Services model, the packet is marked with a code to trigger the appropriate service response from the network elements that handles the packet, so that there is no strict requirement to install a per-reservation state on these network elements. Also, the end application or the service requestor is not required to provide the network with advance notice relating to the destination of the traffic, nor any indication of the intended traffic profile or the associated service profile. In the absence of such information any form of per-application or per-path resource reservation is not feasible. In this model there is no maintained per-flow state within the network. The state-based Integrated Services architectural model admits the potential to support greater level of accuracy, and a finer level of granularity on the part of the network to respond to service requests. Each individual application's service request can be used to generate a reservation state within the network that is intended to prevent the resources associated with the reservation to be reassigned or otherwise preempted to service other reservations or to service best effort traffic loads. The state-based model is intended to be exclusionary, where other traffic is displaced in order to meet the reservation's service targets. As noted in RFC2208 [2], there are several areas of concern about the deployment of this form of service architecture. With regard to concerns of per-flow service scalability, the resource requirements (computational processing and memory consumption) for running per- flow resource reservations on routers increase in direct proportion to the number of separate reservations that need to be accommodated. By the same token, router forwarding performance may be impacted adversely by the packet-classification and scheduling mechanisms intended to provide differentiated services for these resource- reserved flows. This service architecture also poses some challenges to the queuing mechanisms, where there is the requirement to allocate absolute levels of egress bandwidth to individual flows, while still supporting an unmanaged low priority best effort traffic class. The stateless approach to service management is more approximate in the nature of its outcomes. Here there is no explicit negotiation between the application's signaling of the service request and the network's capability to deliver a particular service response. If the network is incapable of meeting the service request, then the request simply will not be honored. In such a situation there is no requirement for the network to inform the application that the Huston Informational [Page 5] RFC 2990 Next Steps for QoS Architecture November 2000 request cannot be honored, and it is left to the application to determine if the service has not been delivered. The major attribute of this approach is that it can possess excellent scaling properties from the perspective of the network. If the network is capable of supporting a limited number of discrete service responses, and the routers uses per-packet marking to trigger the service response, then the processor and memory requirements in each router do not increase in proportion to the level of traffic passed through the router. Of course this approach does introduce some degree of compromise in that the service response is more approximate as seen by the end client, and scaling the number of clients and applications in such an environment may not necessarily result in a highly accurate service response to every client's application. It is not intended to describe these service architectures in further detail within this document. The reader is referred to RFC1633 [3] for an overview of the Integrated Services Architecture (IntServ) and RFC2475 [4] for an overview of the Differentiated Services architecture (DiffServ). These two approaches are the endpoints of what can be seen as a continuum of control models, where the fine-grained precision of the per application invocation reservation model can be aggregated into larger, more general and potentially more approximate aggregate rese