Network Working Group E. Crawley Request for Comments: 2386 Argon Networks Category: Informational R. Nair Arrowpoint B. Rajagopalan NEC USA H. Sandick Bay Networks August 1998 A Framework for QoS-based Routing in the Internet Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (1998). All Rights Reserved. ABSTRACT QoS-based routing has been recognized as a missing piece in the evolution of QoS-based service offerings in the Internet. This document describes some of the QoS-based routing issues and requirements, and proposes a framework for QoS-based routing in the Internet. This framework is based on extending the current Internet routing model of intra and interdomain routing to support QoS. 1. SCOPE OF DOCUMENT & PHILOSOPHY This document proposes a framework for QoS-based routing, with the objective of fostering the development of an Internet-wide solution while encouraging innovations in solving the many problems that arise. QoS-based routing has many complex facets and it is recommended that the following two-pronged approach be employed towards its development: 1. Encourage the growth and evolution of novel intradomain QoS-based routing architectures. This is to allow the development of independent, innovative solutions that address the many QoS-based routing issues. Such solutions may be deployed in autonomous systems (ASs), large and small, based on their specific needs. Crawley, et. al. Informational [Page 1] RFC 2386 A Framework for QoS-based Routing August 1998 2. Encourage simple, consistent and stable interactions between ASs implementing routing solutions developed as above. This approach follows the traditional separation between intra and interdomain routing. It allows solutions like QOSPF [GKOP98, ZSSC97], Integrated PNNI [IPNNI] or other schemes to be deployed for intradomain routing without any restriction, other than their ability to interact with a common, and perhaps simple, interdomain routing protocol. The need to develop a single, all encompassing solution to the complex problem of QoS-based routing is therefore obviated. As a practical matter, there are many different views on how QoS-based routing should be done. Much overall progress can be made if an opportunity exists for various ideas to be developed and deployed concurrently, while some consensus on the interdomain routing architecture is being developed. Finally, this routing model is perhaps the most practical from an evolution point of view. It is superfluous to say that the eventual success of a QoS-based Internet routing architecture would depend on the ease of evolution. The aim of this document is to describe the QoS-based routing issues, identify basic requirements on intra and interdomain routing, and describe an extension of the current interdomain routing model to support QoS. It is not an objective of this document to specify the details of intradomain QoS-based routing architectures. This is left up to the various intradomain routing efforts that might follow. Nor is it an objective to specify the details of the interface between reservation protocols such as RSVP and QoS-based routing. The specific interface functionality needed, however, would be clear from the intra and interdomain routing solutions devised. In the intradomain area, the goal is to develop the basic routing requirements while allowing maximum freedom for the development of solutions. In the interdomain area, the objectives are to identify the QoS-based routing functions, and facilitate the development or enhancement of a routing protocol that allows relatively simple interaction between domains. In the next section, a glossary of relevant terminology is given. In Section 3, the objectives of QoS-based routing are described and the issues that must be dealt with by QoS-based Internet routing efforts are outlined. In Section 4, some requirements on intradomain routing are defined. These requirements are purposely broad, putting few constraints on solution approaches. The interdomain routing model and issues are described in Section 5 and QoS-based multicast routing is discussed in Section 6. The interaction between QoS-based routing and resource reservation protocols is briefly considered in Section 7. Security considerations are listed in Section 8 and related work is described in Section 9. Finally, summary and conclusions are presented in Section 10. Crawley, et. al. Informational [Page 2] RFC 2386 A Framework for QoS-based Routing August 1998 2. GLOSSARY The following glossary lists the terminology used in this document and an explanation of what is meant. Some of these terms may have different connotations, but when used in this document, their meaning is as given. Alternate Path Routing : A routing technique where multiple paths, rather than just the shortest path, between a source and a destination are utilized to route traffic. One of the objectives of alternate path routing is to distribute load among multiple paths in the network. Autonomous System (AS): A routing domain which has a common administrative authority and consistent internal routing policy. An AS may employ multiple intradomain routing protocols internally and interfaces to other ASs via a common interdomain routing protocol. Source: A host or router that can be identified by a unique unicast IP address. Unicast destination: A host or router that can be identified by a unique unicast IP address. Multicast destination: A multicast IP address indicating all hosts and routers that are members of the corresponding group. IP flow (or simply "flow"): An IP packet stream from a source to a destination (unicast or multicast) with an associated Quality of Service (QoS) (see below) and higher level demultiplexing information. The associated QoS could be "best-effort". Quality-of-Service (QoS): A set of service requirements to be met by the network while transporting a flow. Service class: The definitions of the semantics and parameters of a specific type of QoS. Integrated services: The Integrated Services model for the Internet defined in RFC 1633 allows for integration of QoS services with the best effort services of the Internet. The Integrated Services (IntServ) working group in the IETF has defined two service classes, Controlled Load Service [W97] and Guaranteed Service [SPG97]. RSVP: The ReSerVation Protocol [BZBH97]. A QoS signaling protocol for the Internet. Path: A unicast or multicast path. Crawley, et. al. Informational [Page 3] RFC 2386 A Framework for QoS-based Routing August 1998 Unicast path: A sequence of links from an IP source to a unicast IP destination, determined by the routing scheme for forwarding packets. Multicast path (or Multicast Tree): A subtree of the network topology in which all the leaves and zero or more interior nodes are members of the same multicast group. A multicast path may be per-source, in which case the subtree is rooted at the source. Flow set-up: The act of establishing state in routers along a path to satisfy the QoS requirement of a flow. Crankback: A technique where a flow setup is recursively backtracked along the partial flow path up to the first node that can determine an alternative path to the destination. QoS-based routing: A routing mechanism under which paths for flows are determined based on some knowledge of resource availability in the network as well as the QoS requirement of flows. Route pinning: A mechanism to keep a flow path fixed for a duration of time. Flow Admission Control (FAC): A process by which it is determined whether a link or a node has sufficient resources to satisfy the QoS required for a flow. FAC is typically applied by each node in the path of a flow during flow set-up to check local resource availability. Higher-level admission control: A process by which it is determined whether or not a flow set-up should proceed, based on estimates and policy requirements of the overall resource usage by the flow. Higher-level admission control may result in the failure of a flow set-up even when FAC at each node along the flow path indicates resource availability. 3. QOS-BASED ROUTING: BACKGROUND AND ISSUES 3.1 Best-Effort and QoS-Based Routing Routing deployed in today's Internet is focused on connectivity and typically supports only one type of datagram service called "best effort" [WC96]. Current Internet routing protocols, e.g. OSPF, RIP, use "shortest path routing", i.e. routing that is optimized for a single arbitrary metric, administrative weight or hop count. These routing protocols are also "opportunistic," using the current shortest path or route to a destination. Alternate paths with acceptable but non-optimal cost can not be used to route traffic (shortest path routing protocols do allow a router to alternate among Crawley, et. al. Informational [Page 4] RFC 2386 A Framework for QoS-based Routing August 1998 several equal cost paths to a destination). QoS-based routing must extend the current routing paradigm in three basic ways. First, to support traffic using integrated-services class of services, multiple paths between node pairs will have to be calculated. Some of these new classes of service will require the distribution of additional routing metrics, e.g. delay, and available bandwidth. If any of these metrics change frequently, routing updates can become more frequent thereby consuming network bandwidth and router CPU cycles. Second, today's opportunistic routing will shift traffic from one path to another as soon as a "better" path is found. The traffic will be shifted even if the existing path can meet the service requirements of the existing traffic. If routing calculation is tied to frequently changing consumable resources (e.g. available bandwidth) this change will happen more often and can introduce routing oscillations as traffic shifts back and forth between alternate paths. Furthermore, frequently changing routes can increase the variation in the delay and jitter experienced by the end users. Third, as mentioned earlier, today's optimal path routing algorithms do not support alternate routing. If the best existing path cannot admit a new flow, the associated traffic cannot be forwarded even if an adequate alternate path exists. 3.2 QoS-Based Routing and Resource Reservation It is important to understand the difference between QoS-based routing and resource reservation. While resource reservation protocols such as RSVP [BZBH97] provide a method for requesting and reserving network resources, they do not provide a mechanism for determining a network path that has adequate resources to accommodate the requested QoS. Conversely, QoS-based routing allows the determination of a path that has a good chance of accommodating the requested QoS, but it does not include a mechanism to reserve the required resources. Consequently, QoS-based routing is usually used in conjunction with some form of resource reservation or resource allocation mechanism. Simple forms of QoS-based routing have been used in the past for Type of Service (TOS) routing [M98]. In the case of OSPF, a different shortest-path tree can be computed for each of the 8 TOS values in the IP header [ISI81]. Such mechanisms can be used to select specially provisioned paths but do not completely assure that resources are not overbooked along the path. As long as strict resource management and control are not needed, mechanisms such as TOS-based routing are useful for separating whole classes of traffic Crawley, et. al. Informational [Page 5] RFC 2386 A Framework for QoS-based Routing August 1998 over multiple routes. Such mechanisms might work well with the emerging Differential Services efforts [BBCD98]. Combining a resource reservation protocol with QoS-based routing allows fine control over the route and resources at the cost of additional state and setup time. For example, a protocol such as RSVP may be used to trigger QoS-based routing calculations to meet the needs of a specific flow. 3.3 QoS-Based Routing: Objectives Under QoS-based routing, paths for flows would be determined based on some knowledge of resource availability in the network, as well as the QoS requirement of flows. The main objectives of QoS-based routing are: 1. Dynamic determination of feasible paths: QoS-based routing can determine a path, from among possibly many choices, that has a good chance of accommodating the QoS of the given flow. Feasible path selection may be subject to policy constraints, such as path cost, provider selection, etc. 2. Optimization of resource usage: A network state-dependent QoS- based routing scheme can aid in the efficient utilization of network resources by improving the total network throughput. Such a routing scheme can be the basis for efficient network engineering. 3. Graceful performance degradation: State-dependent routing can compensate for transient inadequacies in network engineering (e.g., during focused overload conditions), giving better throughput and a more graceful performance degradation as compared to a state-insensitive routing scheme [A84]. QoS-based routing in the Internet, however, raises many issues: - How do routers determine the QoS capability of each outgoing link and reserve link resources? Note that some of these links may be virtual, over ATM networks and others may be broadcast multi- access links. - What is the granularity of routing decision (i.e., destination- based, source and destination-based, or flow-based)? - What routing metrics are used and how are QoS-accommodating paths computed for unicast flows? Crawley, et. al. Informational [Page 6] RFC 2386 A Framework for QoS-based Routing August 1998 - How are QoS-accommodating paths computed for multicast flows with different reservation styles and receiver heterogeneity? - What are the performance objectives while computing QoS-based paths? - What are the administrative control issues? - What factors affect the routing overheads?, and - How is scalability achieved? Some of these issues are discussed briefly next. Interdomain routing is discussed in Section 5. 3.4 QoS Determination and Resource Reservation To determine whether the QoS requirements of a flow can be accommodated on a link, a router must be able to determine the QoS available on the link. It is still an open issue as to how the QoS availability is determined for broadcast multiple access links (e.g., Ethernet). A related problem is the reservation of resources over such links. Solutions to these problems are just emerging [GPSS98]. Similar problems arise when a router is connected to a large non- broadcast multiple access network, such as ATM. In this case, if the destination of a flow is outside the ATM network, the router may have multiple egress choices. Furthermore, the QoS availability on the ATM paths to each egress point may be different. The issues then are, o how does a router determine all the egress choices across the ATM network? o how does it determine what QoS is available over the path to each egress point?, and o what QoS value does the router advertise for the ATM link. Typically, IP routing over ATM (e.g., NHRP) allows the selection of a single egress point in the ATM network, and the procedure does not incorporate any knowledge of the QoS required over the path. An approach like I-PNNI [IPNNI] would be helpful here, although it introduces some complexity. An additional problem with resource reservation is how to determine what resources have already been allocated to a multicast flow. The availability of this information during path computation improves the chances of finding a path to add a new receiver to a multicast flow. QOSPF [ZSSC97] handles this problem by letting routers broadcast reserved resource information to other routers in their area. Crawley, et. al. Informational [Page 7] RFC 2386 A Framework for QoS-based Routing August 1998 Alternate path routing [ZES97] deals with this issue by using probe messages to find a path with sufficient resources. Path QoS Computation (PQC) method, proposed in [GOA97], propagates bandwidth allocation information in RSVP PATH messages. A router receiving the PATH message gets an indication of the resource allocation only on those links in the path to itself from the source. Allocation for the same flow on other remote branches of the multicast tree is not available. Thus, the PQC method may not be sufficient to find feasible QoS-accommodating paths to all receivers. 3.5 Granularity of Routing Decision Routing in the Internet is currently based only on the destination address of a packet. Many multicast routing protocols require routing based on the source AND destination of a packet. The Integrated Services architecture and RSVP allow QoS determination for an individual flow between a source and a destination. This set of routing granularities presents a problem for QoS routing solutions. If routing based only on destination address is considered, then an intermediate router will route all flows between different sources and a given destination along the same path. This is acceptable if the path has adequate capacity but a problem arises if there are multiple flows to a destination that exceed the capacity of the link. One version of QOSPF [ZSSC97] determines QoS routes based on source and destination address. This implies that all traffic between a given source and destination, regardless of the flow, will travel down the same route. Again, the route must have capacity for all the QoS traffic for the source/destination pair. The amount of routing state also increases since the routing tables must include source/destination pairs instead of just the destination. The best granularity is found when routing is based on individual flows but this incurs a tremendous cost in terms of the routing state. Each QoS flow can be routed separately between any source and destination. PQC [GOA97] and alternate path routing [ZES97], are examples of solutions which operate at the flow level. Both source/destination and flow-based routing may be susceptible to packet looping under hop-by-hop forwarding. Suppose a node along a flow or source/destination-based path loses the state information for the flow. Also suppose that the flow-based route is different from the regular destination-based route. The potential then exists for a routing loop to form when the node forwards a packet belonging to the flow using its destination-based routing table to a node that occurs Crawley, et. al. Informational [Page 8] RFC 2386 A Framework for QoS-based Routing August 1998 earlier on the flow-based path. This is because the latter node may use its flow-based routing table to forward the packet again to the former and this can go on indefinitely. 3.6 Metrics and Path Computation 3.6.1 Metric Selection and Representation There are some considerations in defining suitable link and node metrics [WC96]. First, the metrics must represent the basic network properties of interest. Such metrics include residual bandwidth, delay and jitter. Since the flow QoS requirements have to be mapped onto path metrics, the metrics define the types of QoS guarantees the network can support. Alternatively, QoS-based routing cannot support QoS requirements that cannot be meaningfully mapped onto a reasonable combination of path metrics. Second, path computation based on a metric or a combination of metrics must not be too complex as to render them impractical. In this regard, it is worthwhile to note that path computation based on certain combinations of metrics (e.g., delay and jitter) is theoretically hard. Thus, the allowable combinations of metrics must be determined while taking into account the complexity of computing paths based on these metrics and the QoS needs of flows. A common strategy to allow flexible combinations of metrics while at the same time reduce the path computation complexity is to utilize "sequential filtering". Under this approach, a combination of metrics is ordered in some fashion, reflecting the importance of different metrics (e.g., cost followed by delay, etc.). Paths based on the primary metric are computed first (using a simple algorithm, e.g., shortest path) and a subset of them are eliminated based on the secondary metric and so forth until a single path is found. This is an approximation technique and it trades off global optimality for path computation simplicity (The filtering technique may be simpler, depending on the set of metrics used. For example, with bandwidth and cost as metrics, it is possible to first eliminate the set of links that do not have the requested bandwidth and then compute the least cost path using the remaining links.) Now, once suitable link and node metrics are defined, a uniform representation of them is required across independent domains - employing possibly different routing schemes - in order to derive path metrics consistently (path metrics are obtained by the composition of link and node metrics). Encoding of the maximum, minimum, range, and granularity of the metrics are needed. Also, the definitions of comparison and accumulation operators are required. In addition, suitable triggers must be defined for indicating a significant change from a minor change. The former will cause a routing update to be generated. The stability of the QoS routes would Crawley, et. al. Informational [Page 9] RFC 2386 A Framework for QoS-based Routing August 1998 depend on the ability to control the generation of updates. With interdomain routing, it is essential to obtain a fairly stable view of the interconnection among the ASs. 3.6.2 Metric Hierarchy A hierarchy can be defined among various classes of service based on the degree to which traffic from one class can potentially degrade service of traffic from lower classes that traverse the same link. In this hierarchy, guaranteed constant bit rate traffic is at the top and "best-effort" datagram traffic at the bottom. Classes providing service higher in the hierarchy impact classes providing service in lower levels. The same situation is not true in the other direction. For example, a datagram flow cannot affect a real-time service. Thus, it may be necessary to distribute and update different metrics for each type of service in the worst case. But, several advantages result by identifying a single default metric. For example, one could derive a single metric combining the availability of datagram and real-time service over a common substrate. 3.6.3 Datagram Flows A delay-sensitive metric is probably the most obvious type of metric suitable for datagram flows. However, it requires careful analysis to avoid instabilities and to reduce storage and bandwidth requirements. For example, a recursive filtering technique based on a simple and efficient weighted averaging algorithm [NC94] could be used. This filter is used to stabilize the metric. While it is adequate for smoothing most loading patterns, it will not distinguish between patterns consisting of regular bursts of traffic and random loading. Among other stabilizing tools, is a minimum time between updates that can help filter out high-frequency oscillations. 3.6.4 Real-time Flows In real-time quality-of-service, delay variation is generally more critical than delay as long as the delay is not too high. Clearly, voice-based applications cannot tolerate more than a certain level of delay. The condition of varying delays may be expected to a greater degree in a shared medium environment with datagrams, than in a network implemented over a switched substrate. Routing a real-time flow therefore reduces to an exercise in allocating the required network resources while minimizing fragmentation of bandwidth. The resulting situation is a bandwidth-limited minimum hop path from a source to the destination. In other words, the router performs an ordered search through paths of increasing hop count until it finds one that meets all the bandwidth needs of the flow. To reduce contention and the probability of false probes (due to inaccuracy in Crawley, et. al. Informational [Page 10] RFC 2386 A Framework for QoS-based Routing August 1998 route tables), the router could select a path randomly from a "window" of paths which meet the needs of the flow and satisfy one of three additional criteria: best-fit, first-fit or worst-fit. Note that there is a similarity between the allocation of bandwidth and the allocation of memory in a multiprocessing system. First-fit seems to be appropriate for a system with a high real-time flow arrival rates; and worst-fit is ideal for real-time flows with high holding times. This rather nonintuitive result was shown in [NC94]. 3.6.5 Path Properties Path computation by itself is merely a search technique, e.g., Shortest Path First (SPF) is a search technique based on dynamic programming. The usefulness of the paths computed depends to a large extent on the metrics used in evaluating the cost of a path with respect to a flow. Each link considered by the path computation engine must be evaluated against the requirements of the flow, i.e., the cost of providing the services required by the flow must be estimated with respect to the capabilities of the link. This requires a uniform method of combining features such as delay, bandwidth, priority and other service features. Furthermore, the costs must reflect the lost opportunity of using each link after routing the flow. 3.6.6 Performance Objectives One common objective during path computation is to improve the total network throughput. In this regard, merely routing a flow on any path that accommodates its QoS requirement is not a good strategy. In fact, this corresponds to uncontrolled alternate routing [SD95] and may adversely impact performance at higher traffic loads. It is therefore necessary to consider the total resource allocation for a flow along a path, in relation to available resources, to determine whether or not the flow should be routed on the path. Such a mechanism is referred to in this document as "higher level admission control". The goal of this is to ensure that the "cost" incurred by the network in routing a flow with a given QoS is never more than the revenue gained. The routing cost in this regard may be the lost revenue in potentially blocking other flows that contend for the same resources. The formulation of the higher level admission control strategy, with suitable administrative hooks and with fairness to all flows desiring entry to the network, is an issue. The fairness problem arises because flows with smaller reservations tend to be more successfully routed than flows with large reservations, for a given engineered capacity. To guarantee a certain level of Crawley, et. al. Informational [Page 11] RFC 2386 A Framework for QoS-based Routing August 1998 acceptance rate for "larger" flows, without over-engineering the network, requires a fair higher level admission control mechanism. The application of higher level admission control to multicast routing is discussed later. 3.7 Administrative Control There are several administrative control issues. First, within an AS employing state-dependent routing, administrative control of routing behavior may be necessary. One example discussed earlier was higher level admission control. Some others are described in this section. Second, the control of interdomain routing based on policy is an issue. The discussion of interdomain routing is defered to Section 5. Two areas that need administrative control, in addition to appropriate routing mechanisms, are handling flow priority with preemption, and resource allocation for multiple service classes. 3.7.1 Flow Priorities and Preemption If there are critical flows that must be accorded higher priority than other types of flows, a mechanism must be implemented in the network to recognize flow priorities. There are two aspects to prioritizing flows. First, there must be a policy to decide how different users are allowed to set priorities for flows they originate. The network must be able to verify that a given flow is allowed to claim a priority level signaled for it. Second, the routing scheme must ensure that a path with the requested QoS will be found for a flow with a probability that increases with the priority of the flow. In other words, for a given network load, a high priority flow should be more likely to get a certain QoS from the network than a lower priority flow requesting the same QoS. Routing procedures for flow prioritization can be complex. Identification and evaluation of different procedures are areas that require investigation. 3.7.2 Resource Control If there are multiple service classes, it is necessary to engineer a network to carry the forecasted traffic demands of each class. To do this, router and link resources may be logically partitioned among various service classes. It is desirable to have dynamic partitioning whereby unused resources in various partitions are dynamically shifted to other partitions on demand [ACFH92]. Dynamic sharing, however, must be done in a controlled fashion in order to prevent traffic under some service class from taking up more resources than Crawley, et. al. Informational [Page 12] RFC 2386 A Framework for QoS-based Routing August 1998 what was engineered for it for prolonged periods of time. The design of such a resource sharing scheme, and its incorporation into the QoS-based routing scheme are significant issues. 3.8 QoS-Based Routing for Multicast Flows QoS-based multicast routing is an important problem, especially if the notion of higher level admission control is included. The dynamism in the receiver set allowed by IP multicast, and receiver heterogeneity add to the problem. With straightforward implementation of distributed heuristic algorithms for multicast path computation [W88, C91], the difficulty is essentially one of scalability. To accommodate QoS, multicast path computation at a router must have knowledge of not only the id of subnets where group members are present, but also the identity of branches in the existing tree. In other words, routers must keep flow-specific state information. Also, computing optimal shared trees based on the shared reservation style [BZBH97], may require new algorithms. Multicast routing is discussed in some detail in Section 6. 3.9 Routing Overheads The overheads incurred by a routing scheme depend on the type of the routing scheme, as well as the implementation. There are three types of overheads to be considered: computation, storage and communication. It is necessary to understand the implications of choosing a routing mechanism in terms of these overheads. For example, considering link state routing, the choice of the update propagation mechanism is important since network state is dynamic and changes relatively frequently. Specifically, a flooding mechanism would result in many unnecessary message transmissions and processing. Alternative techniques, such as tree-based forwarding [R96], have to be considered. A related issue is the quantization of state information to prevent frequent updating of dynamic state. While coarse quantization reduces updating overheads, it may affect the performance of the routing scheme. The tradeoff has to be carefully evaluated. QoS-based routing incurs certain overheads during flow establishment, for example, computing a source route. Whether this overhead is disproportionate compared to the length of the sessions is an issue. In general, techniques for the minimization of routing-related overheads during flow establishment must be investigated. Approaches that are useful include pre-computation of routes, caching recently used routes, and TOS routing based on hints in packets (e.g., the TOS field). Crawley, et. al. Informational [Page 13] RFC 2386 A Framework for QoS-based Routing August 1998 3.10 Scaling by Hierarchical Aggregation QoS-based routing should be scalable, and hierarchical aggregation is a common technique for scaling (e.g., [PNNI96]). But this introduces problems with regard to the accuracy of the aggregated state information [L95]. Also, the aggregation of paths under multiple constraints is difficult. One of the difficulties is the risk of accepting a flow based on inaccurate information, but not being able to support the QoS requirements of flow because the capabilities of the actual paths that are aggregated are not known during route computation. Performance impacts of aggregating path metric information must therefore be understood. A way to compensate for inaccuracies is to use crankback, i.e., dynamic search for alternate paths as a flow is being routed. But crankback increases the time to set up a flow, and may adversely affect the performance of the routing scheme under some circumstances. Thus, crankback must be used judiciously, if at all, along with a higher level admission control mechanism. 4. INTRADOMAIN ROUTING REQUIREMENTS At the intradomain level, the objective is to allow as much latitude as possible in addressing the QoS-based routing issues. Indeed, there are many ideas about how QoS-based routing services can be provisioned within ASs. These range from on-demand path computation based on current state information, to statically provisioned paths supporting a few service classes. Another aspect that might invite differing solutions is performance optimization. Based on the technique used for this, intradomain routing could be very sophisticated or rather simple. Finally, the service classes supported, as well as the specific QoS engineered for a service class, could differ from AS to AS. For instance, some ASs may not support guaranteed service, while others may. Also, some ASs supporting the service may be engineered for a better delay bound than others. Thus, it requires considerable thought to determine the high level requirements for intradomain routing that both supports the overall view of QoS-based routing in the Internet and allows maximum autonomy in developing solutions. Our view is that certain minimum requirements must be satisfied by intradomain routing in order to be qualified as "QoS-based" routing. These are: - The routing scheme must route a flow along a path that can accommodate its QoS requirements, or indicate that the flow cannot be admitted with the QoS currently being requested. Crawley, et. al. Informational [Page 14] RFC 2386 A Framework for QoS-based Routing August 1998 - The routing scheme must indicate disruptions to the current route of a flow due to topological changes. - The routing scheme must accommodate best-effort flows without any resource reservation requirements. That is, present best effort applications and protocol stacks need not have to change to run in a domain employing QoS-based routing. - The routing scheme may optionally support QoS-based multicasting with receiver heterogeneity and shared reservation styles. In addition, the following capabilities are also recommended: - Capabilities to optimize resource usage. - Implementation of higher level admission control procedures to limit the overall resource utilization by individual flows. Further requirements along these lines may be specified. The requirements should capture the consensus view of QoS-based routing, but should not preclude particular approaches (e.g., TOS-based routing) from being implemented. Thus, the intradomain requirements are expected to be rather broad. 5. INTERDOMAIN ROUTING The fundamental requirement on interdomain QoS-based routing is scalability. This implies that interdomain routing cannot be based on highly dynamic network state information. Rather, such routing must be aided by sound network engineering and relatively sparse information exchange between independent routing domains. This approach has the advantage that it can be realized by straightforward extensions of the present Internet interdomain routing model. A number of issues, however, need to be addressed to achieve this, as discussed below. Crawley, et. al. Informational [Page 15] RFC 2386 A Framework for QoS-based Routing August 1998 5.1 Interdomain QoS-Based Routing Model The interdomain QoS-based routing model is depicted below: AS1 AS2 AS3 ___________ _____________ ____________ | | | | | | | B------B B----B | | | | | | | -----B----- B------------- --B--------- \ / / \ / / ____B_____B____ _________B______ | | | | | B-------B | | | | | | B-------B | --------------- ---------------- AS4 AS5 Here, ASs exchange standardized routing information via border nodes B. Under this model, each AS can itself consist of a set of interconnected ASs, with standardized routing interaction. Thus, the interdomain routing model is hierarchical. Also, each lowest level AS employs an intradomain QoS-based routing scheme (proprietary or standardized by intradomain routing efforts such as QOSPF). Given this structure, some questions that arise are: - What information is exchanged between ASs? - What routing capabilities does the information exchange lead to? (E.g., source routing, on-demand path computation, etc.) - How is the external routing information represented within an AS? - How are interdomain paths computed? - What sort of policy controls may be exerted on interdomain path computation and flow routing?, and - How is interdomain QoS-based multicast routing accomplished? At a high level, the answers to these questions depend on the routing paradigm. Specifically, considering link state routing, the information exchanged between domains would consist of an abstract representation of the domains in the form of logical nodes and links, along with metrics that quantify their properties and resource availability. The hierarchical structure of the ASs may be handled Crawley, et. al. Informational [Page 16] RFC 2386 A Framework for QoS-based Routing August 1998 by a hierarchical link state representation, with appropriate metric aggregation. Link state routing may not necessarily be advantageous for interdomain routing for the following reasons: - One advantage of intradomain link state routing is that it would allow fairly detailed link state information be used to compute paths on demand for flows requiring QoS. The state and metric aggregation used in interdomain routing, on the other hand, erodes this property to a great degree. - The usefulness of keeping track of the abstract topology and metrics of a remote domain, or the interconnection between remote domains is not obvious. This is especially the case when the remote topology and metric encoding are lossy. - ASs may not want to advertise any details of their internal topology or resource availability. - Scalability in interdomain routing can be achieved only if information exchange between domains is relatively infrequent. Thus, it seems practical to limit information flow between domains as much as possible. Compact information flow allows the implementation QoS-enhanced versions of existing interdomain protocols such as BGP-4. We look at the interdomain routing issues in this context. 5.2 Interdomain Information Flow The information flow between routing domains must enable certain basic functions: 1. Determination of reachability to various destinations 2. Loop-free flow routes 3. Address aggregation whenever possible 4. Determination of the QoS that will be supported on the path to a destination. The QoS information should be relatively static, determined from the engineered topology and capacity of an AS rather than ephemeral fluctuations in traffic load through the AS. Ideally, the QoS supported in a transit AS should be allowed to vary significantly only under exceptional circumstances, such as failures or focused overload. Crawley, et. al. Informational [Page 17] RFC 2386 A Framework for QoS-based Routing August 1998 5. Determination, optionally, of multiple paths for a given destination, based on service classes. 6. Expression of routing policies, including monetary cost, as a function of flow parameters, usage and administrative factors. Items 1-3 are already part of existing interdomain routing. Item 5 is also a straightfoward extension of the current model. The main problem areas are therefore items 4 and 6. The QoS of an end-to-end path is obtained by composing the QoS available in each transit AS. Thus, border routers must first determine what the locally available QoS is in order to advertise routes to both internal and external destinations. The determination of local "AS metrics" (corresponding to link metrics in the intradomain case) should not be subject to too much dynamism. Thus, the issue is how to define such metrics and what triggers an occasional change that results in re-advertisements of routes. The approach suggested in this document is not to compute paths based on residual or instantaneous values of AS metics (which can be dynamic), but utilize only the QoS capabilities engineered for aggregate transit flows. Such engineering may be based on the knowledge of traffic to be expected from each neighboring ASs and the corresponding QOS needs. This information may be obtained based on contracts agreed upon prior to the provisioning of services. The AS metric then corresponds to the QoS capabilities of the "virtual path" engineered through the AS (for transit traffic) and a different metric may be used for different neighbors. This is illustrated in the following figure. AS1 AS2 AS3 ___________ _____________ ____________ | | | | | | | B------B1 B2----B | | | | | | | -----B----- B3------------ --B--------- \ / \ / ____B_____B____ | | | | | | | | --------------- AS4 Crawley, et. al. Informational [Page 18] RFC 2386 A Framework for QoS-based Routing August 1998 Here, B1 may utilize an AS metric specific for AS1 when computing path metrics to be advertised to AS1. This metric is based on the resources engineered in AS2 for transit traffic from AS1. Similarly, B3 may utilize a different metric when computing path metrics to be advertised to AS4. Now, it is assumed that as long as traffic flow into AS2 from AS1 or AS4 does not exceed the engineered values, these path metrics would hold. Excess traffic due to transient fluctuations, however, may be handled as best effort or marked with a discard bit. Thus, this model is different from the intradomain model, where end nodes pick a path dynamically based on the QoS needs of the flow to be routed. Here, paths within ASs are engineered based on presumed, measured or declared traffic and QoS requirements. Under this model, an AS can contract for routes via multiple transit ASs with different QoS requirements. For instance, AS4 above can use both AS1 and AS2 as transits for same or different destinations. Also, a QoS contract between one AS and another may generate another contract between the second and a third AS and so forth. An issue is what triggers the recomputation of path metrics within an AS. Failures or other events that prevent engineered resource allocation should certainly trigger recomputation. Recomputation should not be triggered in response to arrival of flows within the engineered limit. 5.3 Path Computation Path computation for an external destination at a border node is based on reachability, path metrics and local policies of selection. If there are multiple selection criteria (e.g., delay, bandwidth, cost, etc.), mutiple alternaives may have to be maintained as well as propagated by border nodes. Selection of a path from among many alternatives would depend on the QoS requests of flows, as well as policies. Path computation may also utilze any heuristics for optimizing resource usage. 5.4 Flow Aggregation An important issue in interdomain routing is the amount of flow state to be processed by transit ASs. Reducing the flow state by aggregation techniques must therefore be seriously considered. Flow aggregation means that transit traffic through an AS is classified into a few aggregated streams rather than being routed at the individual flow level. For example, an entry border router may classify