Congestion and Pre-Congestion Notification | B. Briscoe |
Internet-Draft | BT |
Intended status: Standards Track | T. Moncaster |
Expires: November 22, 2011 | Moncaster Internet Consulting |
M. Menth | |
University of Tuebingen | |
May 21, 2011 |
Encoding 3 PCN-States in the IP header using a single DSCP
draft-ietf-pcn-3-in-1-encoding-05
The objective of Pre-Congestion Notification (PCN) is to protect the quality of service (QoS) of inelastic flows within a Diffserv domain. On every link in the PCN domain, the overall rate of the PCN-traffic is metered, and PCN-packets are appropriately marked when certain configured rates are exceeded. Egress nodes provide decision points with information about the PCN-marks of PCN-packets which allows them to take decisions about whether to admit or block a new flow request, and to terminate some already admitted flows during serious pre-congestion.
This document specifies how PCN-marks are to be encoded into the IP header by re-using the Explicit Congestion Notification (ECN) codepoints within a PCN-domain. This encoding builds on the baseline encoding of RFC5696 and provides for three different PCN marking states using a single DSCP: not-marked (NM), threshold-marked (ThM) and excess-traffic-marked (ETM). Hence, it is called the 3-in-1 PCN encoding.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 22, 2011.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The objective of Pre-Congestion Notification (PCN) [RFC5559] is to protect the quality of service (QoS) of inelastic flows within a Diffserv domain, in a simple, scalable, and robust fashion. Two mechanisms are used: admission control, to decide whether to admit or block a new flow request, and flow termination to terminate some existing flows during serious pre-congestion. To achieve this, the overall rate of PCN-traffic is metered on every link in the domain, and PCN-packets are appropriately marked when certain configured rates are exceeded. These configured rates are below the rate of the link thus providing notification to boundary nodes about overloads before any real congestion occurs (hence "pre-congestion notification").
[RFC5670] provides for two metering and marking functions that are configured with reference rates. Threshold-marking marks all PCN packets once their traffic rate on a link exceeds the configured reference rate (PCN-threshold-rate). Excess-traffic-marking marks only those PCN packets that exceed the configured reference rate (PCN-excess-rate). The PCN-excess-rate is typically larger than the PCN-threshold-rate [RFC5559]. Egress nodes monitor the PCN-marks of received PCN-packets and provide information about the PCN-marks to decision points which take decisions about flow admission and termination on this basis [I-D.ietf-pcn-cl-edge-behaviour], [I-D.ietf-pcn-sm-edge-behaviour].
The baseline encoding defined in [RFC5696] describes how two PCN marking states (Not-marked and PCN-Marked) can be encoded using a single Diffserv codepoint. It also provides an experimental codepoint (EXP), along with guidelines for use of that codepoint. To support the application of two different marking algorithms in a PCN-domain, for example as required in [I-D.ietf-pcn-cl-edge-behaviour], three PCN marking states are needed. This document describes an extension to the baseline encoding that uses the EXP codepoint to provide a third PCN marking state in the IP header, still using a single Diffserv codepoint. This encoding scheme is called "3-in-1 PCN encoding".
This document only concerns the PCN wire protocol encoding for all IP headers, whether IPv4 or IPv6. It makes no changes or recommendations concerning algorithms for congestion marking or congestion response. Other documents define the PCN wire protocol for other header types. For example, the MPLS encoding is defined in [RFC5129] and Appendix A of that document provides an informative example for a mapping between the encodings in IP and in MPLS.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
General PCN-related terminology is defined in the PCN architecture [RFC5559], and terminology specific to packet encoding is defined in the PCN baseline encoding [RFC5696]. Additional terminology is defined below.
In accordance with the PCN architecture [RFC5559], PCN-ingress-nodes control packets entering a PCN-domain. Packets belonging to PCN-controlled flows are subject to PCN-metering and -marking, and PCN-ingress-nodes mark them as Not-marked (PCN-colouring). Any node in the PCN-domain may perform PCN-metering and -marking and mark PCN-packets if needed. There are two different metering and marking schemes: threshold-marking and excess-traffic-marking [RFC5670]. Some edge behaviors require only a single marking scheme [I-D.ietf-pcn-sm-edge-behaviour], others require both [I-D.ietf-pcn-cl-edge-behaviour]. In the latter case, three PCN marking states are needed: not-marked (NM) to indicate not-marked packets, threshold-marked (ThM) to indicate packets marked by the threshold-marker, and excess-traffic-marked (ETM) to indicate packets marked by the excess-traffic-marker [RFC5670]. Threshold-marking and excess-traffic-marking are configured to start marking packets at different load conditions, so one marking scheme indicates more severe pre-congestion than the other. Therefore, a fourth PCN marking state indicating that a packet is marked by both markers is not needed. However a fourth codepoint is required to indicate packets that are not PCN-capable (the not-PCN codepoint).
In all current PCN edge behaviors that use two marking schemes [RFC5559], [I-D.ietf-pcn-cl-edge-behaviour], excess-traffic-marking is configured with a larger reference rate than threshold-marking. We take this as a rule and define excess-traffic-marked as a more severe PCN-mark than threshold-marked.
The baseline encoding scheme [RFC5696] was defined so that it could be extended to accommodate an additional marking state. It provides rules to embed the encoding of two PCN states in the IP header. Figure 1 shows the structure of the former type-of-service field. It contains the 6-bit Differentiated Services (DS) field that holds the DS codepoint (DSCP) [RFC2474] and the 2-bit ECN field [RFC3168].
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DS FIELD | ECN FIELD | +-----+-----+-----+-----+-----+-----+-----+-----+
Baseline encoding defines that the DSCP must be set to a PCN-compatible DSCP n and the ECN-field [RFC3168] indicates the specific PCN-mark. Baseline encoding offers four possible encoding states within a single DSCP with the following restrictions.
[RFC6040] defines rules for the encapsulation and decapsulation of ECN markings within IP-in-IP tunnels. This RFC removes some of the constraints that existed when [RFC5696] was written. Happily the rules for use of the EXP codepoint are fully compatible with [RFC6040]. In particular, the relative severity of each marking is the same: CE (PM) is more severe than ECT(1) (EXP) is more severe than ECT(0) (NM). This is discussed in more detail in both the baseline encoding document [RFC5696] and in [I-D.ietf-pcn-encoding-comparison].
The 3-in-1 encoding is applicable in situations where two marking schemes are being used in the PCN-domain. In some circumstances it can also be used in PCN-domains with only a single marking scheme in use. Further guidance on choosing an encoding scheme can be found in Section 6.2. All nodes within the PCN-domain MUST be fully compliant with the ECN encapsulation rules set out in [RFC6040]. As such the encoding is not applicable in situations where legacy tunnels might exist.
The 3-in-1 PCN encoding scheme is an extension of the baseline encoding scheme defined in [RFC5696]. The PCN requirements and the extension rules for baseline encoding presented in the previous section determine how PCN encoding states are carried in the IP headers. This is shown in Figure 2.
+--------+----------------------------------------------------+ | | Codepoint in ECN field of IP header | | DSCP | <RFC3168 codepoint name> | | +--------------+-------------+-------------+---------+ | | 00 <Not-ECT> | 10 <ECT(0)> | 01 <ECT(1)> | 11 <CE> | +--------+--------------+-------------+-------------+---------+ | DSCP n | Not-PCN | NM | ThM | ETM | +--------+--------------+-------------+-------------+---------+
Like baseline encoding, 3-in-1 PCN encoding also uses a PCN compatible DSCP n and the ECN field for the encoding of PCN-marks. The PCN-marks have the following meaning.
To be compliant with the 3-in-1 PCN Encoding, an PCN interior node behaves as follows:
In other words, a PCN interior node MUST NOT mark PCN-packets into non-PCN packets and vice-versa, and it may increase the severity of the PCN-mark of a PCN-packet, but it MUST NOT decrease it.
Discussion of backward compatibility between PCN encoding schemes and previous uses of the ECN field is given in Section 6 of [RFC5696].
This encoding complies with the rules for extending the baseline PCN encoding schemes in Section 5 of [RFC5696].
The term "compatibility" is meant in the following sense. It is possible to operate nodes with baseline encoding [RFC5696] and 3-in-1 encoding in the same PCN domain. The nodes with baseline encoding MUST perform excess-traffic-marking because the 11 codepoint of 3-in-1 encoding also means excess-traffic-marked. PCN-boundary-nodes of such domains are required to interpret the full 3-in-1 encoding and not just baseline encoding, otherwise they cannot interpret the 01 codepoint.
Using nodes that perform only excess-traffic-marking may make sense in networks using the CL edge behavior [I-D.ietf-pcn-cl-edge-behaviour]. Such nodes are able to notify the egress only about severe pre-congestion when traffic needs to be terminated. This seems reasonable for locations that are not expected to see any pre-congestion, but excess-traffic-marking gives them a means to terminate traffic if unexpected overload occurs.
NOTE: This sub-section is informative not normative.
When deciding which PCN encoding is suitable an operator needs to take account of how many PCN states need to be encoded. The following table gives guidelines on which encoding to use with either threshold-marking, excess-traffic marking or both.
+------------------------+--------------------------------+ | Marking schemes in use | Recommended encoding scheme | +------------------------+--------------------------------+ | Only threshold-marking | Baseline encoding [RFC5696] | +------------------------+--------------------------------+ | Only excess-traffic- | Baseline encoding [RFC5696] | | marking | or 3-in-1 PCN encoding | +------------------------+--------------------------------+ | Threshold-marking and | 3-in-1 PCN encoding | | excess-traffic-marking | | +------------------------+--------------------------------+
If both excess-traffic-marking and threshold-marking are enabled in a PCN-domain, 3-in-1 encoding should be used as described in this document.
If only excess-traffic-marking is enabled in a PCN-domain, baseline encoding or 3-in-1 encoding may be used. They lead to the same encoding because PCN-boundary nodes will interpret baseline "PCN-marked (PM)" as "excess-traffic-marked (ETM)".
No scheme is currently proposed that solely uses threshold-marking. If such a scheme is proposed, the choice of encoding scheme will depend on whether nodes are compliant with [RFC6040] or not. Where it is certain that all nodes in the PCN-domain are compliant then either 3-in-1 encoding or baseline encoding are suitable. If legacy tunnel decapsulators exist within the PCN-domain then baseline encoding SHOULD be used.
This memo includes no request to IANA.
Note to RFC Editor: this section may be removed on publication as an RFC.
The security concerns relating to this extended PCN encoding are the same as those in [RFC5696]. In summary, PCN-boundary nodes are responsible for ensuring inappropriate PCN markings do not leak into or out of a PCN domain, and the current phase of the PCN architecture assumes that all the nodes of a PCN-domain are entirely under the control of a single operator, or a set of operators who trust each other.
Given the only difference between the baseline encoding and the present 3-in-1 encoding is the use of the 01 codepoint, no new security issues are raised, as this codepoint was already available for experimental use in the baseline encoding.
The 3-in-1 PCN encoding uses a PCN-compatible DSCP and the ECN field to encode PCN-marks. One codepoint allows non-PCN traffic to be carried with the same PCN-compatible DSCP and three other codepoints support three PCN marking states with different levels of severity. The use of this PCN encoding scheme presupposes that any tunnels in the PCN region have been updated to comply with [RFC6040].
Thanks to Phil Eardley, Teco Boot, Kwok Ho Chan and Georgios Karaginannis for reviewing this document.
To be removed by RFC Editor: Comments and questions are encouraged and very welcome. They can be addressed to the IETF Congestion and Pre-Congestion working group mailing list <pcn@ietf.org>, and/or to the authors.
[I-D.ietf-pcn-cl-edge-behaviour] | Charny, A, Huang, F, Karagiannis, G, Menth, M and T Taylor, "PCN Boundary Node Behaviour for the Controlled Load (CL) Mode of Operation", Internet-Draft draft-ietf-pcn-cl-edge-behaviour-10, October 2011. |
[I-D.ietf-pcn-sm-edge-behaviour] | Charny, A, Karagiannis, G, Menth, M and T Taylor, "PCN Boundary Node Behaviour for the Single Marking (SM) Mode of Operation", Internet-Draft draft-ietf-pcn-sm-edge-behaviour-06, June 2011. |
[I-D.ietf-pcn-encoding-comparison] | Karagiannis, G, Chan, K, Moncaster, T, Menth, M, Eardley, P and B Briscoe, "Overview of Pre-Congestion Notification Encoding", Internet-Draft draft-ietf-pcn-encoding-comparison-06, June 2011. |
The PCN encoding described in this document re-uses the bits of the ECN field in the IP header. Consequently, this disables ECN within the PCN domain. Appendix B of [RFC5696] included advice on handling ECN traffic within a PCN-domain. This appendix clarifies that advice.
For the purposes of this appendix we define two forms of traffic that might arrive at a PCN-ingress node. These are Admission-controlled traffic and Non-admission-controlled traffic.
Admission-controlled traffic will be remarked to the PCN-compatible DSCP by the PCN-ingress node. Two mechanisms can be used to identify such traffic:
All other traffic can be thought of as Non-admission-controlled. However such traffic may still need to share the same DSCP as the Admission-controlled traffic. This may be due to policy (for instance if it is high priority voice traffic), or may be because there is a shortage of local DSCPs.
ECN [RFC3168] is an end-to-end congestion notification mechanism. As such it is possible that some traffic entering the PCN-domain may also be ECN capable The following lists the four cases for how e2e ECN traffic may wish to be treated while crossing a PCN domain:
The first option is recommended unless the operator is short of local DSCPs.
The second option is not recommended unless tunnelling is not possible for some reason..
, . . . . . PCN-domain . . . . . . . ,--------. ,--------. . . _| PCN- |___________________| PCN- |_ . . / | ingress | | egress | \ . .| '---------' '--------' |. | . . . . . . . . . . . . . . .| ,--------. ,--------. _____| Tunnel | | Tunnel |____ | Ingress | - - ECN preserved inside tunnel - - | Egress | '---------' '--------'
In the list above any form of IP-in-IP tunnel can be used unless specified otherwise. NB, We assume a logical separation of tunneling and PCN actions in both PCN-ingress and PCN-egress nodes. That is, any tunneling action happens wholly outside the PCN-domain as illustrated in the following figure: