Internet-Draft | Performance Measurement for SR-MPLS | October 2024 |
Gandhi, et al. | Expires 9 April 2025 | [Page] |
Segment Routing (SR) leverages the source routing paradigm. SR applies to the Multiprotocol Label Switching data plane (SR-MPLS) as specified in RFC 8402. RFC 6374 and RFC 7876 specify protocol mechanisms to enable efficient and accurate measurement of packet loss, one-way and two-way delay, as well as related metrics such as delay variation in MPLS networks. RFC 9341 defines the Alternate-Marking Method using Block Number as a data correlation mechanism for packet loss measurement. This document utilizes mechanisms from RFC 6374, RFC 7876, and RFC 9341 for performance delay and loss measurements in SR-MPLS networks, covering both links and end-to-end SR-MPLS paths, including SR Policies.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 9 April 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Segment Routing (SR) leverages the source routing paradigm. SR applies to both Multiprotocol Label Switching (SR-MPLS) and IPv6 (SRv6) data planes as specified in [RFC8402]. SR takes advantage of the Equal-Cost Multipaths (ECMPs) between source and transit nodes, between transit nodes and between transit and destination nodes. SR Policies as defined in [RFC9256] are used to steer traffic through specific, user-defined paths using a list of Segments. A comprehensive SR Performance Measurement toolset is one of the essential requirements for measuring network performance to provide Service Level Agreements (SLAs).¶
[RFC6374] specifies protocol mechanisms to enable efficient and accurate measurement of packet loss, one-way and two-way delay, as well as related metrics such as delay variation in MPLS networks.¶
[RFC7876] specifies mechanisms for sending and processing out-of-band responses over a UDP return path when receiving query messages defined in [RFC6374]. These mechanisms are also well-suited to SR-MPLS networks.¶
[RFC9341] defines the Alternate-Marking Method using Block Number as a data correlation mechanism for packet loss measurement.¶
This document utilizes the mechanisms from [RFC6374], [RFC7876], and [RFC9341] for performance delay and loss measurements in SR-MPLS networks, covering both links and end-to-end SR-MPLS paths, including SR Policies.¶
This document defines Return Path and Block Number TLV extensions for [RFC6374] for performance delay and loss measurement in SR-MPLS networks. These TLV extensions also apply to the MPLS Label Switched Paths (LSPs) [RFC3031]. However, the procedure for performance delay and loss measurement of MPLS LSPs is outside the scope of this document.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
ACH: Associated Channel Header.¶
DM: Delay Measurement.¶
ECMP: Equal Cost Multi-Path.¶
G-ACh: Generic Associated Channel (G-ACh).¶
GAL: Generic Associated Channel (G-ACh) Label.¶
LM: Loss Measurement.¶
LSE: Label Stack Entry.¶
MPLS: Multiprotocol Label Switching.¶
PSID: Path Segment Identifier.¶
SID: Segment Identifier.¶
SL: Segment List.¶
SR: Segment Routing.¶
SR-MPLS: Segment Routing with MPLS data plane.¶
TC: Traffic Class.¶
TE: Traffic Engineering.¶
TTL: Time-To-Live.¶
URO: UDP Return Object.¶
In the Reference Topology shown in Figure 1, the querier node Q1 initiates a query message, and the responder node R1 transmits a response message for the query message received. The response message may be sent back to the querier node Q1 on the same path (same set of links and nodes) or a different path in the reverse direction from the path taken towards the responder R1.¶
The T1 is a transmit timestamp, and T4 is a receive timestamp, both added by node Q1. The T2 is a receive timestamp, and T3 is a transmit timestamp, both added by node R1.¶
SR is enabled with MPLS data plane on nodes Q1 and R1. The nodes Q1 and R1 may be directly connected via a link enabled with MPLS (Section 2.9.1 of [RFC6374]) or a Point-to-Point (P2P) SR-MPLS path [RFC8402]. The link may be a physical interface, a virtual link, or a Link Aggregation Group (LAG) [IEEE802.1AX], or LAG member link. The SR-MPLS path may be an SR-MPLS Policy [RFC9256] on node Q1 (called head-end) with destination to node R1 (called tail-end).¶
For delay and loss measurement in SR-MPLS networks, the procedures defined in [RFC6374], [RFC7876], and [RFC9341] are used in this document. Note that the one-way, two-way, and round-trip delay measurements are defined in Section 2.4 of [RFC6374] and are further described in this document for SR-MPLS networks. Similarly, the packet loss measurement is defined in Section 2.2 of [RFC6374] and is further described in this document for SR-MPLS networks.¶
The packet loss measurement using Alternate-Marking Method defined in [RFC9341] may use Block Number for data correlation. This is achieved by using the Block Number TLV extension defined in this document.¶
In SR-MPLS networks, the query and response messages defined in [RFC6374] are sent as follows:¶
If it is desired in SR-MPLS networks that the same path (same set of links and nodes) between the querier and responder be used in both directions of the measurement, it is achieved by using the Return Path TLV extension defined in this document.¶
The performance measurement procedure for links can be used to compute extended Traffic Engineering (TE) metrics for delay and loss as described in this document. The metrics are advertised in the network using the routing protocol extensions defined in [RFC7471], [RFC8570], and [RFC8571].¶
The query message as defined in [RFC6374] is sent over the links for both delay and loss measurement. In each Label Stack Entry (LSE) [RFC3032] in the MPLS label stack, the TTL value MUST be set to 255.¶
An SR-MPLS Policy Candidate-Path may contain a number of Segment Lists (SLs) (i.e., stack of MPLS labels) [RFC9256]. For delay and/or loss measurement for an end-to-end SR-MPLS Policy, the query messages MUST be transmitted for every SL of the SR-MPLS Policy Candidate-Path. Each query message contains an SR-MPLS label stack of the Candidate-Path, with the G-ACh Label (GAL) at the bottom of the stack (with S=1) as shown in Figure 2. In each LSE in the MPLS label stack, the TTL value MUST be set to 255.¶
The fields "0001", Version, Reserved, and Channel Type shown in Figure 2 are specified in [RFC5586].¶
The SR-MPLS label stack can be empty in the case of one hop SR-MPLS Policy with an Implicit NULL label.¶
For an SR-MPLS Policy, to ensure that the query message is processed by the intended responder, Destination Address TLV (Type 129) [RFC6374] containing the address of the responder can be sent in the query messages. The responder that supports this TLV MUST return Success in "Control Code" [RFC6374] if it is the intended destination for the query. Otherwise, it MUST return 0x15: Error - Invalid Destination Node Identifier [RFC6374].¶
In one-way measurement mode defined in Section 2.4 of [RFC6374], the querier can receive "out-of-band" response messages with IP/UDP header by properly setting the UDP Return Object (URO) TLV in the query message. The URO TLV (Type=131) is defined in [RFC7876] and includes the UDP-Destination-Port and IP Address. When the querier sets an IP address and a UDP port in the URO TLV, the response message MUST be sent to that IP address as the destination address and UDP port as the destination port. In addition, the "Control Code" in the query message MUST be set to "out-of-band response requested" [RFC6374].¶
In two-way measurement mode defined in Section 2.4 of [RFC6374], the response messages SHOULD be sent back in-band on the same link or the same end-to-end SR-MPLS path (same set of links and nodes) in the reverse direction to the querier.¶
For links, the response message as defined in [RFC6374] is sent back on the same incoming link where the query message is received. In this case, the "Control Code" in the query message MUST be set to "in-band response requested" [RFC6374].¶
For end-to-end SR-MPLS paths, the responder transmits the response message (example as shown in Figure 2) on a specific return SR-MPLS path. The querier can request in the query message to the responder to send the response message back on a given return path using the MPLS Label Stack sub-TLV in the Return Path TLV defined in this document.¶
The loopback measurement mode defined in Section 2.8 of [RFC6374] is used to measure round-trip delay for a bidirectional circular SR-MPLS path. In this mode for SR-MPLS, the received query messages are not punted out of the fast path in forwarding (i.e., to the slow path or control-plane) at the responder. In other words, the responder does not process the payload and generate response messages. The loopback function simply returns the received query message to the querier without responder modifications [RFC6374].¶
The loopback mode is done by generating "queries" with the Response flag set to 1 and adding the Loopback Request object (Type 3) [RFC6374]. The label stack, as shown in Figure 2, in query messages in this case carries both the forward and reverse paths in the MPLS header. The GAL is still carried at the bottom of the label stack (with S=1) (example as shown in Figure 2).¶
As defined in [RFC6374], MPLS Delay Measurement (DM) query and response messages use Associated Channel Header (ACH) (value 0x000C for delay measurement) [RFC6374], which identifies the message type, and the message payload as defined in Section 3.2 [RFC6374] following the ACH. For delay measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
The Loss Measurement (LM) protocol can perform two distinct kinds of loss measurement as described in Section 2.9.8 of [RFC6374].¶
As defined in [RFC6374], MPLS LM query and response messages use Associated Channel Header (ACH) (value 0x000A for direct loss measurement or value 0x000B for inferred loss measurement), which identifies the message type and the message payload defined in Section 3.1 [RFC6374] following the ACH. For loss measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
As defined in [RFC6374], Combined DM+LM query and response messages use Associated Channel Header (ACH) (value 0x000D for direct loss and delay measurement or value 0x000E for inferred loss and delay measurement), which identifies the message type, and the message payload defined in Section 3.3 [RFC6374] following the ACH. For combined loss and delay measurement, the same ACH value is used for both links and end-to-end SR-MPLS Policies.¶
The Path Segment Identifier (PSID) [RFC9545] MUST be carried in the received data packet for the traffic flow under measurement for accounting received traffic on the egress node of the SR-MPLS Policy. In direct mode, the PSID in the received query message as shown in Figure 3 can be used to associate the receive traffic counter on the responder to detect the transmit packet loss for the end-to-end SR-MPLS Policy.¶
In inferred mode, the PSID in the received query messages, as shown in Figure 3 can be used to count the received query messages on the responder to detect the transmit packet loss for an end-to-end SR-MPLS Policy.¶
The fields "0001", Version, Reserved, and Channel Type shown in Figure 3 are specified in [RFC5586].¶
Different values of PSID can be used per Candidate-Path for accounting received traffic to measure packet loss at Candidate-Path level. Similarly, different values of PSID can be used per Segment List of the Candidate-Path for accounting received traffic to measure packet loss at Segment List level. The same value of PSID can be used for all Segment Lists of the SR-MPLS Policy to measure packet loss at SR-MPLS Policy level.¶
The packet loss measurement using Alternate-Marking Method defined in [RFC9341] may use Block Number for data correlation for the traffic flow under measurement. As defined in Section 3.1 of [RFC9341], the block number is used to divide the traffic flow into consecutive blocks and counting the number of packets transmitted and received in each block for loss measurement.¶
As described in Section 4.3 of [RFC9341], protocol-based distributed solution can be used to exchange values of counters on the nodes for loss measurement. That solution is further described in this document using the LM messages defined in [RFC6374].¶
The querier node assigns a block number to the block of data packets of the traffic flow under measurement. The querier counts the number of packets transmitted in each block. The mechanism for assignment of block number is a local decision on the querier and is outside the scope of this document.¶
As an example, the querier can use the procedure defined in [I-D.ietf-mpls-inband-pm-encapsulation] for alternate marking the data packets of the traffic flow under measurement. The responder counts the number of received packets in each block based on the marking in the received data packets. The querier and responder maintain separate sets of transmit and receive counters for each marking. The marking can be used as a block number or a separate block number can be incremented when the marking changes. Other methods can be defined for alternate marking the data packets of the traffic flow under measurement to assign block number for the counters.¶
The LM query and response messages defined in [RFC6374] are used to measure packet loss for the block of data packets transmitted with the previous marking while data packets carry alternate marking. Specifically, LM query and response messages carry the transmit and receive counters (which are currently not incrementing) along with their block number to correlate for loss measurement.¶
"The assumption of the block number mechanism is that the measurement nodes are time synchronized" as specified in Section 4.3 of [RFC9341] is not necessary as the block number on the responder can be synchronized based on the received LM query messages.¶
In two-way measurement mode, the responder may transmit the response message on a specific return path, for example, in an ECMP environment. The querier can request in the query message to the responder to send a response message back on a given return path (e.g., co-routed bidirectional path). This allows the responder to avoid creating and maintaining additional states (containing return paths) for the sessions.¶
The querier may not be directly reachable from the responder in a network. The querier in this case MUST send its reachability path information to the responder using the Return Path TLV.¶
[RFC6374] defines query and response messages those can include one or more optional TLVs. New TLV Type (TBA1) is defined in this document for the Return Path TLV to carry return path information in query messages. The format of the Return Path TLV is shown in Figure 4:¶
The Length is a one-byte field and is equal to the length of the Return Path Sub-TLV and the Reserved field in bytes. Length MUST NOT be 0.¶
The Return Path TLV is defined in the Mandatory TLV Type registry space [RFC6374]. The querier MUST only insert one Return Path TLV in the query message. The responder that supports this TLV, MUST only process the first Return Path TLV and ignore the other Return Path TLVs if present. The responder that supports this TLV, also MUST send response message back on the return path specified in the Return Path TLV. The responder also MUST NOT add Return Path TLV in the response message. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
The Return Path TLV contains a Sub-TLV to carry the return path. The format of the MPLS Label Stack Sub-TLV is shown in Figure 5. The Label entries in the Sub-TLV MUST be in network order. The MPLS Label Stack Sub-TLV in the Return Path TLV is of the following Type:¶
The MPLS Label Stack contains a list of 32-bit LSE that includes a 20-bit label value, 8-bit TTL value, 3-bit TC value, and 1-bit EOS (S) field. An MPLS Label Stack Sub-TLV may carry a stack of labels or a Binding SID label [RFC8402] of the Return SR-MPLS Policy.¶
The Length is a one-byte field and is equal to the length of the label stack field and the Reserved field in bytes. Length MUST NOT be 0.¶
The Return Path TLV MUST carry only one Return Path Sub-TLV. The MPLS Label Stack in the Return Path Sub-TLV MUST contain at least one MPLS Label. The responder that supports this Sub-TLV, MUST only process the first Return Path Sub-TLV and ignore the other Return Path Sub-TLVs if present. The responder that supports this Sub-TLV, MUST send response message back on the return path specified in the Return Path Sub-TLV. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
[RFC6374] defines query and response messages; those can include one or more optional TLVs. New TLV Type (value TBA2) is defined in this document to carry the Block Number (8-bit) of the traffic counters in the LM query and response messages. The format of the Block Number TLV is shown in Figure 6:¶
The Length is a one-byte field and is equal to 2 bytes.¶
The Block Number TLV is defined in the Mandatory TLV Type registry space [RFC6374]. The querier MUST only insert one Block Number TLV in the query message to identify the Block Number for the traffic counters in the forward direction. The responder that supports this TLV, MUST only insert one Block Number TLV in the response message to identify the Block Number for the traffic counters in the reverse direction. The responder also MUST return the first Block Number TLV from the query message and ignore the other Block Number TLVs if present. The R flag is used to indicate the query and response message direction associated with the Block Number. The R Flag MUST be clear in the query message for the Block Number associated with Counter 1 and Counter 2, and set in the response message for the Block Number associated with Counter 3 and Counter 4. The Reserved field MUST be set to 0 and MUST be ignored on the receive side.¶
An SR-MPLS Policy can have ECMPs between the source and transit nodes, between transit nodes and between transit and destination nodes. Usage of node SID [RFC8402] by an SR-MPLS Policy can result in ECMP paths. In addition, usage of Anycast SID [RFC8402] by an SR-MPLS Policy can result in ECMP paths via transit nodes part of that Anycast group. The query and response messages SHOULD be sent to traverse different ECMP paths to measure delay of each of the ECMP path of a Segment List of an SR-MPLS Policy Candidate-Path.¶
The forwarding plane has various hashing functions available to forward packets on specific ECMP paths. For end-to-end SR-MPLS Policy delay measurement, different entropy label [RFC6790] values can be used in query and response messages to take advantage of the hashing function in forwarding plane to influence the ECMP path taken by them.¶
The considerations for loss measurement for different ECMP paths of an SR-MPLS Policy are outside the scope of this document.¶
The extended TE metrics for link delay and loss can be computed using the performance measurement procedures described in this document and advertised in the routing domain as follows:¶
The procedures defined in this document are backwards compatible with the procedures defined in [RFC6374] at both querier and responder. If the responder does not support the new Mandatory TLV Types defined in this document, it MUST return Error 0x17: Unsupported Mandatory TLV Object as per [RFC6374].¶
The manageability considerations described in Section 7 of [RFC6374] and Section 6 of [RFC7876] are applicable to this specification.¶
The security considerations specified in [RFC6374], [RFC7471], [RFC8570], [RFC8571], [RFC7876], and [RFC9341] also apply to the procedures described in this document.¶
The procedure defined in this document is intended for deployment in a single operator administrative domain. As such, querier node, responder node, forward, and return paths are provisioned by the operator for the probe session. It is assumed that the operator has verified the integrity of the forward and return paths of the probe packets.¶
The "Return Path" TLV extensions defined in this document may be used for potential address spoofing. For example, a query message may carry a return path that has destination that is not local at the querier. To prevent such possible attacks, the responder MAY drop the query messages when it cannot determine whether the return path has the destination local at the querier. The querier may send a proper source address in the "Source Address" TLV that the responder can use to make that determination, for example, by checking the access control list provisioned by the operator.¶
IANA is requested to allocate values for the following Mandatory TLV Types for [RFC6374] from the "MPLS Loss/Delay Measurement TLV Object" registry contained within the "Generic Associated Channel (G-ACh) Parameters" registry set:¶
Value | Description | Reference |
---|---|---|
TBA1 | Return Path TLV | This document |
TBA2 | Block Number TLV | This document |
The Block Number TLV is carried in the query and response messages and Return Path TLV is carried in the query messages.¶
IANA is requested to create a registry for "Return Path Sub-TLV Type". All code points in the range 0 through 175 in this registry shall be allocated according to the "IETF Review" procedure as specified in [RFC8126]. Code points in the range 176 through 239 in this registry shall be allocated according to the "First Come, First Served" procedure as specified in [RFC8126]. Remaining code points are allocated according to Table 2:¶
Value | Description | Reference |
---|---|---|
0 - 175 | IETF Review | This document |
176 - 239 | First Come First Served | This document |
240 - 251 | Experimental Use | This document |
252 - 255 | Private Use | This document |
This document defines the following values in the Return Path Sub-TLV Type registry:¶
Value | Description | Reference |
---|---|---|
0 | Reserved | This document |
1 | MPLS Label Stack of the Return Path | This document |
255 | Reserved | This document |
The authors would like to thank Thierry Couture and Ianik Semco for the discussions on the use-cases for the performance measurement in segment routing networks. Authors would like to thank Patrick Khordoc, Ruby Lin, and Haowei Shi for implementing the mechanisms defined in this document. The authors would like to thank Greg Mirsky and Xiao Min for providing many useful comments and suggestions. The authors would also like to thank Stewart Bryant, Sam Aldrin, Tarek Saad, and Rajiv Asati for their review comments. Thanks to Huaimo Chen, Yimin Shen, and Xufeng Liu for MPLS-RT expert review, Zhaohui Zhang for RTGDIR early review, Ned Smith for SECDIR review, Roni Even for Gen-ART review, Marcus Ihlar for TSV-ART review, and Dhruv Dhody for OPSDIR review.¶
Sagar Soni Cisco Systems, Inc. Email: sagsoni@cisco.com Zafar Ali Cisco Systems, Inc. Email: zali@cisco.com Pier Luigi Ventre CNIT Italy Email: pierluigi.ventre@cnit.it¶