Internet-Draft | cs-srte | March 2022 |
Schmutzer, et al. | Expires 8 September 2022 | [Page] |
This document describes how Segment Routing (SR) policies can be used to satisfy the requirements for strict bandwidth guarantees, end-to-end recovery and persistent paths within a segment routing network. SR policies satisfying these requirements are called "circuit-style" SR policies (CS-SR policies).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 September 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Segment routing does allow for a single network to carry both typical IP (connection-less) services and connection-oriented transport services. IP services required ECMP and TI-LFA, while transport services that normally are delivered via dedicated circuit-switched SONET/SDH or OTN networks do require:¶
Such a "transport centric" behaviour is referred to as "circuit-style" in this document.¶
This document describes how SR policies [I-D.ietf-spring-segment-routing-policy] and adjacency-SIDs defined in the SR architecture [RFC8402] together with a stateful Path Computation Element (PCE) [RFC8231] can be used to satisfy those requirements. It includes how end-to-end recovery and path integrity monitoring can be implemented.¶
SR policies that satisfy those requirements are called "circuit-style" SR policies (CS-SR policies).¶
The reference model for CS-SR policies is following the segment routing architecture [RFC8402] and SR policy architecture [I-D.ietf-spring-segment-routing-policy] and is depicted in Figure 1.¶
By nature of CS-SR policies, paths will be computed and maintained by a stateful PCE defined in [RFC8231]. When using a MPLS data plane [RFC8660], PCEP extensions defined in [RFC8664] will be used. When using a SRv6 data plane [RFC8754], PCEP extensions defined in [I-D.ietf-pce-segment-routing-ipv6] will be used.¶
In order to satisfy the requirements of CS-SR policies, each link in the topology MUST have:¶
When using a MPLS data plane [RFC8660] existing IGP extensions defined in [RFC8667] and [RFC8665] and BGP-LS defined in [RFC9085] can be used to distribute the topology information including those persistent and unprotected Adj-SIDs.¶
When using a SRv6 data plane [RFC8754] the IGP extensions defined in [I-D.ietf-lsr-isis-srv6-extensions] and [I-D.ietf-lsr-ospfv3-srv6-extensions] and BGP-LS extensions in [I-D.ietf-idr-bgpls-srv6-ext] apply.¶
A CS-SR policy has the following characteristics:¶
Multiple candidate paths in case of protection/restoration:¶
A CS-SR policy between A and Z is configured both on A (with Z as endpoint) and Z (with A as endpoint) as shown in Figure 1.¶
Both nodes A and Z act as PCC and delegate path computation to the PCE using the extensions defined in [RFC8664]. The PCRpt message sent from the headends to the PCE contains the following parameters:¶
LSPA object (section 7.11 of [RFC5440]) : to indicate that no local protection requirements¶
If the SR-policies are configured with more than one candidate path, a PCEP request is sent per candidate path. Each PCEP request does include the "SR Policy Association" object (type 6) as defined in [I-D.ietf-pce-segment-routing-policy-cp] to make the PCE aware of the candidate path belonging to the same policy.¶
The signaling extensions described in [I-D.sidor-pce-circuit-style-pcep-extensions] are used to ensure that¶
Bandwidth adjustment can be requested after initial creation by signaling both requested and operational bandwidth in the BANDWIDTH object but the PCE is not allowed to respond with a changed path.¶
The proper operation of each segment list is validated by both headends using STAMP in loopback measurement mode as described in section 4.2.3 of [I-D.ietf-spring-stamp-srpm].¶
As the STAMP test packets are including both the segment list of the forward and reverse path, standard segment routing data plane operations will make those packets get switched along the forward path to the tailend and along the reverse path back to the headend.¶
The headend forms the bidirectional SR Policy association using the procedure described in [I-D.ietf-pce-sr-bidir-path] and receives the information about the reverse segment list from the PCE as described in section 4.5 of [I-D.ietf-pce-multipath]¶
The same STAMP session used for liveliness monitoring can be used to measure delay. As loopback mode is used only round-trip delay is measured and one-way has to be derived by dividing the round-trip delay by two.¶
The same STAMP session can also be used to estimate round-trip loss as described in section 5 of [I-D.ietf-spring-stamp-srpm].¶
Various protection and restoration schemes can be implemented. The terms "protection" and "restoration" are used with same subtle distinctions outlined in section 1 of [RFC4872], [RFC4427] and [RFC3386] respectively.¶
In the most basic scenario no protection nor restoration is required. The CS-SR policy has only one candidate path configured. This candidate path is established, activated (O field in LSP object is set to 2) and is carrying traffic.¶
In case of a failure the CS-SR policy will go down and traffic will not be recovered.¶
Typically two CS-SR policies are deployed either within the same network with disjoint paths or in two completely separate networks and the overlay service is responsible for traffic recovery.¶
To avoid pre-allocating protection bandwidth in steady state (Section 7.3) but still be able to react to network failures and recover traffic flow in a deterministic way (maintain required bandwidth commitment) the CS-SR policy is configured with two candidate paths.¶
The candidate path with higher preference is established, activated (O field in LSP object is set to 2) and is carrying traffic.¶
The second candidate path with lower preference is only established and activated (O field in LSP object is set to 2) upon a failure impacting the first candidate path in order to send traffic over an alternate path through the network around the failure with potentially relaxed constraints but still satisfying the bandwidth commitment.¶
The second candidate path is generally only requested from the PCE and activated after a failure, but may also be requested and pre-established during CS-SR policy creation with the downside of bandwidth being set aside ahead of time.¶
As soon as the failure that brought the first candidate path down is cleared, the second candidate path is getting deactivated (O field in LSP object is set to 1) or torn down. The first candidate path is activated (O field in LSP object is set to 2) and traffic sent across it.¶
Restoration and reversion behavior is bidirectional. As described in Section 6.1, both headends use liveness in loopback mode and therefore even in case of unidirectional failures both headends will detect the failure or clearance of the failure and switch traffic away from the failed or to the recovered candidate path.¶
For fast recovery against failures the CS-SR policy is configured with two candidate paths. Both paths are established but only the candidate with higher preference is activated (O field in LSP object is set to 2) and is carrying traffic. The candidate path with lower preference has its O field in LSP object set to 1.¶
Appropriate routing of the protect path diverse from the working path can be requested from the PCE by using the "Disjointness Association" object (type 2) defined in [RFC8800] in the PCRpt messages. The disjoint requirements are communicated in the "DISJOINTNESS-CONFIGURATION TLV"¶
The P bit may be set for first candidate path to allow for finding the best working path that does satisfy all constraints without considering diversity to the protect path.¶
The "Objective Function (OF) TLV" as defined in section 5.3 of [RFC8800] may also be added to minimize the common shared resources.¶
Upon a failure impacting the candidate path with higher preference carrying traffic, the candidate path with lower preference is activated immediately and traffic is now sent across it.¶
Protection switching is bidirectional. As described in Section 6.1, both headends will generate and receive their own loopback mode test packets, hence even a unidirectional failure will always be detected by both headends without protection switch coordination required.¶
Two cases are to be considered when the failure impacting the candidate path with higher preference is cleared:¶
For further resiliency in case of multiple concurrent failures that could affect both candidate paths in a Section 7.3 scenario the CS-SR policy is configured with three candidate paths with decreasing preference.¶
The third candidate path enables restoration and will generally only be established, activated (O field in LSP object is set to 2) and carry traffic after failure(s) have impacted both the candidate path with highest and second highest preference.¶
The third candidate path may also be requested and pre-computed already whenever either the first or second candidate path went down due to a failure with the downside of bandwidth being set aside ahead of time.¶
As soon as failure(s) that brought either the first or second candidate path down is cleared the third candidate path is getting deactivated (O field in LSP object is set to 1), the candidate path that recovered is activated (O field in LSP object is set to 2) and traffic sent across it.¶
Protection switching, restoration and reversion behavior is bidirectional. As described in Section 6.1, both headends use liveness in loopback mode and therefore even in case of unidirectional failures both headends will detect the failure or clearance of the failure and switch traffic away from the failed or to the recovered candidate path.¶
It is very common to allow operators to trigger a switch between candidate paths even no failure is present. I.e. to proactively drain a resource for maintenance purposes. Operator triggered switching between candidate paths is unidirectional and has to be requested on both headends.¶
TO BE ADDED¶
This document has no IANA actions.¶
The author's want to thank Samuel Sidor, Mike Koldychev, Rakesh Gandhi for providing their review comments.¶
Contributors' Addresses¶
Brent Foster Cisco Systems, Inc. Email: brfoster@cisco.com Bertrand Duvivier Cisco System, Inc. Email: bduvivie@cisco.com Stephane Litkowski Cisco Systems, Inc. Email: slitkows@cisco.com¶