Internet-Draft | TPE-aided SPE-Protection | January 2021 |
Wang | Expires 29 July 2021 | [Page] |
MPLS EVPN SPEs cannot make use of anycast MPLS tunnel (whose egress LSRs are two of these SPEs) because that the two SPEs will re-assign different EVPN labels for the same EVPN prefix. It will be complicated to static-configure EVPN label for each EVPN prefix. At the same time, the TPEs should advertise specified signalling to do egress node (TPE) protection. This document specifies a egress node protection signalling from/among TPE nodes, and TPE (whether it is egress-protected or not) can help the SPEs to do egress protection on the basis of that signalling.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 29 July 2021.¶
Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
In section 2.5 and section 4.4 of [I-D.wang-bess-evpn-egress-protection], a MPLS egress protection signalling is defined. The section 5.4 of [I-D.wang-bess-evpn-context-label] uses the same signalling to do egress protection for SPEs. This draft put the two scenarios together, and describe all the unified signallings for the MPLS SPEs and TPEs.¶
Note that the "egress" in "egress protection" means the egress LSR of the underlay LSP, not the egress LSR of the overlay LSP. The SPEs are not the egress LSR of the overlay LSP, but they are the egress LSR of the underlay LSP. So the anycast tunnel for SPEs is also egress protection tunnel for SPEs.¶
This document uses the following acronyms and terms:¶
The above figure is a combination of [I-D.wang-bess-evpn-egress-protection]'s Figure 1 and [I-D.wang-bess-evpn-context-label]'s Figure 6. The TPE1/SPE1/SPE2/TPE2 above is the TPE1/SPE1/SPE2/TPE2 of [I-D.wang-bess-evpn-context-label]'s Figure 6, But TPE2 is also the PE1 of [I-D.wang-bess-evpn-egress-protection]'s Figure 1, and TPE3 is the PE2, SPE1 is the PE3.¶
When TPE2 advertises an EVPN route (say R9), the same R9 will be advertised to both the two SPEs and TPE3. When TPE3 receives R9, they will do EVPN egress protection. When SPE1 or SPE2 receives the same R9, SPE1/SPE2 will advertise R9 to TPE1 with the same nexthop (the anycast tunnel address of SPE1 and SPE2) following Section 3.3.¶
Then the requirement here is clear that we want TPE2 use the same route attributes to advertise R9 to both the SPEs and the TPEs.¶
In addition, Note that when the BUM tunnel (T1) from PE1 (TPE2) to PE2 (TPE3) travels through the PLR1, and the PLR1 reroutes these packets (destined to PE2) back to PE1 when PE2 fails, at that moment, PE1 should drop these packets because their EVI label are mirrored EVI labels (in context-specific label space) but their ESI labels are not absent.¶
Note that the Leaf labels (along with mirrored EVI labels) should be distinguished from the ESI labels (along with mirrored EVI labels), because that the former should not be dropped but the latter can be dropped. They can be distinguished by installing mirrored Leaf labels, but the mirrored ESI labels need not be installed.¶
When TPE3 receives an EVPN route R0 whose nexthop matches the prefix LOC1, TPE3 may discard the route R0 because its nexthop is considered to be TPE3's own address. Even though TPE3 don't disccard R0, TPE3 cannot use its nexthop to send an EVPN data packet to TPE2.¶
Because that a destination IP within prefix LOC1 (in forms of LOC1_P) will be considered to be sent to TPE3 itself. So we should use IP_N1 and IP_N2 to establish the bypass path between TPE2 and TPE3 instead of LOC1 and LOC2.¶
The downstream-CLS ID Extended Community is a new Transitive Opaque EC with the following structure (Sub-Type value to be assigned by IANA):¶
M bit: Multi-homing Flag. If the EVPN route is advertised by a TPE of a redundancy group, and the nexthop of that route is the TPE's anycast address, the multi-homing flag should be set to 1.¶
If the EVPN route is advertised by a SPE of no redundancy group, and the nexthop of that route is not an anycast address, the multi-homing flag should be kept unchanged.¶
If the EVPN route is advertised by a SPE of a redundancy group, and the nexthop of that route is the redundancy group's anycast address, the multi-homing flag should be rewritten to 1.¶
Note that although the downstream-CLS ID EC is highly similar to the Context Label Space ID Extended Community (see section 3.1 of [I-D.ietf-bess-mvpn-evpn-aggregation-label]) in their encodings, they have absolutely different behaviors in data-plane. The CLS-ID EC should be treated as an incomming label in data-plane, but the downstream-CLS ID EC should be treated as an outgoing label in data-plane. So they couldn't share the same code-point in the signalling procedures.¶
First of all, We reserve a portion of the label space for assignment by a central authority. We refer to this reserved portion as the "Domain-wide Common Block" (DCB) of labels. This is analogous to the DCB that is described in Section 3.1. The DCB is taken from the same label space that is used for downstream-assigned labels, but each PE would know not to allocate local labels from that space. A PE would know by provisioning which label from the DCB corresponds to itself, and each of other labels from the DCB corresponds to each PE of the domain.¶
Note that the PEs don't have to know exactly which label corresponds to a specified PE, They just need know which label is for itself, and other labels is not for itself.¶
The MPLS-specific procedures are defined in the following list:¶
Now take above use case for example, the two SPEs are the egress nodes of an anycast SR-MPLS tunnel. The anycast SR-MPLS tunnel is used to transport flows from TPE1 to either SPE1 or SPE2 according to load balancing procedures. So SPE1 and SPE2 have to advertise the same EVPN label independently for a given EVPN route.¶
When TPE2 send a MAC/IP advertisement route (say R8) to SPE1 and SPE2, a "Downstream Context-specific Label Space (CLS) ID Extended Community" can be included in R8 along with an EVPN label (say EVL4).¶
When SPE1 and SPE2 receive R8 from TPE2, they should advertise R8 to TPE1 independently, and the next-hop of R8 should be changed to the common anycast node address (say IP_12) of SPE1 and SPE2 before the advertisement. But SPE1 and SPE2 can simply keep R8's EVPN label (the EVL4 from TPE2) unchanged.¶
The contex-VC label (say VCL4) in the "downstream-CLS ID EC" is also kept unchanged.¶
Note that although the EVL4 and VCL4 is unchanged, a CLS-specific ILM whose label operation is "label swapping" should also be installed, because that the outgoing PSN tunnel information should be resolved.¶
Note that the two outgoing-labels of the label-swapping have the same value (EVL4 and VCL4) as the two incomming-labels.¶
Note that if there is no TPE3, thus TPE2 is in no redundancy group. The SPEs will receive R8 with M bit = 0, In such case, the SPEs will not push the VCL4 onto the label stack for TPE2.¶
When TPE2 don't advertise the Downstream-CLS ID EC to SPE1 and SPE2, They have to generate that EC by themselves.¶
In such case, TPE2 should advertise the OPE TLV for R8. And a context-VC infrastructure should be established previously. The context-VC infrastructure should assure that the context-VCs from TPE2 to any other TPEs/SPEs have the same VCL value.¶
Then the SPE1 can set the ID-Value of the Downstream-CLS ID EC to the VCL of the contex VC from TPE2 to itself. The ID-Type of the Downstream-CLS ID EC is set to 0. So the same Downstream-CLS ID EC can be generated by the SPEs independently.¶
It is feasible for such context-VC infrastructure to be implemented on the basis of Kompella VPLS signalling or BGP SR signaling. But it will be better for the admin-EVI (as the context-VC infrastructure) and EVPN VPLS to use the same signalling framework.¶
Please see section 5 of [I-D.wang-bess-evpn-egress-protection].¶
The label stack on the anycast SR-MPLS tunnel is constructed by TPE1 as the following:¶
Note that the SR Tunnel Label (TL) in the label stack is the anycast SR-LSP label from TPE1 to the SPE1 or SPE2. And the VCL4 in the label stack is mandatory (from the viewpoint of TPE1).¶
Note that the context-VC is constructed (on SPE1 and SPE2) in per-platform label space, and VC labels from TPE2 to SPE1 and SPE2 will be the same value (VCL4). so the label stacks (from the viewpoint of TPE1) are the same for SPE1 and SPE2. That's why the anycast tunnel from TPE1 to SPE1 and SPE2 can be used for R8 by TPE1.¶
When SPE1/SPE2 receives that data packet, then SPE1/SPE2 will perform CLS-specific ILM lookup for the EVPN label in the "TPE2-specific label space" which is identified by the context-VC label VCL4. The label operation will be "swapping", and the new outgoing EVPN label will be the same value (as EVL4).¶
This document introduces a new Transitive Opaque Extended Community "Downstream CLS ID Extended Community". An IANA request will be submitted later for the code-point in the BGP Transitive Opaque Extended Community Sub-Types registry.¶
This section will be added in future versions.¶
TBD.¶