Internet-Draft | EVPN Multicast Source Redundancy Update | October 2022 |
Nagaraj, et al. | Expires 27 April 2023 | [Page] |
draft-ietf-bess-evpn-redundant-mcast-source specifies Warm Standby (WS) and Hot Standby (HS) procedures for handling redundant multicast traffic into an EVPN tenant domain. With the Hot Standby procedure, multiple ingress PEs may inject traffic and an egress PE will decide from which ingress PE traffic will be accepted and forwarded. The decision is based on certain signaling messages and/or BFD status of provider tunnels from the ingress PEs, and the traffic is associated with ingress PEs based on Ethernet Segment Identifier (ESI) labels. As a result, the procedures in that document only apply to MPLS data plane. This document extends the Hot Standby procedures to non-MPLS data planes and EVPN Data Center Interconnect scenarios.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 27 April 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
[I-D.ietf-bess-evpn-redundant-mcast-source] specifies Warm Standby and Hot Standby procedures for handling redundant multicast traffic into an EVPN tenant domain. With the Hot Standby procedure, multiple ingress PEs will inject traffic and an egress PE will decide from which ingress PE traffic will be accepted and forwarded.¶
The PEs that inject redundant traffic advertises Selective Provider Multicast Service Interface (S-PMSI) A-D routes. The routes carry an EVPN Multicast Flags Extended Community with a bit to indicate that matching traffic is from redundant sources. With MPLS data plane, the routes also carry an Ethernet Segment Identifier (ESI) label, indicating the Ethernet Segment on which the traffic is received.¶
When an egress PE receives S-PMSI A-D routes, it decides from which ingress PE it should accept the traffic. The decision could be based on the following factors:¶
All the above options are local behaviors on individual egress PEs.¶
With MPLS data plane, the Hot Standby redundant flows from different PEs are distinguished via ESI labels. With non-MPLS IP encapsulations like VXLAN/NVGRE, this document specifies that the redundant flows are distinguished by the source IP address (Source VTEP IP) in the outer IP header.¶
This document also makes explicit that non-MPLS IP tunnels that carry an identifier of the source Ethernet Segment reuse all the procedures of [I-D.ietf-bess-evpn-redundant-mcast-source] for Hot Standby redundancy. Examples of these tunnels used by EVPN are GENEVE [I-D.ietf-bess-evpn-geneve] and SRv6 [RFC9252].¶
When an EVPN network is used as Data Center Interconnect (DCI) for DCs (e.g., VxLAN or EVPN), multiple gateways (GWs) are placed between a DC and DCI, as described in [RFC9014]. A virtual Ethernet Segment is defined for each EVPN (the DC and/or DCI) and multi-homed to the GWs. A Designated Forwarder (DF) is elected for each virtual ES (ethernet segment). Each GW can receive the same BUM traffic from a DC/DCI EVPN but only the DF will forward traffic to the next DCI/DC (corresponding to the virtual ES).¶
This section discusses how source redundancy works with DCI, and how DCI GWs can optionally introduce redundant flows even when there is no source redundancy at source DC.¶
[I-D.ietf-bess-evpn-redundant-mcast-source] is MPLS-based. It is "true" source redundancy in that multiple of sources of the same flow are attached to different Ethernet Segments. S-PMSI A-D routes announce the redundant flows and carry ESI Label Extended Communities (ECs) for the ESes so that an egress PE can choose from which source ES the packets will be accepted.¶
With DCI, the source ESes are hidden outside the source DC, and different DC/DCI may use different data planes. Additionally, currently only the GW that is the DF for the Interconnect Ethernet Segment (I-ES) will forward BUM traffic to the downstream DC/DCI, so the benefit of HS is lost once the first DC boundary is crossed.¶
The above issues are solved as following:¶
When the S-PMSI A-D routes do not carry ESI Label ECs, an egress PE chooses from which PE/GW (vs. ES) to accept traffic from.¶
In the "true source redundancy" case, S-PMSI A-D routes announce the redundancy and the DCI GWs always forward accepted flows regardless of the DF status.¶
The GWs may also forward all BUM traffic regardless of DF status - not just those redundant flows announced by S-PMSI A-D routes. This creates a similar scenario of source redundancy, though it is introduced by the GWs. A downstream GW/PE can choose which redundant flows need to be accepted/discarded based on the A-D per ES routes for the I-ES instead of S-PMSI A-D routes.¶
This requires that all downstream PEs/GWs behave consistently. That is ensured either based on provisioning or based on signaling (details to be added in a future revision).¶
In the "true source redundancy" case, all flows covered by the (, g) or (s-prefix, g) in the S-PMSI A-D routes are treated as redundant flows. In the GW-introduced redundancy, (, g) flows are treated as distinct flows that have redundant copies. They may be from different PEs in the local DC and all must be accepted, or they may be from different DCs in which case only traffic from one GW for each upstream DC can be accepted, as explained below.¶
Consider that a DCI interconnects three DCs. GW1a/GW1b connect DC1 and the DCI, GW2a/GW2b connect DC2 and the DCI, and GW3a/GW3b connect DC3 and the DCI.¶
An egress PE1 in DC1 may need to accept and forward (, G) traffic from all local PEs in DC1 and GW1a but not from the GW1b. To do so, it installs a (, G) forwarding state in a BD (Broadcast Domain) with indication that traffic from GW1b must be discarded. Similarly, GW3a/GE3b may need to accept and forward (, G) traffic from GW1a/GW2a but not from GW1b/GW2b. To do so, it installs a (, G) forwarding state with indication that traffic from GW1b/GW2b must be discarded.¶
The reverse logic (of specifying PEs/GWs from which traffic should not be accepted) is only needed for (*, G) entries in the DCI case. For (S, G) case, the reverse logic is not needed because an egress PE should be able to decide from which PE/GW the traffic should be accepted.¶
Both flavors of redundancy can co-exist. For redundant flows announced by S-PMSI A-D routes, the method described in Section 1.2.1 is used. For GW-introduced redundancy, the method described in Section 1.2.2 is used. The difference between the two on downstream PEs/GWs is that one uses S-PMSI A-D routes while the other uses I-ES A-D per ES routes to choose which flow to accept, and for (*,g) flows in the latter case, reverse logic is needed.¶
In case the EVPN network uses non-MPLS IP tunnels without source Ethernet Segment identification, e.g., VXLAN/NVGRE, the procedures in [I-D.ietf-bess-evpn-redundant-mcast-source] for Hot Standby redundancy are modified as follows:¶
Upon receiving the S-PMSI A-D routes, the downstream PEs select a primary upstream PE out of the list of (S-PMSI A-D route) next hops and add an RPF check to the (,G)/(S,G) state in the BD or SBD (Supplementary Broadcast Domain). This RPF check discards all ingress packets to (,G)/(S,G) that are not received from the selected primary Source VTEP. The selection of the primary upstream PE is a matter of local policy, for instance, an egress PE could keep track of traffic statistics of redundant flows and dynamically decide which flow is accepted based on traffic threshold information.¶
The selection of the upstream PE for non-MPLS IP tunnels, instead of the primary Source Ethernet Segment, provides a solution for redundant sources connected to different upstream PEs, however it MUST NOT be used when the redundant sources are connected to the same upstream PE, or multi-homed to the same set of upstream PEs.¶
In case the EVPN network uses non-MPLS IP tunnels that can carry a source Ethernet Segment identification, e.g., GENEVE or SRv6, all the procedures in [I-D.ietf-bess-evpn-redundant-mcast-source] for Hot Standby redundancy are followed. The following considerations apply:¶
To be added.¶
No additional security considerations are needed besides what are in [I-D.ietf-bess-evpn-redundant-mcast-source].¶