Internet-Draft | EVPN and IPVPN Interworking | July 2022 |
Rabadan, et al. | Expires 7 January 2023 | [Page] |
EVPN is used as a unified control plane for tenant network intra and inter-subnet forwarding. When a tenant network spans not only EVPN domains but also domains where BGP VPN-IP or IP families provide inter-subnet forwarding, there is a need to specify the interworking aspects between BGP domains of type EVPN, VPN-IP and IP, so that the end to end tenant connectivity can be accomplished. This document specifies how EVPN interworks with VPN-IPv4/VPN-IPv6 and IPv4/IPv6 BGP families for inter-subnet forwarding. The document also addresses the interconnect of EVPN domains for Inter-Subnet Forwarding routes. In addition, this specification defines a new BGP Path Attribute called D-PATH (Domain PATH) that protects gateways against control plane loops. D-PATH modifies the BGP best path selection for multiprotocol BGP routes of SAFI 1, 128 and EVPN IP Prefix routes, and therefore this document updates [RFC4271].¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 January 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
EVPN is used as a unified control plane for tenant network intra and inter-subnet forwarding. When a tenant network spans not only EVPN domains but also domains where BGP VPN-IP or IP families provide inter-subnet forwarding, there is a need to specify the interworking aspects between the different families, so that the end to end tenant connectivity can be accomplished. This document specifies how EVPN should interwork with VPN-IPv4/VPN-IPv6 and IPv4/IPv6 BGP families for inter-subnet forwarding. The document also addresses the interconnect of an EVPN domain to another EVPN domain for Inter-Subnet Forwarding routes. In addition, this specification defines a new BGP Path Attribute called D-PATH (Domain PATH) that protects gateways against control plane loops. Loops are created when two (or more) redundant gateway PEs interconnect two domains and exchange inter-subnet forwarding routes. For instance, if PE1 and PE2 are redundant gateway PEs interconnecting an IPVPN and an EVPN domain, gateway PE1 receives a VPN-IP route to prefix P and propagates the route into an EVPN IP Prefix to P. If gateway PE2 receives the EVPN IP Prefix route, it cannot propagate the route back to the IPVPN domain, or it would create a loop for prefix P.¶
D-PATH modifies the BGP best path selection for multiprotocol BGP routes of SAFI 1, 128 and EVPN IP Prefix routes, and therefore this document updates [RFC4271].¶
EVPN supports the advertisement of IPv4 or IPv6 prefixes in two different route types:¶
When interworking with other BGP address families (AFIs/SAFIs) for inter-subnet forwarding, the IP prefixes in those two EVPN route types must be propagated to other domains using different SAFIs. Some aspects of that propagation must be clarified. Examples of these aspects or procedures across BGP families are: route selection, loop prevention or BGP Path attribute propagation. The Interworking PE concepts are defined in Section 3, and the rest of the document describes the interaction between Interworking PEs and other PEs for end-to-end inter-subnet forwarding.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This section summarizes the terminology related to the "Interworking PE" concept that will be used throughout the rest of the document.¶
BT: a Bridge Table, as defined in [RFC7432]. A BT is the instantiation of a Broadcast Domain in a PE. When there is a single Broadcast Domain in a given EVI, the MAC-VRF in each PE will contain a single BT. When there are multiple BTs within the same MAC-VRF, each BT is associated to a different Ethernet Tag. The EVPN routes specific to a BT, will indicate which Ethernet Tag the route corresponds to.¶
Example: In Figure 1, MAC-VRF1 has two BTs: BT1 and BT2. Ethernet Tag x is defined in BT1 and Ethernet Tag y in BT2.¶
AC: Attachment Circuit or logical interface associated to a given BT or IP-VRF. To determine the AC on which a packet arrived, the PE will examine the combination of a physical port and VLAN tags (where the VLAN tags can be individual c-tags, s-tags or ranges of both).¶
Example: In Figure 1, AC1 is associated to BT1, AC2 to BT2 and AC3 to IP-VRF1.¶
MPLS/NVO tnl: It refers to a tunnel that can be MPLS or NVO-based (Network Virtualization Overlays) and it is used by MAC-VRFs and IP-VRFs. Irrespective of the type, the tunnel may carry an Ethernet or an IP payload. MAC-VRFs can only use tunnels with Ethernet payloads (setup by EVPN), whereas IP-VRFs can use tunnels with Ethernet (setup by EVPN) or IP payloads (setup by EVPN or IPVPN). IPVPN-only PEs have IP-VRFs but they cannot send or receive traffic on tunnels with Ethernet payloads.¶
Example: Figure 1 shows an MPLS/NVO tunnel that is used to transport Ethernet frames to/from MAC-VRF1. The PE determines the MAC-VRF and BT the packets belong to based on the EVPN label (MPLS or VNI). Figure 1 also shows two MPLS/NVO tunnels being used by IP-VRF1, one carrying Ethernet frames and the other one carrying IP packets.¶
Domain: Two PEs are in the same domain if they are attached to the same tenant and the packets between them do not require a data path IP lookup (in the tenant space) in any intermediate router. A gateway PE is always configured with multiple DOMAIN-IDs.¶
Example 1: Figure 2 depicts an example where Tenant Systems TS1 and TS2 belong to the same tenant, and they are located in different Data Centers that are connected by gateway PEs (see the gateway PE definition later). These gateway PEs use IPVPN in the WAN. When TS1 sends traffic to TS2, the intermediate routers between PE1 and PE2 require a tenant IP lookup in their IP-VRFs so that the packets can be forwarded. In this example there are three different domains. The gateway PEs connect the EVPN domains to the IPVPN domain.¶
Interworking PE: a PE that may advertise a given prefix with an EVPN ISF route (RT-2 or RT-5) and/or an IPVPN ISF route and/or a BGP IP ISF route. An interworking PE has one IP-VRF per tenant, and zero, one or multiple MAC-VRFs per tenant. Each MAC-VRF may contain one or more BTs, where each BT may be attached to that IP-VRF via IRB. There are two types of Interworking PEs: composite PEs and gateway PEs. Both PE functions can be independently implemented per tenant and they may both be implemented for the same tenant.¶
Example: Figure 1 shows an interworking PE of type gateway, where ISF SAFIs 1, 128 and 70 are enabled. IP-VRF1 and MAC-VRF1 are instantiated on the PE, and together provide inter-subnet forwarding for the tenant.¶
Composite PE: an interworking PE that is attached to a composite domain and advertises a given prefix to an IPVPN peer with an IPVPN ISF route, to an EVPN peer with an EVPN ISF route, and to a route reflector with both an IPVPN and EVPN ISF route. A composite PE performs the procedures of Section 7.¶
Example: Figure 4 shows an example where PE1 is a composite PE since PE1 has EVPN and another ISF SAFI enabled to the same route-reflector, and PE1 advertises a given IP prefix IPn/x twice, one using EVPN and another one using ISF SAFI 128. PE2 and PE3 are not composite PEs.¶
Gateway PE: an interworking PE that is attached to two (or more) domains, each either regular or composite, and which, based on configuration, does one of the following:¶
A gateway PE is always configured with multiple DOMAIN-IDs. The DOMAIN-ID is encoded in the Domain Path Attribute (D-PATH), and advertised along with ISF SAFI routes. Section 4 describes the D-PATH attribute.¶
Example: Figure 5 illustrates an example where PE1 is a gateway PE since the EVPN and IPVPN SAFIs are enabled on different BGP peers, and a given local IP prefix IPn/x is sent to both BGP peers for the same tenant. PE2 and PE1 are in one domain and PE3 and PE1 are in another domain.¶
Composite/Gateway PE: an interworking PE that is both a composite PE and a gateway PE that is attached to two domains, one regular and one composite, and which does the following:¶
This is particularly useful when a tenant network uses multiple ISF SAFIs (BGP IP, IPVPN and EVPN domains) and any-to-any connectivity is required. In this case end-to-end control plane consistency, when possible, is desired.¶
The BGP Domain Path (D-PATH) attribute is an optional and transitive BGP path attribute.¶
Similar to AS_PATH, D-PATH is composed of a sequence of Domain segments. Each Domain segment is comprised of <domain segment length, domain segment value>, where the domain segment value is a sequence of one or more Domains, as illustrated in Figure 6. Each domain is represented by <DOMAIN-ID:ISF_SAFI_TYPE>.¶
Value | Type |
---|---|
0 | Gateway PE local ISF route |
1 | SAFI 1 |
70 | EVPN |
128 | SAFI 128 |
About the BGP D-PATH attribute:¶
Identifies the sequence of domains, each identified by a <DOMAIN-ID:ISF_SAFI_TYPE> through which a given ISF route has passed.¶
It is added/modified by a gateway PE when propagating an update to a different domain (which runs the same or different ISF SAFI):¶
For a local ISF route, i.e., a configured route or a route learned from a local attachment circuit, a gateway PE has three choices:¶
An ISF route received by a gateway PE with a D-PATH attribute that contains one or more of its locally associated DOMAIN-IDs for the IP-VRF is considered to be a looped ISF route. The ISF route in this case MUST be flagged as "looped" and be installed in the IP-VRF only in case there is no better route after the best path selection (Section 6). The ISF_SAFI_TYPE is irrelevant for the purpose of loop detection of an ISF route. In other words, an ISF route is considered as a looped route if it contains a D-PATH attribute with at least one DOMAIN-ID matching a local DOMAIN-ID, irrespective of the ISF_SAFI_TYPE of the DOMAIN-ID.¶
For instance, in the example of Figure 2, gateway GW1 receives TS1 prefix in two different ISF routes:¶
Gateway GW1 flags the SAFI 128 route as "looped" (since 6500:1 is a local DOMAIN-ID in GW1) and it will not install it in the tenant IP-VRF, since the route selection process selects the EVPN RT-5 due to a shorter D-PATH attribute (Section 6). Gateway GW1 identifies the route as "looped" even if the ISF_SAFI_TYPE value is unknown to GW1, i.e., any value different from the ones specified in this document).¶
A DOMAIN-ID value on a gateway PE MAY be assigned for a peering domain or MAY be scoped for an individual tenant IP-VRF.¶
The following error-handling rules apply to the D-PATH attribute:¶
A Domain Segment is considered malformed in any of the following cases:¶
Based on its configuration, a gateway PE is required to propagate an ISF route between two domains that use the same or different ISF SAFI. This requires a definition of what a gateway PE has to do with BGP Path Attributes attached to the ISF route that the gateway PE is propagating. This section specifies the BGP Path Attribute propagation modes that a gateway PE may follow when receives an ISF route with ISF SAFI-x, installs the route in the IP-VRF and exports the ISF route into ISF SAFI-y. ISF SAFI-x and SAFI-y values MAY be the same values.¶
This is the default mode of operation for gateway PEs that re-export ISF routes from a domain into another domain. In this mode, the gateway PE will simply re-initialize the BGP Path Attributes when propagating an ISF route, as though it would for direct or local IP prefixes. This model may be enough in those use-cases where, e.g., the EVPN domain is considered an "abstracted" CE and remote IPVPN/IP PEs don't need to consider the original EVPN Attributes for path calculations.¶
Since this mode of operation does not propagate the D-PATH attribute either, redundant gateway PEs are exposed to routing loops. Those loops may be resolved by policies and the use of other attributes, such as the Route Origin extended community [RFC4360], however not all the loop situations may be identified.¶
In this mode, the gateway PE simply keeps accumulating or mapping certain key commonly used BGP Path Attributes when propagating an ISF route. This mode is typically used in networks where EVPN and IPVPN SAFIs are used seamlessly to distribute IP prefixes.¶
The following rules MUST be observed by the gateway PE when propagating BGP Path Attributes:¶
The gateway PE imports an ISF route in the IP-VRF and stores the original Path Attributes. The following set of Path Attributes SHOULD be propagated by the gateway PE to other ISF SAFIs (other BGP Path Attributes SHOULD NOT be propagated):¶
As discussed in point 1, Communities, Extended Communities and Large Communities SHOULD be preserved from the originating ISF route by the gateway PE. Exceptions of Extended Communities that SHOULD NOT be propagated are:¶
The gateway PE SHOULD NOT copy the above extended communities from the originating ISF route to the re-advertised ISF route.¶
Instead of propagating a high number of (host) ISF routes between domains, a gateway PE that receives multiple ISF routes from a domain MAY choose to propagate a single ISF aggregate route into a different domain. In this document, aggregation is used to combine the characteristics of multiple ISF routes in such way that a single aggregate ISF route can be propagated to the destination domain. Aggregation of multiple ISF routes of one ISF SAFI into an aggregate ISF route is only done by a gateway PE.¶
Aggregation on gateway PEs may use either the No-Propagation-Mode or the Uniform-Propagation-Mode explained in Section 5.1 and Section 5.2, respectively.¶
When using Uniform-Propagation-Mode, Path Attributes of the same type code MAY be aggregated according to the following rules:¶
Assuming the aggregation can be performed (the above rules are applied), the operator should consider aggregation to deal with scaled tenant networks where a significant number of host routes exists. For example, large Data Centers.¶
A PE may receive an IP prefix in ISF routes with different ISF SAFIs, from the same or different BGP peer. It may also receive the same IP prefix (host route) in an EVPN RT-2 and RT-5. A route selection algorithm across all ISF SAFIs is needed so that:¶
For a given prefix advertised in one or more non-EVPN ISF routes, the BGP best path selection procedure will produce a set of "non-EVPN best paths". For a given prefix advertised in one or more EVPN ISF routes, the BGP best path selection procedure will produce a set of "EVPN best paths". To support EVPN/non-EVPN ISF interworking in the context of the same IP-VRF receiving non-EVPN and EVPN ISF routes for the same prefix, it is then necessary to run a tie-breaking selection algorithm on the union of these two sets. This tie-breaking algorithm begins by considering all EVPN and other ISF SAFI routes, equally preferable routes to the same destination, and then selects routes to be removed from consideration. The process terminates as soon as only one route remains in consideration.¶
The route selection algorithm must remove from consideration the routes following the rules and the order defined in [RFC4271], with the following exceptions and in the following order:¶
The above process modifies the [RFC4271] selection criteria for multiprotocol BGP routes with SAFIs 1, 128 and EVPN IP Prefix routes to include the shortest D-PATH so that operators minimize the number of Gateways and domains through which packets need to be routed.¶
Example 1 - PE1 receives the following routes for IP1/32, that are candidate to be imported into IP-VRF-1:¶
{SAFI=EVPN, RT-2, Local-Pref=100, AS-Path=(100,200)} {SAFI=EVPN, RT-5, Local-Pref=100, AS-Path=(100,200)} {SAFI=128, Local-Pref=100, AS-Path=(100,200)}¶
Selected route: {SAFI=EVPN, RT-2, Local-Pref=100, AS-Path=100,200] (due to step 3, and no ECMP).¶
Example 2 - PE1 receives the following routes for IP2/24, that are candidate to be imported into IP-VRF-1:¶
{SAFI=EVPN, RT-5, D-PATH=(6500:3:IPVPN), AS-Path=(100,200), MED=10} {SAFI=128, D-PATH=(6500:1:EVPN,6500:2:IPVPN), AS-Path=(200), MED=200}¶
Selected route: {SAFI=EVPN, RT-5, D-PATH=(6500:3:IPVPN), AS- Path=(100,200), MED=10} (due to step 1).¶
As described in Section 3, composite PEs are typically used in tenant networks where EVPN and IPVPN are both used to provide inter-subnet forwarding within the same composite domain.¶
Figure 7 depicts an example of a composite domain, where PE1/PE2/PE4 are composite PEs (they support EVPN and IPVPN ISF SAFIs on their peering to the Route Reflector), and PE3 is a regular IPVPN PE.¶
In a composite domain with composite and regular PEs:¶
Composite PEs MUST process routes for the same prefix coming from different ISF SAFI routes, and perform route selection.¶
Section 3 defines a gateway PE as an Interworking PE that is attached to two (or more) domains and propagates ISF routes between those domains. Examples of gateway PEs are Data Center gateways connecting domains that make use of EVPN and other ISF SAFIs for a given tenant. The gateway PE procedures in this document provide an interconnect solution for ISF routes and complement the gateway definition of [RFC9014], which focuses on the interconnect solution for Layer 2. This section applies to the interconnect of two domains that use different ISF SAFIs (e.g., EVPN to IPVPN), as well as the interconnect of two domains of the same ISF SAFI (e.g., EVPN to EVPN). Figure 8 illustrates a gateway PE use-case, in which PE1 and PE2 (and PE3/PE4) are gateway PEs interconnecting domains for the same tenant.¶
The procedures for a gateway PE enabled for ISF SAFI-x and ISF SAFI-y on the same IP-VRF follow:¶
A gateway PE that imports an ISF SAFI-x route to prefix P in an IP-VRF, MUST export P in ISF SAFI-y if:¶
In the example of Figure 8, gateway PE1 and PE2 receive an EVPN RT-5 with IP1/24, install the prefix in the IP-VRF and re-advertise it using SAFI 128.¶
A gateway PE that receives an ISF SAFI-x route to prefix P in an IP-VRF MUST NOT export P in ISF SAFI-y if:¶
Once the gateway PE determines that P must be exported, P will be advertised using ISF SAFI-y as follows:¶
The D-PATH attribute MUST be included, so that loops can be detected in remote gateway PEs. When a gateway PE propagates an ISF route between domains, it MUST prepend a <DOMAIN-ID:ISF_SAFI_TYPE> to the received D-PATH attribute. The DOMAIN-ID and ISF_SAFI_TYPE fields refer to the domain over which the gateway PE received the IP prefix and the ISF SAFI of the route, respectively. If the received IP prefix route did not include any D-PATH attribute, the gateway IP MUST add the D-PATH when readvertising. The D-PATH in this case will have only one segment on the list, the <DOMAIN-ID:ISF_SAFI_TYPE> of the received route.¶
In the example of Figure 8, gateway PE1/PE2 receive the EVPN RT-5 with no D-PATH attribute since the route is originated at PE5. Therefore PE1 and PE2 will add the D-PATH attribute including <DOMAIN-ID:ISF_SAFI_TYPE> = <6500:1:EVPN>. Gateways PE3/PE4 will propagate the route again, now prepending their <DOMAIN-ID:ISF_SAFI_TYPE> = <6500:2:IPVPN>. PE6 receives the EVPN RT-5 routes with D-PATH = {<6500:2:IPVPN>,<6500:1:EVPN>} and can use that information to make BGP path decisions.¶
While Interworking PE networks may well be similar to the examples described in Section 7 and Section 8, in some cases a combination of both functions may be required. Figure 9 illustrates an example where the gateway PEs are also composite PEs, since not only they need to propagate ISF routes between domains (from EVPN SAFI to IPVPN and/or EVPN SAFIs), but they also need to interwork with IPVPN-only PEs in a domain with a mix of composite and IPVPN-only PEs.¶
In the example above, PE1 and PE2 MUST follow the procedures described in Section 7 and Section 8. Compared to the example in Section 8, PE1 and PE2 now need to also propagate ISF routes from EVPN to EVPN, in addition to propagating prefixes from EVPN to IPVPN.¶
It is worth noting that PE1 and PE2 will receive TS4's IP prefix via IPVPN and EVPN RT-5 routes. When readvertising to NVE1 and NVE2, PE1 and PE2 will consider the D-PATH rules and attributes of the selected route for TS4 (Section 6 describes the Route Selection Process).¶
An Interworking PE (acting as gateway PE or composite PE) observes the following error-handling procedures for ISF routes:¶
If a gateway PE is set to propagate BGP Path Attributes for ISF routes across domains, the procedures in Section 5.2 guarantee that a BGP speaker does not receive UPDATES with well-formed but unexpected BGP Path Attributes. If a gateway PE fails to follow the propagation rules in Section 5.2 and propagates some BGP Path Attributes erroneously, the receiving PEs follow the specifications for the specific ISF route type and BGP Path Attribute. Some (but not all) examples follow:¶
This document describes the procedures required in PEs that process and advertise ISF routes for a given tenant. In particular, this document defines:¶
The above procedures provide an operator with the required tools to build large tenant networks that may span multiple domains, use different ISF SAFIs to handle IP prefixes, in a deterministic way and with routing loop protection.¶
In general, the security considerations described in [RFC9136] and [RFC4364] apply to this document.¶
Section 4 introduces the use of the D-PATH attribute, which provides a security tool against control plane loops that may be introduced by the use of gateway PEs that propagate ISF routes between domains. A correct use of the D-PATH will prevent control plane and data plane loops in the network, however an incorrect configuration of the DOMAIN-IDs on the gateway PEs may lead to the detection of false route loops and the blackholing of the traffic. An attacker may benefit of this transitive attribute to propagate the wrong domain information across multiple domains.¶
In addition, Section 5.2 introduces the propagation of BGP Path Attributes between domains on gateway PEs. Without this mode of propagation, BGP Path Attributes are re-initialized when re-exporting ISF routes into a different domain, and the operator does not have the end-to-end visibility of a given ISF route path. However, the Uniform Propagation mode introduces the capability of propagating BGP Path Attributes beyond the ISF SAFI scope. While this is a useful tool to provide end-to-end visibility across multiple domains, it can also be used by an attacker to propagate wrong (although correctly formed) BGP Path Attributes that can influence the BGP path selection in remote domains. An implementation can also choose Section 5.1 (No-propagation mode) to minimize the risks derived from propagating incorrect attributes, but again, this mode of operation will prevent the receiver PE from seeing the attributes that the originator of the route intended to convey in the first place.¶
This document defines a new BGP path attribute known as the BGP Domain Path (D-PATH) attribute.¶
IANA has assigned a new attribute code type from the "BGP Path Attributes" subregistry under the "Border Gateway Protocol (BGP) Parameters" registry:¶
Path Attribute Value Code Reference -------------------- ------------------------ --------------- 36 BGP Domain Path (D-PATH) [This document]¶
The authors want to thank Russell Kelly, Dhananjaya Rao, Suresh Basavarajappa, Mallika Gautam, Senthil Sathappan, Arul Mohan Jovel, Naveen Tubugere, Mathanraj Petchimuthu, Eduard Vasilenko, Amit Kumar, Mohit Kumar and Lukas Krattiger for their review and suggestions.¶