Internet-Draft | RIFT Auto-EVPN | July 2021 |
Head, et al. | Expires 10 January 2022 | [Page] |
This document specifies procedures that allow an EVPN overlay to be fully and automatically provisioned when using RIFT as underlay by leveraging RIFT's no-touch ZTP architecture.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 10 January 2022.¶
Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
RIFT is a protocol that focuses heavily on operational simplicity. [RIFT] natively supports Zero Touch Provisioning (ZTP) functionality that allows each node in an underlay network to automatically derive its place in the topology and configure itself accordingly when properly cabled. RIFT can also disseminate Key-Value information contained in Key-Value Topology Information Elements (KV-TIEs) [RIFT-KV]. These KV-TIEs can contain any information and therefore be used for any purpose. Leveraging RIFT to provision EVPN overlays without any need for configuration and leveraging KV capabilities to easily validate correct operation of such overlay without a single point of failure would provide significant benefit to operators in terms of simplicity and robustness of such a solution.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
EVPN supports various service models, this document defines a method for the VLAN-Aware service model defined in [RFC7432]. Other service models may be considered in future revisions of this document.¶
Each model has its own set of requirements for deployment. For example, a functional BGP overlay is necessary to exchange EVPN NLRI regardless of the service model. Furthermore, the requirements are made up of individual variables, such as each node's loopback address and AS number for the BGP session. Some of these variables may be coordinated across each node in a network, but are ultimately locally significant (e.g. route distinguishers). Similarly, calculation of some variables will be local only to each device. RIFT contains currently enough topology information in each node to calculate all those necessary variables automatically.¶
Once the EVPN overlay is configured and becomes operational, RIFT Key-Value TIEs can be used to distribute state information to allow for validation of basic operational correctness without the need for further tooling.¶
The 64-bit RIFT System ID that uniquely identifies a node as defined in RIFT [RIFT].¶
RIFT operates on variants of Clos substrate which are commonly called an IP Fabric. Since EVPN VLANs can be either contained within one fabric or span them, Auto-EVPN introduces the concept of a Fabric ID into RIFT.¶
This section describes an optional extension to LIE packet schema in the form of a 16-bit Fabric ID that identifies a nodes membership within a particular fabric. Auto-EVPN capable nodes MUST support this extension but MAY not advertise it when not participating in Auto-EVPN. A non-present Fabric ID and value of 0 is reserved as ANY_FABRIC and MUST NOT be used for any other purpose.¶
Fabric ID MUST be considered in existing adjacency FSM rules so nodes that support Auto-EVPN can interoperate with nodes that do not. The LIE validation is extended with following clause and if it is not met, miscabling should be declared:¶
(if fabric_id is not advertised by either node OR if fabric_id is identical on both nodes) AND (if auto_evpn_version is not advertised by either node OR if auto_evpn_version is identical on both nodes)¶
The appendix details LIE (Appendix A.1.2) and Node-TIE (Appendix A.2.2) schema changes.¶
Auto-EVPN requires that each node understand its given role within the scope of the EVPN implementation so each node derives the necessary variables and provides the necessary overlay configuration. For example, a leaf node performing VXLAN gateway functions does not need to derive its own Cluster ID or learn one from the route reflector that it peers with.¶
Not all nodes have to participate in Auto-EVPN, however if a node does assume an Auto-EVPN role, it MUST derive the following variables:¶
This section defines an Auto-EVPN role whereby some Top-of-Fabric nodes act as EVPN route reflectors. It is expected that route reflectors would establish IBGP sessions with leaf nodes in the same fabric. The typical route reflector requirements do not change, however determining which specific values to use requires further consideration. ToF nodes performing route reflector functionality MUST derive the following variables:¶
Leaf nodes derive their role from realizing they are at the bottom of the fabric, i.e. not having any southbound adjacencies. Alternately, a node can assume a leaf node if it has only southbound adjacencies to nodes with explicit LEAF_LEVEL to allow for scenarios where RIFT leaves do NOT participate in Auto-EVPN.¶
Leaf nodes MUST derive the following variables:¶
If a leaf node is required to perform layer-2 VXLAN gateway functions, it MUST be capable of deriving the following types of variables:¶
For each VLAN derived in an EVI the following variables MUST be derived:¶
If a leaf node is required to perform layer-3 VXLAN gateway functions, it MUST additionally be capable of deriving the following types of variables:¶
Type-5 EVPN IP Prefix with ToFs performing gateway functionality can also be derived and will be described in a future version of this document.¶
As previously mentioned, not all nodes are required to derive all variables in a given network (e.g. a transit spine node may not need to derive any or participate in Auto-EVPN). Additionally, all derived variables are derived from RIFT's FSM or ZTP mechanism so no additional flooding beside RIFT flooding is necessary for the functionality.¶
It is also important to mention that all variable derivation is in some way based on combinations of System ID, MAC-VRF ID, Fabric ID, EVI and VLAN and MUST comply precisely with calculation methods specified in the Auto-EVPN Variable Derivation section to allow interoperability between different implementations. All foundational code elements such as imports, constants, etc. are also mentioned there.¶
This section describes extensions to both the RIFT LIE packet and Node-TIE schemas in the form of a 16-bit value that identifies the Auto-EVPN Version. Auto-EVPN capable nodes MUST support this extension, but MAY choose not to advertise it in LIEs and Node-TIEs when Auto-EVPN is not being utilized. The appendix describes LIE (Appendix A.1.1) and Node-TIE (Appendix A.2.1) schema changes in detail.¶
This section describes a variable MAC-VRF ID that uniquely identifies an instance of EVPN instance (EVI) and is used in variable derivation procedures. Each EVPN EVI MUST be associated with a unique MAC-VRF ID, this document does not specify a method for making that association or ensuring that they are coordinated properly across fabric(s).¶
First and foremost, RIFT does not advertise anything more specific than the fabric default route in the southbound direction by default. However, Auto-EVPN nodes MUST advertise specific loopback addresses southbound to all other Auto-EVPN nodes so to establish MP-BGP reachability correctly in all scenarios.¶
Auto-EVPN nodes MUST derive a ULA-scoped IPv6 loopback address to be used as both the IBGP source address, as well as the VTEP source when VXLAN gateways are required. Calculation is done using the 6-bytes of reserved ULA space, the 2-byte Fabric ID, and the node's 8-byte System ID. Derivation of the System ID varies slightly depending upon the node's location/role in the fabric and will be described in subsequent sections.¶
Calculation is done using the 6-bytes of reserved ULA space, the 2-byte Fabric ID, and the node's 8-byte System ID.¶
In order for leaf nodes to derive IPv6 loopback addresses, algorithms shown in both auto_evpn_fidsidv6loopback (Figure 24) and auto_evpn_v6prefixfidsid2loopback (Figure 9) are required.¶
IPv4 addresses MAY be supported, but it should be noted that they have a higher likelihood of collision. The appendix contains the required auto_evpn_fidsid2v4loopback (Figure 23) algorithm to support IPv4 loopback derivation.¶
ToF nodes acting as route reflectors MUST derive their loopback address according to the specific section describing the algorithm. Calculation is done using the 6-bytes of reserved ULA space, the 2-byte Fabric ID, and the 8-byte System ID of each elected route reflector.¶
In order for the ToF nodes to derive IPv6 loopbacks, the algorithms shown in both auto_evpn_fidsidv6loopback (Figure 24) and auto_evpn_fidrrpref2rrloopback (Figure 10) are required.¶
In order for the ToF derive the necessary prefix range to facilitate peering requests from any leaf, the algorithm shown in "auto_evpn_fid2fabric_prefixes" (Figure 8) is required.¶
Four Top-of-Fabric nodes MUST be elected as an IBGP route reflector. Each ToF performs the election independently based on system IDs of other ToFs in the fabric obtained via southbound reflection. The route reflector election procedures are defined as follows:¶
This ordering is necessary to prevent a single node with either the highest or lowest System ID from triggering changes to route reflector loopback addresses as it would result in all BGP sessions dropping.¶
For example, if two nodes, ToF01 and ToF02 with System IDs 002c6af5a281c000 and 002c6bf5788fc000 respectively, ToF02 would be elected due to it having the highest System ID of the ToFs (002c6bf5788fc000). If a ToF determines that it is elected as route reflector, it uses the knowledge of its position in the list to derive route reflector v6 loopback address.¶
The algorithm shown in "auto_evpn_sids2rrs" (Figure 6) is required to accomplish this.¶
Considerations for multiplane route reflector elections will be included in future revisions.¶
Nodes in each fabric MUST derive a private autonomous system number based on its Fabric ID so that it is unique across the fabric.¶
The algorithm shown in auto_evpn_fid2private_AS (Figure 25) is required to derive the private ASN.¶
Nodes MUST drive a Router ID that is based on both its System ID and Fabric ID so that it is unique to both.¶
The algorithm shown in auto_evpn_sidfid2bgpid (Figure 11) is required to derive the BGP Router ID.¶
Route reflector nodes in each fabric MUST derive a cluster ID that is based on its Fabric ID so that it is unique across the fabric.¶
The algorithm shown in auto_evpn_fid2clusterid (Figure 26) is required to derive the BGP Cluster ID.¶
Nodes hosting EVPN EVIs MUST derive a route target extended community based on the MAC-VRF ID for each EVI so that it is unique across the network. Route targets MUST be of type 0 as per RFC4360.¶
For example, if given a MAC-VRF ID of 1, the derived route target would be "target:1"¶
The algorithm shown in auto_evpn_evi2rt (Figure 12) is required to derive the Route Target community.¶
Nodes hosting EVPN EVIs MUST derive a type-0 route distinguisher based on its System ID and Fabric ID so that it is unique per MAC-VRF and per node.¶
The algorithm shown in auto_evpn_sidfid2rd (Figure 18) is required to derive the Route Distinguisher.¶
It's obvious that applications utilizing Auto-EVPN overlay services may require a variety of layer-2 and/or layer-3 traffic considerations. Variables supporting these services are also derived based on some combination of MAC-VRF ID, Fabric ID, and other constant values. Integrated Routing and Bridging (IRB) gateway address derivation also leverages a set of constant RANDOMSEEDS (Figure 5) values that MUST be used to provide additional entropy.¶
In order to ensure that VLAN ID's don't collide, a single deployment SHOULD NOT exceed 3 fabrics with 3 EVIs where each EVI terminate 15 VLANs. The algorithms shown in auto_evpn_fidevivlansvlans2desc (Figure 16) and auto_evpn_vlan_description_table (Figure 15) are required to derive VLANs accordingly. An implementation MAY exceed this, but MUST indicate methods to ensure collision-free derivation and describe which VLANs are stretched across fabrics.¶
This section defines methods to derive unique VLAN, VNI, MAC, and gateway address values for deployments where untagged traffic is stretched across multiple fabrics.¶
Untagged traffic stretched across multiple fabrics MUST derive VLAN tags based on MAC-VRF ID in conjunction with a constant value of 1 (i.e. MAC-VRF ID + 1).¶
Untagged traffic stretched across multiple fabrics MUST derive VNIs based on MAC-VRF ID and Fabric ID in conjunction with a constant value. These VNIs MUST correspond to EVPN Type-2 routes.¶
The algorithm shown in auto_evpn_fidevivid2vni (Figure 14) is required to derive VNIs for Type-2 EVPN routes.¶
The MAC address MUST be a unicast address and also MUST be identical for any IRB gateways that belong to an individual bridge-domain across fabrics. The last 5-bytes MUST be a hash of the MAC-VRF ID and a constant value of 1 that is calculated using the previously mentioned random seed values.¶
The algorithm shown in auto_evpn_fidevividsid2mac (Figure 22) is required to derive MAC addresses.¶
The derived IPv6 gateway address MUST be from a ULA-scoped range that will account for the first 6-bytes. The next 5-bytes MUST be the last bytes of the derived MAC address. Finally, the remaining 7-bytes MUST be ::0001.¶
The algorithm shown in auto_evpn_fidevividsid2v6subnet (Figure 21) is required to derive the IPv6 gateway address.¶
The derived IPv4 gateway address MUST be from a RFC1918 range, which accounts for the first octet. The next octet MUST a hash of the MAC-VRF ID and a constant value of 1 that is calculated using the previously mentioned random seed values. Finally, the remaining 2 octets MUST be 0 and 1 respectively.¶
The algorithm shown in auto_evpn_v4prefixfidevividsid2v4subnet (Figure 19) is required to derive the IPv4 gateway address. It should be noted that there is a higher likelihood of address collisions when deriving IPv4 addresses.¶
This section defines methods to derive unique VLAN, VNI, MAC, and gateway address values for deployments where tagged traffic is stretched across multiple fabrics.¶
Tagged traffic stretched across multiple fabrics MUST derive VLAN tags based on MAC-VRF ID in conjunction with a constant value of 16 (i.e. MAC-VRF ID + 16).¶
Tagged traffic stretched across multiple fabrics MUST derive VNIs based on MAC-VRF ID and Fabric ID in conjunction with a constant value. These VNIs MUST correspond to EVPN Type-2 routes.¶
The algorithm shown in auto_evpn_fidevivid2vni (Figure 14) is required to derive VNIs for Type-2 EVPN routes.¶
The MAC address MUST be a unicast address and also MUST be identical for any IRB gateways that belong to an individual bridge-domain across fabrics. The last 5-bytes MUST be a hash of the MAC-VRF ID and a constant value of 1 that is calculated using the previously mentioned random seed values.¶
The algorithm shown in auto_evpn_fidevividsid2mac (Figure 22) is required to derive MAC addresses.¶
The derived IPv6 gateway address MUST be from a ULA-scoped range that will account for the first 6-bytes. The next 5-bytes MUST be the last bytes of the derived MAC address. Finally, the remaining 7-bytes MUST be ::0001.¶
The algorithm shown in auto_evpn_fidevividsid2v6subnet (Figure 21) is required to derive the IPv6 gateway address.¶
The derived IPv4 gateway address MUST be from a RFC1918 range, which accounts for the first octet. The next octet MUST a hash of the MAC-VRF ID and a constant value of 16 that is calculated using the previously mentioned random seed values. Finally, the remaining 2 octets MUST be 0 and 1 respectively.¶
The algorithm shown in auto_evpn_v4prefixfidevividsid2v4subnet (Figure 19) is required to derive the IPv4 gateway address. It should be noted that there is a higher likelihood of address collisions when deriving IPv4 addresses.¶
This section defines a method to derive unique VLAN, VNI, MAC, and gateway address values for deployments where untagged traffic is contained within a single fabric.¶
Tagged traffic contained to a single fabric MUST derive VLAN tags based on MAC-VRF ID and Fabric ID in conjunction with a constant value of 17 (i.e. MAC-VRF ID + Fabric ID + 17).¶
Tagged traffic contained to a single fabric MUST derive VNIs based on MAC-VRF ID and Fabric ID in conjunction with a constant value. These VNIs MUST correspond to EVPN Type-2 routes.¶
The algorithm shown in auto_evpn_fidevivid2vni (Figure 14) is required to derive VNIs for Type-2 EVPN routes.¶
The MAC address MUST be a unicast address and also MUST be identical for any IRB gateways that belong to an individual bridge-domain across fabrics. The last 5-bytes MUST be a hash of the MAC-VRF ID and a constant value of 1 that is calculated using the previously mentioned random seed values.¶
The algorithm shown in auto_evpn_fidevividsid2mac (Figure 22) is required to derive MAC addresses.¶
The derived IPv6 gateway address MUST be from a ULA-scoped range, which accounts for the first 6-bytes. The next 5-bytes MUST be the last bytes of the derived MAC address. Finally, the remaining 7-bytes MUST be ::0001.¶
The algorithm shown in auto_evpn_fidevividsid2v6subnet (Figure 21) is required to derive the IPv6 gateway address.¶
The derived IPv4 gateway address MUST be from a RFC1918 range, which accounts for the first octet. The next octet MUST a hash of the MAC-VRF ID and a constant value of 17 that is calculated using the previously mentioned random seed values. Finally, the remaining 2 octets MUST be 0 and 1 respectively.¶
The algorithm shown in auto_evpn_v4prefixfidevividsid2v4subnet (Figure 19) is required to derive the IPv4 gateway address. It should be noted that there is a higher likelihood of address collisions when deriving IPv4 addresses.¶
Nodes hosting IP Prefix routes MUST derive a type-0 route distinguisher based on its System ID and Fabric ID so that it is unique per IP-VRF and per node.¶
The algorithm shown in auto_evpn_sidfid2rd (Figure 18) is required to derive the Route Target.¶
Nodes hosting IP prefix routes MUST derive a route target extended community based on the MAC-VRF ID for each IP-VRF so that it is unique across the network. Route targets MUST be of type 0.¶
The algorithm shown in auto_evpn_evi2rt (Figure 12) is required to derive the Route Target community.¶
Leaf nodes MAY optionally advertise analytics information about the Auto-EVPN fabric to ToF nodes using RIFT Key-Value TIEs. This may be advantageous in that overlay validation and troubleshooting activities can be performed on the ToF nodes.¶
This section requests suggested values from the RIFT Well-Known Key-Type Registry and describes their use for Auto-EVPN.¶
Name | Value | Description |
---|---|---|
Auto-EVPN Analytics MAC-VRF | 3 | Analytics describing a MAC-VRF on a particular node within a fabric. |
Auto-EVPN Analytics Global | 4 | Analytics describing an Auto-EVPN node within a fabric. |
The normative Thrift schema can be found in the appendix (Appendix A.4).¶
This Key Type describes node level information within the context of the Auto-EVPN fabric. The System ID of the advertising leaf node MUST be used to differentiate the node among other nodes in the fabric.¶
The Auto-EVPN Global Key Type MUST be advertised with the RIFT Fabric ID encoded into the 3rd and 4th bytes of the Key Identifier.¶
where:¶
The value indicating the node's Auto-EVPN role within the fabric.¶
This Key-Value structure contains information about a specific MAC-VRF within the Auto-EVPN fabric.¶
The Auto-EVPN MAC-VRF Key Type MUST be advertised with the Auto-EVPN MAC-VRF ID encoded into the 3rd and 4th bytes of the Key Identifier.¶
All values advertised in a MAC-VRF Key-Value TIE MUST represent only state of the local node.¶
where:¶
The authors would like to thank Olivier Vandezande, Matthew Jones, and Michal Styszynski for their contributions.¶
This document introduces no new security concerns to RIFT or other specifications referenced in this document.¶
This section contains the normative Thrift models required to support Auto-EVPN. Per the main RIFT [RIFT] specification, all signed values MUST be interpreted as unsigned values.¶
struct LIEPacket { ... /** It provides optional version of EVPN ZTP as 256 * MAJOR + MINOR */ 26: optional i16 auto_evpn_version; ...¶
struct NodeTIEElement { ... /** It provides optional version of EVPN ZTP as 256 * MAJOR + MINOR */ 13: optional i16 auto_evpn_version; ...¶
This section contains the normative Auto-EVPN Thrift schema.¶
/** Thrift file for common AUTO EVPN definitions for RIFT Copyright (c) Juniper Networks, Inc., 2016- All rights reserved. */ namespace py common_evpn namespace rs models include "common.thrift" include "encoding.thrift" include "statistics.thrift" const common.FabricIDType default_fabric_id = 1 const i8 default_evis = 3 const i8 default_vlans_per_evi = 7 typedef i32 RouterIDType typedef i32 ASType typedef i32 ClusterIDType struct EVPNAnyRole { 1: required common.IPv6Address v6_loopback, 2: required common.IPv6Address type5_v6_loopback, 3: required common.IPv4Address type5_v4_loopback, 4: required RouterIDType bgp_router_id, 5: required ASType autonomous_system, 6: required ClusterIDType cluster_id, /** prefixes to be redistributed north */ 7: required set<common.IPPrefixType> redistribute_north, /** prefixes to be redistributed south */ 8: required set<common.IPPrefixType> redistribute_south, /** group name for evpn auto overlay */ 9: required string bgp_group_name, /** fabric prefixes to be advertised in rift instead of default */ 10: required set<common.IPPrefixType> fabric_prefixes, } struct PartialEVPNEVI { // route target per RFC4360 1: required CommunityType rt_target, 2: required RTDistinguisherType rt_distinguisher, 3: required RTDistinguisherType rt_type5_distinguisher, 5: required string mac_vrf_name, 6: required VNIType type5_vni, } struct EVPNRRRole { 2: required common.IPv6Address v6_rr_addr_loopback, 3: required common.IPv6PrefixType v6_peers_allowed_range, 4: required map<MACVRFNumberType, PartialEVPNEVI> evis, } typedef i64 RTDistinguisherType typedef i64 RTTargetType typedef i16 MACVRFNumberType typedef i16 VLANIDType typedef binary MACType typedef i16 UnitType struct IRBType { 1: required string name, 2: required UnitType unit, /// constant 3: required MACType mac, /// contains address of the gateway as well 4: optional common.IPv6PrefixType v6_subnet, /// contains address of the gateway as well 5: optional common.IPv4PrefixType v4_prefix, } typedef i32 VNIType struct VLANType { 1: optional VLANIDType id, 2: required string name, 3: optional IRBType irb, 5: optional bool stretched = false, 6: optional bool is_native = false, } struct CEInterfaceType { 2: optional common.IEEE802_1ASTimeStampType moved_to_ce, // we may not be able to obtain it in case of internal errors 3: optional string platform_interface_name, } typedef i64 CommunityType struct EVPNEVI { // route target per RFC4360 1: required CommunityType rt_target, 2: required RTDistinguisherType rt_distinguisher, 3: required RTDistinguisherType rt_type5_distinguisher, 4: required string mac_vrf_name, // fabric unique 24 bits VNI on non-stretch, otherwise unique across fabrics 5: required map<VNIType, VLANType> vlans, 6: required VNIType type5_vni, } struct EVPNLeafRole { 1: required set<common.IPv6Address> rrs, 2: required map<MACVRFNumberType, EVPNEVI> evis, 3: optional map<common.LinkIDType, CEInterfaceType> ce_interfaces, 5: optional binary leaf_unique_lacp_system_id, 6: optional binary fabric_unique_lacp_system_id, } /// structure to indicate EVPN roles assumed and their variables for /// external platform to configure itself accordingly. Presence of /// according structure indicates that the role is assumed. struct EVPNRoles { 1: required EVPNAnyRole generic, 2: optional EVPNRRRole route_reflector, 3: optional EVPNLeafRole leaf, } const common.TimeIntervalInSecType default_leaf_delay = 120 const common.TimeIntervalInSecType default_interface_ce_delay = 180 /// default delay before EVPNZTP FSM starts to compute anything const common.TimeIntervalInSecType default_evpnztp_startup_delay = 60¶
This section contains the normative Auto-EVPN Analytics Thrift schema.¶