Internet-Draft ET-ID Usage Update August 2021
Wang Expires 24 February 2022 [Page]
Workgroup:
BESS WG
Published:
Intended Status:
Standards Track
Expires:
Author:
Y. Wang
ZTE Corporation

Ethernet Tag ID Usage Update for Ethernet A-D per EVI Route

Abstract

This draft discusses the issues with several service interfaces of L3 EVIs. Then it proposes an extension to [RFC7432] and [I-D.sajassi-bess-evpn-ip-aliasing] to do ARP synchronizing and IP aliasing for Layer 3 routes that is needed for L3-EVIs to build a complete IP ECMP. It also introduced two new EVPN Service Interfaces for EVPN VPLS services and an extension of AC-ID extended community to improve ARP/ND Probing upon remote PE failures.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 24 February 2022.

Table of Contents

1. Introduction

In [I-D.wang-bess-evpn-arp-nd-synch-without-irb] section 5.1, the ESI of IP-VRF ACs are advertised as Overlay index of IP forwarding. But in IP-VRF context, the Ethernet A-D per EVI routes of the same ESI are more easier to conflict with each other, when they are imported into the same IP-VRF. This document discribes three secenarios of such kind. Then it discribes the solutions of them. These solutions all need an extension for the Ethernet Tag ID of the Ethernet A-D per EVI routes. Then this draft proposes two new Service Interfaces for EVPN VPLS services, they are Multiple VLAN-based ACs of the Same BD and Dedicating all VLANs of Separated Risk VLAN-bundle AC to a single BD . When different VLANs on the same ES of the same Broadcast Domain can fail independently, these two service interfaces will work better than VLAN-bundle and VLAN-aware bundle service interface.

1.1. Service Interfaces of L3 EVIs

The detailed explanation of this network's physical links are described in Figure 11 and Appendix A. But the network's EVCs (Ethernet Virtual Connections, which are typically established per <Port, VLAN> basis) is illustrated in the following sections per each Service Interface.

Ethernet segment ES1 is the ethernet segment of P1 and P2 of Figure 11, and its ESI is ESI21.

1.1.1. IRB Service Interface

The L3 EVI service interface per [I-D.sajassi-bess-evpn-ip-aliasing] is called as IRB Service Interface in this draft.

1.1.2. Mono VLAN-based Service Interface

                                +-------------------+
   PNEC1                PE1     |                   |
+---------+          +----------+--------+          |
|         |          |  __(P1.1)__(VPNx) |          |
|      "  |   P1     | /                 |          |
|      #==============<                  |          | PE3
|      "  |  ESI21   | \__      __       |     +----+----+
| N1+--+  |    +     |    (P1.2)  (VPNy) |     |         |
|      "  |    |     +-----------+-------+     |  (VPNx)---+N3
|      "  |    |                 |             |         |
|      "  |    |                 |             |         |
|      "  |    |        PE2      |             |         |
|      "  |    |     +-----------+-------+     |  (VPNy)---+N5
| N2+--"  |    +     |  __(P2.2)__(VPNy) |     |         |
|      "  |  ESI21   | /                 |     +----+----+
|      #==============<                  |          |
|      "  |   P2     | \__      __       |          |
|         |          |    (P2.1)  (VPNx) |          |
+---------+          +----------+--------+          |
                                |                   |
                                +-------------------+
Figure 1: Mono VLAN-Based S-I

In this service interface, each ESI can have no more than one of its VLANs attached to a specified EVPN Instance.

Take above figure for example, P1.1 and P1.2 are two subinterfaces of the same ESI, and <ESI21, VPNx> is Mono VLAN-based service interface, thus P1.1 and P1.2 can't be attached to the same EVPN Instance. Actually, P1.1 are attached to VPNx, while P1.2 are attached to VPNy.

Note that There are no MAC-VRF or IRB interface on PE1/PE2/PE3 in this case. Thus the IP-VRFs are called as EVPN instance instead. Such EVPN instance can be called EVPN signalled L3VPN or L3EVI for short.

1.1.3. Multiple VLAN-based Service Interface

                            +-----------------------+
   PNEC1              PE1   |                       |
+---------+         +-------+------+                |
|         |         |  __(P1.1)    |                |
|      "  |         | /        \   |                |
|      #=============<      (VPN1) |                | PE3
|      "  |  ESI21  | \__      /   |           +----+----+
| N1+--"  |    +    |    (P1.2)    |           |         |
|      "  |    |    +--------+-----+           |         |
|      "  |    |             |                 |         |
|      "  |    |             |                 | (VPN1)----+N3
|      "  |    |      PE2    |                 |         |
|      "  |    |    +--------+-----+           |         |
| N2+--"  |    +    |  __(P2.2)    |           |         |
|      "  |  ESI21  | /        \   |           +----+----+
|      #=============<      (VPN1) |                |
|      "  |         | \__      /   |                |
|         |         |    (P2.1)    |                |
+---------+         +-------+------+                |
                            |                       |
                            +-----------------------+
Figure 2: Multiple VLAN-based S-I

This network is similar to Figure 1 with a few notable exceptions as below: The L3EVIs VPNx and VPNy there are the same L3EVI (VPN1) here. So two of PE1's VPN1's ACs are both subinterfaces (P1.1 and P1.2) of the same ESI (ESI23).

Note that P1.1 is the gateway of N1, while P1.2 is the gateway of N2. N1 and N2 are just not in the same subnets.

1.1.4. VLAN-bundle Service Interface

When two VLANs of the same ES shares the same Gateway IP address of the same EVPN, These two VLANs can be configured into the same subinterface of that ES. This is VLAN-bundle service interface.

                            +-----------------------+
   PNEC1              PE1   |                       |
+---------+         +-------+-------------+         |
|         |         |                     |         |
|      "  |         | (P1.1)   AC1        |         |
|      #=============<      >======(VPN1) |         | PE3
|      "  |  ESI21  | (P1.2)              |    +----+----+
| N1+--"  |    +    |                     |    |         |
|      "  |    |    +--------+------------+    |         |
|      "  |    |             |                 |         |
|      "  |    |             |                 | (VPN1)----+N3
|      "  |    |      PE2    |                 |         |
|      "  |    |    +--------+------------+    |         |
| N2+--"  |    +    |                     |    |         |
|      "  |  ESI21  | (P2.1)   AC2        |    +----+----+
|      #=============<      >======(VPN1) |         |
|      "  |         | (P2.2)              |         |
|         |         |                     |         |
+---------+         +-------+-------------+         |
                            |                       |
                            +-----------------------+
Figure 3: Separated Risk VLAN-Bundle S-I

This network is similar to Figure 2 with a few notable exceptions as below. P1.1 and P1.2 are aggregated into the same subinterface (that is AC1). It is AC1 that is attached to VPN1.

Note that although P1.1 and P1.2 are aggregated into the same subinterface AC1, when <P1, VLAN 1>(P1.1) fails, <P1, VLAN 1>(P1.2) may not fail. Thus we say that AC1 are configured with Separated Risk VLAN-Bundle.

Note that VLAN-bundle Service Interface is actually Separated Risk VLAN-Bundle Service Inerface.

1.1.5. Shared Risk VLAN-bundle Service Interface

There may be other network which is similar to Section 1.1.4, except for that the VLAN-bundle of AC1 is Shared Risk VLAN-bundle, not Separated Risk VLAN-bundle.

When we say subinterface AC1 is of Shared Risk VLAN-bundle, we are saying that when an event result in P1.1's failure, that event will also result in P1.2's failure.

When AC1 is of Shared Risk VLAN-bundle, we say that <ESI21, VPN1> is Shared Risk VLAN-bundle service interface.

1.1.6. Integrated Routing and Cross-connecting Service Interface

The service interface in [I-D.wz-bess-evpn-vpws-as-vrf-ac] can be called as IRC (Ingegrated Routing and Cross-connecting) Service Interface.

1.2. New Service Interfaces of L2 EVIs

The detailed explanation of this network's physical links are described in Figure 11 and Appendix A. But the network's EVCs (Ethernet Virtual Connections, which are typically established per <Port, VLAN> basis) is illustrated in the following sections per each Service Interface.

Ethernet segment ES1 is the ethernet segment of P1 and P2 of Figure 11, and its ESI is ESI21.

1.2.1. Multiple VLAN-based ACs of the Same BD

                            +-----------------------+
   PNEC1              PE1   |                       |
+---------+         +-------+------+                |
|         |         |  __(P1.1)    |                |
|      "  |         | /        \   |                |
|      #=============<     (BD100) |                | PE3
|      "  |  ESI21  | \__      /   |           +----+----+
| N1+--"  |    +    |    (P1.2)    |           |         |
|      "  |    |    +--------+-----+           |         |
|      "  |    |             |                 |         |
|      "  |    |             |                 | (BD100)---+N3
|      "  |    |      PE2    |                 |         |
|      "  |    |    +--------+-----+           |         |
| N2+--"  |    +    |  __(P2.2)    |           |         |
|      "  |  ESI21  | /        \   |           +----+----+
|      #=============<     (BD100) |                |
|      "  |         | \__      /   |                |
|         |         |    (P2.1)    |                |
+---------+         +-------+------+                |
                            |                       |
                            +-----------------------+
Figure 4: Multiple VLAN-based S-I

This network is similar to Figure 2 with a few notable exceptions as below: The EVI there (VPN1) is a L3 EVI, but the EVI here is a L2 EVI (one of whose BDs is BD100).

Note that P1.1 and P1.2 are two ACs of BD100. thus they are two subinterfaces of the same ESI (ESI21) and the same BD.

1.2.1.1. Attaching these ACs to an ETI-Specific BD

A Broadcast Domain (BD) whose data packets can be received along with any Ethernet Tag ID (ETI). When an EVPN Instance (EVI) can have multiple broadcast domains (BDs), every BD of that EVI Instance will be called as an ETI-specific BD in this draft.

An ETI-Specific BD is identified by an ET-ID in the context of that EVI. That ET-ID are called as that ETI-Specific BD's BD-ID in this draft. That is to say, all MAC entries of that ETI-Specific BD are in a MAC-space identified by that BD-ID. Thus the ET-ID fields of the RT-2 routes of these MAC entries will all be set to the value of that BD-ID. That's why the BD-ID is also called as the normalized ET-ID of that BD.

Although the VLANs of P1.1 and P1.2 are different, when P1.1 and P1.2 are configured with the same normalized ET-ID, they will belong to the same ETI-specific BD whose BD-ID is that normalized ET-ID.

If a BD is an ETI-Specific BD, when we say that an AC is attached to that BD, we means that the AC is attached to the EVI of that BD, and the normalized ET-ID of that AC is configured with the BD-ID of that BD.

So we assign the same normalized ET-ID (say ET-ID 100) to P1.1, P1.2, P2.1 and P2.2. As a result of that, and according to [RFC7432], the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1b), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2b). The ET-IDs of R1_100_P1b and R1_100_P2b will be the same ET-ID 100.

1.2.1.2. Attaching these ACs to an ETI-Agnostic BD

When an EVPN Instance (EVI) can only have one broadcast domain (BD), the only BD of that EVI Instance will be called as an ETI-Agnostic BD in this draft. A broadcast domain of a L2 EVI of VLAN-based service interface is a good example of an ETI-Agnostic BD.

If a BD is an ETI-Agnostic BD, when we say that an AC is attached to that BD, we means that the AC is attached to the EVI of that BD. The ET-ID fields of the RT-2 routes of all MAC entries of a ETI-Agnostic BD will always be zero.

When P1.1, P1.2, P2.1 and P2.2 are attched to an ETI-agnostic BD, according to [RFC7432], the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1c), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2c). The ET-IDs of R1_100_P1c and R1_100_P2c will both be the same value (that's zero) too.

1.2.2. Separated Risk VLAN-bundle AC

Although P1.1 and P1.2 can be aggregated into a single subinterface, this can't change the fact that they don't share the same risks. When the physical interface P3 (see Figure 11) fails, one of them will fail, while the other will continue to work well.

                            +-----------------------+
   PNEC1              PE1   |                       |
+---------+         +-------+-------------+         |
|         |         |                     |         |
|      "  |         | (P1.1)   AC1        |         |
|      #=============<      >=====(BD100) |         | PE3
|      "  |  ESI21  | (P1.2)              |    +----+----+
| N1+--"  |    +    |                     |    |         |
|      "  |    |    +--------+------------+    |         |
|      "  |    |             |                 |         |
|      "  |    |             |                 | (BD100)---+N3
|      "  |    |      PE2    |                 |         |
|      "  |    |    +--------+------------+    |         |
| N2+--"  |    +    |                     |    |         |
|      "  |  ESI21  | (P2.1)   AC2        |    +----+----+
|      #=============<      >=====(BD100) |         |
|      "  |         | (P2.2)              |         |
|         |         |                     |         |
+---------+         +-------+-------------+         |
                            |                       |
                            +-----------------------+
Figure 5: Separated Risk VLAN-Bundle S-I

This network is similar to Figure 4 with a few notable exceptions as below. P1.1 and P1.2 are aggregated into the same subinterface (that is AC1). Now it is AC1 that is attached to VPN1, not P1.1 or P1.2.

Note that although P1.1 and P1.2 are aggregated into the same subinterface AC1, when <P1, VLAN 1>(P1.1) fails, <P1, VLAN 1>(P1.2) may not fail. Thus we say that AC1 are configured as Separated Risk VLAN-Bundle Service Interface.

Note that VLAN-bundle Service Interface of [RFC7432] is actually Shared Risk VLAN-Bundle Service Inerface.

1.2.2.1. Dedicating Such AC to a Single ETI-Specific BD

When we say AC1 (which is a VLAN-bundle subinterface) is dedicated to BD100, That is saying that AC1 is attached to the EVI of BD100, and all VLANs of AC1 is configured with BD100's normalized ET-ID (that's 100).

According to [RFC7432], the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1d), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2d). Both R1_100_P1d and R1_100_P2d will have an ET-ID 100.

Note that if each VLAN has individual normalized ET-ID, it is just normal VLAN-aware bundle service interface as per [RFC7432]. In such case, we say that the VLAN-mapping relationship between the AC and BD100 is 1:1 mapping, but when the AC is dedicated to a single ETI-Specific BD, the VLAN-mapping relationship betwenn the AC and BD100 is N:1 mapping.

Note that when we say an VLAN-bundle AC is attached to an ETI-Specifc BD, that AC may be dedicated to that BD in some use cases, but maybe only a VLAN of that AC is attached to that BD in some other use cases.

1.2.2.2. Attaching Such AC to an ETI-Agnostic BD

When a BD is not ETI-Specific, we can say that it is ETI-Agnostic. When we say an AC is attached to an ETI-Agnostic BD, it means that all VLANs of that AC are attached to that ETI-Agnostic BD. In other words, the AC is dedicated to that BD.

When the BD is ETI-Agnostic, according to [RFC7432], the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1e), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2e). Both R1_100_P1e and R1_100_P2e will have an zero ET-ID.

1.3. Terminology and Acronyms

Most of the acronyms and terms used in this documents comes from [RFC7432], [I-D.wang-bess-evpn-arp-nd-synch-without-irb] and [I-D.sajassi-bess-evpn-ip-aliasing] except for the following:

* VRF AC -

An Attachment Circuit (AC) that attaches a CE to an IP-VRF but is not an IRB interface.

* VRF Interface -

An IRB interface or a VRF-AC or an IRC interface. Note that a VRF interface will be bound to the routing space of an IP-VRF.

* L3 EVI -

An EVPN instance spanning the Provider Edge (PE) devices participating in that EVPN which contains VRF ACs and maybe contains IRB interfaces or IRC interfaces.

* IP-AD/EVI -

Ethernet Auto-Discovery route per EVI, and the EVI here is an IP-VRF. Note that the Ethernet Tag ID of an IP-AD/EVI route may be not zero.

* IP-AD/ES -

Ethernet Auto-Discovery route per ES, and the EVI for one of its route targets is an IP-VRF.

* RMAC -

Router's MAC, which is signaled in the Router's MAC extended community.

* ESI Overlay Index -

ESI as overlay index.

* ET-ID -

Ethernet Tag ID, it is also called ETI for short in this document.

* RT-2R -

When a MAC/IP Advertisement Route whose ESI is not zero is used for IP-VRF forwarding, it is called as a RT-2R in this draft. When it is used for MAC-VRF forwarding, it is not called as a RT-2R in this draft.

* RT-5E -

An EVPN Prefix Advertisement Route with a non-reserved ESI as its overlay index (the ESI-as-Overlay-Index-style RT-5) .

* IRC -

Integrated Routing and Cross-connecting, thus a IRC interface is the virtual interface connecting an IP-VRF and an EVPN VPWS.

* CE-BGP -

The BGP session between PE and CE. Note that CE-BGP route doesn't have a RD or Route-Target.

* CE-Prefix -

An IP Prefixes behind a CE is called as that CE's CE-Prefix.

* EVC -

Ethernet Virtual Connection, which is typically constructed per <Port, VLAN> basis.

* ETI-Agnostic BD -

A Broadcast Domain (BD) whose data packets can be received along with any Ethernet Tag ID (ETI). Note that a broadcast domain of an L2 EVI of VLAN-aware bundle service interface is a good example of an ETI-Specific BD.

* ETI-Specific BD -

A Broadcast Domain (BD) whose data packets are expected to be received along with a normalized Ethernet Tag ID (ETI). Note that a broadcast domain of an L2 EVI of VLAN-bundle or VLAN-based service interface is a good example of an ETI-Agnostic BD.

* BDI-Specific EADR -

When the <ESI, BD> uses BDI-Specific Ethernet Auto-discovery mode, the only Ethernet A-D per EVI route of that <ESI, BD> is called as a BDI-Specific EADR in this draft.

* ACI-Specific EADR -

When the <ESI, BD> uses ACI-Specific Ethernet Auto-discovery mode, the Ethernet A-D per EVI routes of that <ESI, BD> are called as ACI-Specific EADRs in this draft.

* U-Tag -

User Tag, a data packet's U-tag is a tag which is not used to find out the AC of that data packet. The U-Tag typically is not configured on the AC.

2. Problem Statement

2.1. Problem with Multiple VLAN-based L3SI

                                 +--------------------------+
 PNEC1                      PE1  |                          |
+-------------+          +-------+------+                   | PE3
|             |          | X__(20.9)    | ----X---->   +----+----+
|          "  |   P1     | /        \   | Withdraw     |         |
|          #==============<      (VPN1) | IP-AD/EVI    |  (VPN1)---+N6
| R1_______"  |  ESI21   | \__      /   | ET-ID=0      |         |
|    10.2  "  |    +     |    (10.9)    |              +----+----+
|          "  |    |     +--------+-----+                   |
|          "  |    |              |                         |
|          "  |    |              |                         | DGW1
|          "  |    |        PE2   |                    +----+----+
| R2_______"  |    |     +--------+-----+              |         |
|    20.2  "  |    +     |  __(20.9)    |              |(3.3.3.3)|
|          "  |  ESI21   | /        \   |              |    |    |
|          #==============<      (VPN1) | Withdraw     |  (VPN1)---+N3
|          "  |   P2     | \__      /   | IP-AD/EVI    |         |
|             |          | X  (10.9)    | ----X---->   +----+----+
+-------------+          +-------+------+ ET-ID=0           |
                                 |                          |
                                 +--------------------------+
Figure 6: RT-1 Confliction of L3EVIs

The IP addresses of P1.1, P1.2, P2.1, P2.2, R1 and R2 (see Figure 2) are illustrated in above Figure.

P1 and P2 are configured with the same ESI ESI21, thus an Ethernet A-D per EVI route ETI_10_2 is advertsed for P1.1, an Ethernet A-D per EVI route ETI_10_3 is advertsed for P2.1, an Ethernet A-D per EVI route ETI_20_2 is advertsed for P1.2, and an Ethernet A-D per EVI route ETI_20_3 is advertsed for P2.2.

When PE3 receives ETI_10_2 and ETI_20_2, it will pick up only one of them to be installed to the data plane. Because that they have the same <RD,ESI,ET-ID> and nexthop. We assume that the ETI_20_2 are picked out. When PE3 receives ETI_10_3 and ETI_20_3, it will also pick up only one of them to be installed to the data plane. Because that they also have the same <RD,ESI,ET-ID> and nexthop. We assume that the ETI_20_3 are picked out.

Although PE1 will advertise a RT-5 Route R5_SN8_1 (whose ESI is ESI23) to PE3, When H3 send data packet DP_3_8 to a host in SN8 after P1.1 fails, PE3 may still send DP_3_8 to PE1 because that PE3 will load-balance traffics just fllowing ETI_20_2 and ETI_20_3. That's a problem that will cause packet-drop or traffic-bypassing.

When physical port P3 (see Figure 11, which illustrates the physical links of Figure 6) fails, the CFM session of P2.1 (10.9 of PE2) goes down (illustrated by the 'X' inside PE2), while the CFM session of P2.2 (20.9 of PE2) continues to be UP. thus only the IP-AD/EVI route (whose ET-ID=1) of P2.1 should be withdrawn by PE2. the IP-AD/EVI route (where ET-ID=2) of P2.2 and the IP-AD/ES route should not be withdrawn by PE2.

Note that if the ET-IDs of these two IP-AD/EVI routes are the same, when P2.1 fails, DGW1 will continue to load-balance traffics whose DA=20.2 to PE2, because that there is still another IP-AD/EVI route (of VPN1) whose ESI and ET-ID are the same. That's why ACI-Specific Auto-discovery (Section 3.1.1) should be followed.

The solution for this problem is decribed in Section 3.

2.2. Problem with IRB Service Interface

The detailed explanation of this network's physical links are described in Figure 11 and Appendix A. But the network's EVCs (Ethernet Virtual Connections, which are typically established per <Port, VLAN> basis) is illustrated in the following sections per each Service Interface.

  PNEC1                         PE1
+------------+           +----------------+
|            |           |  __(BD-20)     |
| H4      "  |        P1 | /      \ IRB21 |
| |       #================   (IP-VRF)    +-----------------+
| N1______"  |   ESI21   | \__    / IRB11 |                 |
|    10.2 "  |     +     |    (BD-10)     |                 |  PE3
|         "  |     |     +----------------+             +---+----+
|         "  |     |                                    |        |
|         "  |     |                                    |(IP-VRF)+-+H3
|         "  |     |            PE2                     |        |
| N2______"  |     |     +----------------+             +---+----+
|    20.2 "  |     +     |  __(BD-10)     |                 |
|         "  |   ESI21   | /      \ IRB12 |                 |
|         #================   (IP-VRF)    +-----------------+
|         "  |        P2 | \__    / IRB22 |
|            |           |    (BD-20)     |
+------------+           +----------------+
Figure 7: RT-1 Confliction of EVPN IRB

The BD-10 here is the VPNx of Figure 11, and the BD-20 is the VPNy of Figure 11.

BD-10 and BD-20 are both BDs (broadcast domains), not IP-VRFs. The anycast IP address of IRB11 and IRB12 is 10.9, and the anycast IP address of IRB21 and IRB22 is 20.9. BD-10 and BD-20 are integrated into the same IP-VRF by IRB11, IRB12, IRB21 and IRB22. As a result of that, N1, IRB11 and IRB12 are of subnet SN1, and N2, IRB21 and IRB22 are of subnet SN2.

Note that IRB11 and IRB12 are IRB interfaces of BD-10 where BD-10 is a Broadcast Domain of VLAN-based Service Interface. IRB21 and IRB22 are IRB interfaces of BD-20 where BD-20 is also a Broadcast Domain of VLAN-based Service Interface.

According to [I-D.sajassi-bess-evpn-ip-aliasing], the IP A-D per EVI routes R1_110, R1_120, R1_210, R1_220 for P1.1, P1.2, P2.1 and P2.2 will all have zero Ethernet Tag IDs.

When PE3 receives R1_110 and R1_120, it will pick up only one of them to be installed to the data plane. We assume that the R1_120 is picked out. When PE3 receives R1_210 and R1_220, it will pick up only one of them to be installed to the data plane. We assume that the R1_220 is picked out.

Although PE1 will advertise a RT-2 Route R2_N1 (whose ESI is ESI21, IP is 10.2) to PE3, When H3 send data packet DP_H3_N1 to N1 after P1.1 fails, PE3 may still send DP_H3_N1 to PE1 because that PE3 will load-balance traffics just fllowing R1_120 and R1_220. That's a problem that will cause packet-drop or traffic-bypassing.

The solution for this problem is decribed in Section 3.6.1.

2.3. Problem with Multipe VLANs of a Single BD

2.3.1. Problem with Multipe VLAN-based L2SI

We want the IP addresses of N1, N2 and N3 are of the same subnet (say SN100), so we select the network of Figure 4 to accomplish that. Now assume that the BD100 of that network is a ETI-Agnostic broadcast domain. That's to say that, the EVI of BD100 is of VLAN-based service interface.

According to [RFC7432], the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2). Both R1_100_P1 and R1_100_P2 will have a zero Ethernet Tag ID.

So when the CFM of subinterface P1.1 fails, if R1_100_P1 is withdrawn, the forwarding of N2's packets (data packets which are destined to N2) will be in the wrong, but if R1_100_P1 is not withdrawn, the forwarding of N1's packets will be affected. That's the problem.

The solution for this problem is decribed in Section 3.6.2.

2.3.2. Problem with Separated Risk VLAN-bundle of a BD

Then P1.1 and P1.2 of Section 2.3.1 are aggregated into a single subinterface (see AC1 of Figure 5). The BD-100 is an ETI-Specific BD. Although that EVI is of VLAN-aware bundle Service Interface and the VLANs of P1.1 and P1.2 are different, P1.1 and P1.2 can't be attached to the same broadcast domain of that EVI. Because that we still want the IP addresses of N1,N2 and N3 are of the same subnet (say SN100), like what we have expected in Section 2.3.1.

So we assign the same normalized ET-ID (say ET-ID 100) to P1.1, P1.2, P2.1 and P2.2. As a result of that, the Ethernet A-D per EVI routes for P1.1 and P1.2 will be the same (say R1_100_P1b), the Ethernet A-D per EVI routes for P2.1 and P2.2 will be the same (say R1_100_P2b). Both R1_100_P1b and R1_100_P2b will have a ET-ID 100.

So when the CFM of subinterface P1.1 fails, if R1_100_P1b is withdrawn, the forwarding of N2's packets (those packets destinating to N2) will be in the wrong, but if R1_100_P1b is not withdrawn, the forwarding of N1's packets will be affected. That's the problem.

Note that this problem will not be resolved even if R1_100_P1b can carry two AC-ID Extended Communities (one per AC).

The solution for this problem is decribed in Section 3.6.3.

2.4. Problem with Bump-in-the-wire Use-Case

          TS2                          NVE2
    +--------------+           +---------------+
    |              |           |               |
SN7-----(N2-M4)__  |           |  __(BD-20)    |
    |            \ |       IF2 | /             |
    |             ===============              +-------+
    |          __/ |   ESI23   | \__           |       |
 +----- (N1-M2)    |     +     |    (BD-10)    |       |  DGW1
 |  |              |     |     |               |   +---+-----+
 |  +--------------+     |     +---------------+   | (BD-10) |
 |                       |                         |   \IRB1 |
SN1                      |                         |(IP-VRF) +-+H3
 |        TS3            |             NVE3        |   /IBR3 |
 |  +--------------+     |     +---------------+   | (BD-20) |
 |  |              |     |     |               |   +---+-----+
 +------(N1-M3)__  |     +     |  __(BD-10)    |       |
    |            \ |   ESI23   | /             |       |
    |             ===============              +-------+
    |          __/ |       IF3 | \__           |
SN7-----(N2-M5)    |           |    (BD-20)    |
    |              |           |               |
    +--------------+           +---------------+

Figure 8: RT-1 Confliction of Bump-in-the-wire

This network is similar to Figure 7 (section 4.3) of [I-D.ietf-bess-evpn-prefix-advertisement] with a few notable exceptions as below.

The NVE2,NVE3,DGW1,IRB1,BD-10,ESI23,TS2,TS3 and SN1 here is the NVE2,NVE3,DGW1,IRB1,BD-10,ESI23,TS2,TS3 and SN1 there. The N1 here is the Virtual Appliance (whose VA-MAC is M2/M3 on TS2/TS3) there.

But here we have another Virtual Appliance N2, which are attached to another Broadcast Domain BD-20. Both BD-10 and BD-20 are integrated into the same IP-VRF by DGW1. But the subnet SN1 can only be reached through BD-10, while the subnet SN7 can only be reached through BD-20.

As the result of that, the RT1 routes of <ESI23, BD-10> and <ESI23, BD-20> will conflict in the IP-VRF's context.

Note that both BD-10 and BD-20 are EVIs of VLAN-based Service Interfaces.

The solution for this problem is decribed in Section 3.6.4.

2.5. Problem with ARP/ND Probing upon Remote PE Failure

In order to avoid blackholing, when PE2 detects loss of reachability to PE1, it can trigger ARP/ND requests for all synced IP prefixes received from PE1 across all affected BDs. This will force host H21a (a host of subnet SN21) to reply to the solicited ARP/ND messages from PE2 and refresh both MAC and IP for the corresponding host in its tables.

This procedures are called as ARP/ND Probing in this draft, the problem with ARP/ND Probing are described as the following:

                                               +----------------+
   PNEC1                                 PE1   |                |
+--------------------+                 +-------+------+         |
|                    | (S-VLAN21)      |  __(P1.1)    |         |
| SN21(C-VLAN100) "  | QinQ-uplink     | /        \   |         |
| |               #=====================<     (BD100) |         |
| +   QinQ-Access "  |     ESI21       | \__      /   |      +--+--+
| N1+-------------"  |       +         |    (P1.2)    |      |     |
| +               "  |       |         +--------+-----+      |     |
| |               "  |       |                  |            |     |
| SN22(C-VLAN200) "  |       |                  |            | PE3 |
|                 "  |       |           PE2    |            |     |
|                 "  |       |         +--------+-----+      |     |
| N2+-------------"  |       +         |  __(P2.2)    |      |     |
|     QinQ-Access "  |     ESI21       | /        \   |      +--+--+
|                 #=====================<     (BD100) |         |
|                 "  | QinQ-uplink     | \__      /   |         |
|                    | (S-VLAN21)      |    (P2.1)    |         |
+--------------------+                 +-------+------+         |
                                               |                |
                                               +----------------+
Figure 9: RT-1 Confliction of Bump-in-the-wire

But when QinQ is enabled in PNEC1 but the ACs (P1.1 and P2.1) are still dot1q subinterface. The ARP/ND Probing procedures will fail. Because PE2 doesn't know the required C-VLAN of H21a. Although AC-ID extended coummunity of [I-D.sajassi-bess-evpn-ac-aware-bundling] can be used, the AC-ID extended coummunity is not used to advertise U-Tags. So PE2 can not trigger an ARP/ND request along with the required C-VLAN (that's C-VLAN 100) of H21a.

The solution for this problem is decribed in Section 3.6.5.

3. Solutions

Note that the PEs follow [I-D.wang-bess-evpn-arp-nd-synch-without-irb] to achieve the ESI load balance except for the following explicit discription.

3.1. Ethernet Auto-Discovery modes

When the AC-type is N:1 mapping, we propose a new Ethernet Auto-discovery mode, It is called as ACI-Specific Ethernet A-D mode in this draft, while the Ethernet A-D mode from [RFC7432] are called as BDI-Specific Ethernet A-D mode in this draft.

3.1.1. BDI-Specific EAD vs ACI-Specific EAD

* BDI-Specific Ethernet A-D mode -

When the AC-type is N:1 mapping, and only a single Ethernet A-D per EVI route is advertised for that <ESI, BD>, we say that the <ESI, BD> uses BDI-Specific Ethernet Auto-discovery mode, and that Ethernet A-D per EVI route is called as a BDI-Specific EADR (Ethernet A-D per EVI Route) in this draft.

* ACI-Specific Ethernet A-D mode -

When the AC-type is N:1 mapping, and individual Ethernet A-D per EVI routes are advertised per each VLAN of that <ESI, BD>, we say that the <ESI, BD> uses ACI-Specific Ethernet Auto-discovery mode, and each of such Ethernet A-D per EVI route is called as a ACI-Specific EADR (Ethernet A-D per EVI Route) in this draft.

Note that when a Shared Risk VLAN-bundle AC is dedicated to a single BD, either BDI-Specific EAD-mode or ACI-Specific EAD-mode can be used. But when a Separated Risk VLAN-bundle AC is dedicated to a single BD, only ACI-Specific EAD-mode should be used.

Note that when BDI-Specific Ethernet Auto-Discovery is used, the IP-AD per EVI route's ET-ID should be set to the BD-Identifier of its BD. When ETI-Specific Ethernet Auto-Discovery is used, the the ET-IDs of the IP-AD per EVI routes of the <ESI, BD> should be set per each VLAN of the bundle, but the ET-ID of RT-2R should still be set to the BD-Identifier of that BD. As a result of that, an AC-ID Overlay Index Extended Community should be carried along with that RT-2R in ETI-Specific Ethernet A-D mode.

3.1.2. Use Cases and their ET-IDs and AC-IDs

The advertisement of ET-IDs and AC-IDs can be combined in many ways, which are illustrated in Table 1:

3.1.2.1. ETI-ACI Combinations for L2EVI Use Cases

The Table is explained in details as the following:

o The "BD-Type","AC-Type" Columns:

Which use cases can each combination be used for?

* BD-Type:
Which type of broadcast domain is selected in that use case?
* AC-Type:
Which type of AC is slected for that BD in that use case?
* ETI-Specific:
The broacast domain is an ETI-specific BD.
* ETI-Agnostic:
The broacast domain is an ETI-agnostic BD.
* Mono VLAN:
only one VLAN of the ES is attached to the ETI-Agnostic BD.
* N:1 Separated:
a VLAN-bundle AC whose risk factors is separated per each VLAN of its own.
* N:1 Shared:
a VLAN-bundle AC whose VLANs will share the same risks.
* N:1 mapping:
a VLAN-bundle AC whose VLANs are all attached to the same BD.
* 1:1 mapping:
only one VLAN of the ES is attached to the ETI-specific BD.
o The "AC-ETI", "IP-ETI", "ACI-T" Columns:

Which value can each field be assigned to?

* AC-ETI:
The ET-ID field of RT-1 per EVI route of that BD.
* BD-ID:
The ET-ID field of a RT-2 route. it is the identifier (in the context of an EVI) of that RT-2 route's broadcast domain.
* ACI-T:
Which Extended Community should the ACI be carried in? Attachment Circuit ID Extended Community or AOI Extended Community?
* ACI:

The AC-ID of an AC for a specified broadcast domain.

Note that when the AC-Type is N:1 Separated, different VLAN of that AC have different ACI, the ACI typically will be the same value with the corresponding VLAN.

* AOI:

The ACI is encapsulated as AOI Extended Coummunity.

Note that in such case, the ACI is the overlay index of that RT-2R or RT-5E. When the AC-Type is N:1 Separated, each RT-1 per EVI route will select individual VLAN of that AC to be its own AOI.

* AC-ID:

The ACI is encapsulated as Attachment Circuit ID Extended Community.

Note that in such case the AOI is still the ET-ID field of that RT-2R or RT-5E.

* BDI:
That field is set to the BD identifier of an ETI-Specific BD.
* 0:
That field is set to zero.
* /:
That Attachment Circuit ID Extended Community is not carried along with that RT-2R or RT-5E route.
o The "No." Column:

The index number of each use case.

o

Notes:

*
When the AC-Type is N:1 shared or N:1 Separated, we can say the AC-Type is N:1 mapping.
**
This follows [I-D.sajassi-bess-evpn-ac-aware-bundling].
3.1.2.2. ETI-ACI Combinations for EVPN IRB Use Cases

The Table is explained in details as the following:

o The "BD-Type", "AC-Type" Columns:

Which use cases can each combination be used for?

* L2AC-Type:
Which type of AC is slected for that BD in that use case?.
* Any:
Regardless of which type the L2 AC is.
* BumpWire0:
The broadcast domain is an ETI-Agnostic BD, and the use case is a Bump-in-the-wire use case. This is the original Bump-in-the-wire use case of [I-D.ietf-bess-evpn-prefix-advertisement] section 4.3.
* BumpWireX
The broadcast domain is an ETI-Specific BD, and the use case is a Bump-in-the-wire use case.
* BumpWire0s:
Multiple BumpWire0-BDs on the same ES are integrated into the same L3EVI.
* BumpWireXs:
Multiple BumpWireX-BDs on the same ES are integrated into the same L3EVI.
* ETI-0 BDs:
Multiple ETI-Agnostic BDs on the same ES are integrated into the same L3EVI.
* ETI-X BDs:
Multiple ETI-Specific BDs on the same ES are integrated into the same L3EVI.
o The "EADR-ETI", "IP-ETI", "IP-ACI" Columns:

Which value can each field be assigned to?

* EADR-ETI:
The Ethernet Tag ID field of RT-1 per EVI route of that BD.
* IP-ETI:
The Ethernet Tag ID field of RT-2R or RT-5E route.
* IP-ACI:

Which Extended Community should the ACI be carried in? Attachment Circuit ID Extended Community or AOI Extended Community?

o

Notes:

*
When the BD-10 of above Bump-in-the-wire use case is replaced with an ETI-specific BD, that use case is called ETI-specific Bump-in-the-wire use case. The ETI-specific Bump-in-the-wire use case is implied in [I-D.ietf-bess-evpn-prefix-advertisement] as discussed in Section 3.5, Paragraph 4.
**
When the BD-10 of above Bump-in-the-wire use case is replaced with an ETI-specific BD and ACI-Specific EAD mode ACs, that use case is called N:1 ETI-specific Bump-in-the-wire use case.
3.1.2.3. ETI-ACI Combinations for L3EVI Use Cases
Table 3: Combinations for L3EVIs
No. Use Cases AC-ETI IP-ETI IP-ACI
33 Multiple VLAN-based ACI 0 AOI
34 Separated Risk ACI ACI / *
35 Mono VLAN-based 0 0 /
36 Shared Risk ACI 0 AOI **

The Table is explained in details as the following:

o The "Use Cases" Column:

Which use cases can each combination be used for?

* L3 EVPN Service Interfaces:
Mutiple VLAN-based service interface, Separated Risk VLAN-bundle service interface, Mono VLAN-based service interface, Shared Risk service interface.
o The "AC-ETI", "IP-ETI", "IP-ACI" Columns:

Which value can each field be assigned to?

* AC-ETI:
The ET-ID field of IP-AD per EVI route of that L3EVI.
o

Notes:

*
Both Multiple VLAN-based service interface and Separated Risk VLAN-bundle service interface can use combinations 33-34. The difference is that the ACI of combination 34 will be encapsulated into data packets, but the ACI of combination 33 won't.
**
Both Mono VLAN-based service interface and Shared Risk VLAN-bundle service interface can use combinations 35-36. If you are not very sure whether the risks are shared or separated, the combination 36 will be safer.

3.2. Determining the Aliasing Pathes for RT-5E/RT-2R

When PE3 forward a data packet DP_2021 according to an IP Prefix advertisement route R5_2021 whose overlay index is an ESI, If the ET-ID of R5_2021 is a non-reserved ET-ID, DP_2021 should not be forwarded according to an ethernet A-D per EVI route R1_2021, unless the ET-ID and ESI of R1_2021 are both the same as that of R5_2021.

Note that in [I-D.sajassi-bess-evpn-ip-aliasing] the IP-AD per EVI route carries a "Router's MAC" extended community in case the RMAC is not the same among different PEs. In these cases, the inner destination MAC of the corresponding data packets from PE3 to PE1/PE2 must use the RMAC in IP-AD/EVI route instead, even if there is a RMAC in RT-2R route.

Note that this is a data-plane update of [I-D.ietf-bess-evpn-prefix-advertisement] for both EVPN signalled L3VPN and [I-D.sajassi-bess-evpn-ip-aliasing]. According to [I-D.ietf-bess-evpn-prefix-advertisement] section 4.3 or [I-D.ietf-bess-evpn-inter-subnet-forwarding] section 5.4, the inner destination MAC will follow the RMAC of RT-5E Route or RT-2R Route.

When selecting corresponding IP-AD/EVI routes for a RT-5E route, the AOI Extended Community (if it exists) of the RT-5E route is prefered than the ET-ID of the RT-5E route.

* Using ET-ID to select BDI-Specific EADRs -

There may be multiple IP-AD/EVI routes which all can match the RT-5E's ESI. In such case, The IP-AD/EVI routes with the same ET-ID as the RT-5E should be selected.

Note that when the RT-5E's ET-ID is X (X!=0), the ET-IDs of the selected IP-AD/EVI routes (of that RT-5E) should be all X.

Note that the RT-5E's ET-ID not only just be used to select IP-AD/EVI routes, but also be encapsulated into data packets in order to keep compatible with ETI-specific Bump-in-the-wire use case.

* Using AOI to select ETI-Specific EADRs -

There may be multiple IP-AD/EVI routes which all can match the RT-5E's ESI. In such case, The IP-AD/EVI routes whose ET-ID are the same as the RT-5E's AOI should be selected.

Note that when the RT-5E's AOI is Y (Y!=0), the ET-IDs of the selected IP-AD/EVI routes (of that RT-5E) should be all Y.

Note that when the RT-5E's ET-ID is not 0, and an AOI is advertised along with the RT-5E, the IP-AD/EVI routes of that RT-5E should be selected according to the AOI.

Note that when a data packet is load-balanced according to <ESI, AOI>, it is the RT-5E's ET-ID which should be encapsulated into the data packet, not the AOI.

Note that [I-D.sajassi-bess-evpn-ac-aware-bundling] requires the Presence of Attachment Circuit ID Extended Community MUST be ignored by non multihoming PEs. It requires the remote PE (non-multihome PE, e.g. PE3) MUST process MAC route as defined in [RFC7432]. But the AOI of this case should be used to select ETI-Specific EADRs. This is non-compatible with the Attachment Circuit Extended Community, thus the new ACI-Specific Overlay Index Extended Community is defined.

Note that the usage of RT-2R's ET-ID in the context of an IP-VRF should be the same as the usage of RT-5E's ET-ID, and the usage of RT-2R's AOI in the context of an IP-VRF should be the same as the usage of RT-5E's AOI.

3.3. ACI-specific Overlay Index Extended Community

A new EVPN BGP Extended Community called Supplementary Overlay Index is introduced. This new extended community is a transitive extended community with the Type field of 0x06 (EVPN) and the Sub-Type of TBD. It is advertised along with EVPN MAC/IP Advertisement Route (Route Type 2) per [RFC7432] in ACI-Sepecific Ethernet Auto-Discovery mode. It may also be advertised along with EVPN Prefix Advertisement Route (Route Type 5) as per [I-D.ietf-bess-evpn-prefix-advertisement]. Generically speaking, the new extended community must be attached to any routes which are leant over an <ESI, EVI> of ACI-specific Ethernet Auto-Discovery.

The Supplementary Overlay Index Extended Community is encoded as an 8-octet value as follows:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Type=0x06     | Sub-Type=TBD  | Type  |O|Z|F=1| Flags | VLAN3 |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | VLAN3(Cont.)  |         VLAN2         |         VLAN1         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 10: Supplementary Overlay Index Extended Community
o F:

Format Indicator, its value is always 1 in this draft. Other values are reserved.

o Type:

.

* 0-1:

VLAN based AC-ID. Are there U-Tags in the AC-ID?

= 1:

VLAN2 is an U-Tag.

Note that when the ACI-Specific Overlay Index is used to slect RT-1 per EVI routes, the U-Tags of the AOI should not be used, because that the corresponding RT-1 per EVI route's ET-ID will always be advertised following the Type-0 AOI format. Thus a Type-1 AOI should be translated into a Type-0 AOI before it is used to select RT-1 per EVI routes. When the U-Tags of a Type-1 AOI is set to zero, it will change to the corresponding Type-0 AOI.

= 0:

Neither VLAN1 nor VLAN2 is an U-Tag.

Table 4: VLAN-based AOIs
No. Use Cases Type VLAN2 VLAN1 VLAN3
1 untag type 0 0 0 0
2 default type 0 0 FFF 0
3 dot1q type 0 0 E 0
4 QinQ type 0 I E 0
5 dot1q type 1 U E 0
6 default type 1 U FFF 0
7 default type 1 U FFF U-Tag2

Notes:

E :

That field is the External VLAN of the AC.

I :

That field is the Internal VLAN of the AC.

U :

That field is a U-Tag of a packet received by the AC.

0 :

The tag corresponding to that field is absent.

FFF :

The AC is the default subinterface (Section 3.3) of the corresponding ES.

untag :

An untagged subinterface should be matched by that format.

default :

A default subinterface should be matched by that format. When the AC is a default subinterface, it will match all the remaining VLAN-tags (which are left over by other subinterfaces) on its main-interface.

dot1q :

A dot1q subinterface should be matched by that format.

QinQ :

A QinQ subinterface should be matched by that format.

U-Tag2 :

The Inner VLAN-Tag of the U-Tag which is corresponding to VLAN2 field. It is only used for the type-1 AOI of a main interface's default subinterface.

* 2-7:

Reserved.

* 8-15:

Reserved until all of 2-7 are used, or the first bit should be used as a specail flag.

o O Flag:

Overlay Index Flag, this extended community is used as overlay index.

When type field is 0-1: For ACI-Specific Ethernet auto-discovery mode, when it is carried along with a RT-2 route, the O Flag should be set to 1, For BDI-Specific Ethernet auto-discovery, when it is carried along with a RT-2 route, the O Flag should be set to 0.

When the O Flag is set to 1, this AC-ID is also called as AOI (ACI-Specific Overlay Index), and the <ESI, AOI> of that RT-2R or RT-5E should be used to determine ECMP pathes. At the same time, the AOI should also be used like Attachment Circuit ID Extended Community too.

Note that only the lowest 8 bits of VLAN3 field should be used to select RT-1 per EVI routes. <lowest 8 bits of VLAN3, VLAN2, VLAN1> of a type-0 AOI forms an Ethernet Tag ID of an ACI-Specific EADR.

o Z Flag:

Must be zero. Reserved for future use, the receiver should ignore this extended coummunity if Z flag is not zero at now.

o Flags:

Reserved for future use. it is set to 0 on advertising, and ignored on receiving.

Note that although this extended community is actually an extension of the AC-ID extended community (as per [I-D.sajassi-bess-evpn-ac-aware-bundling]), we can assume that they may be of different Sub-Types because that they have different behaviors.

3.4. ARP/ND Synching and IP Aliasing

3.4.1. Constructing MAC/IP Advertisement Route

This draft introduces a new usage/construction of MAC/IP Advertisement route to enable Aliasing for IP addresses in L3EVI use-cases. The usage/construction of this route remains similar to that described in [I-D.sajassi-bess-evpn-ip-aliasing] with a few notable exceptions as below.

  • The Route-Distinguisher should be set to the corresponding L3EVI context.
  • The Ethernet Tag ID should be set to a value according to Table 2's IP-ETI column.
  • The ACI should be set to a value according to Table 2's IP-ACI column.

    Note that the ACI may be encapsulated as Attachment Circuit ID Extended Communinty or ACI Extended Community. If it is encapsulated as ACI Extended Communinty, the <ESI, ACI> will be used to select IP-AD/EVI routes by PE3, and the selected IP-AD/EVI routes are used to determine the aliasing pathes of this RT-2 route. But if it is encapsualted as Attachment Circuit ID Extended Community, PE3 will ignore it, and the aliasing pathes of this RT-2 route will be determined by <ESI, ET-ID> as per [RFC7432].

  • In EVPN IRB, The ESI SHOULD be set to the ESI of the L2 AC from which the ARP entry is snooped as per [I-D.sajassi-bess-evpn-ip-aliasing].

    In EVPN signalled L3VPN, The ESI SHOULD be set to the ESI of the VRF interface from which the ARP entry is learnt.

    Note that the <ESI,ACI> is used to install the synched ARP entries to corresponding VRF interfaces on PE1/PE2. But on PE3, the <ESI,ACI> is used to load balance traffics.

  • The MAC/IP Advertisement SHOULD carry one or more IP VRF Route- Target (RT) attributes.
  • In EVPN Signalled L3VPN, the MPLS Label1 should be set to the same pre-configured value for all local ARP entries. It is just used to be compatible with existing RRs.

    In EVPN IRB, the MPLS Label1 should follow [I-D.ietf-bess-evpn-inter-subnet-forwarding].

  • The MPLS Label2 should be set to the local label of the IP-VRF in MPLS or VXLAN EVPN. But it should be set to implicit-null in SRv6 EVPN. This is the same as [I-D.sajassi-bess-evpn-ip-aliasing].
  • The RMAC Extended Community attribute SHOULD be carried in VXLAN EVPN. This follows [I-D.ietf-bess-evpn-inter-subnet-forwarding].

3.4.2. Constructing IP-AD/EVI Route

The usage/construction of this route is similar to the IP-AD per EVI route described in [I-D.sajassi-bess-evpn-ip-aliasing] with a few notable exceptions as below.

  • The Ethernet Tag ID (ET-ID) should be set to a value according to Table 2's EADR-ETI column.

3.5. Constructing IP Prefix Advertisement Route

When an IP Prefix Advertisement is advertised, The Ethernet Tag ID is recommanded to be carried along with it, if it is not clear that whether there will be conflictions among IP A-D per EVI routes in the future.

Note that the Ethernet Tag ID here is not used to isolate IP address spaces. It is just used to resolve its ESI overlay index to a proper IP A-D per EVI route.

The AC-ID extended community can't be considered as a substitute of the ET-ID. Because that the AC-ID is not the key of IP A-D per EVI routes, but the ET-ID is.

Arguably, non-reserved Ethernet Tag ID in the RT-5 route, could be assumed that it is already in [I-D.ietf-bess-evpn-prefix-advertisement], because that when the BD-10 of the Bump-in-the-wire use-case is of an EVI of VLAN-aware bundl service interface, non-reserved ethernet tag ID will be carried along with Ethernet A-D per EVI routes, hence non-reserved Ethernet Tag ID should be carried along with IP Prefix Advertisement Routes too. Otherwise those Ethernet A-D per EVI routes can not be referred by these IP Prefix Advertisement Routes.

3.6. Secenario-Specific Procedures

3.6.1. EVPN-IRB Specific Procedures

PE1 may advertise two IP A-D per EVI routes for subinterface P1.1, one (say R1_110b) is for BD-10, the other (that R1_110) is for IP-VRF. The Ethernet Tag ID of R1_110b is zero per [RFC7432], but the Ethernet Tag ID of R1_110 is set to the VLANs of P1.1 according to this draft.

When PE1 advertise a RT-5 Route for a prefix behind BD-10, the Ethernet Tag ID of that RT-5 Route is determined by the out-interface (P1.1) of the MAC of that prefix's overlay nexthop (10.0.0.2).

Note that R1_110b will not be imported into the IP-VRF.

PE1 may advertise two RT-2 routes for N1's MAC/IP, one (say R2_N1b) is for BD-10, the other (that R2_N1) is for the IP-VRF. The Ethernet Tag ID of R2_N1b is zero per [RFC7432], but the Ethernet Tag ID of R2_N1 is set to the VLANs of P1.1 according to this draft.

The MAC-VRFs and IP-VRFs in this solution will have their own copy of EVPN routes, This issue can be improved using the mechanisms of Section 3.6.2, if interoperation between VLAN-based service interface and VLAN-aware service interface per [I-D.ietf-bess-evpn-modes-interop] is provisioned in this network.

3.6.2. On Separated Risk VLAN-bundle of the same BD

PE1 will advertise different Ethernet A-D per EVI routes for P1.1 and P1.2, the Ethernet Tag ID of them will be the VLANs of corresponding AC (P1.1 or P1.2).

Note that the MAC/IPs on P1.1 and P1.2 will be advertised along with such ET-IDs too.

When PE3 receives such Ethernet A-D per EVI routes and RT-2 routes, it SHOULD process them following [I-D.ietf-bess-evpn-modes-interop]'s section 3.1.2. As a result of that, although PE3 works in VLAN-based or VLAN-baundle Service Interface, Such MAC/IPs will be istalled in BD-100 and they will be resolved to those Ethernet A-D per EVI routes under the help of such ET-IDs.

3.6.3. On Multiple VLAN-based ACs of the same BD

PE1 will advertise different Ethernet A-D per EVI routes for P1.1 and P1.2, the Ethernet Tag ID of them will be the VLANs (10 or 20) of corresponding AC (P1.1 or P1.2), not the normalized ET-ID (100).

Note that the MAC/IPs on P1.1 and P1.2 will not be advertised along with such ET-IDs too, They will be advertised along with the normalized ET-ID and the corresponding AC-ID extended community per [I-D.sajassi-bess-evpn-ac-aware-bundling].

When PE3 receives these two Ethernet A-D per EVI routes, it installs them separately. Whe PE3 receives the RT-2 route for N1's address, that route carries the AC-ID 1 of P1.1, PE3 will resolve it to the Ethernet A-D per EVI routes (in all-active mode) whose ET-ID are 1 (not the normalized ET-ID 100). Whe PE3 receives the RT-2 route for N2's address, that route carries the AC-ID 2 of P1.2, PE3 will resolve it to the Ethernet A-D per EVI routes (in all-active mode) whose ET-ID are 2 (not the normalized ET-ID 100).

The ET-ID fields of Ethernet A-D per EVI routes follows Table 1's AC-ETI column. The ET-ID fields of RT-2 routes follows Table 1's BD-ID column. The AOI/AC-ID extended community follows Table 1's ACI-T column.

In this case, the ACI-specific EADRs can be used to do AC-influenced DF-election procedures. Each VLAN of that ES may have individual DF-election result.

3.6.4. Bump-in-the-wire Specific Procedures

PE1's RT-5E routes (for the CE-prefixes behind each BD) should be advertised in its BD's context, and they should not be imported into the IP-VRF directly by Route-Targets, otherwise it will be difficult to find the exact IRB interface for it. When a RT-5E route are imported into BD-10 (or BD-20) by DGW1, then it will be imported into that IP-VRF following the IRB1 (or IRB2) interface, and the IRB1 (or IRB2) interface's MAC will be the source MAC of the data packets which are sent to PE1/PE2 following that RT-5E by DGW1, and the RT-1 per EVI routes for that RT-5E route will be resolved in the BD-10's context, not the IP-VRF context.

The advertisement of Ethernet A-D per EVI routes and RT-2 routes are similar to Section 3.6.2.

The RT-5E routes (for the CE-prefixes behind each BD) should be advertised follows Table 2. Note that the IP-ETI, EADR-ETI and IP-ACI should be determined by the outgoing AC per each CE-prefix's VA MAC. The IP-ETI is set to the BD-ID of that outgoing AC's BD. When that <AC, BD> is ACI-Specific EAD mode, the IP-ACI is the AOI extended comunity for that <AC, BD>, and the EADR-ETI is the same value as the IP-ACI. When that <AC, BD> is BDI-Specific EAD mode, the IP-ACI is absent, and the EADR-ETI is the same value as the IP-ETI.

Note that in Bump-in-the-wire use cases, the EVPN label that is encapsulated by DGW1 for NVE2 or NVE2 will be a label that identifies a L2 EVI. So when the BD is an ETI-Specific BD, the IP-ETI MUST be encapsulated into the ethernet header of the data packets. Otherwise such data packets won't be received by that BD.

Note that in Bump-in-the-wire use cases, even if the BD is a MPLS EVPN BD, PE3 should send data packets to NVE2/NVE3 along with the overlay ethernet header, because the Bump-in-the-wire use case is actually a special EVPN IRB use case. Otherwise NVE2/NVE3 can't decapsulate the data packets properly.

3.6.5. ARP/ND Probing Specific Procedures

When PE1 synchs the ARP/ND entry of H21a to PE2, The AOI extended community is carried along with the RT-2 route for H21a. The AOI extended community is constructed in the following format:

The Type-1 AOI is used, the VLAN1 field is the VLAN ID of subinterface P1.1 (see Figure 9), and the VLAN2 field is the C-VLAN (that's C-VLAN 100) against which that ARP/ND entry is learnt on P1.1.

When PE2 receives the RT-2 route of H21a, it don't use the VLAN2 field to install the MAC/ARP/ND entry, because that VLAN2 is a U-Tag, which would not have been configured on subinterface P2.1. PE2 use VLAN1 field to install the MAC/ARP/ND entry on P2.1, but VLAN2 as an U-Tag are also recorded into the ARP/ND entry.

When PE2 decide to trigger the ARP/ND Probing for H21a, the ARP/ND request should be sent over P2.1 along with VLAN2 as its inner VLAN-tag (the outer VLAN-tag will be VLAN1).

Note that it is not necessary for an U-Tag to be recorded in any MAC entries.

4. IANA Considerations

A new transitive extended community Type of 0x06 and Sub-Type of TBD for EVPN Supplementary Overlay Index Extended Community needs to be allocated by IANA.

5. Security Considerations

TBD.

6. References

6.1. Normative References

[I-D.ietf-bess-evpn-modes-interop]
Krattiger, L., Sajassi, A., Thoria, S., Rabadan, J., and J. Drake, "EVPN Interoperability Modes", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-modes-interop-00, , <https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-modes-interop-00>.
[I-D.ietf-bess-srv6-services]
Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R., Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based Overlay Services", Work in Progress, Internet-Draft, draft-ietf-bess-srv6-services-07, , <https://datatracker.ietf.org/doc/html/draft-ietf-bess-srv6-services-07>.
[I-D.sajassi-bess-evpn-ip-aliasing]
Sajassi, A., Badoni, G., Warade, P., Pasupula, S., Drake, J., and J. Rabadan, "EVPN Support for L3 Fast Convergence and Aliasing/Backup Path", Work in Progress, Internet-Draft, draft-sajassi-bess-evpn-ip-aliasing-02, , <https://datatracker.ietf.org/doc/html/draft-sajassi-bess-evpn-ip-aliasing-02>.
[I-D.sajassi-bess-evpn-ac-aware-bundling]
Sajassi, A., Brissette, P., Mishra, M. P., Thoria, S., Rabadan, J., and J. Drake, "AC-Aware Bundling Service Interface in EVPN", Work in Progress, Internet-Draft, draft-sajassi-bess-evpn-ac-aware-bundling-04, , <https://datatracker.ietf.org/doc/html/draft-sajassi-bess-evpn-ac-aware-bundling-04>.
[I-D.ietf-bess-evpn-prefix-advertisement]
Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. Sajassi, "IP Prefix Advertisement in EVPN", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-prefix-advertisement-11, , <https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-prefix-advertisement-11>.
[I-D.ietf-bess-evpn-inter-subnet-forwarding]
Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. Rabadan, "Integrated Routing and Bridging in EVPN", Work in Progress, Internet-Draft, draft-ietf-bess-evpn-inter-subnet-forwarding-15, , <https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-inter-subnet-forwarding-15>.
[RFC7432]
Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, , <https://www.rfc-editor.org/info/rfc7432>.
[RFC8365]
Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., Uttaro, J., and W. Henderickx, "A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, DOI 10.17487/RFC8365, , <https://www.rfc-editor.org/info/rfc8365>.

6.2. Informative References

[I-D.wang-bess-evpn-arp-nd-synch-without-irb]
Wang, Y. and Z. Zhang, "ARP/ND Synching And IP Aliasing without IRB", Work in Progress, Internet-Draft, draft-wang-bess-evpn-arp-nd-synch-without-irb-07, , <https://datatracker.ietf.org/doc/html/draft-wang-bess-evpn-arp-nd-synch-without-irb-07>.
[I-D.wz-bess-evpn-vpws-as-vrf-ac]
Wang, Y. and Z. Zhang, "EVPN VPWS as VRF Attachment Circuit", Work in Progress, Internet-Draft, draft-wz-bess-evpn-vpws-as-vrf-ac-01, , <https://datatracker.ietf.org/doc/html/draft-wz-bess-evpn-vpws-as-vrf-ac-01>.

Appendix A. Explanation for Physical Links of the Use-cases

There are three PEs, two L2NEs (Layer 2 Network Elements) and five L3NEs (Layer 3 Network Elements) in abobe network. The PEs are PE1, PE2 and PE3. The L2NEs are L2NE1 and L2NE2. The L3NEs are N1/N2/N3/N4/N5. They are all illustrated in Figure 11.

There are 9 physical links among these 10 physical devices as illustrated in Figure 11. These physical links are called as PLi (i=1,2...8). The two physical ports of the same physical link PLi are both called as Pi (i=1,2...8).

As illustrated in Figure 11, some of these physical ports may have subinterfaces. When a subinterface's VLAN ID is j and it is physical port Pi's subinterface, that subinterface is called as Pi.j. For example, P1.2 is a subinterface of physical port P1 and its VLAN ID is 2.

There are three NIs (Network Instances) among PE1, PE2 and PE3. They are VPNx, VPNy and NIz. Two subinterfaces are attached to VPNx, they are P1.1 and P2.1. Other two subinterfaces are attached to VPNy, they are P1.2 and P2.2. N3 is also attched to VPNx, while N5 is also attached to VPNy.

There are two EVCs (Ethernet Virtual Connections) between L2NE1 and L2NE2, they are EVC1 and EVC2. The L2NE1's EVC1 instance (which is illustrated as the "O" on L2NE1) have three member interfaces, they are P4, P1.1 and P3.1, where P3.1 and P1.1 are of the same protection-group. The L2NE2's EVC1 instance have two member interfaces, they are P3.1 and P2.1. The L2NE2's EVC2 instance (which is illustrated as the "O" on L2NE2) have three member interfaces, they are P5, P2.2 and P3.2, where P3.1 and P1.1 are of the same protection-group. The L2NE1's EVC2 instance have two member interfaces, they are P3.2 and P1.2. The L2NE2's EVC1 instance and L2NE1's EVC2 instance are both CCC (Circuit Cross Connection) local connections.

VPNx and VPNy are associated to NIz on each PE.

A.1. Failure Detections for P1.2 (or P2.1)

There is a CFM session CFM1 between P1.2 of PE1 and L2NE2's P3.2, when physical port P3 fails, the CFM session CFM1 will go down. There is a CFM session CFM2 between P2.1 of PE2 and L2NE1's P3.1, when physical port P3 fails, the CFM session CFM2 will go down.

A.2. Protection Approaches for N1 (or N2)

A.2.1. CCC-Approaches

The L2NE1's EVC2 instance and L2NE2's EVC1 instance are both CCC local connections too. In L2NE1's EVC1 instance, P1.1 and P3.1 are of the same protection-group PG1. In L2NE2's EVC2 instance, P2.2 and P3.2 are of the same protection-group PG2. In PG1, both P1.1 and P3.1 will receive data packets. In PG2, both P2.2 and P3.2 will receive data packets.

A.2.1.1. CCC Active-Active Protection

L2NE1 (or L2NE2) will load-balance N1's (N2's) data packets between P1.1 and P3.1 (or P2.2 and P3.2).

A.2.1.2. CCC Active-Standby Protection

In PG1, P1.1 is the active path, P3.1 is the backup path. In PG2, P2.2 is the active path, P3.2 is the backup path.

That's saying that L2NE1 (or L2NE2) will not send N1's (or N2's) data packets over P3.1 (or P3.2), unless P1.1 (or P2.2) or P1 (or P2) has been in failure before that data forwarding.

A.2.2. VSI-Approaches

L2NE1's EVC2 instance and L2NE2's EVC1 instance are both VSI instances in this case. P1.1, P3.1, P2.2 and P3.2 are all individual ACs in these VSIs.

Note that L2NE2's EVC1 instance and L2NE1's EVC2 instance are still both CCC local connections in this case, and there is no PG1 or PG2 in this case, and there are no PWs in this case.

Author's Address

Yubao Wang
ZTE Corporation
No.68 of Zijinghua Road, Yuhuatai Distinct
Nanjing
China