Individual Submission | A. Lim, Ed. |
Internet-Draft | S. Blake, Ed. |
Intended status: Informational | Extreme Networks |
Expires: January 07, 2012 | S. Shah |
Ericsson | |
July 06, 2011 |
Extreme Networks' Ethernet Automatic Protection Switching (EAPS), Version 1.3
draft-shah-extreme-rfc3619bis-02
This document describes version 1.3 of the Ethernet Automatic Protection Switching (EAPS) (TM) technology invented by Extreme Networks to increase the availability and robustness of Ethernet rings. An Ethernet ring built using EAPS can have resilience comparable to that provided by SONET rings, at lower cost and without some of the constraints (e.g. ring size) of SONET.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 07, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document may not be modified, and derivative works of it may not be created, except to format it for publication as an RFC or to translate it into languages other than English.
Many Metropolitan Area Networks (MANs) and some Local Area Networks (LANs) have a ring topology, as the fiber runs. The Ethernet Automatic Protection Switching technology described here works well in ring topologies for MANs or LANs.
Also, most MAN operators want to minimize the recovery time in the event a fiber cut occurs. The Spanning Tree Protocol STP [IEEE802.1D-1998] can take as long as 40 seconds to converge in the event of a topology change. The newer Rapid Spanning Tree Protocol RSTP [IEEE802.1D-2004] is considerably faster. However, its convergence time is still dependent upon the number of switching nodes in the ring. Both STP and RSTP limit the number of switches in the ring. The Ethernet Automatic Protection Switching (EAPS) technology described here converges in less than one second, often in less than 100 milliseconds. EAPS technology does not limit the number of switches in the ring, and the convergence time is independent of the number of switches.
EAPS version 1 is specified in [RFC3619].
An EAPS Domain exists on a single Ethernet ring. Any Ethernet Virtual Local Area Network (VLAN) that is to be protected is configured on all ports in the ring for the given EAPS Domain. Each EAPS Domain has a single designated "Master node". Each other switch on that ring is referred to as a "Transit node".
Of course, each switch on the ring will have 2 ports connected to the ring. One port of the Master node is designated to be the "primary port" to the ring for the Master node. The other port is designated as the "secondary port".
In normal operation, the Master node blocks the secondary port for all non-control Ethernet frames belonging to the given EAPS Domain, thereby avoiding a loop in the ring. Existing Ethernet switching and learning mechanisms operate per existing standards on this ring. This is possible because the Master node makes the ring appear not to have a loop, from the perspective of the Ethernet standard algorithms used for switching and learning. If the Master node detects a ring fault, it unblocks its secondary port and allows Ethernet data frames to pass through that port. Each EAPS Domain uses a special Control VLAN which only carries EAPS control PDUs. This Control VLAN must not be blocked, so EAPS control PDUs must be consumed by the Master node and NOT forwarded.
EAPS uses both a polling mechanism, described in detail below, and an alert mechanism, also described below, to verify the connectivity of the ring and to quickly detect any faults.
EAPS frames are encoded using the Extreme Networks' Encapsulation Protocol. The EAPS frame format is defined in Section 5. All EAPS frames use a source MAC address of 00-E0-2B-00-00-01 (assigned out of an Extreme Networks OUI). All EAPS frames use a destination MAC address of 00-E0-2B-00-00-04 (with the exception of FLUSH-FDB-PDU, described in Section 4).
When any Transit node detects a link-down on any of its ports in the EAPS Domain, that Transit node immediately sends a "link down" control frame (LINK-DOWN-PDU) on the Control VLAN to the Master node.
When the Master node receives this "link down" control frame, the Master node moves from the "normal" state (COMPLETE) to the ring-fault state (FAILED) and unblocks its secondary port. The Master node also flushes its bridging table. The Master node also sends a control frame (RING-DOWN-FLUSH-FDB) to all other ring switches instructing them to flush their bridging tables. Immediately after flushing its bridging table, each switch starts learning the new topology.
The Master node sends a health-check frame (HEALTH-CHECK-PDU) out of its primary port on the Control VLAN at a user-configurable interval. If the ring is complete, this will be received on its secondary port. Upon receipt of the HEALTH-CHECK-PDU, the Master node resets its Fail-period timer and continues normal operation.
If the Master node does not receive the HEALTH-CHECK-PDU before the Fail-period timer expires, the Master node will perform one of two operations based on the Fail-timer-expiry action configured. If the Fail-timer-expiry action is set to "Send-Alert", the Master node will send a QUERY-LINK-STATUS-PDU out of both ring ports to determine whether any Transit nodes have link failures. Transit nodes with a link failure will reply back to the Master node with a LINK-DOWN-PDU, which will cause the Master node to move from normal state to the "ring-fault" state (FAILED) and unblock its secondary port. If the Fail-timer-expiry action is set to "Open Secondary Port" the Master will NOT send a QUERY-LINK-STATUS-PDU, but will move directly from normal state to the "ring-fault" state (FAILED) and unblock its secondary port. When entering FAILED state, the Master node will flush its bridging table. The Master node will also send a control frame (RING-DOWN-FLUSH-FDB) to all other switches, instructing them to also flush their bridging tables. Immediately after flushing its bridge table, each switch will start learning the new topology. This ring polling mechanism provides a backup in the event the Link Down Alert frame (LINK-DOWN-PDU) should get lost for some unforeseen reason.
The Master node continues sending periodic HEALTH-CHECK-PDUs out its primary port even when operating in the ring-fault (FAILED) state. Once the ring is restored, the very next HEALTH-CHECK-PDU sent will be received on the Master node's secondary port. This will cause the Master node to transition back to the normal (COMPLETE) state, logically block non-control frames on the secondary port, flush its own bridge table, and send a control frame (RING-UP-FLUSH-FDB-PDU) to the Transit nodes instructing them to flush their bridging tables and re-learn the topology.
During the time between the Transit node detecting that its link is restored and the Master node detecting that the ring is restored, the secondary port of the Master node is still open -- creating the possibility of a temporary loop in the topology. To prevent any temporary loop, the Transit node will put all the protected VLANs transiting the newly restored port into a temporary blocked state, remember which port has been temporarily blocked, and transition into the "PREFORWARDING" state. When the Transit node in the PREFORWARDING state receives a RING-UP-FLUSH-FDB-PDU instructing it to flush its bridging table, it will flush the bridging table, unblock the previously blocked protected VLANs on the newly restored port, and transition to the "normal" (LINKS-UP) state.
One of the biggest drawbacks of using the ring-polling mechanism in detecting failures is when the EAPS HEALTH-CHECK-PDUs do not return to the Master node, even though the ring itself is complete. This could happen due to a number of reasons such as the Control VLAN not being configured correctly on all switches in the ring; or bad hardware dropping control PDUs; or too much traffic on the ring causing control PDUs to get dropped or delayed; or the Master node's CPU being too busy, and not getting a chance to process a HEALTH-CHECK-PDU thereby causing its Fail-period timer to expire.
When the EAPS Master node enters into FAILED state due to its Fail-period timer expiring, and unblocks its secondary port, it may inadvertently cause a loop in the network if the ring is actually complete.
The EAPS Master node can be configured to take one of two actions when its Fail-period timer expires:
To handle the situation where a LINK-DOWN-PDU may been missed or dropped, a new PDU type has been introduced - QUERY-LINK-STATUS-PDU. When the Master node's Fail-period timer expires while being configured for send-alert, it's Failed flag gets set. The Master node also sends the QUERY-LINK-STATUS-PDU out both its ring-ports. If any Transit node in the ring has one of its links down, it will respond with its regular LINK-DOWN-PDU. This way, if there is a legitimate failure in the ring, the Master node will get a chance to learn about it and transition to the regular FAILED state and unblock its secondary-port.
A couple of enhancements have been added to the EAPS protocol since [RFC3619] to help in trouble-shooting an EAPS network.
On a Transit node, when one ring-port is already up, and the other one transitions from down to up, this Transit node's state will change from LINK-DOWN to PREFORWARDING. This is a state to prevent a temporary loop in the network.
Without this state, a Transit node which already had one port up, and had the other one coming up would have transitioned from LINK-DOWN state to LINKS-UP state where both ports are forwarding. At that moment the Master node is still in FAILED state where both of its ports are forwarding. We would have a temporary loop in the network until the Master node detected the ring is complete and blocked its secondary port.
When a port comes up on the Transit node, it checks the status of its other ring-port. If the other port is also up, it enters into PREFORWARDING state, and keeps the port that just came up in blocking state, while starting its Preforwarding timer. It now waits to receive the RING-UP-FLUSH-FDB-PDU from the Master node. The Master node sends this PDU when it enters into COMPLETE state and has blocked its secondary-port.
When the Transit node sees the RING-UP-FLUSH-FDB-PDU message, it knows that the ring has been blocked by the Master node, so it transitions from PREFORWARDING state into the LINKS-UP state, enables forwarding on its previously blocked port, and flushes its FDB. The Transit node also stops the Preforwarding timer.
The role of the Preforwarding timer is to deal with a lost RING-UP-FLUSH-FDB-PDU. It is also used in the case of another break in the ring, in which case the Transit node would not receive a RING-UP-FLUSH-FDB-PDU from the Master node. Without a Preforwarding timer, the Transit node would remain in PREFORWARDING state with its port in blocked state indefinitely, thereby causing a disconnected network.
The value of the Preforwarding timer is derived from the HEALTH-CHECK-PDU sent by the Master node. The Transit node looks up the hello-interval field in the PDU, then multiplies this value by 3, and adds 3 to it.
An EAPS-enabled switch can be part of more than one ring. Hence, an EAPS-enabled switch can belong to more than one EAPS Domain at the same time. Each EAPS Domain on a switch requires a separate instance of the EAPS protocol on that same switch, one instance per EAPS-protected ring.
One can also have more than one EAPS Domain running on the same ring at the same time. Each EAPS Domain has its own different Master node and each EAPS Domain has its own set of protected VLANs. This facilitates spatial reuse of the ring's bandwidth.
EAPS Shared-Ports is the technology used with EAPS to break a loop when multiple EAPS Domains have a common-link between them.
Without EAPS Shared-Ports there would be a loop in the network if the common-link between them went down, and there were VLANs spanning two or more domains.
EAPS Shared-Ports is a proprietary protocol run in addition to EAPS on those switches that have one end of the common-link between the domains. One switch is configured to be the CONTROLLER, which does the blocking when the common-link goes down. Its peer is running on the other end of the common-link, and is configured to be the PARTNER. Both switches work together to prevent a loop in the network when the common-link between them goes down.
Describing the operation of EAPS Shared-Ports is beyond the scope of this document. Here, we will describe the additional changes that have to be made to an EAPS switch which is participating in such a network, but not running the EAPS Shared-Ports protocol itself. In fact, it does not even have to know that there is EAPS Shared-Ports configured on other switches in its network.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination MAC address (6 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Source MAC address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (6 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EtherType (2 bytes) | Pri | VLAN Id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Frame length [0x005C] | DSAP [0xAA] | SSAP [0xAA] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Control [0x03]| OUI [0x00] | OUI [0xE0] | OUI [0x2B] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type [0x00BB] |EEP Ver [0x01] |EEP Resv[0x00] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EEP Len [0x0054] | EEP Checksum (2 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EEP Sequence Num (2 bytes) | Device Id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (8 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Marker [0x99]| EEP Type[0x0B] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EAPS length [0x0040] |EAPS Ver [0x01]| EAPS PDU Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |EAPS Control VLAN Id (2 bytes) | EAPS Reserved [0x0000] | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | [0x0000] | System MAC Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (6 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EAPS Hello Timer (2 bytes) | EAPS Fail Timer (2 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |EAPS State | [0x00] | EAPS Sequence Num (2 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EAPS Reserved (38 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Marker [0x99] | EEP Type[0x00]| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | EEP NULL TLV Len [0x04] | Ethernet Frame... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ...Checksum (4 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where: Dest MAC (6 bytes) = 0x00 E0 2B 00 00 04 Source MAC (6 bytes) = 0x00 E0 2B 00 00 01 EtherType (2 bytes) = 0x81 00 (for IEEE 802.1Q tagged packets) Pri (4 bits) = 3 bits Priority + 1 bit reserved VLAN Id (12 bits) = VLAN Id for Control VLAN in use Frame Len (2 bytes) = 0x005C (Ethernet frame data length) DSAP (1 byte) = 0xAA SSAP (1 byte) = 0xAA Control (1 byte) = 0x03 OUI (3 bytes) = 0x00 E0 2B (Extreme Networks OUI) Type (2 bytes) = 0x00 BB EEP Ver (1 byte) = 0x01 (Extreme's Encapsulation Protocol version) EEP Resv (1 byte) = 0x00 EEP Len (2 bytes) = 0x0054 (Length of EEP data + EEP header) EEP Checksum (2 bytes) = Calculated checksum. (Described below) EEP Seq Num (2 bytes) = The first EEP packet has a value of 1. This value is incremented by 1 for each subsequent EEP packet sent out. (This field is only used for debugging purposes.) Device Id (8 bytes) = System MAC. The 2 MSBs of Device Id set to 0. Marker (1 byte) = 0x99 (EEP's Start of a new TLV marker. This is the beginning of the EAPS TLV) EEP Type (1 byte) = 0x0B (EAPS PDU TLV) EAPS Length (2 bytes) = 0x0040 (Length of EAPS TLV including header) EAPS Ver (1 byte) = 0x01 (EAPS Ver 1) EAPS PDU Type (1 byte) = Identifies the type of EAPS PDU (Values given below) EAPS VLAN Id (2 bytes) = VLAN Id for Control VLAN being used to send and receive EAPS PDUs EAPS Reserved (4 bytes) = 0x00 00 00 00 System MAC (6 bytes) = System MAC of node issuing the EAPS packet EAPS Hello (2 bytes) = 0x04. Even though this is meant to convey the EAPS Hello Interval, it is hard-coded to 4. This is so that the Transits can derive their preforwarding interval to be 15 seconds. EAPS Fail (2 bytes) = EAPS Fail Timer interval set by Master EAPS State (1 byte) = EAPS node's state (Values given below) EAPS Reserved (1 byte) = 0x00 EAPS Seq Num (2 bytes) = For debug, sequence number of Health-PDUs in ascending order EAPS Reserved (38 bytes) = All bytes are 0 for now Marker (1 byte) = 0x99 (EEP's Start of a new TLV marker. This is the beginning of the NULL TLV) EEP Type (1 byte) = 0x00 (EEP NULL TLV) NULL TLV Len (2 bytes) = 0x04 (including header) Checksum (4 bytes) = Ethernet Frame's checksum
EAPS PDU Type values: HEALTH-CHECK-PDU = 0x05 RING-UP-FLUSH-FDB-PDU = 0x06 RING-DOWN-FLUSH-FDB-PDU = 0x07 LINK-DOWN-PDU = 0x08 FLUSH-FDB-PDU = 0x0D QUERY-LINK-STATUS-PDU = 0x0F LINK-UP-PDU = 0x10 All other values are reserved
EAPS State values: IDLE = 0x00 (EAPS Domain (Master/Transit) still not running) COMPLETE = 0x01 (Master in Complete State) FAILED = 0x02 (Master in Failed State) LINKS-UP = 0x03 (Transit in Links-Up State. Both ring-ports are up) LINK-DOWN = 0x04 (Transit in Link-Down State. One or both ring-ports are down) PREFORWARDING = 0x05 (Transit in Preforwarding State) INIT = 0x06 (Master in Init State) All other values are reserved
EEP Checksum: This is 16 bits wide and is calculated as follows: o First set "EEP Checksum" field to 0. o Calculate the checksum using the algorithm below starting from "EEP Ver" field through "EEP NULL TLV Len" field. o This size is the same value that is set in "EEP Len" field. // Algorithm for EEP checksum calculation int checksum (uint16_t *addr, int len) { int sum = 0; o Using a 32 bit accumulator 'sum' o In a while-loop, go on adding sequential 16 bit words from 'addr' to accumulator 'sum' o If there is an odd byte left at the end, add it to accumulator 'sum' o Add high 16 bits of 'sum' to low 16 bits of 'sum' o If there is a carry bit, add it back to 'sum' o Truncate to 16 bits, and return this as a 16 bit word. }
This memo includes no request to IANA.
Anyone with physical access to the physical layer connections could forge any sort of Ethernet frame they wished, including but not limited to Bridge frames or EAPS frames. Such forgeries could be used to disrupt an Ethernet network in various ways, including methods that are specific to EAPS or other unrelated methods such as forged Ethernet bridge frames.
As such, it is recommended that users not deploy Ethernet without some form of encryption in environments where such active attacks are considered a significant operational risk. IEEE standards already exist for link-layer encryption [IEEE802.1AE-2006]. Those IEEE standards could be used to protect an Ethernet's links. Alternately, upper-layer security mechanisms could be used if more appropriate to the local threat model.
The IETF has been notified of intellectual property rights claimed in regard to some or all of the specification contained in this document. For more information, consult the online list of claimed rights.
[RFC3619] was edited and put into RFC format by R.J. Atkinson from internal documents created by the authors of that document. This version of the EAPS specification is derived from [RFC3619]. Arnel Lim and Steven Blake edited this document based on a draft prepared by Sunil Shah.
This document was produced using the xml2rfc tool [RFC2629].
[IEEE802.1D-1998] | IEEE LAN/MAN Standards Committee, "IEEE 802.1D Standard for Local and Metropolitan Area Networks: Media Access Control (MAC) Bridges", 1998. |
[IEEE802.1D-2004] | IEEE LAN/MAN Standards Committee, "IEEE 802.1D Standard for Local and Metropolitan Area Networks: Media Access Control (MAC) Bridges", 2004. |
[RFC3619] | Shah, S. and M. Yip, "Extreme Networks' Ethernet Automatic Protection Switching (EAPS) Version 1", RFC 3619, October 2003. |
[IEEE802.1AE-2006] | IEEE LAN/MAN Standards Committee, "IEEE 802.1AE Standard for Local and Metropolitan Area Networks: Media Access Control (MAC) Security", 2006. |
[RFC2629] | Rose, M.T., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999. |