Network Working Group | E. Osborne |
Internet-Draft | Cisco |
Intended status: Standards Track | F. Zhang |
Expires: January 01, 2012 | ZTE |
Y. Weingarten | |
Nokia Siemens Networks | |
June 30, 2011 |
MPLS-TP 1toN Protection
draft-ezy-mpls-1ton-protection-00.txt
As part of the Transport Profile for Multiprotocol Label Switching (MPLS-TP) there is a requirement to support 1:n linear protection for transport paths. This requirement is elaborated on in the MPLS-TP Survivability Framework document [SurvivFwk]. The basic protocol for linear protection was specified in the MPLS-TP Linear Protection document [LinProt] but is limited to 1+1 and 1:1 protection. This document extends the protocol defined there to address the additional functionality necessary to support scenarios of a single protection path preconfigured to provide protection of multiple transport paths between two joint endpoints.
This document is a product of a joint Internet Engineering Task Force (IETF) / International Telecommunications Union Telecommunications Standardization Sector (ITU-T) effort to include an MPLS Transport Profile within the IETF MPLS and PWE3 architectures to support the capabilities and functionalities of a packet transport network as defined by the ITU-T.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 01, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
The MPLS Transport Profile (MPLS-TP) Requirements document [TPReq] includes requirements for the necessary survivability tools that are required for MPLS based transport networks. Network survivability is the ability of a network to recover traffic delivery following failure, or degradation of network resources. Requirement 67 lists various types of 1:n protection architectures that are required for MPLS-TP. The MPLS-TP Survivability Framework [SurvivFwk] is a framework for survivability in MPLS-TP networks, and describes recovery elements, types, methods, and topological considerations, focusing on mechanisms for recovering MPLS-TP Label Switched Paths (LSPs).
Linear protection in mesh networks – networks with arbitrary interconnectivity between nodes – is described in Section 4.7 of [SurvivFwk]. Linear protection provides rapid and simple protection switching. In a mesh network, linear protection provides a very suitable protection mechanism because it can operate between any pair of points within the network. It can protect against a defect in an intermediate node, a span, a transport path segment, or an end-to-end transport path.
[LinProt] defines a Protection State Coordination (PSC) protocol that supports the different 1+1 and 1:1 architectures described in [SurvivFwk]. The PSC protocol is a single-phased protocol that allows the two endpoints of the protection domain to coordinate the protection switching operation when a switching condition is detected on the transport paths of the protection domain.
This document extends the PSC protocol to allow it to support a protection domain that includes multiple working transport paths that are protected by a single protection transport path. The protection transport path is pre-allocated with resources to transport the traffic normally carried by any one of the working transport paths. This is the architecture described in [SurvivFwk] as 1:n protection, and is the generalization of the 1:1 protection architecture already supported by PSC.
Linear protection switching is a fully allocated survivability mechanism. It is fully allocated in the sense that the route and bandwidth of the protection path is reserved for a set of working paths. For 1:n protection the protection path is allocated to protect any one of n working paths between the two endpoints of the protection domain.
+-----+ +-----+ | |=============================| | |LER-A| Working Path #1 |LER-Z| | | | | | |=============================| | | | Working Path #2 | | | | | | | |=============================| | | | Working Path #3 | | | | | | | | ooo | | | | | | | |=============================| | | | Working Path #N | | | | | | | | Protection Path | | | |*****************************| | | | | | +-----+ +-----+ |--------Protection Domain--------|
Figure 1 shows a protection domain with N working transport paths and a single protection path. In 1:n protection, it is assumed (as mentioned above) that the protection path may transport the traffic of only a single working path at any particular time. The identity of the working path that is being protected must be communicated between the two endpoints.
The different working paths may be disjoint at the intermediary points on the path between LER-A and LER-Z and may also have different resource requirements. In addition, each of the working paths may be assigned a priority that could be used to decide which working path would be protected in cases of conflict (see more on this topic in Section 1.3). It is usually advised to arrange these protection groups in a way that would minimize any potential conflict situation.
As the 1:n architecture requires the ability for one working path to preempt the traffic of another in the event of multiple failures (see Section 1.3), there must be an indication of priority between the different working paths so that an implementation can decide whether a new failure should be allowed to preempt a protection switch already in place. This priority is purely a local decision, i.e., determined by configuration at both endpoints of the protection domain. It is also possible to assign the same priority to multiple working paths, thus creating a "first come first served" preemption policy. This document provides no means to signal the priority of a given working path, nor a means to detect priority mismatches or misconfigurations. Any mismatch or misconfiguration will likely result in unexpected protection behavior.
Preemption occurs when the protection path is being used to transport traffic and is then required to transport traffic for a service with higher priority. At this point, the current traffic that is being transported on the protection path needs to be interrupted to allow the transport of the protected traffic.
There are two basic scenarios for preemption of traffic –
Nurit Sprecher (NSN)
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
This draft uses the following acronyms:
Ack | Acknowledge |
DNR | Do not revert |
FS | Forced Switch |
LER | Label Edge Router |
LO | Lockout of protection |
MPLS-TP | Transport Profile for MPLS |
MS | Manual Switch |
NR | No Request |
P2P | Point-to-point |
P2MP | Point-to-multipoint |
PSC | Protection State Coordination Protocol |
SD | Signal Degrade |
SF | Signal Fail |
Wfa | Wait for Acknowledge |
WTR | Wait-to-Restore |
The terminology used in this document is based on the terminology defined in [RFC4427] and further adapted for MPLS-TP in [SurvivFwk]. In addition, we use the term LER to refer to a MPLS-TP Network Element, whether it is a LSR, LER, T-PE, or S-PE.
The Protection State Coordination protocol (PSC) is defined in [LinProt]. This includes both the format of the G-ACh based message as well as a description of the operations and the state transition logic of the protocol. The extension to cover 1:n protection includes changes to both aspects of PSC.
The changes to the message structure, include both the addition of new information and extension of the semantics of some of the existing fields of the message. These changes will be described in Section 3.2.
The changes relative to the behavior of the base PSC protocol will be described in Section 3.3.
Base PSC (as defined in [LinProt] is a single-phased protocol, i.e. the endpoints perform protection switching without waiting for acknowledgement from the far end LER. The protocol messages are transmitted using the G-ACh and the format is described in Figure 2.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 1|Version| Reserved | PSC-CT | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Ver|Request|PT |R| Reserved1 | FPath | Path | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLV Length | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Optional TLVs ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In regards to the G-ACh Header no changes are suggested in the extensions for 1:n protection, i.e., the channel type field will continue to use the PSC-CT value defined in [LinProt]. The fields from the PSC payload which are affected by this document are the Ver field, the Reserved1 field, and the Fpath and Path fields.
In order to support 1:n protection there is a need to make changes to the format of the PSC payload (see Figure 3). In particular, there is the need to add a new field to the payload to indicate an acknowledge of a protection switching operation. In addition, the semantics of the FPath and Path field are adjusted to indicate an index of the multiple working paths. The details of these changes are supplied in the following subsections.
Due to the significance of these changes, the value of the Ver field (in the PSC payload) for 1:n protection domain MUST be set to 2.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Ver|Request|PT |R|K| Reserved1 | FPath | Path | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLV Length | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Optional TLVs ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Acknowledge flag is used by an endpoint to acknowledge the request to preempt any current traffic on the parotection path and instead transmit the traffic from the requested working path. See details in section x.y.
The Fpath field indicates which path is identified to be in a fault condition or affected by an administrative command. The following are the possible values:
The Path field indicates which data is being transmitted on the protection path. Under normal conditions, the protection path does not need to carry any user data traffic, but may carry extra traffic. If there is a failure/degrade condition on one of the working paths, then that working path's data traffic will be transmitted over the protection path. The following are the possible values:
In all of the following subsections, assume a protection domain between LER-A and LER-Z, using working paths 1-N and the protection path as shown in figure 1.
A basic premise of this protection architecture is that both endpoints of the protection domain are configured to associate the indices of the working paths with the proper LSP identifiers. If this condition is not met then the protection scheme will cause inconsistencies in traffic transmission.
Protection of the N working paths is based on the operational principles outlined in [LinProt] and will employ the same basic Protection State Coordination Protocol (PSC) outlined in that document. However, as can be expected, due to certain basic differences in the architecture of the protection domain, a small set of differences in operation are necessary. The following sub-sections will highlight these differences and explain their effects on the PSC state machine.
PSC, as presented in [LinProt] is a single-phased protocol. This means that when an endpoint receives a trigger to perform a protection switch, the LER switches traffic and then notifies the far end of the switch, without waiting for acknowledgement. When addressing the situation in a 1:n protection domain, the endpoint that receives the trigger must first verify that the protection path is available to transmit the protected traffic. This may involve interrupting the traffic that is currently being transmitted on the protection path by both endpoints.
In general, after the LER has detected a trigger for protection switching, e.g. a FS operator command, or a SF indication for one of the working paths, the LER SHALL transmit the appropriate PSC message as described in [LinProt] with the following changes:
As stated above, before performing a protection switch the endpoint that detected a switching trigger MUST wait for an Acknowledge message prior to performing the switch. There are two types of message that will be considered as an Acknowledge message:
The protection system should include a timer called the Wait for Acknowledge (Wfa) timer that SHALL be started when the LER enters Wfa state and reset when the Acknowledge message is received. The length of the Wfa timer SHOULD be configured to allow protection switching within the normal time constraints. The Wfa timer will expire only if no Acknowledge message was recieved by the LER in Wfa state. The Wfa Expires local input should have a priority just below that of the WTRExpires signal.
As described above, there is a need for the endpoint that is reporting on a trigger for protection-switching to delay the actual switchover until an acknowledge is received from the far end LER. In order to facilitate this wait period it is necessary to define a new PSC State - Wait for Acknowledge (Wfa) state. This state will be entered by the LER upon receiving a trigger for protection switching, and will be exited either upon receiving an acknowledge message or receiving a remote message indicating that the protection path is currently occupied by a higher priority request.
The following sub-section will describe the actions to be taken when an LER is in the Wfa state.
An LER will enter the Wait for Acknowledge state before transitioning into a protection state, i.e. either Protecting administrative or Protecting failure state. The LER SHALL remain in this state until either receiving an Acknowledge message, or until a Wfa timer expires. Normally, the Acknowledge message will be a remote PSC input. The following describe how the LER, in Wfa state, should react to a new local input:
The following details the reactions of the LER in Wfa state to remote messages:
This document does not include any required IANA considerations
The generic security considerations for the data-plane of MPLS-TP are described in the security framework document [SecureFwk] together with the required mechanisms needed to address them. The security considerations for the generic associated control channel are described in [RFC5586]. The security considerations for protection and recovery aspects of MPLS-TP are addressed in [SurvivFwk].
The extensions to the protocol described in this document are extensions to the protocol defined in [LinProt] and does not introduce any new security risks.
The authors would like to thank all members of the teams (the Joint Working Team, the MPLS Interoperability Design Team in IETF and the T-MPLS Ad Hoc Group in ITU-T) involved in the definition and specification of MPLS Transport Profile.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[TPReq] | Niven-Jenkins, B., Brungard, D., Betts, M., Sprecher, N. and S. Ueno, "Requirements of an MPLS Transport Profile", RFC 5654, September 2009. |
[LinProt] | Bryant, S., Sprecher, N., Osborne, E., Fulignoli, A. and Y. Weingarten, "Multi-protocol Label Switching Transport Profile Linear Protection", ID draft-ietf-mpls-tp-linear-protection-07.txt, Apr 2011. |
[RFC5586] | Vigoureux,, M., Bocci, M., Swallow, G., Aggarwal, R. and D. Ward, "MPLS Generic Associated Channel", RFC 5586, May 2009. |
[RFC4427] | Mannie, E. and D. Papadimitriou, "Recovery Terminology for Generalized Multi-Protocol Label Switching", RFC 4427, Mar 2006. |
[SurvivFwk] | Sprecher, N., Farrel, A. and H. Shah, "Multi-protocol Label Switching Transport Profile Survivability Framework", ID draft-ietf-mpls-tp-survive-fwk-02.txt, Feb 2009. |
[SecureFwk] | Fang, L., Niven-Jenkins, B., Mansfield, S., Zhang, R., Bitar, N., Daikoku, M. and L. Wang, "MPLS-TP Security Framework", ID draft-ietf-mpls-tp-security-framework-00.txt, Feb 2011. |
The full PSC state machine is described in [LinProt], both in textual and tabular form. This appendix highlights the changes to the basic PSC state machine. In the event of a mismatch between these tables and the text either in [LinProt] or in this document, the text is authoritative. Note that this appendix is intended to be a functional description, not an implementation specification.
The tables here use the same format and state descriptions used in the Linear Protection document with the addition of the Wfa state, Wfa Expires, and the changes in the behavior that is noted.
Each state corresponds to the transmission of a particular set of Request, FPath and Path bits. The table below lists the message that is generally sent in each particular state. If the message to be sent in a particular state deviates from the table below, it is noted in the footnotes to the state-machine table.
State | REQ(FP,P) |
---|---|
N | NR(0,0) |
UA:LO:L | LO(0,0) |
UA:P:L | SF(0,0) |
UA:LO:R | NR(0,0) |
UA:P:R | NR(0,0) |
PF:W:L | SF(1,1) |
PF:W:R | NR(0,1) |
PA:F:L | FS(1,1) |
PA:M:L | MS(1,1) |
PA:F:R | NR(0,1) |
PA:M:R | NR(0,1) |
WTR | WTR(0,1) |
DNR | DNR(0,1) |
The top row in each table is the list of possible inputs. The local inputs are:
NR | No Request |
OC | Operator Clear |
LO | Lockout of protection |
SF-P | Signal Fail on protection path |
SF-W | Signal Fail on working path |
FS | Forced Switch |
SFc | Clear Signal Fail |
MS | Manual Switch |
WTRExp | WTR Expired |
and the remote inputs are:
LO | remote LO message |
SF-P | remote SF message indicating protection path |
SF-W | remote SF message indicating working path |
FS | remote FS message |
MS | remote MS message |
WTR | remote WTR message |
DNR | remote DNR message |
NR | remote NR message |
Section 4.3.3 refers to some states as 'remote' and some as 'local'. By definition, all states listed in the table of local sources are local states, and all states listed in the table of remote sources are remote states. For example, section 4.3.3.1 says "A local Lockout of protection input SHALL cause the LER to go into local Unavailable State". As the trigger for this state change is a local one, 'local Unavailable State' is by definition displayed in the table of local sources. Similarly, "A remote Lockout of protection message SHALL cause the LER to go into remote Unavailable state" means that the state represented in the Unavailable rows in the table of remote sources is by definition a remote Unavailable state.
Each cell in the table below contains either a state, a footnote, or the letter 'i'. 'i' stands for Ignore, and is an indication to continue with the current behavior. See section 4.3.3. The footnotes are listed below the table.
Part 1: Local input state machine
| OC | LO | SF-P | FS | SF-W | SFc | MS | WTRExp --------+-----+-------+------+------+------+------+------+------- N | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i UA:LO:L | N | i | i | i | i | i | i | i UA:P:L | i |UA:LO:L| i | i | i | [5] | i | i UA:LO:R | i |UA:LO:L| [1] | i | [2] | [6] | i | i UA:P:R | i |UA:LO:L|UA:P:L| i | [3] | [6] | i | i PF:W:L | i |UA:LO:L|UA:P:L|PA:F:L| i | [7] | i | i PF:W:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i PA:F:L | N |UA:LO:L|UA:P:L| i | i | i | i | i PA:M:L | N |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i PA:F:R | i |UA:LO:L|UA:P:L|PA:F:L| [4] | [8] | i | i PA:M:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i WTR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| [9] DNR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i
Part 2: Remote messages state machine
| LO | SF-P | FS | SF-W | MS | WTR | DNR | NR --------+-------+------+------+------+------+------+------+------ N |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i UA:LO:L | i | i | i | i | i | i | i | i UA:P:L | [10] | i | i | i | i | i | i | i UA:LO:R | i | i | i | i | i | i | i | [16] UA:P:R |UA:LO:R| i | i | i | i | i | i | [16] PF:W:L | [11] | [12] |PA:F:R| i | i | i | i | i PF:W:R |UA:LO:R|UA:P:R|PA:F:R| i | i | [14] | [15] | N PA:F:L |UA:LO:R|UA:P:R| i | i | i | i | i | i PA:M:L |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | i PA:F:R |UA:LO:R|UA:P:R| i | i | i | i | i | [17] PA:M:R |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | N WTR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | [18] DNR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i
The following are the footnotes for the table:
[1] Remain in the current state (UA:LO:R) and transmit SF(0,0)
[2] Remain in the current state (UA:LO:R) and transmit SF(1,0)
[3] Remain in the current state (UA:P:R) and transmit SF(1,0)
[4] Remain in the current state (PA:F:R) and transmit SF(1,1)
[5] If the SF being cleared is SF-P, Transition to N. If it's SF-W, ignore the clear.
[6] Remain in current state (UA:x:R), if the SFc corresponds to a previous SF then begin transmitting NR(0,0).
[7] If domain configured for revertive behavior transition to WTR, else transition to DNR
[8] Remain in PA:F:R and transmit NR(0,1)
[9] Remain in WTR, send NR(0,1)
[10] Transition to UA:LO:R continue sending SF(0,0)
[11] Transition to UA:LO:R and send SF(1,0)
[12] Transition to UA and send SF(1,0)
[13] Transition to PF:W:R and send NR(0,1)
[14] Transition to WTR state and continue to send the current message.
[15] Transition to DNR state and continue to send the current message.
[16] If the local input is SF-P then transition to UA:P:L. If the local input is SF-W then transition to PF:W:L. Else - transition to N state and continue to send the current message.
[17] If the local input is SF-W then transition to PF:W:L. Else - transition to N state and continue to send the current message.
[18] If the receiving LER's WTR timer is running, maintain current state and message. If the WTR timer is stopped, transition to N.