TOC 
Network Working GroupT. Morin
Internet-DraftFrance Telecom - Orange Labs
Intended status: ExperimentalY. Rekhter
Expires: January 9, 2010R. Aggarwal
 Juniper Networks
 W. Henderickx
 P. Muley
 Alcatel-Lucent
 July 08, 2009


Multicast VPN fast upstream failover
draft-morin-l3vpn-mvpn-fast-failover-02

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on January 9, 2010.

Copyright Notice

Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

Abstract

This document defines multicast VPN extensions and procedures that allow fast failover for upstream failures, by allowing downstream PEs to take into account the status of Provider-Tunnels (P-tunnels) when selecting the upstream PE for a VPN multicast flow, and extending BGP mVPN routing so that a C-multicast route can be advertised toward a standby upstream PE.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].



Table of Contents

1.  Introduction
2.  Terminology
3.  UMH Selection based on tunnel status
    3.1.  Determining the status of a tunnel
        3.1.1.  mVPN tunnel root tracking
        3.1.2.  PE-P Upstream link status
        3.1.3.  P2MP RSVP-TE tunnels
        3.1.4.  Leaf-initiated P-tunnels
        3.1.5.  P2MP LSP OAM
        3.1.6.  (S,G) counter information
4.  Standby C-multicast route
    4.1.  Downstream PE behavior
    4.2.  Upstream PE behavior
    4.3.  Reachability determination
5.  Hot leaf standby
6.  Duplicate packets
7.  IANA Considerations
8.  Security Considerations
9.  Acknowledgements
10.  References
    10.1.  Normative References
    10.2.  Informative References
§  Authors' Addresses




 TOC 

1.  Introduction

In the context of multicast in BGP/MPLS VPNs, it is desirable to provide mechanisms allowing fast recovery of connectivity on different types of failures. This document addresses failures of elements in the provider network that are upstream of PEs connected to VPN sites with receivers.

The sections 3 (UMH Selection based on tunnel status) and 4 (Standby C-multicast route) describe two independent mechanisms, allowing different levels of resiliency, and providing different failure coverage:

Moreover, section 5 (Hot leaf standby) describes a "hot leaf standby" mechanism, that uses a combination of these two mechanisms. This approach has similarities with the solution described in [I‑D.karan‑mofrr] (Karan, A., Filsfils, C., and D. Farinacci, “Multicast only Fast Re-Route,” March 2009.) to improve failover times when PIM routing is used in a network given some topology and metric constraints.



 TOC 

2.  Terminology

The terminology used in this document is the terminology defined in [I‑D.ietf‑l3vpn‑2547bis‑mcast] (Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y., Rosen, E., Wijnands, I., and S. Yasukawa, “Multicast in MPLS/BGP IP VPNs,” January 2010.) and [I‑D.ietf‑l3vpn‑2547bis‑mcast‑bgp] (Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, “BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs,” September 2009.).



 TOC 

3.  UMH Selection based on tunnel status

Current multicast VPN specifications (Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y., Rosen, E., Wijnands, I., and S. Yasukawa, “Multicast in MPLS/BGP IP VPNs,” January 2010.) [I‑D.ietf‑l3vpn‑2547bis‑mcast], section 5.1, describe the procedures used by a multicast VPN downstream PE to determine what the upstream multicast hop (UMH) is for a said (C-S,C-G).

The procedure described here is an OPTIONAL procedure that consist in having a downstream PE take into account the status of P-tunnels rooted at each possible upstream PEs, for including or not including each said PE in the list of candidate UMHs for a said (C-S,C-G) state. The result is that, if a P-tunnel is "down" (see Section 3.1 (Determining the status of a tunnel)), the PE that is the root of the P-Tunnel will not be considered for UMH selection, which will result in the downstream PE to failover to the upstream PE which is next in the list of candidates.

More precisely, UMH determination for a said (C-S,C-G) will consider the UMH candidates in the following order:

For a said downstream PE and a said VRF, the P-tunnel corresponding to a said upstream PE for a said (C-S,C-G) state is the S-PMSI tunnel advertized by that upstream PE for this (C-S,C-G) and imported into that VRF, or if there isn't any such S-PMSI, the I-PMSI tunnel advertized by that PE and imported into that VRF.

Note that this documents assumes that if a site of a given mVPN that contains C-S is dual-homed to two PEs, then all the other sites of that mVPN would have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its own RD.



 TOC 

3.1.  Determining the status of a tunnel

Different factors can be considered to determine the "status" of a P-tunnel and are described in the following sub-sections. The procedure proposed here also allows that all downstream PEs don't apply the same rules to define what the status of a P-tunnel is (please see Section 6 (Duplicate packets)), and some of them will produce a result that may be different for different downstream PEs. Thus what is called the "status" of a P-tunnel in this section, is not a characteristic of the tunnel in itself, but is the status of the tunnel, as seen from a particular downstream PE.

Depending on the criteria used to determine the status of a P-tunnel, there may be an interaction with other resiliency mechanism used for the P-tunnel itself, and the UMH update may happen immediately or may need to be delayed. Each particular case is covered in each separate sub-section below.



 TOC 

3.1.1.  mVPN tunnel root tracking

A condition to consider that the status of a P-tunnel is up is that the root of the tunnel, as determined in the PMSI tunnel attribute, is reachable through unicast routing tables. In this case the downstream PE can immediately update its UMH when the reachability condition changes.

This is similar to BGP next-hop tracking for VPN routes, except that the address considered is not the BGP next-hop address, but the root address in the PMSI tunnel attribute.

If BGP next-hop tracking is done for VPN routes, and the root address of a said tunnel happens to be the same as the next-hop address in the BGP autodiscovery route advertising the tunnel, then this mechanisms may be omitted for this tunnel, as it will not bring any specific benefit.



 TOC 

3.1.2.  PE-P Upstream link status

A condition to consider a tunnel status as up can be that the last-hop link of the P-tunnel is up.

In that case, if the PE can determine that there is no fast restoration mechanism (such as MPLS FRR (Pan, P., Swallow, G., and A. Atlas, “Fast Reroute Extensions to RSVP-TE for LSP Tunnels,” May 2005.) [RFC4090]) in place for the P-tunnel, it can update the UMH immediately. Else, it should wait before updating the UMH, to let the P-tunnel restoration mechanims happen. A configurable timer MUST be provided for this purpose, and it is recommended to provide a reasonable default value for this timer.



 TOC 

3.1.3.  P2MP RSVP-TE tunnels

For P-Tunnels of type P2MP MPLS-TE, the status of the P-Tunnel is considered up if one or more of the P2MP RSVP-TE LSPs, identified by the P-Tunnel Attribute, are in up state. The determination of whether a P2MP RSVP-TE LSP is in up state requires Path and Resv state for the LSP and is based on procedures in [RFC4875] (Aggarwal, R., Papadimitriou, D., and S. Yasukawa, “Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to-Multipoint TE Label Switched Paths (LSPs),” May 2007.). In this case the downstream PE can immediately update its UMH when the reachability condition changes.

When signaling state for a P2MP TE LSP is removed (e.g. if the ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE LSP changes state from up to down as determined by procedures in [RFC4875] (Aggarwal, R., Papadimitriou, D., and S. Yasukawa, “Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to-Multipoint TE Label Switched Paths (LSPs),” May 2007.), the status of the corresponding P-Tunnel SHOULD be re-evaluated. If the P-Tunnel transitions from up to down state, the upstream PE, that is the ingress of the P-Tunnel, SHOULD not be considered a valid UMH.



 TOC 

3.1.4.  Leaf-initiated P-tunnels

A PE can be removed from the UMH candidate list for a said (S,G) if the P-tunnel for this S,G (I or S , depending) is leaf triggered (PIM, mLDP), but for some reason internal to the protocol the upstream one-hop branch of the tunnel from P to PE cannot be built. In this case the downstream PE can immediately update its UMH when the reachability condition changes.



 TOC 

3.1.5.  P2MP LSP OAM

When a P2MP connectivity verification mechanism such as [I‑D.katz‑ward‑bfd‑multipoint] (Katz, D. and D. Ward, “BFD for Multipoint Networks,” February 2009.) used in conjunction with bootstraping mechanisms described in [I‑D.ietf‑mpls‑mcast‑cv] (Swallow, G., “Connectivity Verification for Multicast Label Switched Paths,” April 2007.) has been setup for a tunnel, the result of the connectivity verification can be used to define the status of the tree.

If a MultipointHead session has been established on a P2MP MPLS LSP so that BFD packets are periodically sent from the root toward leaves, a condition to consider the status of corresponding tunnel as up is that the BFD SessionState is Up.

When such a procedure is used, in context where fast restoration mechanisms are used for the P-tunnels, downstream PEs should be configured to wait before updating the UMH, to let the P-tunnel restoration mechanims happen. A configurable timer MUST be provided for this purpose, and it is recommended to provide a reasonable default value for this timer.



 TOC 

3.1.6.  (S,G) counter information

In cases, where the downstream node can be configured so that the maximum inter-packet time is known for all the multicast flows mapped on a P-tunnel, the local per-(C-S,C-G) traffic counter information for traffic received on this P-tunnel can be used to determine the status of the P-tunnel.

When such a procedure is used, in context where fast restoration mechanisms are used for the P-tunnels, downstream PEs should be configured to wait before updating the UMH, to let the P-tunnel restoration mechanims happen. A configurable timer MUST be provided for this purpose, and it is recommended to provide a reasonable default value for this timer.

This method can be applicable for instance when a (S,G) flow is mapped on an S-PMSI.

In cases where this mechanism is used in conjunction with Hot leaf standby (Hot leaf standby), then no prior knowledge of the rate of the multicast streams is required ; downstream PEs can compare reception on the two P-tunnels to determine when one of them is down.



 TOC 

4.  Standby C-multicast route

The procedures described below are limited to the case where the site that contains C-S is connected to exactly two PEs. The procedures require all the PEs of that mVPN to follow the single forwarder PE selection, as specified in [I‑D.ietf‑l3vpn‑2547bis‑mcast] (Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y., Rosen, E., Wijnands, I., and S. Yasukawa, “Multicast in MPLS/BGP IP VPNs,” January 2010.). The procedures assume that if a site of a given mVPN that contains C-S is dual-homed to two PEs, then all the other sites of that mVPN would have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) routes to C-S, each with its own RD.

As long as C-S is reachable via both PEs, a said downstream PE will select one of the PEs connected to C-S as its Upstream PE with respect to C-S. We will refer to the other PE connected to C-S as the "Standby Upstream PE". Note that if the connectivity to C-S through the Primary Upstream PE becomes unavailable, then the PE will select the Standby Upstream PE as its Upstream PE with respect to C-S.

For readability, in the following sub-sections, the procedures are described for BGP C-multicast Source Tree Join routes, but they apply equally to BGP C-multicast Shared Tree Join routes failover for the case where the customer RP is dual-homed (substitute "C-RP" to "C-S").



 TOC 

4.1.  Downstream PE behavior

When a (downstream) PE connected to some site of an mVPN needs to send a C-multicast route (C-S, C-G), then following the procedures specified in Section "Originating C-multicast routes by a PE" of [I‑D.ietf‑l3vpn‑2547bis‑mcast‑bgp] (Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, “BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs,” September 2009.) the PE sends the C-multicast route with RT that identifies the Upstream PE selected by the PE originating the route. As long as C-S is reachable via the Primary Upstream PE, the Upstream PE is the Primary Upstream PE. If C-S is reachable only via the Standby Upstream PE, then the Upstream PE is the Standby Upstream PE.

If C-S is reachable via both the Primary and the Standby Upstream PE, then in addition to sending the C-multicast route with an RT that identifies the Primary Upstream PE, the PE also originates and sends a C-multicast route with an RT that identifies the Standby Upstream PE. This route, that has the semantic of being a 'standby' C-multicast route, is further called a "Standby BGP C-multicast route", and is constructed as follows:

The normal and the standby C-multicast routes must have their Local Preference attribute adjusted so that, if two C-multicast routes with same NLRI are received by a BGP peer, one carrying the "Standby PE" attribute and the other one not carrying the "Standby PE" community, then preference is given to the one not carrying the "Standby PE" attribute. Such a situation can happen when, for instance due to transient unicast routing inconistencies, two different downstream PEs consider different upstream PEs to be the primary one ; in that case, without any precaution taken, both upstream PEs would process a standby C-multicast route and possibly stop forwarding at the same time. For this purpose a Standby BGP C-multicast route MUST have the LOCAL_PREF attribute set to zero.

If at some later point the local PE determines that C-S is no longer reachable through the Primary Upstream PE, the Standby Upstream PE becomes the Upstream PE, and the local PE re-sends the C-multicast route with RT that identifies the Standby Upstream PE, except that now the route does not carry the Standby PE BGP Community (which results in replacing the old route with a new route, with the only difference between these routes being the presence/absence of the Standby PE BGP Community).



 TOC 

4.2.  Upstream PE behavior

When a PE receives a C-multicast route for a particular (C-S, C-G), and the RT carried in the route results in importing the route into a particular VRF on the PE, if the route carries the Standby PE BGP Community, then the PE performs as follows:

when the PE determines that C-S is not reachable through some other PE, the PE SHOULD install VRF PIM state corresponding to this Standby BGP C-multicast route (the result will be that a PIM Join message will be sent to the CE towards C-S, and that the PE will receive (C-S,C-G) traffic), and the PE SHOULD forward (C-S, C-G) traffic received by the PE to other PEs through a P-tunnel rooted at the PE.

Furthermore, irrespective of whether C-S carried in that route is reachable through some other PE:

a)
based on local policy, as soon as the PE receives this Standby BGP C-multicast route, the PE MAY install VRF PIM state corresponding to this BGP Source Tree Join route (the result will be that Join messages will be sent to the CE toward C-S, and that the PE will receive (C-S,C-G) traffic)
b)
based on local policy, as soon as the PE receives this Standby BGP C-multicast route, the PE MAY forward (C-S, C-G) traffic to other PEs through a P-tunnel independently of the reachability of C-S through some other PE. [note that this implies also doing (a)]

Doing neither (a), nor (b) for a said (C-S,C-G) is called "cold root standby".

Doing (a) but not (b) for a said (C-S,C-G) is called "warm root standby".

Doing (b) (which implies also doing (a)) for a said (C-S,C-G) is called "hot root standby".



 TOC 

4.3.  Reachability determination

The standby PE can use the following information to determine that C-S can or cannot be reached through the primary PE:



 TOC 

5.  Hot leaf standby

The mechanisms defined in the two previous section can be used together as follows.

The principle is that, for a said VRF (or possibly only for a said C-S,C-G):

Note that the same level of protection would be achievable with a simple C-multicast Source Tree Join route advertised to both the primary and secondary upstream PEs (carrying as Route Target extended communities, the values of the VRF Route Import attribute of each VPN route from each upstream PEs). The advantage of using the hot leaf standby semantic is that by making these downstream PEs always advertise a Standby C-multicast route to the secondary upstream PE, it allows to choose the protection level through a change of configuration on the secondary upstream PE, without requiring any reconfiguration of all the downstream PEs.

Other combinations of the mechanisms proposed in Section 4 (Standby C-multicast route) and Section 3 (UMH Selection based on tunnel status) are for further study.



 TOC 

6.  Duplicate packets

Multicast VPN specifications (Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y., Rosen, E., Wijnands, I., and S. Yasukawa, “Multicast in MPLS/BGP IP VPNs,” January 2010.) [I‑D.ietf‑l3vpn‑2547bis‑mcast] impose that a PE only forwards to CEs the packets coming from the expected usptream PE (Section 9.1).

We highlight the reader's attention to the fact that the respect of this part of multicast VPN specifications is especially important when two distinct upstream PEs are succeptible to forward the same traffic on P-tunnels at the same time in steady state. This will be the case when "hot root standby" mode is used (Section 4 (Standby C-multicast route)), and which can also be the case if procedures of Section 3 (UMH Selection based on tunnel status) are used and (a) the rules determining the status of a tree are not the same on two distinct downstream PEs or (b) the rule determining the status of a tree depend on conditions local to a PE (e.g. the PE-P upstream link being up).



 TOC 

7.  IANA Considerations

Allocation is expected from IANA for the BGP "Standby PE" community. (TBC)

[Note to RFC Editor: this section may be removed on publication as an RFC.]



 TOC 

8.  Security Considerations



 TOC 

9.  Acknowledgements

The authors want to thank Ray Qiu for its review and useful feedback.



 TOC 

10.  References



 TOC 

10.1. Normative References

[I-D.ietf-l3vpn-2547bis-mcast] Aggarwal, R., Bandi, S., Cai, Y., Morin, T., Rekhter, Y., Rosen, E., Wijnands, I., and S. Yasukawa, “Multicast in MPLS/BGP IP VPNs,” draft-ietf-l3vpn-2547bis-mcast-10 (work in progress), January 2010 (TXT).
[I-D.ietf-l3vpn-2547bis-mcast-bgp] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, “BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs,” draft-ietf-l3vpn-2547bis-mcast-bgp-08 (work in progress), September 2009 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC4875] Aggarwal, R., Papadimitriou, D., and S. Yasukawa, “Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to-Multipoint TE Label Switched Paths (LSPs),” RFC 4875, May 2007 (TXT).


 TOC 

10.2. Informative References

[I-D.ietf-mpls-mcast-cv] Swallow, G., “Connectivity Verification for Multicast Label Switched Paths,” draft-ietf-mpls-mcast-cv-00 (work in progress), April 2007 (TXT).
[I-D.karan-mofrr] Karan, A., Filsfils, C., and D. Farinacci, “Multicast only Fast Re-Route,” draft-karan-mofrr-00 (work in progress), March 2009 (TXT).
[I-D.katz-ward-bfd-multipoint] Katz, D. and D. Ward, “BFD for Multipoint Networks,” draft-katz-ward-bfd-multipoint-02 (work in progress), February 2009 (TXT).
[RFC4090] Pan, P., Swallow, G., and A. Atlas, “Fast Reroute Extensions to RSVP-TE for LSP Tunnels,” RFC 4090, May 2005 (TXT).


 TOC 

Authors' Addresses

  Thomas Morin
  France Telecom - Orange Labs
  2, avenue Pierre Marzin
  Lannion 22307
  France
Email:  thomas.morin@orange-ftgroup.com
  
  Yakov Rekhter
  Juniper Networks
  1194 North Mathilda Ave.
  Sunnyvale, CA 94089
  U.S.A.
Email:  yakov@juniper.net
  
  Rahul Aggarwal
  Juniper Networks
  1194 North Mathilda Ave.
  Sunnyvale, CA 94089
  U.S.A.
Email:  rahul@juniper.net
  
  Wim Henderickx
  Alcatel-Lucent
  Copernicuslaan 50
  Antwerp 2018
  Belgium
Email:  wim.henderickx@alcatel-lucent.com
  
  Praveen Muley
  Alcatel-Lucent
  701 East Middlefield Rd
  Mountain View, CA 94043
  U.S.A.
Email:  praveen.muley@alcatel-lucent.com