Network Working Group | D. Farinacci |
Internet-Draft | IJ. Wijnands |
Intended status: Experimental Protocol | S. Venaas |
Expires: March 01, 2012 | cisco Systems |
M. Napierala | |
AT&T Labs | |
August 29, 2011 |
A Reliable Transport Mechanism for PIM
draft-ietf-pim-port-08.txt
This document defines a reliable transport mechanism for the PIM protocol for transmission of Join/Prune messages. This eliminates the need for periodic Join/Prune message transmission and processing. The reliable transport mechanism can use either TCP or SCTP as the transport protocol.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 01, 2012.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The goals of this specification are:
The explicit non-goals of this specification are:
This document will specify how periodic Join/Prune message transmission can be eliminated by using TCP [RFC0793] or SCTP [RFC4960] as the reliable transport mechanism for Join/Prune messages.
This specification enables greater scalability in terms of control traffic overhead. However, for routers connected to multi-access links that comes at the price of increased PIM state and the overhead required to maintain this state.
In many existing and emerging networks, particularly wireless and mobile satellite systems, link degradation due to weather, interference, and other impairments can result in temporary spikes in the packet loss. In these environments, periodic PIM joining can cause join latency when messages are lost causing a retransmission only 60 seconds later. By applying a reliable transport, a lost join is retransmitted rapidly. Furthermore, when the last user leaves a multicast group, any lost prune is similarly repaired and the multicast stream is quickly removed from the wireless/satellite link. Without a reliable transport, the multicast transmission could otherwise continue until it timed out, roughly 3 minutes later. As network resources are at a premium in many of these environments, rapid termination of the multicast stream is critical for maintaining efficient use of bandwidth.
This is an experimental extension to PIM. It makes some fundamental changes to how PIM works in that Join/Prune state does not require periodic updates, and partly turns PIM into a hard-state protocol. Also, using reliable delivery for PIM messages is a new concept, and it is likely that experiences from early implementations and deployments will lead to at least minor changes in the protocol. It should be considered making this a standards track protocol once there is some deployment experience. Experiments using this protocol only require support by pairs of PIM neighbors, and need not be constrained to isolated networks.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
PIM Over Reliable Transport (PORT) is a simple extension to PIMv2 for refresh reduction of PIM Join/Prune messages. It involves sending incremental rather than periodic Join/Prune messages over a TCP/SCTP connection between PIM neighbors.
PORT only applies to PIM Sparse-Mode [RFC4601] and Bidirectional PIM [RFC5015] Join/Prune messages.
This document does not restrict PORT to any specific link types. However, the use of PORT on e.g. multi-access LANs with many PIM neighbors should be carefully evaluated. This due to the fact that there may be a full mesh of PORT connections, and that explicit tracking of all PIM PORT routers is required.
PORT can be incrementally used on a link between PORT capable neighbors. Routers that are not PORT capable can continue to use PIM in Datagram Mode. PORT capability is detected using new PORT Capable PIM Hello Options.
Once PORT is enabled on an interface and a PIM neighbor also announces that it is PORT enabled, only PORT Join/Prune messages will be used. That is, only PORT Join/Prune messages are accepted from, and sent to, that particular neighbor. Native Join/Prune messages are still used for PIM neighbors that are not PORT enabled.
PORT Join/Prune messages are sent using a TCP/SCTP connection. When two PIM neighbors are PORT enabled, both for TCP or both for SCTP, they will immediately, or on-demand, establish a connection. If the connection goes down, they will again immediately, or on-demand, try to reestablish the connection. No Join/Prune messages (neither Native nor PORT) are sent while there is no connection. Also, any received native Join/Prune messages from that neighbor are discarded, even when the connection is down.
When PORT is used, only incremental Join/Prune messages are sent from downstream routers to upstream routers. As such, downstream routers do not generate periodic Join/Prune messages for state for which the RPF neighbor is PORT-capable.
For Joins and Prunes, which are received over a TCP/SCTP connection, the upstream router does not start or maintain timers on the outgoing interface entry. Instead, it keeps track of which downstream routers have expressed interest. An interface is deleted from the outgoing interface list only when all downstream routers on the interface, no longer wish to receive traffic. If there also are native joins/prunes from non-PORT neighbor, then one can maintain timers on the outgoing interface entry as usual, while at the same time keep track of each of the downstream PORT joins/prunes.
This document does not update the PIM Join/Prune packet format. In the procedures described in this document, each PIM Join/Prune message is included in the payload of a PORT message carried over TCP/SCTP. See section Section 5 for details on the PORT message.
Option Type: PIM-over-TCP Capable
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 27 | Length = 4 + X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TCP Connection ID AFI | Reserved | Exp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TCP Connection ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Allocated Hello Type values can be found in [HELLO-OPT].
When a router is configured to use PIM over TCP on a given interface, it MUST include the PIM-over-TCP Capable hello option in its Hello messages for that interface. If a router is explicitly disabled from using PIM over TCP, it MUST NOT include the PIM-over-TCP Capable hello option in its Hello messages.
All Hello messages containing the PIM-over-TCP Capable hello option, MUST also contain the Interface ID hello option, see section Section 3.3.
Implementations MAY provide a configuration option to enable or disable PORT functionality. It is RECOMMENDED that this capability be disabled by default.
Option Type: PIM-over-SCTP Capable
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 28 | Length = 4 + X | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCTP Connection ID AFI | Reserved | Exp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCTP Connection ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Allocated Hello Type values can be found in [HELLO-OPT].
When a router is configured to use PIM over SCTP on a given interface, it MUST include the PIM-over-SCTP Capable hello option in its Hello messages for that interface. If a router is explicitly disabled from using PIM over SCTP, it MUST NOT include the PIM-over-SCTP Capable hello option in its Hello messages.
All Hello messages containing the PIM-over-SCTP Capable hello option, MUST also contain the Interface ID hello option, see section Section 3.3.
Implementations MAY provide a configuration option to enable or disable PORT functionality. It is RECOMMENDED that this capability be disabled by default.
All Hello messages containing PIM-over-TCP Capable or PIM-over-SCTP Capable hello options, MUST also contain the Interface ID hello option [I-D.ietf-pim-hello-intid].
The Interface ID is used to associate a PORT Join/Prune message with the PIM neighbor that it is coming from. When unnumbered interfaces are used or when a single Transport connection is used for sending and receiving Join/Prune messages over multiple interfaces, the Interface ID is used to convey the interface from Join/Prune message sender to Join/Prune message receiver. The value of the Interface ID hello option in Hellos sent on an interface, MUST be the same as the Interface ID value in all PORT Join/Prune messages sent to a PIM neighbor on that interface.
The Interface ID need only uniquely identify an interface of a router, it does not need to identify which router the interface belongs to. This means that the Router ID part of the Interface ID MAY be 0. For details on the Router ID and the value 0, see [I-D.ietf-pim-hello-intid].
While a router interface is PORT enabled, a PIM-over-TCP or a PIM-over-SCTP option MUST be included in the PIM Hello messages sent on that interface. When a router on a PORT-enabled interface receives a Hello message containing a PIM-over-TCP/PIM-over-SCTP Option from a new neighbor, or an existing neighbor that did not previously include the option, it switches to PORT mode for that particular neighbor.
When a router switches to PORT mode for a neighbor, it stops sending and accepting Native Join/Prune messages for that neighbor. Any state from previous Native Join/Prune messages is left to expire as normal. It will also attempt to establish a Transport connection (TCP or SCTP) with the neighbor. If both the router and its neighbor have announced both PIM-over-TCP and PIM-over-SCTP options, SCTP MUST be used. This resolves the issue where two transports are both offered. The method prefers SCTP over TCP, because SCTP has benefits such as call collision handling and support for multiple streams, as discussed later in this document.
When the router is using TCP, it will compare the TCP Connection ID it announced in the PIM-over-TCP Capable Option with the TCP Connection ID in the Hello received from the neighbor. Unless connections are opened on-demand (see below), the router with the lower Connection ID MUST do an active Transport open to the neighbor Connection ID. The router with the higher Connection ID MUST do a passive Transport open. An implementation MAY open connections only on-demand, in that case it may be that the neighbor with the higher Connection ID does the active open, see Section 4.5. If the router with the lower Connection ID chooses to only do an active open on-demand, it MUST do a passive open, allowing for the neighbor to initiate the connection. Note that the source address of the active open MUST be the announced Connection ID.
When the router is using SCTP, the IP address comparison need not be done since the SCTP protocol can handle call collision.
If PORT is used both for IPv4 and IPv6, both IPv4 and IPv6 PIM Hello messages MUST be sent, both containing PORT Hello options. If two neighbors announce the same transport (TCP or SCTP) and the same Connection ID in the IPv4 and IPv6 Hello messages, then only one connection is established and is shared. Otherwise, two connections are established and are used separately.
The PIM router that performs the active open initiates the connection with a locally generated source transport port number and a well-known destination transport port number. The PIM router that performs the passive open listens on the well-known local transport port number and does not qualify the remote transport port number. See Section 5 for well-known port number assignment for PORT.
When a Transport connection is established (or reestablished), the two routers MUST both send a full set of Join/Prune messages for state for which the other router is the upstream neighbor. This is needed to ensure that the upstream neighbor has the correct state. When moving from Datagram mode, or when the connection has gone down, the router cannot be sure that all the previous Join/Prune state was received by the neighbor. Any state created before the connection was established (or reestablished) that is not refreshed, MUST be left to expire and be deleted. When the non-refreshed state has expired and been deleted, the two neighbors will be in sync.
When not running PORT, a full update is only needed when a router restarts, with PORT it must be done every time a connection is established. This can be costly, although it is expected that it is a rare event for a PORT connection to go up and down. There may be a need for extensions to better handle this.
It is possible that a router starts sending Hello messages with a new Connection ID, e.g. due to configuration changes. A router MUST always use the last announced and last seen Connection IDs. A connection is identified by the local Connection ID (the one we are announcing on a particular interface), and the remote Connection ID (the one we are receiving from a neighbor on the same interface). When either the local or remote ID changes, the Connection ID pair we need a connection for changes. There may be an existing connection with the same pair, in which case the router will share that connection. Or a new connection may need to be established. Note that for link-local addresses, the interface should be regarded as part of the ID, so that connection sharing is not attempted when the same link-local addresses are seen on different interfaces.
When a Connection ID changes, if the previously used connection is not needed (there are no other PIM neighborships using the same Connection ID pair), both peers MUST attempt to reset the transport connection. Next (even if the old connection is still needed), they MUST, unless a connection already exists with the new Connection ID pair, immediately or on-demand attempt to establish a new connection with the new Connection ID pair.
Normally the Interface ID would not change while a connection is up. However, if it does, it does not affect the connection. It just means that when subsequent PORT join/prune messages are received, they should be matched against the last seen Interface ID.
Note that, a Join sent over a Transport connection will only be seen by the upstream router, and thus will not cause routers on the link that do not use PIM PORT with the upstream router to possibly delay the refresh of Join state for the same state. Similarly, a Prune sent over a Transport connection will only be seen by the upstream router, and will thus never cause routers on the link that do not use PIM PORT with the upstream router, to send a Join to override this Prune.
Note also, that a datagram PIM Join/Prune message for a said (S,G) or (*,G) sent by some router on a link will not cause routers on the same link that use a Transport connection with the upstream router for that state, to suppress the refresh of that state to the upstream router (because they don't need to periodically refresh this state) or to send a Join to override a Prune (as the upstream router will only stop forwarding the traffic when all joined routers that use a Transport connection have explicitly sent a Prune for this state, as explained in Section 6).
TCP/SCTP packets used for PORT MUST be sent with a TTL/Hop Limit of 255 to facilitate enabling of the Generalized TTL Security Mechanism (GTSM) [RFC5082]. Implementations SHOULD provide a configuration option to enable the GTSM check at the receiver. This means checking that inbound packets from directly connected neighbors have a TTL/Hop Limit of 255, but MAY also allow for a different TTL/Hop Limit threshold to check that the sender is within a certain number of router hops. The GTSM check SHOULD be disabled by default.
Implementations SHOULD support the TCP Authentication Option (TCP-AO) [RFC5925] and SCTP Authenticated Chunks [RFC4895].
TCP is designed to keep connections up indefinitely during a period of network disconnection. If a PIM-over-TCP router fails, the TCP connection may stay up until the neighbor actually reboots, and even then it may continue to stay up until you actually try to send the neighbor some information. This is particularly relevant to PIM, since the flow of Join/Prune messages might be in only one direction, and the downstream neighbor might never get any indication via TCP that the other end of the connection is not really there.
SCTP has a heart beat mechanism that can be used to detect that a connection is working, even when no data is sent.
One can detect that a PORT connection is not working by regularly sending PORT messages. This applies to both TCP and SCTP. E.g., for TCP the connection will be reset if no TCP ACKs are received after a few retries. PORT in itself does not require any periodic signaling. PORT Join/Prune messages are only sent when there is a state change. If the state changes are not frequent enough, a PORT Keep-Alive message (defined in Section 5.2) can be sent instead. E.g., if an implementation wants to send a PORT message, to check that the connection is working, at least every 60 seconds, then whenever there is 60 seconds since the previous message, a Keep-Alive message could be sent. If there were less than 60 seconds between each Join/Prune, no Keep-Alive messages would be needed. Implementations SHOULD support the use of PORT Keep-Alive messages. It is RECOMMENDED that a configuration option is available to network administrators to enable it when needed. Note that Keep-Alives can be used by a peer, independently of whether the other peer supports it.
An implementation that supports Keep-Alive messages acts as follows when processing a received PORT message. When processing a Keep-Alive message with a non-zero Holdtime value, it MUST set a timer to the value. We call this timer Connection Expiry Timer (CET). If the CET is already running, it MUST be reset to the new value. When processing a Keep-Alive message with a zero Holdtime value, the CET MUST be stopped if running. When processing a PORT message other than Keep-Alive, the CET MUST be reset to the last received Holdtime value if running. If the CET is not running, no action is taken. If the CET expires, the connection SHOULD be shut down. This specification does not mandate a specific default Holdtime value. However, the dynamic congestion and flow control in TCP and SCTP can result in variable transit delay between the endpoints when capacity varies, there may be loss in the network or variable link performance. Consistent behaviour therefore requires a sufficiently large Holdtime value. E.g., 60 seconds to prevent premature termination.
It is possible that a router receives Join/Prune messages for an interface/link that is down. As long as the neighbor has not expired, it is RECOMMENDED processing those messages as usual. If they are ignored, then the router SHOULD ensure it gets a full update for that interface when it comes back up. This can be done by changing the GenID (Generation Identifier, see [RFC4601]), or by terminating and reestablishing the connection.
If a PORT neighbor changes its GenID and a connection is established or attempting to be established, the local side should generally tear down the connection and do as described in Section 4.3. However, if the connection is shared by multiple interfaces and the GenID changes only for one of them, the local side SHOULD simply send a full update, similar to other cases when a GenID changes for an upstream neighbor.
A connection may go down for a variety of reasons. It may be due to an error condition, or a configuration change. A connection SHOULD be shut down as soon as there are no more PIM neighbors using it. That is, for the connection we have associated local and remote Connection IDs. When there is no PIM neighbor with that particular remote connection ID on any interface where we announce the local connection ID, the connection SHOULD be shut down. This may happen when a new connection ID is configured, PORT is disabled, or a PIM neighbor expires.
If a PIM neighbor expires, one should free connection state and downstream oif-list state for the neighbor. A downstream router, when an upstream neighboring router has expired, will simply update the RPF neighbor for the corresponding state to a new neighbor where it would trigger Join/Prune messages. This behavior is according to [RFC4601] where also the term RPF neighbor is defined. It is required of a PIM router to clear its neighbor table for a neighbor who has timed out due to neighbor holdtime expiration.
When a connection is no longer available between two PORT enabled PIM neighbors, they MUST immediately, or on-demand, try to reestablish the connection following the normal rules for connection establishment. The neighbors MUST also start expiry timers so that all oif-list state for the neighbor using the connection, gets expired after J/P_Holdtime, unless it later gets refreshed by receiving new Join/Prunes.
The value of J/P_Holdtime is 215 seconds. This value is based on section 4.11 of [RFC4601] which says that J/P_HoldTime should be 3.5 * t_periodic where the default for t_periodic is 60 seconds.
There may be situations where an administrator decides to stop using PORT. If PORT is disabled on a router interface, or a previously PORT enabled neighbor no longer announces any of the PORT Hello options, the router follows the rules in Section 4.3 for taking down connections and starting timers. Next, the router SHOULD trigger a full state update similar to what would be done if the GenID changed in Datagram Mode. The router SHOULD send Join/Prune messages for any state where the router switched from PORT to Datagram Mode for the upstream neighbor.
Transport connections could be established when they are needed or when a router interface to other PIM neighbors has come up. The advantage of on-demand Transport connection establishment is the reduction of router resources. Especially in the case where there is no need for a full mesh of connections on a network interface. The disadvantage is additional delay and queueing when a Join/Prune message needs to be sent and a Transport connection is not established yet.
If a router interface has become operational and PIM neighbors are learned from Hello messages, at that time, Transport connections may be established. The advantage is that a connection is ready to transport data by the time a Join/Prune message needs to be sent. The disadvantage is there can be more connections established than needed. This can occur when there is a small set of RPF neighbors for the active distribution trees compared to the total number of neighbors. Even when Transport connections are pre-established before they are needed, a connection can go down and an implementation will have to deal with an on-demand situation.
Note that for TCP, it is the router with the lower Connection ID that decides whether to open a connection immediately, or on-demand. The router with the higher Connection ID SHOULD only initiate a connection on-demand. That is, if it needs to send a Join/Prune message and there is no currently established connection.
Therefore, this specification RECOMMENDS but does not mandate the use of on-demand Transport connection establishment.
Based on this specification, a Transport connection cannot be established until a Hello message is received. One reason for this is to determine if the PIM neighbor supports this specification and the other is to determine the remote address to use to establish the Transport connection.
There are cases where it is desirable to suppress entirely the transmission of Hello messages. In this case, it is outside the scope of this document on how to determine if the PIM neighbor supports this specification as well as an out-of-band (outside of the PIM protocol) method to determine the remote address to establish the Transport connection.
To ensure that there is only one TCP connection between a pair of PIM neighbors, the following set of rules MUST be followed. Note that this section applies only to TCP, for SCTP this is not an issue. Let A and B be two PIM neighbors where A's Connection ID is numerically smaller than B's Connection ID, and each is known to the other as having a potential PIM adjacency relationship.
At node A:
At node B:
It may be desirable for scaling purposes to allow Join/Prune messages from different PIM protocol families to be sent over the same Transport connection. Also, it may be desirable to have a set of Join/Prune messages for one address-family sent over a Transport connection that is established over a different address-family network layer.
To be able to do this we need a common PORT message format. This will provide both record boundary and demux points when sending over a stream protocol like TCP/SCTP.
A PORT message may contain PORT options, see Section 5.3. We will define two PORT options for carrying PIM Join/Prune messages. One for IPv4 and one for IPv6. For each PIM Join/Prune message to be sent over the Transport connection, we send a PORT Join/Prune message containing exactly one such option.
Each PORT message will have the Type/Length/Value format. Multiple different TLV types can be sent over the same Transport connection.
To make sure PIM Join/Prune messages are delivered as soon as the TCP transport layer receives the Join/Prune buffer, the TCP Push flag will be set in all outgoing Join/Prune messages sent over a TCP transport connection.
PORT messages will be sent using destination TCP port number 8471. When using SCTP as the reliable transport, destination port number 8471 will be used. See Section 11 for IANA considerations.
PORT messages are error checked. This includes illegal type fields, or a truncated message. If the PORT message contains a PIM Join/Prune Message, then that is subject to the normal PIM error checks. If any parsing errors occur in a PORT message, it is skipped, and we proceed to any following PORT messages.
The TLV type field is 16 bits. The range 65532 - 65535 is for experimental use [RFC3692].
This document defines two message types.
PORT Join/Prune Message
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 1 | Message Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Interface | | ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PORT Option Type | Option Value Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | | . | | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ . \ / . / \ . \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PORT Option Type | Option Value Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | | . | | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The PORT Join/Prune Message is used for sending a PIM Join/Prune.
PORT Keep-alive Message
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 2 | Message Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Holdtime | PORT Option Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Value Length | Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . + | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \ . \ / . / \ . \ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PORT Option Type | Option Value Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | | . | | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The PORT Keep-alive Message is used to regularly send PORT messages to verify that a connection is alive. They are used when other PORT messages are not sent at the desired frequency.
Each PORT Option is a TLV. The type is 16 bits. The PORT Option type space is split in two ranges. The types in the range 0 - 32767 (the most significant bit is not set) are for Critical Options. The types in the range 32768 - 65535 (the most significant bit is set) are for Non-Critical Options.
The behavior of a router receiving a message with an unknown PORT Option, is determined by whether the option is a critical option. If the message contains an unknown critical option, the entire message must be ignored. If the option is non-critical, only that particular option is ignored, and the message is processed as if the option was not present.
PORT Option types are assigned by IANA, except the ranges 32764 - 32767 and 65532 - 65535 that are for experimental use [RFC3692]. The length specifies the length of the value in bytes. Below are the two options defined in this document.
PIM IPv4 Join/Prune Option Format
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PORT Option Type = 1 | Option Value Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PIMv2 Join/Prune Message | | . | | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The IPv4 Join/Prune Option is used to carry a PIMv2 Join/Prune message that has all IPv4 encoded addresses in the PIM payload.
PIM IPv6 Join/Prune Option Format
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PORT Option Type = 2 | Option Value Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PIMv2 Join/Prune Message | | . | | . | | . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The IPv6 Join/Prune Option is used to carry a PIMv2 Join/Prune message that has all IPv6 encoded addresses in the PIM payload.
When explicit tracking is used, a router keeps track of join state for individual downstream neighbors on a given interface. This is done for all PORT joins and prunes. It may also be done for native join/prune messages, if all neighbors on the LAN have set the T bit of the LAN Prune Delay option (see definition in section 4.9.2 of [RFC4601]). In the discussion below we will talk about ET (explicit tracking) neighbors, and non-ET neighbors. The set of ET neighbors MUST include the PORT neighbors. The set of non-ET neighbors consists of all the non-PORT neighbors unless all neighbors have set the LAN Prune Delay T bit. Then the ET neighbors set contains all neighbors.
For some link-types, e.g. point-to-point, tracking neighbors is no different than tracking interfaces. It may also be possible for an implementation to treat different downstream neighbors as being on different logical interfaces, even if they are on the same physical link. Exactly how this is implemented and for which link types, is left to the implementer.
For (*,G) and (S,G) state, the router starts forwarding traffic on an interface when a Join is received from a neighbor on such an interface. When a non-ET neighbor sends a Prune, as specified [RFC4601], if no Join is sent to override this Prune before the expiration of the Override Timer, the upstream router concludes that no non-ET neighbor is interested. If no ET neighbors are interested, the interface can be removed from the oif-list. When an ET neighbor sends a Prune, one removes the join state for that neighbor. If no other ET or non-ET neighbors are interested, the interface can be removed from the oif-list. When a PORT neighbor sends a prune, there can be no Prune Override, since the Prune is not visible to other neighbors.
For (S,G,rpt) state, the router needs to track Prune state on the shared tree. It needs to know which ET neighbors have sent prunes, and whether any non-ET neighbors have sent prunes. Normally one would forward a packet from a source S to a group G out on an interface if a (*,G)-join is received, but no (S,G,rpt)-prune. With ET one needs to do this check per ET neighbor. That is, the packet should be forwarded unless all ET neighbors that have sent (*,G)-joins have also sent (S,G,rpt)-prunes, and if a non-ET neighbor has sent a (*,G)-join, whether there also is non-ET (S,G,rpt)-prune state.
To allow for efficient use of router resources, one can mux Join/Prune messages of different address families on the same Transport connection. There are two ways this can be accomplished, one using a common message format over a TCP connection and the other using multiple streams over a single SCTP connection.
Using the common message format described previously in this specification, using different PORT options, both IPv4 and IPv6 based Join/Prune messages can be encoded within the same Transport connection.
When using SCTP multi-streaming, the common message format is still used to convey address family information but an SCTP association is used, on a per-family basis, to send data concurrently for multiple families. When data is sent concurrently, head of line blocking, which can occur when using TCP, is avoided.
There are no changes to processing of other PIM messages like PIM Asserts, Grafts, Graft-Acks, Registers, and Register-Stops. This goes for BSR and Auto-RP type messages as well.
This extension is applicable only to PIM-SM, PIM-SSM and Bidir-PIM. It does not take requirements for PIM-DM into consideration.
This document defines using TCP or SCTP transports between pairs of PIM neighbors. It is recommended that this mechanism is disabled by default. An administrator can then enable PORT TCP and/or SCTP on PIM enabled interfaces. If two neighbors both have PORT SCTP (and if not, if both PORT TCP) they will only use SCTP (alternatively TCP) for PIM Join/Prune messages. This is the case even when the connection is down.
When PORT support is enabled, a router sends PIM Hello messages announcing support for TCP and/or SCTP and also Connection IDs. It should be possible to configure a local Connection ID, and also to see what PORT capabilities and Connection IDs PIM neighbors are announcing. Based on these advertisements, pairs of PIM neighbors will decide whether to try to establish a PORT connection. There should be a way for an operator to check the current connection state. Statistics on the number of PORT messages sent and received (including number of invalid messages) may also be helpful
For connection security (see Section 4.1), it should be possible to enable a GTSM check to only accept connections (TCP/SCTP packets) when the sender is within a certain number of router hops. Also one should be able to configure the use of TCP-AO.
For connection maintenance (see Section 4.2), it is recommended to support Keep-Alive messages. It should be configurable whether to send Keep-Alives. In that case, also wheter to use a Holdtime, and what Holdtime to use.
There should be some way to alert an operator when PORT connections are going down, or when there is a failure in establishing a PORT connection. Also information like the number of connection failures, and how long the connection has been up or down, is useful.
There are several security issues related to the use of TCP or SCTP transports. The attacks would consist of sending packets with a spoofed source address. Either establishing a connection, or injecting packets into an existing connnection. This might allow someone to send spoofed join/prune messages, and may also allow someone to reset the connection. Mechanisms that help protect against this are discussed in Section 4.1).
For authentication one may for TCP use TCP-AO [RFC5925], and for SCTP use Authenticated Chunks [RFC4895]. Also GTSM [RFC5082] can be used to help prevent spoofing.
This specification makes use of a TCP port number and a SCTP port number for the use of PIM-Over-Reliable-Transport that has been allocated by IANA. It also makes use of IANA PIM Hello Options allocations that should be made permanent.
IANA has assigned a port number that is used as a destination port for PORT TCP and SCTP transports. The assigned number is 8471. References to this document should be added to the Service Name and Transport Protocol Port Number Registry.
Value Length Name Reference ------- ---------- ----------------------- --------------- 27 Variable PIM-over-TCP Capable this document 28 Variable PIM-over-SCTP Capable this document
In the Protocol Independent Multicast (PIM) Hello Options registry, the following options are needed for PORT.
A registry for PORT message types is requested. The message type is a 16-bit integer, with values from 0 to 65535. An RFC is required for assignments in the range 0 - 65531. This document defines two PORT message types. Type 1, Join/Prune; and Type 2, Keep-alive. The type range 65532 - 65535 is for experimental use [RFC3692].
Type Name Reference ------------- ------------------------------- --------------- 0 Reserved this document 1 Join/Prune this document 2 Keep-alive this document 3-65531 Unassigned 65532-65535 Experimental this document
The initial content of the registry should be as follows:
A registry for PORT option types is requested. The option type is a 16-bit integer, with values from 0 to 65535. Option types are assigned by IANA, except the ranges 32764 - 32767 and 65532 - 65535 that are for experimental use [RFC3692]. An RFC is required for the IANA assignments. This document defines two PORT Option types. Type 1, PIM IPv4 Join/Prune Message; and Type 2, PIM IPv6 Join/Prune Message.
Type Name Reference ------------- ---------------------------------- --------------- 0 Reserved this document 1 PIM IPv4 Join/Prune Message this document 2 PIM IPv6 Join/Prune Message this document 3-32763 Unassigned Critical Options 32764-32767 Experimental this document 32768-65531 Unassigned Non-Critical Options 65532-65535 Experimental this document
The initial content of the registry should be as follows:
In addition to the persons listed as authors, significant contributions were provided by Apoorva Karan and Arjen Boers.
The authors would like to give a special thank you and appreciation to Nidhi Bhaskar for her initial design and early prototype of this idea.
Appreciation goes to Randall Stewart for his authoritative review and recommendation for using SCTP.
Thanks also goes to the following for their ideas and commentary review of this specification, Mike McBride, Toerless Eckert, Yiqun Cai, Albert Tian, Suresh Boddapati, Nataraj Batchu, Daniel Voce, John Zwiebel, Yakov Rekhter, Lenny Giuliano, Gorry Fairhurst, Sameer Gulrajani, Thomas Morin, Dimitri Papadimitriou, Bharat Joshi, Rishabh Parekh, Manav Bhatia and Pekka Savola.
A special thank you goes to Eric Rosen for his very detailed review and commentary. Many of his comments are reflected as text in this specification.
[AFI] | IANA, , "Address Family Numbers", ADDRESS FAMILY NUMBERS http://www.iana.org/assignments/address-family-numbers, February 2007. |
[HELLO-OPT] | IANA, , "PIM Hello Options", PIM-HELLO-OPTIONS per RFC4601 http://www.iana.org/assignments/pim-hello-options, March 2007. |
[RFC3692] | Narten, T., "Assigning Experimental and Testing Numbers Considered Useful", BCP 82, RFC 3692, January 2004. |