Internet-Draft | Node Liveness Protocol | January 2022 |
Li | Expires 22 July 2022 | [Page] |
Prompt notification of the loss of node liveness or reachability is useful for restoring services in tunneled topologies. IGP summarization precludes remote nodes from directly observing the status of remote nodes. This document proposes a service that, in conjunction with the IGP, provides prompt notifications without impacting IGP summarization.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 22 July 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Overlay services are increasingly common and are implemented by creating tunnels over a physical infrastructure. The failure of one of the tunnel endpoints implies that the traffic towards that endpoint will be lost until the other endpoint recognizes the situation and takes remedial action. Prompt notification of the failure of the other endpoint is useful in minimizing the duration of the outage.¶
Some network designs have come to rely on examining the IGP's Link State Database (LSDB) to determine node liveness and, through the IGP SPF computation, the node's reachability. However, if the network is to scale, some form of summarization must be employed, resulting in this information no longer being directly available. This document proposes a protocol that will provide prompt notificaion of changes in node liveness, even in networks that employ IGP summarization.¶
The service itself runs on OSPF [RFC2328] [RFC5340] Area Border Routers (ABRs) or IS-IS [ISO10589] L1-L2 routers. For brevity, we will use the term 'ABRs' for both cases.¶
This service uses a simple, hierarchical publish-subscribe architecture. Clients are nodes within non-backbone OSPF areas or L1 IS-IS area. They register with their local ABRs. The ABRs are fully meshed, with the exception that ABRs of the same area need not interact. Notifications initiated by an ABR flow to other ABRs and from there to client nodes.¶
The availability of this service is advertised as part of the IGP, so that discovery of the service is automatic. Clients can automatically detect their local ABRs and ABRs can detect each other and automatically form the necessary hierarchy.¶
The protocol runs on top of TCP [RFC0793] and/or QUIC [RFC9000] for reliability. Security is provided by conventional transport protocol mechanisms, such as TLS [RFC5246].¶
Node liveness should not be confused with service liveness. If a node is alive, then a service may or may not be up. This protocol only tries to convey node liveness.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The Node Liveness Protocol is run by ABRs and is advertised in the IGP for connections by clients and other ABRs. Advertisements are done both into the backbone (L2) and into non-backbone (L1) areas. The advertisements into the backbone allow ABRs to automatically mesh. The advertisements into the non-backbone areas allow clients to automatically determine where the service is available.¶
An ABR advertises the IS-IS Node Liveness sub-TLV as part of the IS-IS Router Capability TLV [RFC7981]. This is injected into the ABRs L1 and L2 LSP. The format of the IS-IS Node Liveness sub-TLV is:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | TPI | Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number | +-+-+-+-+-+-+-+-+¶
TPI: Transport Protocol Identifier, 1 octet¶
The advertisement of this capability indicates that the node is providing the Node Liveness service on the designated port using the designated protocol. The TPI indicates the transport protocol to be used and the Port Number indicates the associated port to be used. The TPI and Port Number pair may be included multiple times to indicate that multiple protocols and port numbers are available. The length of the sub-TLV can be used to determine the number of TPI and Port Number pairs.¶
The availabilty of the Node Liveness service is provided by the OSPF Node Liveness Sub-TLV. The OSPF Node Liveness Sub-TLV is used by both OSPFv2 and OSPFv3. The semantics are the same as the IS-IS Node Liveness Sub-TLV. The format of the OSPF Node Liveness Sub-TLV is:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TPI | Port Number | +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+¶
TPI: Transport Protocol Identifier, 1 octet¶
The TPI and Port Number fields are used in the same way as for IS-IS.¶
The Node Liveness Protocol sends messages in a stream inside of the selected transport protocol. The protocol uses two message types: Registration Messages and Notification Messages.¶
The client may determine the set of ABRs that it wishes to communicate with by examination of its LSDB. The client SHOULD open connections to at least two ABRs for redundancy. If the client cannot open two connections, then the management system should be informed.¶
The client MAY send Registration Messages on each of its ABR connections. A client MAY register for any number of prefixes, but it is expected that a client will send a registration for each of the tunnel endpoints that it will correspond with. A client may register for a host (a /32 or /128 prefix) or a shorter prefix. A client MUST NOT send overlapping registrations.¶
Clients never send Notification Messages and never recive Registration Messages.¶
The actions of the client on receiving a Notification Message are out of scope for this document.¶
Each ABR MUST advertise the availability of the Node Liveness service into the backbone (L2) area and into any non-backbone (L1) areas.¶
Each ABR MUST have a single connection to each other ABR that is part of a different non-backbone (L1) area. To prevent duplicate connections, only one ABR should initiate the connection. For IS-IS, the node with the lowest system ID should initiate the connection. For OSPFv4, the node with the lowest IPv4 router ID should initiate the connection. For OSPFv3, the node with the lowest IPv6 router ID should initiate the connection.¶
Each ABR may receive Registration Messages, each containing a prefix. These are retained in a Registration Database (RDB) along with its associated connection information. If a transport connection closes, then all registrations associated with the connection should be removed from the RDB. If an ABR receives a Registration Message requesting a prefix be unregistered, then the prefix should be removed from the RDB for that connection.¶
If an ABR receives a Registration Message for a prefix that is being injected by a non-attached area, then it should determine the set of ABRs that are advertising that prefix or less specifics and register with those ABRs for that prefix.¶
Each ABR should monitor its IGP LSDB for changes in node liveness. If an ABR sees an addition to the LSDB, then it is considered an Up Event for that node. If an ABR sees a LSP/LSA time out or become unreachable, then it is considered a Down Event for that node. Up Events and Down Events for non-host prefixes are out of scope for this document.¶
If an ABR receives a Notification Message with an Up Event for a prefix, then it is considered an Up Event for the prefix. If an ABR receives a Notification Message with a Down Event for a prefix, then it is considered a Down Event for the prefix.¶
If an ABR observes an Up Event for a host, it examines its RDB for registrations for that node or for any less specific prefixes. If there are any, then the ABR sends a Notification Message with an Up Event for that host to each node that registered.¶
Similarly, if an ABR observes a Down Event for a host, it examines its RDB for registrations for that node or for any less specific prefixes. If there are any, then the ABR sends a Notification Message with a Down Event for that host to each node that registered.¶
A Registration Message has the following format:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | AFI | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R| Reserved | Prefix len | Prefix ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
R: 1 bit¶
A Notification Message has the following format:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | AFI | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |U| Reserved | Prefix len | Prefix ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
U: 1 bit¶
This document requests the following code points from the "IS-IS Sub-TLVs for IS-IS Router CAPABILITY TLV" registry.¶
This document requests the following code points from the "OSPF Router Information (RI) TLVs" registry:¶
This document creates no new security issues. Security of transport protocol connections are addressed by the use of conventional transport protocol security techniques, such as TLS. IGP advertisements are not expected to have privacy, so the advertisement of the service is not a security issue.¶