Internet-Draft | LISP-PubSub | February 2023 |
Rodriguez-Natal, et al. | Expires 14 August 2023 | [Page] |
This document specifies an extension to the request/reply based Locator/ID Separation Protocol (LISP) control plane to enable Publish/Subscribe (PubSub) operation.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 August 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Locator/ID Separation Protocol (LISP) [RFC9300] [RFC9301] splits IP addresses in two different namespaces: Endpoint Identifiers (EIDs) and Routing Locators (RLOCs). LISP uses a map-and-encap approach that relies on (1) a Mapping System (basically a distributed database) that stores and disseminates EID-RLOC mappings and on (2) LISP tunnel routers (xTRs) that encapsulate and decapsulate data packets based on the content of those mappings.¶
Ingress Tunnel Routers (ITRs) / Re-encapsulating Tunnel Routers (RTRs) / Proxy Ingress Tunnel Routers (PITRs) pull EID-to-RLOC mapping information from the Mapping System by means of an explicit request message. Section 6.1 of [RFC9301] indicates how Egress Tunnel Routers (ETRs) can tell ITRs/RTRs/PITRs about mapping changes. This document presents a Publish/Subscribe (PubSub) extension in which the Mapping System can notify ITRs/RTRs/PITRs about mapping changes. When this mechanism is used, mapping changes can be notified faster and can be managed in the Mapping System versus the LISP sites.¶
In general, when an ITR/RTR/PITR wants to be notified for mapping changes for a given EID-Prefix, the following steps occur:¶
This operation is repeated for all EID-Prefixes for which ITRs/RTRs/PITRs want to be notified. An ITR/RTR/PITR can set the N-bit for several EID-Prefixes within a single Map-Request. Please note that the steps above illustrate only the simplest scenario and that details for this and other scenarios are described later in the document.¶
The reader may refer to [I-D.boucadair-lisp-pubsub-flow-examples] for sample flows to illustrate the the use of the PubSub specification.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The document uses the terms defined in Section 3 of [RFC9300].¶
In addition to the general assumptions and expectations that [RFC9301] makes for LISP deployments, this document makes the following two deployment assumptions:¶
If either assumption is not met, a subscription cannot be established, and the network will continue operating without this enhancement. The configuration of xTR-IDs (and Site-IDs) is out of the scope of this document.¶
Figure 1 shows the format of the updated Map-Request to support the PubSub functionality. In particular, this document associates a meaning with one of the reserved bits (see Section 8).¶
The following is added to the Map-Request message defined in Section 5.2 of [RFC9301]:¶
The xTR subscribes for changes, to a given EID-Prefix, by sending a Map-Request to the Mapping System with the N-bit set on the EID-Record. The xTR builds a Map-Request according to Section 5.3 of [RFC9301] but also does the following:¶
The Map-Request is forwarded to the appropriate Map-Server through the Mapping System. This document does not assume that a Map-Server is pre-assigned to handle the subscription state for a given xTR. The Map-Server that receives the Map-Request will be the Map-Server responsible to notify that specific xTR about future mapping changes for the subscribed mapping records.¶
Upon receipt of the Map-Request, the Map-Server processes it as described in Section 8.3 of [RFC9301]. In addition, unless the xTR is using the procedure described in Section 7.1 to create a new security association, the Map-Server MUST verify that the nonce in the Map-Request is greater than the stored nonce (if any) associated with the EID-prefix and xTR-ID. Otherwise, the Map-Server silently drops the Map-Request message and logs the event to record that a replay attack could have occurred. Furthermore, upon processing, for the EID-Record that has the N-bit set to 1, the Map-Server proceeds to add the xTR-ID contained in the Map-Request to the list of xTRs that have requested to be subscribed to that EID-Prefix.¶
If an xTR-ID is successfully added to the list of subscribers for an EID-Prefix, the Map-Server MUST extract the nonce and ITR-RLOCs present in the Map-Request, and store the association between the EID-Prefix, xTR-ID, ITR-RLOCs, and nonce. Any already present state regarding ITR-RLOCs and/or nonce for the same xTR-ID MUST be overwritten. When the LISP deployment has a single Map-Server, the Map-Server can be configured to keep a single nonce per xTR-ID for all EID-Prefixes (when used, this option MUST be enabled at the Map-Server and all xTRs).¶
If the xTR-ID is added to the list, the Map-Server MUST send a Map-Notify message back to the xTR to acknowledge the successful subscription. The Map-Server builds the Map-Notify according to Sections 5.5 and 5.7 of [RFC9301] with the following considerations:¶
As a reminder, the initial transmission and retransmission of Map-Notify messages by a Map-Server follow the procedure specified in Section 5.7 of [RFC9301]. Some state changes may trigger an overload that would impact, e.g., the outbound capacity of a Map-Server. A similar problem may be experienced when a large number of state were simultaneously updated. To prevent such phenomena, Map-Servers SHOULD be configured with policies to control the maximum number of subscriptions and also the pace of Map-Notify messages. For example, the Map-Server may be instructed to limit the resources that are dedicated to unsolicited Map-Notify messages to a small fraction (e.g., less than 10%) of its overall processing and forwarding capacity. The exact details to characterize such policies are deployment and implementation specific. Likewise, this document does not specify which notifications take precedence when these policies are enforced.¶
When the xTR receives a Map-Notify with a nonce that matches one in the list of outstanding Map-Request messages sent with an N-bit set, it knows that the Map-Notify is to acknowledge a successful subscription. The xTR processes this Map-Notify, as described in Section 5.7 of [RFC9301], and MUST use the Map-Notify to populate its Map-Cache with the returned EID-Prefix and RLOC-set. As a reminder, following Section 5.7 of [RFC9301], the xTR has to send a Map-Notify-Ack back to the Map-Server. If the Map-Server does not receive the Map-Notify-Ack after exhausting the Map-Notify retransmissions described in Section 5.7 of [RFC9301], the Map-Server can remove the subscription state. If the Map-Server removes the subscription state, it SHOULD notify the xTR by sending a single Map-Notify with the same nonce but with Loc-Count = 0 (and Loc-AFI = 0), and ACT bits set to 5 "Drop/Auth-Failure".¶
The subscription of an xTR-ID may fail for a number of reasons. For example, it fails because of local configuration policies (such as accept and drop lists of subscribers), because the Map-Server has exhausted the resources to dedicate to the subscription of that EID-Prefix (e.g., the number of subscribers excess the capacity of the Map-Server), or because the xTR tried but was not successful in establishing a new security association (Section 7.1).¶
If the subscription request fails, the Map-Server MUST send a Map-Reply to the originator of the Map-Request, as described in Section 8.3 of [RFC9301]. The xTR processes the Map-Reply as specified in Section 8.1 of [RFC9301]. If the subscription request fails, it is up to the implementation to try to subscribe again.¶
If the Map-Server receives a subscription request for an EID-Prefix not present in the mapping database, it SHOULD follow the same logic described in Section 8.4 of [RFC9301] and create a temporary subscription state for the xTR-ID to the least-specific prefix that both matches the original query and does not match any EID-Prefix known to exist in the LISP-capable infrastructure. Alternatively, the Map-Server can instead determine that such subscription request fails, and send a Negative Map-Reply following Section 8.3 of [RFC9301]. In both cases, the TTL of the temporary subscription state or the Negative Map-Reply SHOULD be configurable, with a value of 15-minutes being RECOMMENDED.¶
The subscription state can also be created explicitly by configuration at the Map-Server (possible when a pre-shared security association exists, see Section 7). In this case, the initial nonce associated with the xTR-ID (and EID-Prefix) MUST be randomly generated by the Map-Server.¶
The following specifies the procedure to remove a subscription: If the Map-Request only has one ITR-RLOC with AFI = 0 (i.e., Unknown Address), the Map-Server MUST remove the subscription state for that xTR-ID. In this case, the Map-Server MUST send the Map-Notify to the source RLOC of the Map-Request. If the Map-Server has received this Map-Request for an EID-Prefix without explicit subscription state for that xTR-ID, but covered by a less-specific EID-Prefix for which subscription state exists for the xTR-ID, the Map-Server SHOULD stop publishing updates about this more-specific EID-Prefix to that xTR, until the xTR explicitly subscribes to the more-specific EID-Prefix. The same considerations regarding authentication, integrity protection, and nonce checks described in this section and Section 7 for Map-Requests used to update subscription state, apply for Map-Requests used to remove subscription state.¶
When an EID-Prefix is removed from the Map-Server (either when explicitly withdrawn or when its TTL expires), the Map-Server notifies its subscribers (if any) via a Map-Notify with TTL equal 0.¶
The publish procedure is implemented via Map-Notify messages that the Map-Server sends to xTRs. The xTRs acknowledge the reception of Map-Notifies via sending Map-Notify-Ack messages back to the Map-Server. The complete mechanism works as follows:¶
When a mapping stored in a Map-Server is updated (e.g., via a Map-Register from an ETR), the Map-Server MUST notify the subscribers of that mapping via sending Map-Notify messages with the most updated mapping information. If subscription state in the Map-Server exists for a less-specific EID-Prefix and a more-specific EID-Prefix is updated, then the Map-Notify is sent with the more-specific EID-Prefix mapping to the subscribers of the less-specific EID-Prefix mapping. The Map-Notify message sent to each of the subscribers as a result of an update event follows the encoding and logic defined in Section 5.7 of [RFC9301] for Map-Notify, except for the following:¶
When the xTR receives a Map-Notify with an EID not local to the xTR, the xTR knows that the Map-Notify has been received to update an entry on its Map-Cache. The xTR MUST keep track of the last nonce seen in a Map-Notify received as a publication from the Map-Server for the EID-Prefix. When the LISP deployment has a single Map-Server, the xTR can be configured to keep track of a single nonce for all EID-Prefix (when used, this option MUST be enabled at the Map-Server and all xTRs). If a Map-Notify received as a publication has a nonce value that is not greater than the saved nonce, the xTR drops the Map-Notify message and logs the fact a replay attack could have occurred. The same considerations discussed in Section 5.6 of [RFC9301] regarding Map-Register nonces apply here for Map-Notify nonces.¶
The xTR processes the received Map-Notify as specified in Section 5.7 of [RFC9301], with the following considerations: The xTR MUST use its security association with the Map-Server (Section 7.1) to validate the authentication data on the Map-Notify. The xTR MUST use the mapping information carried in the Map-Notify to update its internal Map-Cache. If after following Section 5.7 of [RFC9301] regarding retransmission of Map-Notify messages, the Map-Server has not received back the Map-Notify-Ack, it can try to send the Map-Notify to a different ITR-RLOC for that xTR-ID. If the Map-Server tries all the ITR-RLOCs without receiving a response, it may stop trying to send the Map-Notify.¶
Generic security considerations related to LISP control messages are discussed in Section 9 of [RFC9301].¶
In the particular case of PubSub, cache poisoning via malicious Map-Notify messages is avoided by the use of nonce and the security association between the ITRs and the Map-Servers.¶
To prevent xTR-ID hijacking, it is RECOMMENDED to follow guidance from Section 9 of [RFC9301] to ensure integrity protection of Map-Request messages. It is also RECOMMENDED that the Map-Resolver verifies that the xTR is allowed to use PubSub and to use the xTR-ID and ITR-RLOCs included in the Map-Request. Map-Servers SHOULD be configured to only accept subscription requests from Map-Resolvers that verify Map-Requests as previously described.¶
Since Map-Notifies from the Map-Server to the ITR need to be authenticated, there is a need for a soft-state or hard-state security association (e.g., a PubSubKey) between the ITRs and the Map-Servers. For some controlled deployments, it might be possible to have a shared PubSubKey (or set of keys) between the ITRs and the Map-Servers. However, if pre-shared keys are not used in the deployment, LISP-SEC [RFC9303] can be used as follows to create a security association between the ITR and the MS.¶
First, when the ITR is sending a Map-Request with the N-bit set following Section 5, the ITR also performs the steps described in Section 5.4 of [RFC9303]. The ITR can then generate a PubSubKey by deriving a key from the One-Time Key (OTK) as follows: PubSubKey = KDF( OTK ), where KDF is the Key Derivation Function indicated by the OTK Wrapping ID. If OTK Wrapping ID equals NULL-KEY-WRAP-128 then the PubSubKey is the OTK. Note that as opposed to the pre-shared PubSubKey, this generated PubSubKey is different per EID-Prefix the ITR subscribes to (since the ITR will use a different OTK per Map-Request).¶
When the Map-Server receives the Map-Request it follows the procedure specified in Section 5. The Map-Server SHOULD verify that the OTK has not been used before. If PubSub is being used in an environment where replay attacks might occur, then the Map-Server MUST verify that the OTK has not been used before. If the Map-Server has to reply with a Map-Reply (e.g., due to PubSub not supported, subscription not accepted, or OTK reused.), then it follows normal LISP-SEC procedure described in Section 5.7 of [RFC9303]. No PubSubKey or security association is created in this case.¶
Otherwise, if the Map-Server has to reply with a Map-Notify (e.g., due to subscription accepted) to a received Map-Request, the following extra steps take place:¶
Note that if the Map-Server replies with a Map-Notify, none of the regular LISP-SEC steps regarding Map-Reply described in Section 5.7 of [RFC9303] takes place.¶
Misbehaving nodes may send massive subscription requests which may lead to exhaust the resources of a Map-Server. Furthermore, frequently changing the state of a subscription may also be considered as an attack vector. To mitigate such issues, Section 5.3 of [RFC9301] discusses rate-limiting Map-Requests and Section 5.7 of [RFC9301] discusses rate-limiting Map-Notifies. Note that when the Map-Notify rate-limit threshold is met for a particular xTR-ID, the Map-Server will discard additional subscription requests from that xTR-ID and will fall back to [RFC9301] behavior when receiving a Map-Request from that xTR-ID (i.e., the Map-Server will send a Map-Reply).¶
This document requests IANA to assign a new bit from the "LISP Control Plane Header Bits: Map-Request" sub-registry under the "Locator/ID Separation Protocol (LISP) Parameters" registry available at [IANA-LISP]. The suggested position of this bit in the Map-Request message can be found in Figure 1.¶
Spec Name | IANA Name | Bit Position | Description | Reference |
---|---|---|---|---|
I | Map-Request-I | 11 | xTR-ID Bit | This-Document |
This document also requests the creation of a new sub-registry entitled "LISP Control Plane Header Bits: Map-Request-Record" under the "Locator/ID Separation Protocol (LISP) Parameters" registry available at [IANA-LISP].¶
The initial content of this sub-registry is shown in Table 2:¶
Spec Name | IANA Name | Bit Position | Description | Reference |
---|---|---|---|---|
N | Map-Request-N | 1 | Notification-Requested Bit | This-Document |
The remaining bits are Unassigned.¶
The policy for allocating new bits from this sub-registry is Specification Required (Section 4.6 of [RFC8126]).¶
Review requests are evaluated on the advice of one or more designated experts. Criteria that should be applied by the designated experts include determining whether the proposed registration duplicates existing entries and whether the registration description is sufficiently detailed and fits the purpose of this registry. These criteria are considered in addition to those already provided in Section 4.6 of [RFC8126] (e.g., the proposed registration must be documented in a permanent and readily available public specification). The designated experts will either approve or deny the registration request, communicating this decision to IANA. Denials should include an explanation and, if applicable, suggestions as to how to make the request successful.¶
Early implementations of PubSub have been running in production networks for some time. The following subsections provides an inventory of some experience lessons from these deployments.¶
Some LISP deployments are using PubSub as a way to monitor EID-Prefixes (particularly, EID-to-RLOC mappings). To that aim, some LISP implementations have extended the LISP Internet Groper (lig) [RFC6835] tool to use PubSub. Such an extension is meant to support an interactive mode with lig, and request subscription for the EID of interest. If there are RLOC changes, the Map-Server sends a notification and then the lig client displays that change to the user.¶
Section 8.1 of [RFC9301] suggests two TTL values for Negative Map-Replies: either 15-minute (if the EID-Prefix does not exist) or 1-minute (if the prefix exists but has not been registered). While these values are based on the original operational experience of the LISP protocol designers, negative cache entries have two unintended effects that were observed in production.¶
First, if the xTR keeps receiving traffic for a negative EID destination (i.e., an EID-Prefix with no RLOCs associated with it), it will try to resolve the destination again once the cached state expires, even if the state has not changed in the Map-Server. It was observed in production that this is happening often in networks that have a significant amount of traffic addressed for outside of the LISP network. This might result on excessive resolution signaling to keep retrieving the same state due to the cache expiring. PubSub is used to relax TTL values and cache negative mapping entries for longer periods of time, avoiding unnecessary refreshes of these forwarding entries, and drastically reducing signaling in these scenarios. In general, a TTL-based schema is a “polling mechanism” that leads to more signaling where PubSub provides an "event triggered mechanism" at the cost of state.¶
Second, if the state does indeed change in the Map-Server, updates based on TTL timeouts might prevent the cached state at the xTR from being updated until the TTL expires. This behavior was observed during configuration (or reconfiguration) periods on the network, where no-longer-negative EID-Prefixes do not receive the traffic yet due to stale Map-Cache entries present in the network. With the activation of PubSub, stale caches can be updated as soon as the state changes.¶
An improved convergence time was observed on the presence of mobility events on LISP networks running PubSub as compared with running LISP [RFC9301]. As described in Section 4.1.2.1 of [I-D.ietf-lisp-eid-mobility], LISP can rely on data-driven Solicit-Map-Requests (SMRs) to ensure eventual network converge. Generally, PubSub offers faster convergence due to (1) no need to wait for a data triggered event and (2) less signaling as compared with the SMR-based flow. Note that when a Map-Server running PubSub has to update a large number of subscribers at once (i.e., when a popular mapping is updated) SMR based convergence may be faster for a small subset of the subscribers (those receiving PubSub updates last). Deployment experience reveals that data-driven SMRs and PubSub mechanisms complement each other and provide a fast and resilient network infrastructure in the presence of mobility events.¶
Furthermore, experience showed that not all LISP entities on the network need to implement PubSub for the network to get the benefits. In scenarios with significant traffic coming from outside of the LISP network, the experience showed that enabling PubSub in the border routers significantly improves mobility latency overall. Even if edge xTRs do not implement PubSub, and traffic is exchanged between EID-Prefixes at the edge, xTRs still converge based on data-driven events and SMR-triggered updates.¶
There is a need to interconnect LISP networks with other networks that might or might not run LISP. Some of those scenarios are similar to the ones described in [I-D.haindl-lisp-gb-atn] and [I-D.moreno-lisp-uberlay]. When connecting LISP to other networks, the experience revealed that in many deployments the point of interaction with the other domains is not the Mapping System but rather the border router of the LISP site. For those cases the border router needs to be aware of the LISP prefixes to redistribute them to the other networks. Over the years different solutions have been used.¶
First, Map-Servers were collocated with the border routers, but this was hard to scale since border routers scale at a different pace than Map-Servers. Second, decoupled Map-Servers and border routers were used with static configuration of LISP entries on the border, which was problematic when modifications were made. Third, a routing protocol (e.g., BGP) can be used to redistributed LISP prefixes from the Map-Servers to a border router, but this comes with some implications, particularly the Map-Servers needs to implement an additional protocol which consumes resources and needs to be properly configured. Therefore, once PubSub was available, deployments started to adapt it to enable border routers to dynamically learn the prefixes they need to redistribute without the need of extra protocols or extra configuration on the network.¶
In other words, PubSub can be used to discover EID-Prefixes so they can be imported into other routing domains that do not use LISP. Similarly, PubSub can also be used to discover when EID-Prefixes need to be withdrawn from other routing domains. That is, in a typical deployment, a border router will withdraw an EID-Prefix it has been announcing to external routing domains, if it receives a notification that the RLOC-set for that EID-Prefix is now empty.¶
EID-to-RLOC mappings can have very long TTL, sometimes in the order of several hours. Upon the expiry of that TTL, the xTR checks if these entries are being used and removes any entry that is not being used. The problem with very long Map-Cache TTL is that (in the absence of PubSub) if a mapping changes, but it is not being used, the cache remains but it is stale. This is due to no data traffic being sent to the old location to trigger an SMR based Map-Cache update as described in Section 4.1.2.1 of [I-D.ietf-lisp-eid-mobility]. If the network operator runs a show command on a router to track the state of the Map-Cache, the router will display multiple entries waiting to expire but with stale RLOC information. This might be confusing for operators sometimes, particularly when they are debugging problems. With PubSub, the Map-Cache is updated with the correct RLOC information, even when it is not being used or waiting to expire, and this helps with debugging.¶
We would like to thank Marc Portoles, Balaji Venkatachalapathy, Bernhard Haindl, Luigi Iannone, and Padma Pillay-Esnault for their great suggestions and help regarding this document.¶
Many thanks to Alvaro Retano for the careful AD review. Thanks to Chris M. Lonvick for the security directorate review, Al Morton for the OPS-DIR review, Roni Even for the Gen-ART review, Mike McBride for the rtg-dir review, Magnus Westerlund for the tsv directorate review, and Sheng Jiang for the int-dir review.¶
This work was partly funded by the ANR LISP-Lab project #ANR-13-INFR-009 (https://www.lisp-lab.org).¶
Dino Farinacci lispers.net San Jose, CA USA Email: farinacci@gmail.com Johnson Leong Email: johnsonleong@gmail.com Fabio Maino Cisco San Jose, CA USA Email: fmaino@cisco.com Christian Jacquenet Orange Rennes France Email: christian.jacquenet@orange.com Stefano Secci Cnam France Email: stefano.secci@cnam.fr¶