Internet-Draft | RPKI ROV Timing | February 2022 |
Bush, et al. | Expires 11 August 2022 | [Page] |
This document explores, and makes recommendations for, timing of Resource Public Key Infrastructure publication of ROV data, their propagation, and their use in Relying Parties, caches, and routers.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 11 August 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document explores, and makes recommendations for, timing of Resource Public Key Infrastructure (RPKI) publication of ROV data, their propagation, and their use in Relying Parties (RP), caches, and routers.¶
The RPKI ROA supply chain from CAs to when they reach routers has the following structure:¶
As Resource Public Key Infrastructure based Route Origin Validation (ROV) becomes deployed in the Internet, the quality of the routing control plane, and hence timely and accurate delivery of packets in the data plane, increasingly depend on prompt and accurate propagation of the RPKI data from the originating Certification Authorities (CAs), to Relying Parties (RPs), to Border Gateway Protocol (BGP) speaking routers.¶
Origin Validation based on stale ROAs allows accidental or intentional mis-origination; announcement of a prefix by an AS which does not have the authority to do so. Delays in ROA propagation to ROV in routers might cause loss of good traffic. Therefore minimizing propagation time of data from CAs to routers is important.¶
Before the data can start on the CA to router supply chain, the resource holder (operator) MUST create, modify, or delete the relevant ROA(s) through the CA's operational interface(s). The operator is responsible for anticipating their future needs for ROAs, be aware of the propagation time from creating ROAs to effect on routing, and SHOULD create, delete, or modify ROAs sufficiently in advance of any needs in the routing system.¶
There are questions of how frewwww3quently a CA publishes, how often an RP pulls, and how often routers pull from their RP(s). Overall, the router(s) SHOULD react within an hour of ROA publication. In pessimistic circumstances, it could be two hours.¶
For CAs publishing to PPs, a few seconds to a minute seems easily achieved with reasonable software. See Section 3.¶
Relying Party validating caches periodically retrieve data from CA publication points. RPs using rsync to poll publication points every ten minutes would be a burden today, given the load it would put on publication services, cf. one notorious repository which was structured against specification. RPs using RRDP impose less load. As the infrastructure moves from rsync to RRDP [I-D.ietf-sidrops-prefer-rrdp], RRDP is designed for quite frequent polling as long as Relying Parties use the If-Modified-Since (see [RFC7232]) header and there is a caching infrastructure. For rsync, an hour would be the longest acceptable window and half an hour the shortest. See Section 4.¶
For BGP speaking router(s) pulling from the RP(s), five minutes to an hour is a wide window. But, the RPKI-Rtr protocol does have the Serial Notify PDU, the equivalent of DNS Notify [RFC1996], where the cache tells the router that it has new data. See Section 5.¶
We discuss each of these in more detail below.¶
One constraint on publication timing can be ensuring the CRL and Manifest ([RFC6486]) are consistent with each other and with respect to the other repository data. With both rsync and RRDP protocols, the publication point MUST be consistent before it becomes current and is published.¶
Operators should beware that there may be implementation dependent delays between instructing their CAs to create and/or update ROAs and the publication of these changes in the PPs.¶
rsync puts a load on RPKI publication point servers. Therefore relying party caches have been discouraged from fetching more frequently than on the order of a half hour. Times as long as a day were even suggested. We specify that RPs using rsync SHOULD pull from CA publication points every 30 to 60 minutes.¶
With RRDP ([RFC8182]), such constraints can be less relevant. [RFC8182] makes clear that polling as frequently as once a minute is acceptable if and only if Relying Parties use the If-Modified-Since header and there is caching. Absent use of the If-Modified-Since header, the RRDP polling interval MUST NOT be more frequent than ten minutes. Use of the If-Modified-Since header is strongly RECOMMENDED.¶
Migration from rsync to RRDP in [I-D.ietf-sidrops-prefer-rrdp] is recommended. During dual RRDP/rsync operation, should an RP need to fall over from RRDP to rsync, a uniformly distributed jittered delay with a mean of half the rsync interval SHOULD be used; so clients falling over to rsync are as spread out as they would be if they used rsync initallly.¶
A number of timers are embedded in the X.509 RPKI data which should also be considered. E.g., CRL publication commitments, expiration of EE certificates pointing to Manifests, and the Manifests themselves. Some CA operators commonly indicate new CRL information should be available in the next 24 hours. These 24 hour sliding timers, when combined with fetching RPKI data once a day, would expose failure windows, especially in the face of transient network issues between the CA and RP. To ameliorate this, RPs SHOULD update from CAs at least as frequently as once an hour.¶
In summary, the following timing constraints SHOULD be applied to data update: RPs SHOULD update from CAs at least once an hour. To avoid excess load, RPs SHOULD NOT update via rsync more frequently than every 30 minutes. RPs using RRDP SHOULD NOT need to update more frequently than every 10 minutes. Some form of timing jitter MUST be applied to ensure load distribution across the community. RPs SHOULD NOT force data fetch to be on the hour or similar times. Publication Points SHOULD deploy RRDP services which honor If-Modified-Since.¶
In general, CAs should have Manifest, CRL, ... timers of a few days to allow relying party operators to go away for the weekend and not fear for their control plane.¶
The rate of change of ROA data is estimated to remain small, on the order of a few ROAs a minute, but with bursts. Therefore, the routers may update from the (presumed local) relying party cache(s) quite frequently. Note that [I-D.ietf-sidrops-8210bis] recommends a polling interval of one hour. This polling timing is conservative because caches can send a Serial Notify PDU to tell routers when there are new data to be fetched. As the RP cache and the router belong to the same operator, routers are free to hammer the RPs as frequently as they wish.¶
A router SHOULD respond with a Serial Query when it receives a Serial Notify from a cache. If a router can not respond appropriately to a Serial Notify, then it MUST send a periodic Serial Query no less frequently than once an hour.¶
Once a router has received an End of Data PDU from a cache, the effect on Route Origin Validation MUST be a matter of seconds to a minute. The router MAY allow incoming VRPs to affect Origin Validation as they arrive instead of waiting for the End of Data PDU. See [I-D.ietf-sidrops-8210bis] for some cautions regarding the arrival and processing sequence of VRPs.¶
Should the supply chain include components or technologies other than those in IETF documents, the end effect SHOULD be the same; the router(s) SHOULD react to invalid AS origins within the same overall time constraint, one hour, two at most, from ROA creation at the CA publication point to effect in the router.¶
Assuming the above recommendations, in worst conditions such as an RPKI-rtr Notify PDU being ignored, it may take up to two hours for a new ROA to propagate from creation at the CA to BGP speaking routers. Therefore it is RECOMMENDED that planned changes in ROAs take this propagation time into consideration. E.g. if a new route is to be announced in BGP, the operators SHOULD create the ROA around three hours before BGP announcement, or it may not propagate globally.¶
Despite common misconceptions and marketing, Route Origin Validation is not a magic security protocol. It is intended to catch operational errors, and is easily gamed and attacked through, for example, AS Path manipulation. It is one tool in the prudent operator's kit, and a good one.¶
If an attacker can add, delete, or modify RPKI data, either in repositories or in flight, they can affect routing and thereby steer or damage traffic. The RPKI system design does much to deter these attacks. But the 'last mile' from the cache to the router uses transport, as opposed to object, security and is vulnerable. This is discussed in [I-D.ietf-sidrops-8210bis].¶
Similarly, if an attacker can delay prompt propagation of RPKI data on the supply chain described in this document, they can affect routing, and therefore traffic flow, to their advantage.¶
None¶
The authors wish to thank George Michaelson, Massimiliano Stucchi and Ties de Kock.¶