This section outlines a probing policy suitable for unilateral adoption by any recursive resolver.
Following this policy should not result in failed resolutions or significant delay.¶
In addition to querying on Do53, the recursive resolver will try either or both of DoT and DoQ concurrently.
The recursive resolver remembers what opportunistic encrypted transport protocols have worked recently based on a (clientIP, serverIP, protocol) tuple.¶
If a query needs to go to a given authoritative server, and the recursive resolver remembers a recent successful encrypted transport to that server, then it doesn't send the query over Do53 at all.
Rather, it only sends the query using the recently-good encrypted transport protocol.¶
If the encrypted transport protocol fails, the recursive resolver falls back to Do53 for that tuple.
When any encrypted transport fails, the recursive resolver remembers that failure for a reasonable amount of time to avoid flooding a non-compatible server with requests that it cannot accept.¶
See the subsections below for a more detailed description of this protocol.¶
In designing a probing strategy, the recursive resolver could record its knowledge about any given authoritative server with different strategies, including at least:¶
- the authoritative server's IP address,¶
- the authoritative server's name (the NS record used), or¶
- the zone that contains the record being looked up.¶
This document encourages the first strategy, to minimize timeouts or accidental delays.
This document does not describe the other two strategies because the first is strongly encouraged.¶
A timeout (accidental delay) is most likely to happen when the recursive client believes that the authoritative server offers encrypted transport, but the actual server reached declines encrypted transport (or worse, filters the incoming traffic and does not even respond with an ICMP port closed message).¶
By associating state with the IP address, the recursive client is most able to avoid reaching a heterogeneous deployment.¶
For example, consider an authoritative server named ns0.example.com
that is served by two installations (with two A
records), one at 192.0.2.7
that follows this guidance, and one at 192.0.2.8
that is a legacy (cleartext port 53-only) deployment.
A recursive client who associates state with the NS
name and reaches .7
first will "learn" that ns0.example.com
supports encrypted transport.
A subsequent query over encrypted transport dispatched to .8
would fail, potentially delaying the response.¶
By associating the state with the authoritative IP address, the client can minimize the number of accidental delays introduced (see also Section 4.5.1 and Section 3.1).¶
A recursive resolver implementing this document needs to set system-wide values for some default parameters.
These parameters may be set independently for each supported encrypted transport, though a simple implementation may keep the parameters constant across encrypted transports.¶
Table 1:
Recursive resolver system parameters per encrypted transport
Name |
Description |
Suggested Default |
persistence
|
How long should the recursive resolver remember successful encrypted transport connections? |
3 days (259200 seconds) |
damping
|
How long should the recursive resolver remember unsuccessful encrypted transport connections? |
1 day (86400 seconds) |
timeout
|
How long should the recursive resolver wait for an initiated encrypted connection to complete? |
4 seconds |
This document uses the notation E-foo
to refer to the foo
parameter for the encrypted transport E
.¶
For example DoT-persistence
would indicate the length of time that the recursive resolver will remember that an authoritative server had a successful connection over DoT
.¶
This document also assumes that the resolver maintains a list of outstanding cleartext queries destined for the authoritative server's IP address X
.
This list is referred to as Do53-queries[X]
.
This document does not attempt to describe the specific operation of sending and receiving cleartext DNS queries (Do53) for a recursive resolver.
Instead it describes a "bolt-on" mechanism that extends the recursive resolver's operation on a few simple hooks into the recursive resolver's existing handling of Do53.¶
Implementers or deployers of DNS recursive resolvers that follow the strategies in this document are encouraged to report their preferred values of these parameters.¶
To follow this guidance, a recursive resolver MUST implement at least one of either DoT or DoQ in its capacity as a client of authoritative nameservers.¶
A recursive resolver SHOULD implement the client side of DNS-over-TLS (DoT).
A recursive resolver SHOULD implement the client side of DNS-over-QUIC (DoQ).¶
DoT queries from the recursive resolver MUST target TCP port 853, with an ALPN of "dot
".
DoQ queries from the recursive resolver MUST target UDP port 853, with an ALPN of "doq
".
ALPN is described in [RFC7301].¶
While this document focuses on the recursive-to-authoritative hop, a recursive resolver implementing these strategies SHOULD also accept queries from its clients over some encrypted transport (current common transports are DoH or DoT).¶
The recursive resolver SHOULD keep a record of the state for each authoritative server it contacts, indexed by the IP address of the authoritative server and the encrypted transports supported by the recursive resolver.
In addition, the recursive resolver SHOULD also keep a record of its own IP addresses used for queries, as described in Section 4.5.1.¶
In addition to tracking the state of connection attempts and outcomes, a recursive resolver SHOULD record the state of established sessions for encrypted protocols.
The details of how sessions are identified is dependent on the transport protocol implementation (such as TLS session ticket or TLS session ID, QUIC connection ID, and so on).
The use of session resumption as recommended here is limited somewhat because the tickets are only stored within the context defined by the (clientIP, serverIP, protocols) tuples used to track client-server interaction by the recursive resolver in a table like the one below.
However, session resumption still offers the ability to optimize the handshake in some circumstances.¶
Each record should contain the following fields for each supported encrypted transport, each of which would initially be null
:¶
Table 2:
Recursive resolver state per authoritative IP, per encrypted transport
Name |
Description |
Retain Across Reset |
session
|
The associated state of any existing, established session (the structure of this value is dependent on the encrypted transport implementation). If session is not null , it may be in one of two states: pending or established
|
no |
initiated
|
Timestamp of most recent connection attempt |
yes |
completed
|
Timestamp of most recent completed handshake (which can include one where an existing session is resumed) |
yes |
status
|
Enumerated value of success or fail or timeout , associated with the completed handshake |
yes |
last-response
|
A timestamp of the most recent response received on the connection |
yes |
resumptions
|
A stack of resumption tickets (and associated parameters) that could be used to resume a prior successful session |
yes |
queries
|
A queue of queries intended for this authoritative server, each of which has additional status early , unsent , or sent
|
no |
last-activity
|
A timestamp of the most recent activity on the connection |
no |
Note that the session
fields in aggregate constitute a pool of open connections to different servers.¶
With the exception of the session
, queries
, and last-activity
fields, this cache information should be kept across restart of the server unless explicitly cleared by administrative action.¶
This document uses the notation E-foo[X]
to indicate the value of field foo
for encrypted transport E
to IP address X
.¶
For example, DoT-initiated[192.0.2.4]
represents the timestamp when the most recent DoT connection packet was sent to IP address 192.0.2.4.¶
Note that the recursive resolver should record this per-authoritative-IP state for each source IP address it uses as it sends its queries.
For example, if a recursive resolver can send a packet to authoritative servers from IP addresses 192.0.2.100
and 192.0.2.200
, it should keep two distinct sets of per-authoritative-IP state, one for each source address it uses.
Keeping these state tables distinct for each source address makes it possible for a pooled authoritative server behind a load balancer to do a partial rollout while minimizing accidental timeouts (see Section 3.1).¶
When a recursive resolver discovers the need for an authoritative lookup to an authoritative DNS server using IP address X
, it retrieves the records associated with X
from its cache.¶
The following sections presume that the time of the discovery of the need for lookup is time T0
.¶
If any of the records discussed here are absent, they are treated as null
.¶
The recursive resolver must decide whether to initially send a query over Do53, or over any of the supported encrypted transports (DoT or DoQ).¶
Note that a resolver might initiate this query via any or all of the known transports.
When multiple queries are sent, the initial packets for each connection can be sent concurrently, similar to "Happy Eyeballs" ([RFC8305]).
However, unlike Happy Eyeballs, when one transport succeeds, the other connections do not need to be terminated, but can instead be continued to establish whether the IP address X
is capable of communicating on the relevant transport.¶
For any of the supported encrypted transports E
, if either of the following holds true, the resolver SHOULD NOT send a query to X
over Do53:¶
-
E-session[X]
is in the established
state, or¶
-
E-status[X]
is success
, and (T0 - E-last-response[X]) < persistence
¶
This indicates that one successful connection to a server that the client then closed cleanly would result in the client not sending the next query over Do53, regardless of how long
in the past that was.¶
Otherwise, if there is no outstanding session for any encrypted transport, and the last successful encrypted transport connection was long ago, the resolver sends a query to X
over Do53.
When it does so, it inserts a handle for the query in Do53-queries[X]
.¶
When a response R
for query Q
arrives at the recursive resolver in cleartext sent over Do53 from authoritative server with IP address X
, the recursive resolver should:¶
If Q
is not in Do53-queries[X]
:¶
- Discard
R
and process it no further (do not respond to a cleartext response to a query that is not outstanding)¶
Otherwise:¶
- Remove
Q
from Do53-queries[X]
¶
If R
is successful:¶
But if R
is unsuccessful (e.g. SERVFAIL
):¶
If any E-session[X]
is in the established
state, the recursive resolver SHOULD NOT initiate a new or resume a previous connection to X
over Do53 or E
, but should instead send queries to X
through the existing session (see Section 4.6.8).¶
If the recursive resolver has a preferred encrypted transport, but only a different transport is in the established
state, it MAY also initiate a new connection to X
over its preferred transport while concurrently sending the query over the established
transport E
.¶
Before considering whether to initiate a new connection over an encrypted transport, the timer should examine and possibly refresh its state for encrypted transport E
to authoritative IP address X
:¶
When resources are available to attempt a new encrypted transport, the resolver should only initiate a new connection to X
over E
as long as one of the following holds true:¶
-
E-status[X]
is success
, or¶
-
E-status[X]
is fail
or timeout
and (T0 - E-completed[X]) > damping
, or¶
-
E-status[X]
is null
and E-initiated[X]
is null
¶
When initiating a session to X
over encrypted transport E
, if E-resumptions[X]
is not empty, one ticket should be popped off the stack and used to try to resume a previous session.
Otherwise, the initial Client Hello handshake should not try to resume any session.¶
When initiating a connection, the resolver should take the following steps:¶
- set
E-initiated[X]
to T0
¶
- store a handle for the new session (which should have
pending
state) in E-session[X]
¶
- insert a handle for the query that prompted this connection in
E-queries[X]
, with status unsent
or early
, as appropriate (see below).¶
Modern encrypted transports like TLS 1.3 offer the chance to store "early data" from the client into the initial Client Hello in some contexts.
A resolver that initiates a connection over a encrypted transport according to this guidance in a context where early data is possible SHOULD send the DNS query that prompted the connection in the early data, according to the sending guidance in Section 4.6.8.¶
If it does so, the status of Q
in E-queries[X]
should be set to early
instead of unsent
.¶
When initiating a new connection (whether by resuming an old session or not), the recursive resolver SHOULD request a session resumption ticket from the authoritative server.
If the authoritative server supplies a resumption ticket, the recursive resolver pushes it into the stack at E-resumptions[X]
.¶
For modern encrypted transports like TLS 1.3, most client implementations expect to send a Server Name Indication (SNI) in the Client Hello.¶
There are two complications with selecting or sending SNI in this unilateral probing:¶
- Some authoritative servers are known by more than one name; selecting a single name to use for a given connection may be difficult or impossible.¶
- In most configurations, the contents of the SNI field is exposed on the wire to a passive adversary.
This potentially reveals additional information about which query is being made, based on the NS of the query itself.¶
To avoid additional leakage and complexity, a recursive resolver following this guidance SHOULD NOT send SNI to the authoritative when attempting encrypted transport.¶
If the recursive resolver needs to send SNI to the authoritative for some reason not found in this document, it is RECOMMENDED that it implements Encrypted Client Hello ([I-D.ietf-tls-esni]) to reduce leakage.¶
Because this probing policy is unilateral and opportunistic, the client connecting under this policy MUST accept any certificate presented by the server.
If the client cannot verify the server's identity, it MAY use that information for reporting, logging, or other analysis purposes.
But it MUST NOT reject the connection due to the authentication failure, as the result would be falling back to cleartext, which would leak the content of the session to a passive network monitor.¶
When an encrypted transport connection actually completes (e.g., the TLS handshake completes) at time T1
, the resolver sets E-completed[X]
to T1
and does the following:¶
If the handshake completed successfully:¶
If, at time T2
an encrypted transport handshake completes with a failure (e.g. a TLS alert),¶
Note that this failure will trigger the recursive resolver to fall back to cleartext queries to the authoritative server at IP address X
.
It will retry encrypted transport to X
once the damping
timer has elapsed.¶
Once established, an encrypted transport might fail for a number of reasons (e.g., decryption failure, or improper protocol sequence).¶
If this happens:¶
Note that this failure will trigger the recursive resolver to fall back to cleartext queries to the authoritative server at IP address X
.
It will retry encrypted transport to X
once the damping
timer has elapsed.¶
For example, What if a TCP timeout closes an idle DoT connection?
What if a QUIC stream ends up timing out but other streams on the same QUIC connection are going through?
Do the described scenarios cover the case when an encrypted transport's port is made unavailable/closed?¶
At time T3
, the recursive resolver may find that authoritative server X
cleanly closes an existing outstanding connection (most likely due to resource exhaustion, see Section 3.4).¶
When this happens:¶
Note that this premature shutdown will trigger the recursive resolver to fall back to cleartext queries to the authoritative server at IP address X
.
Any subsequent query to X
will retry the encrypted connection promptly.¶
When sending a query to an authoritative server over encrypted transport at time T4
, the recursive resolver should take a few reasonable steps to ensure privacy and efficiency.¶
After sending query Q
, the recursive resolver should ensure that Q
's state in E-queries[X]
is set to sent
.¶
The recursive resolver also sets E-last-activity[X]
to T4
.¶
In addition, the recursive resolver should consider the guidance in the following sections.¶
To increase the anonymity set for each query, the recursive resolver SHOULD use a sensible padding mechanism for all queries it sends.
Specifically, a DoT client SHOULD use EDNS(0) padding [RFC7830], and a DoQ client SHOULD follow the guidance in Section 5.4 of [RFC9250].
How much to pad is out of scope of this document, but a reasonable suggestion can be found in [RFC8467].¶
When multiple queries are multiplexed on a single encrypted transport to a single authoritative server, the recursive resolver SHOULD pipeline queries and MUST be capable of receiving responses out of order.
For guidance on how to best achieve this on a given encrypted transport, see [RFC7766] (for DoT) and [RFC9250] (for DoQ).¶
When a response R
for query Q
arrives at the recursive resolver over encrypted transport E
from authoritative server with IP address X
at time T5
, the recursive resolver should:¶
If Q
is not in E-queries[X]
:¶
- Discard
R
and process it no further (do not respond to a encrypted response to a query that is not outstanding)¶
Otherwise:¶
- Remove
Q
from E-queries[X]
¶
- Set
E-last-activity[X]
to T5
¶
- Set
E-last-response[X]
to T5
¶
If R
is successful:¶
But if R
is unsuccessful (e.g. SERVFAIL
):¶
To keep resources under control, a recursive resolver should proactively manage outstanding encrypted connections.
Section 6.5 of [RFC9250] ("Connection Handling") offers useful guidance for clients managing DoQ connections.
Section 3.4 of [RFC7858] offers useful guidance for clients managing DoT connections.¶
Even with sensible connection management, a recursive resolver doing unilateral probing may find resources unexpectedly scarce, and may need to close some outstanding connections.¶
In such a situation, the recursive resolver SHOULD use a reasonable prioritization scheme to close outstanding connections.¶
One reasonable prioritization scheme would be:¶
- close outstanding
established
sessions based on E-last-activity[X]
(oldest timestamp gets closed first)¶
Note that when resources are limited, a recursive resolver following this guidance may also choose not to initiate new connections for encrypted transport.¶
Some recursive resolvers looking to amortize connection costs, and to minimize latency MAY choose to synthesize queries to a particular resolver to keep a encrypted transport session active.¶
A recursive resolver that adopts this approach should try to align the synthesized queries with other optimizations.
For example, a recursive resolver that "pre-fetches" a particular resource record to keep its cache "hot" can send that query over an established encrypted transport session.¶
A recursive resolver's state table for an authoritative server can contain additional information beyond what is described above.
The recursive resolver might use that additional state to change the way it interacts with the authoritative server in the future.
Some examples of additional state include:¶
- Whether the server accepts "early data" over a transport such as DoQ¶
- Whether the server fails to respond to queries after the handshake succeeds:¶
- Track the RTT (round trip time) of queries to the server¶
- Track the number of timeouts (compared to the number of successful queries)¶