HyperText Transfer Protocol D. Denicola Internet-Draft J. Roman Intended status: Informational Google LLC Expires: 6 December 2024 4 June 2024 No-Vary-Search draft-wicg-http-no-vary-search-00 Abstract A proposed HTTP header field for changing how URL search parameters impact caching. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://jeremyroman.github.io/http-no-vary-search/draft-wicg-http-no- vary-search.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-wicg-http-no-vary-search/. Source for this draft and an issue tracker can be found at https://github.com/jeremyroman/http-no-vary-search. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 6 December 2024. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. Denicola & Roman Expires 6 December 2024 [Page 1] Internet-Draft No-Vary-Search June 2024 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Conventions and Definitions . . . . . . . . . . . . . . . . . 2 2. HTTP header field definition . . . . . . . . . . . . . . . . 2 3. Data model . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.1. Parse a URL search variance . . . . . . . . . . . . . . . 4 4.2. Obtain a URL search variance . . . . . . . . . . . . . . 5 4.2.1. Examples . . . . . . . . . . . . . . . . . . . . . . 5 4.3. Parse a key . . . . . . . . . . . . . . . . . . . . . . . 7 4.3.1. Examples . . . . . . . . . . . . . . . . . . . . . . 7 5. Comparing . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . 10 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 7. Privacy Considerations . . . . . . . . . . . . . . . . . . . 11 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 9. Normative References . . . . . . . . . . . . . . . . . . . . 11 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 12 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 1. Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. This document also adopts some conventions and notation typical in WHATWG and W3C usage, especially as it relates to algorithms. See [WHATWG-INFRA]. 2. HTTP header field definition The No-Vary-Search HTTP header field is a structured field [STRUCTURED-FIELDS] whose value must be a dictionary (Section 3.2 of [STRUCTURED-FIELDS]). Denicola & Roman Expires 6 December 2024 [Page 2] Internet-Draft No-Vary-Search June 2024 It has the following authoring conformance requirements: * The dictionary must only contain entries whose keys are one of key-order, params, except. * If present, the key-order entry's value must be a boolean (Section 3.3.6 of [STRUCTURED-FIELDS]). * If present, the params entry's value must be either a boolean (Section 3.3.6 of [STRUCTURED-FIELDS]) or an inner list (Section 3.1.1 of [STRUCTURED-FIELDS]). * If present, the except entry's value must be an inner list (Section 3.1.1 of [STRUCTURED-FIELDS]). * The except entry must only be present if the params entry is also present, and the params entry's value is the boolean value true. | As always, the authoring conformance requirements are not | binding on implementations. Implementations instead need to | implement the processing model given by the obtain a URL search | variance algorithm (Section 4.2). 3. Data model A _URL search variance_ consists of the following: no-vary params either the special value *wildcard* or a list of strings vary params either the special value *wildcard* or a list of strings vary on key order a boolean The _default URL search variance_ is a URL search variance whose no- vary params is an empty list, vary params is *wildcard*, and vary on key order is true. The obtain a URL search variance algorithm (Section 4.2) ensures that all URL search variances obey the following constraints: * vary params is a list if and only if the no-vary params is *wildcard*; and * no-vary params is a list if and only if the vary params is *wildcard*. Denicola & Roman Expires 6 December 2024 [Page 3] Internet-Draft No-Vary-Search June 2024 4. Parsing 4.1. Parse a URL search variance To _parse a URL search variance_ given _value_: 1. If _value_ is null, then return the default URL search variance. 2. If _value_'s keys contains anything other than "key-order", "params", or "except", then return the default URL search variance. 3. Let _result_ be a new URL search variance. 4. Set _result_'s vary on key order to true. 5. If _value_["key-order"] exists: 1. If _value_["key-order"] is not a boolean, then return the default URL search variance. 2. Set _result_'s vary on key order to the boolean negation of _value_["key-order"]. 6. If _value_["params"] exists: 1. If _value_["params"] is a boolean: 1. If _value_["params"] is true, then: 1. Set _result_'s no-vary params to *wildcard*. 2. Set _result_'s vary params to the empty list. 2. Otherwise: 1. Set _result_'s no-vary params to the empty list. 2. Set _result_'s vary params to *wildcard*. 2. Otherwise, if _value_["params"] is an array: 1. If any item in _value_["params"] is not a string, then return the default URL search variance. 2. Set _result_'s no-vary params to the result of applying parse a key (Section 4.3) to each item in _value_["params"]. Denicola & Roman Expires 6 December 2024 [Page 4] Internet-Draft No-Vary-Search June 2024 3. Set _result_'s vary params to *wildcard*. 3. Otherwise, return the default URL search variance. 7. If _value_["except"] exists: 1. If _value_["params"] is not true, then return the default URL search variance. 2. If _value_["except"] is not an array, then return the default URL search variance. 3. If any item in _value_["except"] is not a string, then return the default URL search variance. 4. Set _result_'s vary params to the result of applying parse a key (Section 4.3) to each item in _value_["except"]. 8. Return _result_. | In general, this algorithm is strict and tends to return the | default URL search variance whenever it sees something it | doesn't recognize. This is because the default URL search | variance behavior will just cause fewer cache hits, which is an | acceptable fallback behavior. | The input to this algorithm is generally obtained by parsing a | structured field (Section 4.2 of [STRUCTURED-FIELDS]) using | field_type "dictionary". 4.2. Obtain a URL search variance To _obtain a URL search variance_ given a response (https://fetch.spec.whatwg.org/#concept-response) _response_: 1. Let _fieldValue_ be the result of getting a structured field value (https://fetch.spec.whatwg.org/#concept-header-list-get- structured-header) [FETCH] given `No-Vary-Search` and "dictionary" from _response_'s header list. 2. Return the result of parsing a URL search variance (Section 4.1) given _fieldValue_. 4.2.1. Examples The following illustrates how various inputs are parsed, in terms of their impacting on the resulting no-vary params and vary params: Denicola & Roman Expires 6 December 2024 [Page 5] Internet-Draft No-Vary-Search June 2024 +========================+============================+ | Input | Result | +========================+============================+ | No-Vary-Search: params | no-vary params: *wildcard* | | | vary params: (empty list) | +------------------------+----------------------------+ | No-Vary-Search: | no-vary params: « "a" » | | params=("a") | vary params: *wildcard* | +------------------------+----------------------------+ | No-Vary-Search: | no-vary params: *wildcard* | | params, except=("x") | vary params: « "x" » | +------------------------+----------------------------+ Table 1 The following inputs are all invalid and will cause the default URL search variance to be returned: * No-Vary-Search: unknown-key * No-Vary-Search: key-order="not a boolean" * No-Vary-Search: params="not a boolean or inner list" * No-Vary-Search: params=(not-a-string) * No-Vary-Search: params=("a"), except=("x") * No-Vary-Search: params=(), except=() * No-Vary-Search: params=?0, except=("x") * No-Vary-Search: params, except=(not-a-string) * No-Vary-Search: params, except="not an inner list" * No-Vary-Search: params, except=?1 * No-Vary-Search: except=("x") * No-Vary-Search: except=() The following inputs are valid, but somewhat unconventional. They are shown alongside their more conventional form. Denicola & Roman Expires 6 December 2024 [Page 6] Internet-Draft No-Vary-Search June 2024 +==============================+============================+ | Input | Conventional form | +==============================+============================+ | No-Vary-Search: params=?1 | No-Vary-Search: params | +------------------------------+----------------------------+ | No-Vary-Search: key-order=?1 | No-Vary-Search: key-order | +------------------------------+----------------------------+ | No-Vary-Search: params, key- | No-Vary-Search: key-order, | | order, except=("x") | params, except=("x") | +------------------------------+----------------------------+ | No-Vary-Search: params=?0 | (omit the header) | +------------------------------+----------------------------+ | No-Vary-Search: params=() | (omit the header) | +------------------------------+----------------------------+ | No-Vary-Search: key-order=?0 | (omit the header) | +------------------------------+----------------------------+ Table 2 4.3. Parse a key To _parse a key_ given an ASCII string _keyString_: 1. Let _keyBytes_ be the isomorphic encoding (https://infra.spec.whatwg.org/#isomorphic-encode) [WHATWG-INFRA] of _keyString_. 2. Replace any 0x2B (+) in _keyBytes_ with 0x20 (SP). 3. Let _keyBytesDecoded_ be the percent-decoding (https://url.spec.whatwg.org/#percent-decode) [WHATWG-URL] of _keyBytes_. 4. Let _keyStringDecoded_ be the UTF-8 decoding without BOM (https://encoding.spec.whatwg.org/#utf-8-decode-without-bom) [WHATWG-ENCODING] of _keyBytesDecoded_. 5. Return _keyStringDecoded_. 4.3.1. Examples The parse a key algorithm allows encoding non-ASCII key strings in the ASCII structured header format, similar to how the application/x- www-form-urlencoded (https://url.spec.whatwg.org/#concept-urlencoded) format [WHATWG-URL] allows encoding an entire entry list of keys and values in ASCII URL format. For example, No-Vary-Search: params=("%C3%A9+%E6%B0%97") Denicola & Roman Expires 6 December 2024 [Page 7] Internet-Draft No-Vary-Search June 2024 will result in a URL search variance whose vary params are « "é 気" ». As explained in a later example, the canonicalization process during equivalence testing means this will treat as equivalent URL strings such as: * https://example.com/?é 気=1 * https://example.com/?é+気=2 * https://example.com/?%C3%A9%20気=3 * https://example.com/?%C3%A9+%E6%B0%97=4 and so on, since they all are parsed (https://url.spec.whatwg.org/#concept-urlencoded-parser) [WHATWG-URL] to having the same key "é 気". 5. Comparing Two URLs (https://url.spec.whatwg.org/#concept-url) [WHATWG-URL] _urlA_ and _urlB_ are _equivalent modulo search variance_ given a URL search variance _searchVariance_ if the following algorithm returns true: 1. If the scheme, username, password, host, port, or path of _urlA_ and _urlB_ differ, then return false. 2. If _searchVariance_ is equivalent to the default URL search variance, then: 1. If _urlA_'s query equals _urlB_'s query, then return true. 2. Return false. In this case, even URL pairs that might appear the same after running the application/x-www-form-urlencoded parser (https://url.spec.whatwg.org/#concept-urlencoded-parser) [WHATWG-URL] on their queries, such as https://example.com/a and https://example.com/a?, or https://example.com/foo?a=b&&&c and https://example.com/foo?a=b&c=, will be treated as inequivalent. 3. Let _searchParamsA_ and _searchParamsB_ be empty lists. Denicola & Roman Expires 6 December 2024 [Page 8] Internet-Draft No-Vary-Search June 2024 4. If _wrlA_'s query is not null, then set _searchParamsA_ to the result of running the application/x-www-form-urlencoded parser (https://url.spec.whatwg.org/#concept-urlencoded-parser) [WHATWG-URL] given the isomorphic encoding (https://infra.spec.whatwg.org/#isomorphic-encode) [WHATWG-INFRA] of _urlA_'s query. 5. If _wrlB_'s query is not null, then set _searchParamsB_ to the result of running the application/x-www-form-urlencoded parser (https://url.spec.whatwg.org/#concept-urlencoded-parser) [WHATWG-URL] given the isomorphic encoding (https://infra.spec.whatwg.org/#isomorphic-encode) [WHATWG-INFRA] of _urlB_'s query. 6. If _searchVariance_'s no-vary params is a list, then: 1. Set _searchParamsA_ to a list containing those items _pair_ in _searchParamsA_ where _searchVariance_'s no-vary params does not contain _pair_[0]. 2. Set _searchParamsB_ to a list containing those items _pair_ in _searchParamsB_ where _searchVariance_'s no-vary params does not contain _pair_[0]. 7. Otherwise, if _searchVariance_'s vary params is a list, then: 1. Set _searchParamsA_ to a list containing those items _pair_ in _searchParamsA_ where _searchVariance_'s vary params contains _pair_[0]. 2. Set _searchParamsB_ to a list containing those items _pair_ in _searchParamsB_ where _searchVariance_'s vary params contains _pair_[0]. 8. If _searchVariance_'s vary on key order is false, then: 1. Let _keyLessThan_ be an algorithm taking as inputs two pairs (_keyA_, _valueA_) and (_keyB_, _valueB_), which returns whether _keyA_ is code unit less than (https://infra.spec.whatwg.org/#code-unit-less-than) [WHATWG-INFRA] _keyB_. 2. Set _searchParamsA_ to the result of sorting _searchParamsA_ in ascending order with _keyLessThan_. 3. Set _searchParamsB_ to the result of sorting _searchParamsB_ in ascending order with _keyLessThan_. Denicola & Roman Expires 6 December 2024 [Page 9] Internet-Draft No-Vary-Search June 2024 9. If _searchParamsA_'s size is not equal to _searchParamsB_'s size, then return false. 10. Let _i_ be 0. 11. While _i_ < _searchParamsA_'s size: 1. If _searchParamsA_[_i_][0] does not equal _searchParamsB_[_i_][0], then return false. 2. If _searchParamsA_[_i_][1] does not equal _searchParamsB_[_i_][1], then return false. 3. Set _i_ to _i_ + 1. 12. Return true. 5.1. Examples Due to how the application/x-www-form-urlencoded parser canonicalizes query strings, there are some cases where query strings which do not appear obviously equivalent, will end up being treated as equivalent after parsing. So, for example, given any non-default value for No-Vary-Search, such as No-Vary-Search: key-order, we will have the following equivalences: https://example.com https://example.com/? A null query is parsed the same as an empty string https://example.com/?a=x https://example.com/?%61=%78 Parsing performs percent-decoding https://example.com/?a=é https://example.com/?a=%C3%A9 Parsing performs percent-decoding https://example.com/?a=%f6 https://example.com/?a=%ef%bf%bd Both values are parsed as U+FFFD (�) https://example.com/?a=x&&&& https://example.com/?a=x Parsing splits on & and discards empty strings Denicola & Roman Expires 6 December 2024 [Page 10] Internet-Draft No-Vary-Search June 2024 https://example.com/?a= https://example.com/?a Both parse as having an empty string value for a https://example.com/?a=%20 https://example.com/?a=+ https://example.com/?a= & + and %20 are both parsed as U+0020 SPACE 6. Security Considerations TODO Security 7. Privacy Considerations TODO Privacy 8. IANA Considerations TODO IANA 9. Normative References [FETCH] van Kesteren, A., "Fetch Living Standard", n.d., . WHATWG [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [STRUCTURED-FIELDS] Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021, . [WHATWG-ENCODING] van Kesteren, A., "Encoding Living Standard", n.d., . WHATWG [WHATWG-INFRA] van Kesteren, A. and D. Denicola, "Infra Living Standard", n.d., . WHATWG Denicola & Roman Expires 6 December 2024 [Page 11] Internet-Draft No-Vary-Search June 2024 [WHATWG-URL] van Kesteren, A., "URL Living Standard", n.d., . WHATWG Acknowledgments TODO acknowledge. Index D E O P D default URL search variance *_Section 3, Paragraph 3_*; Section 4.1, Paragraph 2.1.1; Section 4.1, Paragraph 2.2.1; Section 4.1, Paragraph 2.5.2.1.1; Section 4.1, Paragraph 2.6.2.2.2.1.1; Section 4.1, Paragraph 2.6.2.3.1; Section 4.1, Paragraph 2.7.2.1.1; Section 4.1, Paragraph 2.7.2.2.1; Section 4.1, Paragraph 2.7.2.3.1; Section 4.1, Paragraph 3.1; Section 4.2.1, Paragraph 3; Section 5, Paragraph 2.2.1 E equivalent modulo search variance *_Section 5, Paragraph 1_* O obtain a URL search variance Section 2, Paragraph 4.1; Section 3, Paragraph 4; *_Section 4.2, Paragraph 1_* P parse a key Section 4.1, Paragraph 2.6.2.2.2.2.1; Section 4.1, Paragraph 2.7.2.4.1; *_Section 4.3, Paragraph 1_*; Section 4.3.1, Paragraph 1 parse a URL search variance *_Section 4.1, Paragraph 1_*; Section 4.2, Paragraph 2.2.1 Authors' Addresses Domenic Denicola Google LLC Email: d@domenic.me Denicola & Roman Expires 6 December 2024 [Page 12] Internet-Draft No-Vary-Search June 2024 Jeremy Roman Google LLC Email: jbroman@chromium.org Denicola & Roman Expires 6 December 2024 [Page 13]