There is a lot of confusion about media-types, content-types, and
related terminology.¶
This memo is an attempt at clearing it up, so we can use consistent
terminology in CoRE and related specifications.
It also defines some ABNF that can be used in these specifications.¶
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF). Note that other groups may also distribute working
documents as Internet-Drafts. The list of current Internet-Drafts is
at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 22 August 2021.¶
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 [RFC2119][RFC8174] when, and only when, they
appear in all capitals, as shown here.¶
[RFC1590] introduced media types and their registration.
That document took MIME types from [RFC1521] and gave them a new name.
At that time, the term "media type" was often used just for the major
type ("text", "audio"), and what we call a media-type now was the
combination of a type and a subtype. This lives on in [RFC6838],
which does not even have an ABNF [RFC5234] production for media type.
[RFC6838]'s predecessor, [RFC4288], supplied the ABNF shown in (Figure 1).¶
[RFC6838], obsoleting [RFC4288], restricts the
first character of a reg-name to alphanumeric.
It contains the otherwise semantically equivalent ABNF shown in
Figure 2, however adding prose comments that further
limit the use of "." and "+".¶
type-name = restricted-name
subtype-name = restricted-name
restricted-name = restricted-name-first *126restricted-name-chars
restricted-name-first = ALPHA / DIGIT
restricted-name-chars = ALPHA / DIGIT / "!" / "#" /
"$" / "&" / "-" / "^" / "_"
restricted-name-chars =/ "." ; Characters before first dot always
; specify a facet name
restricted-name-chars =/ "+" ; Characters after last plus always
; specify a structured syntax suffix
Today, the term "media type" is now generally used for a registered
combination of a type-name and a subtype-name, as well as for the
specification that defines the semantics of this combination.
We further disambiguate by calling the former a media type name.
An ABNF definition of Media-Type-Name:¶
Media types can have parameters [RFC6838], some of which are
defined by the media type specification to be mandatory.
In HTTP and many other protocols, media-type-names and parameters are
then used together in a "Content-Type" header field.
HTTP [RFC7231] uses the ABNF in Figure 4:¶
In the ABNF as established by
[RFC2616], parts of which became [RFC7231], the rule name
media-type is used for a Media-Type-Name with parameters attached.
We don't follow this inclusive use of media-type; note that
[RFC2616] was quite confused about this term by claiming (Section 3.7 of [RFC2616]):¶
Media-type values are registered with the Internet Assigned Number
Authority (IANA [19]).¶
This clearly reverts to the understanding of Media-Type-Name we use.¶
Instead of prolonging this confusion, we define as a separate term:¶
Content-Type:
A Media-Type-Name, optionally associated with parameters (separated from
the media type name and from each other by a semicolon).¶
Removing the legacy HTAB characters now shunned in polite conversation,
as well as some other cobwebs, we define the conventional textual
representation of a Content-Type with the ABNF in Figure 5:¶
Note that there is a slight inconsistency between the "token" used
here and the "reg-name"/"restricted-name" used above; since media type
parameters probably will be defined within the guard rails set by
[RFC7231], we need to use HTTP's more comprehensive definition here.¶
Section 3.5 of [RFC2616] also introduced the term Content-Coding, a
registered name for an encoding transformation that has been or can be
applied to a representation:¶
Confusingly, in HTTP the Content-Coding is then given in a header
field called "Content-Encoding"; we never use this term (except when
we are in error). Instead we define:¶
Content-Coding:
a registered name for an encoding transformation that has been or
can be applied to a representation.¶
Content-Codings are registered in the HTTP Content Coding Registry, a
subregistry of [IANA.http-parameters]. We often use the "identity"
Content-Coding, which is the identity transformation, and often fail
to identify that Content-Coding by name, instead calling it "no
Content-Coding".¶
CoAP, in Section 1 of [RFC7252], defines a Content-Format as the
combination of a Content-Type and a Content-Coding, identified by a
numeric identifier defined in the "CoAP Content-Formats" registry (a
subregistry of [IANA.core-parameters]), but in more confusing words (it
did not have the benefit of the present specifications).¶
Content-Format:
the combination of a Content-Type and a Content-Coding, identified
by a numeric identifier defined by the "CoAP Content-Formats"
subregistry of [IANA.core-parameters].¶
Note that there has not been a conventional string representation of
just the combination of a Content-Type and a Content-Coding;
Content-Formats so far always are identified by their registered
Content-Format numbers. However, there are applications where that is
useful [I-D.keranen-core-senml-data-ct], so we define:¶
This allows the use of Content-Format-Strings such as
"application/json@deflate" in place of the less self-describing
content-format "11050", or other combinations that do not have a
content-format number defined yet.¶
Content-Format-Strings MUST NOT explicitly use the content-coding value of
"identity" (i.e., if an identity content-coding is desired, the entire
optional part including the "@" sign is left out).¶
Note that a quoted string inside a content-type parameter might
contain an "@" sign, so the parsing of Content-Format-Strings cannot
be done in a too simplistic way.¶
Media type names are sometimes abbreviated as "mt", and Content-Types
as "ct". We propose not to use those abbreviations: Where the long
form of the values can be used, the long form "Content-Type" can also
be used to name them.¶
For historical reasons, both [RFC6690] and [RFC7252] use the
abbreviation "ct" for Content-Format (think first and last character).¶
For Content-Coding, the abbreviation "cc" can be used.¶
The ABNF given here is provisional and may need some more cleanup,
such as unifying the various forms of reg-name, token, etc.¶
(ABNF just shown for illustration is centered, in a blockquote, and tagged with
<artwork type="abnf;old"...> in the XML, while the normative ABNF of this memo is
left-aligned and tagged with <sourcecode type="abnf"...>.)¶
The XPath expression //sourcecode[@type='abnf']/text() can be used
on the XML form of this specification to extract the ABNF defined here.¶
We need to discuss case-insensitivity at some point, which is usually
rather insensitive.¶
Section 3.1 of [RFC8152] defines a common COSE header parameter
(number 3) called "content type" in the description, to indicate the
type of the data in the payload or ciphertext fields.¶
This header can either be an unsigned integer, indicating a CoRE
Content-Format number, or a text string that is only defined in
general terms.
It points to Section 4.2 of [RFC6838] for 'text values following the
syntax of "<type-name>/<subtype-name>"...', but also discusses the
use of parameters and subparameters; no ABNF or similar detail
specification is provided.
The text does not discuss the use of Content-Coding in the text string
form, probably because nothing like the present document existed at
the time, creating a weird gap compared with numeric
Content-Format-Strings.
The text only has trivial changes in Section 3.1 of [I-D.ietf-cose-rfc8152bis-struct-15].¶
The present specification suggests using the production
Content-Format-String as a more formal definition of the text string
that can go into the "content type" (number 3) common header parameter
in COSE.¶
Confusion about terminology may, in the worst case, cause security
problems, as can loosely defined syntax elements of a specification.
No other security considerations are known to be raised by the present
specification.¶
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
Borenstein, N. and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, DOI 10.17487/RFC1521, , <https://www.rfc-editor.org/info/rfc1521>.
Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, DOI 10.17487/RFC2616, , <https://www.rfc-editor.org/info/rfc2616>.
[RFC4288]
Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", RFC 4288, DOI 10.17487/RFC4288, , <https://www.rfc-editor.org/info/rfc4288>.
[RFC5234]
Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/info/rfc5234>.
Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, , <https://www.rfc-editor.org/info/rfc6838>.
[RFC7231]
Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content", RFC 7231, DOI 10.17487/RFC7231, , <https://www.rfc-editor.org/info/rfc7231>.
[RFC7252]
Shelby, Z., Hartke, K., and C. Bormann, "The Constrained Application Protocol (CoAP)", RFC 7252, DOI 10.17487/RFC7252, , <https://www.rfc-editor.org/info/rfc7252>.
Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: Session Description Protocol", RFC 8866, DOI 10.17487/RFC8866, , <https://www.rfc-editor.org/info/rfc8866>.
Matthias Kovatsch forced the authors to make up their minds about this.
Ari Keränen forced them to write it up, then, and created a convincing
use case of Content-Format-Strings.
John Mattsson alerted us to a mistake.
Alexey Melnikov suggested reviving this draft after a year of dormancy.¶