Matroska Codec

Internet-Draft	Matroska	October 2019
Lhomme, et al.	Expires 29 April 2020	[Page]

Abstract

This document defines the Matroska codec mappings, including the codec ID, layout of data in a Block Element and in an optional CodecPrivate Element.¶

6. Codec Mappings

A Codec Mapping is a set of attributes to identify, name, and contextualize the format and characteristics of encoded data that can be contained within Matroska Clusters.¶

Each TrackEntry used within Matroska MUST reference a defined Codec Mapping using the Codec ID to identify and describe the format of the encoded data in its associated Clusters. This Codec ID is a unique registered identifier that represents the encoding stored within the Track. Certain encodings MAY also require some form of codec initialization in order to provide its decoder with context and technical metadata.¶

The intention behind this list is not to list all existing audio and video codecs, but rather to list those codecs that are currently supported in Matroska and therefore need a well defined Codec ID so that all developers supporting Matroska will use the same Codec ID. If you feel we missed support for a very important codec, please tell us on our development mailing list (cellar at ietf.org).¶

6.1. Defining Matroska Codec Support

Support for a codec is defined in Matroska with the following values.¶

6.1.1. Codec ID

Each codec supported for storage in Matroska MUST have a unique Codec ID. Each Codec ID MUST be prefixed with the string from the following table according to the associated type of the codec. All characters of a Codec ID Prefix MUST be capital letters (A-Z) except for the last character of a Codec ID Prefix which MUST be an underscore ("_").¶

Table 1
Codec Type	Codec ID Prefix
Video	"V_"
Audio	"A_"
Subtitle	"S_"
Button	"B_"

Each Codec ID MUST include a Major Codec ID immediately following the Codec ID Prefix. A Major Codec ID MAY be followed by an OPTIONAL Codec ID Suffix to communicate a refinement of the Major Codec ID. If a Codec ID Suffix is used, then the Codec ID MUST include a forward slash ("/") as a separator between the Major Codec ID and the Codec ID Suffix. The Major Codec ID MUST be composed of only capital letters (A-Z) and numbers (0-9). The Codec ID Suffix MUST be composed of only capital letters (A-Z), numbers (0-9), underscore ("_"), and forward slash ("/").¶

The following table provides examples of valid Codec IDs and their components:¶

Table 2
Codec ID Prefix	Major Codec ID	Separator	Codec ID Suffix	Codec ID
A_	AAC	/	MPEG2/LC/SBR	A_AAC/MPEG2/LC/SBR
V_	MPEG4	/	ISO/ASP	V_MPEG4/ISO/ASP
V_	MPEG1			V_MPEG1

6.1.2. Codec Name

Each encoding supported for storage in Matroska MUST have a Codec Name. The Codec Name provides a readable label for the encoding.¶

6.1.3. Description

An optional description for the encoding. This value is only intended for human consumption.¶

6.1.4. Initialization

Each encoding supported for storage in Matroska MUST have a defined Initialization. The Initialization MUST describe the storage of data necessary to initialize the decoder, which MUST be stored within the CodecPrivate Element. When the Initialization is updated within a track then that updated Initialization data MUST be written into the CodecState Element of the first Cluster to require it. If the encoding does not require any form of Initialization then none MUST be used to define the Initialization and the CodecPrivate Element SHOULD NOT be written and MUST be ignored. Data that is defined Initialization to be stored in the CodecPrivate Element is known as Private Data.¶

6.1.5. Codec BlockAdditions

Additional data that contextualizes or supplements a Block can be stored within the BlockAdditional Element of a BlockMore Element. This BlockAdditional data MAY be passed to the associated decoder along with the content of the Block Element. Each BlockAdditional is coupled with a BlockAddID that identifies the kind of data it contains. The following table defines the meanings of BlockAddID values.¶

Table 3
BlockAddID Value	Definition
0	Invalid.
1	Indicates that the context of the `BlockAdditional` data is defined by the corresponding `Codec Mapping`.
2 or greater	`BlockAddID` values of 2 and greater are mapped to the `BlockAddIDValue` of the `BlockAdditionMapping` of the associated Track.

The values of BlockAddID that are 2 of greater have no semantic meaning, but simply associate the BlockMore Element with a BlockAdditionMapping of the associated Track. See the section on Block Additional Mappings for more information.¶

The following XML depicts the nested Elements of a BlockGroup Element with an example of BlockAdditions:¶

<BlockGroup> <Block>{Binary data of a VP9 video frame in YUV}</Block> <BlockAdditions> <BlockMore> <BlockAddID>1</BlockAddID> <BlockAdditional> {alpha channel encoding to supplement the VP9 frame} </BlockAdditional> </BlockMore> </BlockAdditions> </BlockGroup>

6.1.6. Citation

Documentation of the associated normative and informative references for the codec is RECOMMENDED.¶

6.1.7. Deprecation Date

A timestamp, expressed in [RFC3339] that notes when support for the Codec Mapping within Matroska was deprecated. If a Codec Mapping is defined with a Deprecation Date, then it is RECOMMENDED that Matroska writers SHOULD NOT use the Codec Mapping after the Deprecation Date.¶

6.1.8. Superseded By

A Codec Mapping MAY only be defined with a Superseded By value, if it has an expressed Deprecation Date. If used, the Superseded By value MUST store the Codec ID of another Codec Mapping that has superseded the Codec Mapping.¶

6.2. Recommendations for the Creation of New Codec Mappings

Creators of new Codec Mappings to be used in the context of Matroska:¶

SHOULD assume that all Codec Mappings they create might become standardized, public, commonly deployed, or usable across multiple implementations.¶
SHOULD employ meaningful values for Codec ID and Codec Name that they have reason to believe are currently unused.¶
SHOULD NOT prefix their Codec ID with "X_" or similar constructs.¶

These recommendations are based upon Section 3 of [RFC6648].¶

6.3. Video Codec Mappings

6.3.1. V_MS/VFW/FOURCC

Codec ID: V_MS/VFW/FOURCC¶

Codec Name: Microsoft (TM) Video Codec Manager (VCM)¶

Description: The private data contains the VCM structure BITMAPINFOHEADER including the extra private bytes, as defined by Microsoft.aspx). The data are stored in little endian format (like on IA32 machines). Where is the Huffman table stored in HuffYUV, not AVISTREAMINFO ??? And the FourCC, not in AVISTREAMINFO.fccHandler ???¶

Initialization: Private Data contains the VCM structure BITMAPINFOHEADER including the extra private bytes, as defined by Microsoft in https://msdn.microsoft.com/en-us/library/windows/desktop/dd183376(v=vs.85).aspx.¶

Citation: https://msdn.microsoft.com/en-us/library/windows/desktop/dd183376(v=vs.85).aspx ¶

6.3.2. V_UNCOMPRESSED

Codec ID: V_UNCOMPRESSED¶

Codec Name: Video, raw uncompressed video frames¶

Description: All details about the used color specs and bit depth are to be put/read from the KaxCodecColourSpace elements.¶

Initialization: none¶

6.3.3. V_MPEG4/ISO/SP

Codec ID: V_MPEG4/ISO/SP¶

Codec Name: MPEG4 ISO simple profile (DivX4)¶

Description: Stream was created via improved codec API (UCI) or even transmuxed from AVI (no b-frames in Simple Profile), frame order is coding order.¶

Initialization: none¶

6.3.4. V_MPEG4/ISO/ASP

Codec ID: V_MPEG4/ISO/ASP¶

Codec Name: MPEG4 ISO advanced simple profile (DivX5, XviD, FFMPEG)¶

Description: Stream was created via improved codec API (UCI) or transmuxed from MP4, not simply transmuxed from AVI. Note there are differences how b-frames are handled in these native streams, when being compared to a VfW created stream, as here there are no dummy frames inserted, the frame order is exactly the same as the coding order, same as in MP4 streams.¶

Initialization: none¶

6.3.5. V_MPEG4/ISO/AP

Codec ID: V_MPEG4/ISO/AP¶

Codec Name: MPEG4 ISO advanced profile¶

Initialization: none¶

6.3.6. V_MPEG4/MS/V3

Codec ID: V_MPEG4/MS/V3¶

Codec Name: Microsoft (TM) MPEG4 V3¶

Description: Microsoft (TM) MPEG4 V3 and derivates, means DivX3, Angelpotion, SMR, etc.; stream was created using VfW codec or transmuxed from AVI; note that V1/V2 are covered in VfW compatibility mode.¶

Initialization: none¶

6.3.7. V_MPEG1

Codec ID: V_MPEG1¶

Codec Name: MPEG 1¶

Description: The Matroska video stream will contain a demuxed Elementary Stream (ES), where block boundaries are still to be defined. Its RECOMMENDED to use MPEG2MKV.exe for creating those files, and to compare the results with self-made implementations¶

Initialization: none¶

6.3.8. V_MPEG2

Codec ID: V_MPEG2¶

Codec Name: MPEG 2¶

Initialization: none¶

6.3.9. V_REAL/RV10

Codec ID: V_REAL/RV10¶

Codec Name: RealVideo 1.0 aka RealVideo 5¶

Description: Individual slices from the Real container are combined into a single frame.¶

Initialization: The Private Data contains a real_video_props_t structure in Big Endian byte order as found in librmff.¶

6.3.10. V_REAL/RV20

Codec ID: V_REAL/RV20¶

Codec Name: RealVideo G2 and RealVideo G2+SVT¶

Description: Individual slices from the Real container are combined into a single frame.¶

Initialization: The Private Data contains a real_video_props_t structure in Big Endian byte order as found in librmff.¶

6.3.11. V_REAL/RV30

Codec ID: V_REAL/RV30¶

Codec Name: RealVideo 8¶

Description: Individual slices from the Real container are combined into a single frame.¶

Initialization: The Private Data contains a real_video_props_t structure in Big Endian byte order as found in librmff.¶

6.3.12. V_REAL/RV40

Codec ID: V_REAL/RV40¶

Codec Name: rv40 : RealVideo 9¶

Description: Individual slices from the Real container are combined into a single frame.¶

Initialization: The Private Data contains a real_video_props_t structure in Big Endian byte order as found in librmff.¶

6.3.13. V_QUICKTIME

Codec ID: V_QUICKTIME¶

Codec Name: Video taken from QuickTime(TM) files¶

Description: Several codecs as stored in QuickTime, e.g. Sorenson or Cinepak.¶

Initialization: The Private Data contains all additional data that is stored in the 'stsd' (sample description) atom in the QuickTime file after the mandatory video descriptor structure (starting with the size and FourCC fields). For an explanation of the QuickTime file format read QuickTime File Format Specification.¶

6.3.14. V_THEORA

Codec ID: V_THEORA¶

Codec Name: Theora¶

Initialization: The Private Data contains the first three Theora packets in order. The lengths of the packets precedes them. The actual layout is:¶

Byte 1: number of distinct packets '#p' minus one inside the CodecPrivate block. This MUST be '2' for current (as of 2016-07-08) Theora headers.¶
Bytes 2..n: lengths of the first '#p' packets, coded in Xiph-style lacing. The length of the last packet is the length of the CodecPrivate block minus the lengths coded in these bytes minus one.¶
Bytes n+1..: The Theora identification header, followed by the commend header followed by the codec setup header. Those are described in the Theora specs.¶

6.3.15. V_PRORES

Codec ID: V_PRORES¶

Codec Name: Apple ProRes¶

Initialization: The Private Data contains the FourCC as found in MP4 movies:¶

ap4x: ProRes 4444 XQ¶
ap4h: ProRes 4444¶
apch: ProRes 422 High Quality¶
apcn: ProRes 422 Standard Definition¶
apcs: ProRes 422 LT¶
apco: ProRes 422 Proxy¶
aprh: ProRes RAW High Quality¶
aprn: ProRes RAW Standard Definition¶

this page for more technical details on ProRes ¶

6.3.16. V_VP8

Codec ID: V_VP8¶

Codec Name: VP8 Codec format¶

Description: VP8 is an open and royalty free video compression format developed by Google and created by On2 Technologies as a successor to VP7. [RFC6386]¶

Codec BlockAdditions: A single-channel encoding of an alpha channel MAY be stored in BlockAdditions. The BlockAddId of the BlockMore containing these data MUST be 1.¶

Initialization: none¶

6.3.17. V_VP9

Codec ID: V_VP9¶

Codec Name: VP9 Codec format¶

Description: VP9 is an open and royalty free video compression format developed by Google as a successor to VP8. Draft VP9 Bitstream and Decoding Process Specification ¶

Codec BlockAdditions: A single-channel encoding of an alpha channel MAY be stored in BlockAdditions. The BlockAddId of the BlockMore containing these data MUST be 1.¶

Initialization: none¶

6.3.18. V_FFV1

Codec ID: V_FFV1¶

Codec Name: FF Video Codec 1¶

Description: FFV1 is a lossless intra-frame video encoding format designed to efficiently compress video data in a variety of pixel formats. Compared to uncompressed video, FFV1 offers storage compression, frame fixity, and self-description, which makes FFV1 useful as a preservation or intermediate video format. Draft FFV1 Specification ¶

Initialization: For FFV1 versions 0 or 1, Private Data SHOULD NOT be written. For FFV1 version 3 or greater, the Private Data MUST contain the FFV1 Configuration Record structure, as defined in https://tools.ietf.org/html/draft-ietf-cellar-ffv1-04#section-4.2, and no other data.¶

6.4. Audio Codec Mappings

6.4.1. A_MPEG/L3

Codec ID: A_MPEG/L3¶

Codec Name: MPEG Audio 1, 2, 2.5 Layer III¶

Description: The data contain everything needed for playback in the MPEG Audio header of each frame. Corresponding ACM wFormatTag : 0x0055¶

Initialization: none¶

6.4.2. A_MPEG/L2

Codec ID: A_MPEG/L2¶

Codec Name: MPEG Audio 1, 2 Layer II¶

Description: The data contain everything needed for playback in the MPEG Audio header of each frame. Corresponding ACM wFormatTag : 0x0050¶

Initialization: none¶

6.4.3. A_MPEG/L1

Codec ID: A_MPEG/L1¶

Codec Name: MPEG Audio 1, 2 Layer I¶

Description: The data contain everything needed for playback in the MPEG Audio header of each frame. Corresponding ACM wFormatTag : 0x0050¶

Initialization: none¶

6.4.4. A_PCM/INT/BIG

Codec ID: A_PCM/INT/BIG¶

Codec Name: PCM Integer Big Endian¶

Description: The audio bit depth MUST be read and set from the BitDepth Element. Audio samples MUST be considered as signed values, except if the audio bit depth is 8 which MUST be interpreted as unsigned values. Corresponding ACM wFormatTag : ???¶

Initialization: none¶

6.4.5. A_PCM/INT/LIT

Codec ID: A_PCM/INT/LIT¶

Codec Name: PCM Integer Little Endian¶

Initialization: none¶

6.4.6. A_PCM/FLOAT/IEEE

Codec ID: A_PCM/FLOAT/IEEE¶

Codec Name: Floating Point, IEEE compatible¶

Description: The audio bit depth MUST be read and set from the BitDepth Element (32 bit in most cases). The floats are stored as defined in [IEEE.754.1985] and in little endian order. Corresponding ACM wFormatTag : 0x0003¶

Initialization: none¶

6.4.7. A_MPC

Codec ID: A_MPC¶

Codec Name: MPC (musepack) SV8¶

Description: The main developer for musepack has requested that we wait until the SV8 framing has been fully defined for musepack before defining how to store it in Matroska.¶

6.4.8. A_AC3

Codec ID: A_AC3¶

Codec Name: (Dolby[TM]) AC3¶

Description: BSID <= 8 !! The private data is void ??? Corresponding ACM wFormatTag : 0x2000 ; channel number have to be read from the corresponding audio element¶

6.4.9. A_AC3/BSID9

Codec ID: A_AC3/BSID9¶

Codec Name: (Dolby[TM]) AC3¶

Description: The ac3 frame header has, similar to the mpeg-audio header a version field. Normal ac3 is defined as bitstream id 8 (5 Bits, numbers are 0-15). Everything below 8 is still compatible with all decoders that handle 8 correctly. Everything higher are additions that break decoder compatibility. For the samplerates 24kHz (00); 22,05kHz (01) and 16kHz (10) the BSID is 9 For the samplerates 12kHz (00); 11,025kHz (01) and 8kHz (10) the BSID is 10¶

Initialization: none¶

6.4.10. A_AC3/BSID10

Codec ID: A_AC3/BSID10¶

Codec Name: (Dolby[TM]) AC3¶

Initialization: none¶

6.4.11. A_ALAC

Codec ID: A_ALAC¶

Codec Name: ALAC (Apple Lossless Audio Codec)¶

Initialization: The Private Data contains ALAC's magic cookie (both the codec specific configuration as well as the optional channel layout information). Its format is described in ALAC's official source code.¶

6.4.12. A_DTS

Codec ID: A_DTS¶

Codec Name: Digital Theatre System¶

Description: Supports DTS, DTS-ES, DTS-96/26, DTS-HD High Resolution Audio and DTS-HD Master Audio. The private data is void. Corresponding ACM wFormatTag : 0x2001¶

Initialization: none¶

6.4.13. A_DTS/EXPRESS

Codec ID: A_DTS/EXPRESS¶

Codec Name: Digital Theatre System Express¶

Description: DTS Express (a.k.a. LBR) audio streams. The private data is void. Corresponding ACM wFormatTag : 0x2001¶

Initialization: none¶

6.4.14. A_DTS/LOSSLESS

Codec ID: A_DTS/LOSSLESS¶

Codec Name: Digital Theatre System Lossless¶

Description: DTS Lossless audio that does not have a core substream. The private data is void. Corresponding ACM wFormatTag : 0x2001¶

Initialization: none¶

6.4.15. A_VORBIS

Codec ID: A_VORBIS¶

Codec Name: Vorbis¶

Initialization: The Private Data contains the first three Vorbis packet in order. The lengths of the packets precedes them. The actual layout is: - Byte 1: number of distinct packets '#p' minus one inside the CodecPrivate block. This MUST be '2' for current (as of 2016-07-08) Vorbis headers. - Bytes 2..n: lengths of the first '#p' packets, coded in Xiph-style lacing. The length of the last packet is the length of the CodecPrivate block minus the lengths coded in these bytes minus one. - Bytes n+1..: The Vorbis identification header, followed by the Vorbis comment header followed by the codec setup header.¶

6.4.16. A_FLAC

Codec ID: A_FLAC¶

Codec Name: FLAC (Free Lossless Audio Codec)¶

Initialization: The Private Data contains all the header/metadata packets before the first data packet. These include the first header packet containing only the word fLaC as well as all metadata packets.¶