Internet-Draft | CARP - a CMAF compliant implementation o | September 2025 |
Law | Expires 28 March 2026 | [Page] |
This document updates [WARP] by defining a new optional feature for the streaming format. It specifies the syntax and semantics for adding CMAF-packaged media [CMAF] to WARP.¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://wilaw.github.io/carp/draft-law-moq-carp.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-law-moq-carp/.¶
Discussion of this document takes place on the Media Over QUIC Working Group mailing list (mailto:moq@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/moq/. Subscribe at https://www.ietf.org/mailman/listinfo/moq/.¶
Source for this draft and an issue tracker can be found at https://github.com/wilaw/carp.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 28 March 2026.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
CARP Streaming Format (CARP) is a media format designed to deliver CMAF [CMAF] and LOC [LOC] compliant media content over MOQ Transport (MOQT) [MoQTransport]. CARP extends WARP and retains all the scope, capabilities and features of WARP including the catalog format, timeline, ABR switching and LOC support. CARP is targeted at real-time and interactive levels of live latency, as well as VOD content.¶
This document describes version 1 of the CARP streaming format.¶
All of the specifications, requirements, and terminology defined in [WARP] apply to implementations of this extension unless explicitly noted otherwise in this document.¶
A CMAF header is a sequence of CMAF constrained ISO BMFF boxes that do not reference any media samples, but are associated with a CMAF track and are necessary for initializing the decoding of the subsequent CMAF fragments.¶
The header for a given MOQT Track MUST be packaged by encoding the header using [BASE64] and then inserting that payload as the value of the Initialization data "initData" field in the catalog entry for that Track.¶
This specification defines a direct mapping between CMAF Tracks ( [CMAF] Sect 3.2.1) and MOQT tracks ([MoQTransport] Sect 2.3).¶
CMAF switching sets are a set of one or more CMAF tracks (3.2.1), where each track is an alternative encoding of the same source content and are constrained to enable seamless track switching (3.3.9).¶
Each CMAF track in a switching set MUST be transmitted as a separate MOQT Track. The catalog entry for each of these tracks in the switching set MUST carry a Alternate group (altGroup) key with a common value.¶
The MOQT Group numbers within these switching set tracks MUST be media time-aligned. Mandating the track being media time-aligned requires that the presentation time of the first media sample contained within the first MOQT Object of each MOQT Group is identical.¶
The payload of each Object is subject to the following requirements:¶
MUST contain at least one Movie Fragment Box (moof) followed by a Media Data Box (mdat). This is equivalent to requiring that each Object hold at least one CMAF Chunk. The Media Fragment Box (moof) MUST contain a Movie Fragment Header Box (mfhd) and Track Box (trak) with a Track ID (track_ID) matching a Track Box in the initialization fragment.¶
MAY contain multiple successive CMAF Chunks.¶
MUST contain a single [ISOBMFF] track.¶
MUST contain media content encoded in decode order.¶
Each MOQT Group¶
This specification extends the allowed packaging values defined in WARP Section 5.2.10 to include a new entry, as defined in Table 1 below:¶
Name | Value | Reference |
---|---|---|
CMAF | cmaf | This RFC |
Every Track entry in a CARP catalog MUST declare a "packaging" type value of "cmaf".¶
This specification adds two track-level catalog fields, as defined in Table 2 below:¶
Field | Name | Definition |
---|---|---|
Max Group SAP starting type | maxGrpSapStartingType | Section 3.5.2.1 |
Max Object SAP starting type | maxObjSapStartingType | Section 3.5.2.2 |
Location: T Required: Optional JSON Type: Number¶
A number indicating the maximum SAP type the MOQT Groups in the track start with. [Ed.Note: This field, when the SAP terminology is translated to video codec terminology of Random Access Point (RAP) pictures such as IDR, CRA, etc, would also apply to WARP.]¶
Location: T Required: Optional JSON Type: Number¶
A number indicating the maximum SAP type the MOQT Objects in the track start with.¶
This specification extends the METADATA element of the timeline track, defined in WARP Section 7, in the following four aspects:¶
Specification of a general scheme for metadata signalled through the METADATA element of the timeline track [Ed.Note: This aspect should also be applied to WARP, thus should be moved to WARP later on.]¶
Definition of a namespace for CARP-specific metadata signalled through the METADATA element of the timeline track¶
Addition of metadata signalling of SAP type¶
Addition of metadata signalling of earliest presentation time¶
When not empty, the string in the METADATA field MUST contain one or more comma separated key=value pairs, formatted as Strings as specified in Section 3.3.3 of [RFC9651]. For example, the METADATA field can be "key1=1,key2=3268". For another example, the METADATA field can be "key1=1,key2=""hello-world"",key3=3268".¶
A key name MAY be prefixed with a namespace. When a namespace is present, the separator between the namespace prefix and the key name is '.'.¶
For CARP-specific metadata signalled through the METADATA element, the namespace is "timeline:metadata:carp".¶
When the key name of a key=value pair is "SAP_TYPE", the value indicates the SAP type the Object begins with. The namespace-prefixed key is "timeline:metadata:carp.SAP_TYPE".¶
The value 0 indicates that the Object does not start with an ISOBMFF stream access point. The value equal to 1, 2, or 3 indicates that the Object begins with a stream access point of SAP type 1, 2, or 3, respectively. When the Object is the first Object in the Group, the value MUST be equal to 1 or 2.¶
When the key name of a key=value pair is "EARLIEST_PTS", the value indicates the earliest media presentation timestamp rounded to the nearest millisecond of all media samples in the Object. The namespace-prefixed key is "timeline:metadata:carp.SAP_TYPE".¶
WWhen the SAP type the Object begins with is 2 or 3, the EARLIEST_PTS key SHOULD be present.¶
The following section provides non-normative JSON examples of various catalogs compliant with this draft.¶
This example shows catalog for a media producer capable of sending 3 time-aligned video tracks for high definition, low definition and medium definition video qualities, along with an audio track.¶
{ "version": 1, "generatedAt": 1746104606044, "tracks":[ { "name": "hd", "renderGroup": 1, "packaging": "cmaf", "isLive": true, "initData": "AAAAIGZ0eXBpc281AAA...AAAAAAAAAAAAA", "role": "video", "codec":"avc1.640028", "width":1920, "height":1080, "bitrate":5000000, "framerate":30, "altGroup":1 }, { "name": "md", "renderGroup": 1, "packaging": "cmaf", "isLive": true, "initData": "AAAAHGZ0eXBpc281AAA...AAAAAAAAAAAAAA", "role": "video", "codec":"avc1.64001e", "width":720, "height":640, "bitrate":3000000, "framerate":30, "altGroup":1 }, { "name": "sd", "renderGroup": 1, "packaging": "cmaf", "isLive": true, "initData": "AAAAHGZ0eXBpc281AAA...AAAAAAAAAAAAAA", "role": "video", "codec":"avc1.64000d", "width":192, "height":144, "bitrate":500000, "framerate":30, "altGroup":1 }, { "name": "audio", "renderGroup": 1, "packaging": "cmaf", "isLive": true, "initData": "AAAAHGZ0eXBpc281AAA...AAAAAAAAAAAAAA", "role": "audio", "codec":"mp4a.40.5", "samplerate":48000, "channelConfig":"2", "bitrate":67071 } ] }¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
TODO Security¶
This document has no IANA actions.¶
TODO acknowledge.¶