ISO IEC 14496-30-2018 PDF
Name in English:
St ISO IEC 14496-30-2018
Name in Russian:
Ст ISO IEC 14496-30-2018
Original standard ISO IEC 14496-30-2018 in PDF full version. Additional info + preview on request
Full title and description
ISO/IEC 14496-30:2018 — Information technology — Coding of audio-visual objects — Part 30: Timed text and other visual overlays in ISO base media file format. Defines how timed text, subtitles and other visual overlays are carried inside files based on the ISO Base Media File Format (used by MP4 and related formats), enabling multiple subtitle formats to be stored and delivered alongside audio/video.
Abstract
This part of ISO/IEC 14496 specifies carriage for certain forms of timed text and subtitle streams in ISO base media file format files. It documents sample entries, container boxes and timing/format conventions for embedding caption and subtitle data (for example TTML and WebVTT), while not preventing other carriage definitions.
General information
- Status: Published.
- Publication date: November 2018 (Edition 2).
- Publisher: ISO / IEC (ISO/IEC JTC 1/SC 29).
- ICS / categories: 35.040.40 (Coding of audio, video, multimedia and hypermedia information).
- Edition / version: Edition 2 (2018); the document has at least one published amendment (Amendment 1, 2022).
- Number of pages: 15 pages (main 2018 edition).
Scope
Specifies the storage and timing conventions for carrying timed text, subtitles and visual overlays in ISO Base Media File Format files (MP4-family), including sample entry formats and container box structures for multiple subtitle encodings. The scope covers carriage of formats such as TTML (often using 'stpp' sample entries), binary/fragmented WebVTT (using 'wvtt'/'vtt' structures), and compatibility aspects with existing timed-text systems like 3GPP Timed Text. It focuses on file-level carriage (VOD and fragmented MP4 for streaming) rather than on authoring rules for the subtitle markup languages themselves.
Key topics and requirements
- Definition of sample entry and box structures for carrying timed text and subtitle samples inside ISOBMFF tracks (examples include stpp for TTML and wvtt for WebVTT in MP4 containers).
- Timing and sample-fragmentation rules for subtitle delivery in both progressive and fragmented MP4 (supporting streaming use cases).
- Guidance on interaction with other MP4 features (hint tracks, sample tables, timestamps and sync samples) to ensure proper playout alongside audio/video.
- Compatibility notes referencing related timed-text formats (e.g., 3GPP Timed Text / MPEG-4 timed text) and how they map into ISOBMFF carriage.
- Boxes and metadata required for subtitle sample description, codec identification, and optional timing improvements added by later amendment(s).
Typical use and users
Used by streaming platform engineers, encoder/transcoder and packager developers, media player implementers, subtitle authoring and delivery tool vendors, OTT/CDN integrators and broadcasters who need to store or stream captions/subtitles inside MP4/fragmented MP4 files for interoperable playback on devices and services. Implementations include packaging of TTML/TTML-derived captions and WebVTT for adaptive streaming and live fragmented MP4 workflows.
Related standards
Closely related to ISO/IEC 14496-12 (ISO Base Media File Format) and ISO/IEC 14496-14 (MP4 file format). It complements earlier text-related parts such as ISO/IEC 14496-17 (MPEG-4 Timed Text / 3GPP Timed Text) and references external caption formats and specifications (W3C WebVTT, W3C/TTML and 3GPP TS 26.245). Amendment 1 (2022) provides timing improvements to the 2018 edition.
Keywords
timed text, subtitles, captions, ISO Base Media File Format, MP4, TTML, stpp, WebVTT, wvtt, tx3g, fragmented MP4, sample entry, sample table, timed overlays.
FAQ
Q: What is this standard?
A: It is Part 30 of ISO/IEC 14496 (MPEG‑4 family) that defines how to carry timed text, subtitles and visual overlay data inside ISO Base Media File Format (MP4-family) files for VOD and streaming.
Q: What does it cover?
A: It covers sample entry types, box structures and timing/sample-fragment conventions for embedding subtitle formats (notably TTML and WebVTT) in ISOBMFF files so they can be synchronized with audio/video and delivered via progressive or fragmented MP4 workflows.
Q: Who typically uses it?
A: Streaming engineers, packager and encoder vendors, player developers, OTT platforms, broadcasters and anyone implementing MP4-based delivery of captions/subtitles.
Q: Is it current or superseded?
A: The 2018 edition (Edition 2) is the current published edition for Part 30; a published amendment (Amendment 1: 2022) updates timing-related details. As with ISO standards, the document is subject to periodic review.
Q: Is it part of a series?
A: Yes — it is Part 30 of the ISO/IEC 14496 series (the MPEG‑4 / Coding of audio‑visual objects family), which includes related parts such as Part 12 (ISO Base Media File Format) and Part 14 (MP4 file format), and it complements earlier text parts such as Part 17.
Q: What are the key keywords?
A: Timed text, subtitles, captions, MP4, ISO Base Media File Format, TTML, WebVTT, stpp, wvtt, tx3g, fragmented MP4.