Best Practice for Content Preparation - Technical recommendations
Preparing your media for OTT is important to enable efficient delivery and a good end-user experience.
Also, it is key for processing your content later for purposes like archiving, clipping and replay.
Having source content correctly formatted will make the processing by Unified Packager and Unified Origin easier and improve performance.
Table of Contents
This section is about the format of the content at the media source. For dynamic delivery of VOD or Live, Unified Origin will repackage this source on-the-fly (i.e., ‘just-in-time’) for delivery in the requested output format (to support different end user devices). For static delivery of VOD, this source may be repackaged with Unified Packager into the intended delivery format.
In general, source content must be stored as fragmented MP4 (preferably CMAF). Some advantages:
Efficient for cloud storage
Used by DASH-IF Live ingest
Used in HLS and DASH protocols
popular international standard
Single track per file fmp4 not stricty CMAF compliant is OK in many cases too.
The first sample in a segment contains an Instant Decoder Refresh (IDR) frame that is signaled as being a sync-sample, so that the segment can be considered discrete from a decoding perspective. This enables the player to switch between adaptive bitrate video components without significant degradation of the rendered video.
In case more than 1 audio track is used, the mdhd box and/or elng box shall contain the audio language.
In case 1 or more timed text tracks are used, the mdhd box and/or elng box shall contain the language of the timed text track.
Audio and video tracks shall contain a bitrate box indicatign the average and maximum bit-rate. In case of mp4 audio (aac), this may be ommited if signaled in mp4 audio sample entry.
To be able to calculate and signal the framerate of the video the following must be signaled in an avc video track (avc1/avc3/encv):
The timescale and number of units in a tick must be set in the respective VUI parameters (‘time_scale’ and ‘num_units_in_tick’)
In addition the Boolean value for the VUI parameters ‘timing_info_present_flag’ and ‘fixed_frame_rate_flag’ must be set to ‘true’, to signal that the timing info is present and that the framerate is fixed.
For HEVC tracks constant_frame_rate shall be 1 or 2 in HEVCDecoderConfiguration in the sample entry of type hev1, hvc1 or encv. The average_frame_rate shall also be set.
All tracks shall start at the same media presentation time and zero
An stss box must be present. Its absence indicates that every sample is a sync sample. This must be fixed or the HLS playlist generated by the Origin would only include keyframes video tracks, which is most definitely not the desired outcome.
Performs a list of checks on the content of the VP Codec Configuration Box. See for reference https://www.webmproject.org/vp9/mp4/. In particular, it will check that:
“version” is 1
“profile” has a legal and defined value
“bitDepth” has a legal value
“chromaSubSampling” has a legal value and it is coherent with “matricCoefficients”
“codecIntializationData” is not used for VP8 and VP9
Either a ‘sidx’ or ‘mfra’ box must be present in the file
All video fragments except the last shall have an equal duration. In case of audio, small variation in durations may occur. If you are able to align audio segments to video segment boundaries, audio segment durations may also be constant.
To avoid potential timing issues audio tracks should use a TimeScale which matches the sample rate. If the sample rate timescales do not match (an integer multiple of each other) some samples will not be accurately addressable, this may cause discontinuities.
For codecs like AVC/H.264 and HEVC/H.265 codec parameters can be carried in the SampleEntry or in the NAL units in the samples. Carriage of codec parameters in the SampleEntry is preferred. This corresponds to codec configurations such as ‘avc1’ and ‘hvc1’ (i.e., as opposed to ‘avc3’ and ‘hev1’). In case of avc3 or hev1 that codec parameters shall also be present in the Sample Entry.
Note that as of version 1.10.28 Packager can be used to convert avc3 content to avc1 using the option-no_inband_parameter_sets option when packaging content.
The VisualSampleEntry box (14496-12:2020 220.127.116.11) includes optional CleanApertureBox (‘clap’) and PixelAspectRatioBox (‘pasp’) boxes.
These boxes are used, if present, to specify the pixel aspect ratio and clean aperture of the video. For maximum compatibility, these boxes should follow, not precede, any boxes defined in or required by derived specifications.