04.2019 - 1.3


The Interoperable Master Format (IMF) is a family of standards for interchange of audiovisual material. IMF is, designed to address the problems of content producers that make multi-variant content, who want to distribute that content on a moving variety of consumer platforms. For these producers, the cost of editing and encoding materials for a new platform can make or break the profit potential of that platform. To be successful, producers must employ specialized content preparation workflows that make efficient use of resources, to allow rapid and inexpensive entry to new markets.

There are many proprietary examples of such workflows in operation today, and not surprisingly their engineers often attend the same conferences. As a result of all this interaction a common understanding of how best to organize content and metadata has evolved. Simultaneously, the explosion in consumer channels has caused interest in the subject to widen and, more importantly, interoperability between these workflows is now a requirement across the motion picture and television industries.

This workflow concept, known in some circles as Componentized Media Workflow (CMW) is an environment in which images, sound, text, and other bits of audiovisual content are stored separately, and are encoded in such a way as to allow recombination of materials to produce derivative versions. These derivatives are defined by metadata which define the sequence of images, sound, text, etc. that comprise the derivative timeline.

What is IMF?

The Interoperable Master Format (IMF) is a family of standards for implementation of CMW. Published by SMPTE, the ST 2067 family defines methods for interchange of a wide variety of audiovisual content, including all common television and cinema formats. The promise of IMF is reliable CMW, assembled from off-the-shelf components and local requirements.

In addition to content encoding, IMF defines the metadata needed to support automated content processing. Copying, encoding, encryption, transformation and assembly of content can all be performed by automation, which improves the reliability of the result and utilization of resources. IMF metadata includes message digest items that support de-duplication, reliable transmission, and integrity checking, and digital signatures for authenticity.

How does IMF do all of that?

IMF uses the word "composition" to refer to a complete audiovisual work and distinguishes "essence data" from "metadata" when referring to encoded audiovisual material and the description of that encoding, respectively. In the case of images, for example, the pixel values are the essence data and the description of how they are to be decoded is metadata.

In IMF, essence data and codec metadata are stored in "track files" that conform to the Material eXchange Format (MXF) standards. MXF is widely used in professional media and so represents a well understood basis for interchange. IMF further encourages interoperability by prescribing a narrow MXF formulation, limiting the complexity required of decoders. IMF also limits the contents of any MXF file to a single clip of a single type of essence.

MXF mappings exist for many codecs. Common codecs in IMF applications are JPEG 2000, ACES and ProRes. PCM audio and Immersive Audio Bitstream are supported. Subtitles and captions are expressed using IMSC 1.2.

An IMF composition is described in terms of timeline metadata (by referring to clips in track files,) versioning metadata, and a copy of the codec metadata. All of these metadata items are stored together in an XML "composition playlist" document (often referred to as a CPL.) An IMF CPL tells the decoder everything it needs to know about codecs, files and frames to play the specific version that it represents.

The IMF Output Profile List (OPL) is an XML document that defines signal processing that may be applied to the essence in an IMF composition. For example, a resize-and-letterbox operation may be performed on a 1080i HD Rec. 709 image to produce a 4:3 input to a standard definition codec. Signal processing is another IMF feature that is aimed squarely at fully automated stream production.

By allowing IMF compositions to be processed in a deterministic way, the OPL reduces the total inventory needed to service channels that offer the same content but which differ by signal format. It would not be possible to create something like the OPL without the codec metadata provided in the IMF composition model.

Finally, the Interoperable Master Package (IMP) is a collection of IMF files packaged for transfer between systems. The IMP is appropriate for use on filesystems and URL-based access methods such as HTTP.

That seems like a lot of work. Why?

In addition to storage reduction multiplied by versions per title, splitting essence data from the composition timeline allows essence to be reused without re-encoding. This preserves quality and the Quality Control (QC) investment in the original track file because only the new essence and edit points need to be checked. As with storage reduction, this advantage is multiplied by the number of versions based on the same core set of frames.

IMF is designed to enable the development of fully automated stream production for consumer servicing. IMF's efficient use of essence data is a natural fit for CDNs, and CPL-to-the-edge capability allows a content provider to alter a deployed title with a minimum of data transmission, saving both time and carriage.

How do I approach this beast?

The IMF Users Group ( meets quarterly, and usually coincident with a related conference such as NAB or IBC. IMFUG members span content creators, manufacturers, service providers and academics. Whatever your level of interest, you are likely to find like-minded participants and useful pointers to further your inquiry.

The IMF standards are published by the Society of Motion Picture and Television Engineers (SMPTE.) If you need to understand IMF at the encoder level, SMPTE offers access to the standards making process and the ongoing conversation about best practices in a componentized media world.

IMF “Application” documentss define the complete signal specification for common use cases like 60p HD Rec. 709 with stereo audio, 4K UHD Rec. 202 with 5.1 audio, and many others. The Applicationse documents are the best place to start reading about IMF if you are trying to understand a particular use case and want to work with the official documentation.

IMF “Plugin” documents provide a lightweight method for defining a track file type and a way to reference it from a CPL. A Plugin is meant to be used in conjunction with an application. For example, the Plugin for Immersive Audio Bitstream can be used with Application 2 to add object-based audio to a standard 4k UHD composition.

If you are developing or troubleshooting IMF workflows, the low-level detail needed to interpret track files, CPLs and other IMF artifacts can be found in the MXF and IMF core specifications.

The IMF Composition Playlist and core constraints are the heart and soul of IMF. These documents define the timeline model, audio channel labels, limits on MXF file layout, and many other details that make IMF what it is. The common use of these principles throughout the applications is a powerful design element, allowing re-use of software components in applications that differ by codec. For example, an program which transmits track files based on CPL references can operate without any knowledge of the codec in use.

What if I’m not ready for IMF workflow but need to make IMF deliverables?

How does PixelStrings fit in all this?

PixelStrings is a cloud platform built for media conversion andpicture finishing that supports a wide variety of image formats. IMF is an available output format appropriate for creating basic IMF compositions.

PixelStrings supports much of IMF Application 2, with frame rates from 23.98 to 60 including interlace support, in SD, HD, 2k, and 4k frame sizes. PCM audio is supported in stereo and 5.1 channel arrangements. The PixelStrings UI collects all of the required metadata items for the CPL, ensuring that the finished files will contain everything needed to pass industry accepted validation. PixelStrings has licensed CineCert’s Anini solution to do IMF wrapping, and is connected to the back end of the PixelStrings transcode engine to create a single pass solution for authoring IMF.

PixelStings uses all of Cinnafilm’s image processing technology to apply high-quality spatial resolution upscaling, de-noise and re-texturing, de-interlacing and frame rate conversion to the picture asset for optimal playback quality when authoring the final IMP, all in a single workflow job; this is done with the goal of making the highest quality version for playback.

In addition, PixelStrings can remove broken cadence patterns that televisions cannot pick up, ensuring there is no stuttering in playback. Think of PixelStrings as the optimal machine to make progressive frames look and play as good as mathematically possible. Furthermore, in cases where a frame rate conversion introduces an output duration mismatch (an issue that cannot be generally avoided in conversions between certain pairs of frame rates), PixelStrings pads output tracks as needed to match the output durations, generating a standards-compliant IMP.

PixelStrings ensures that your resulting IMP output contains essence tracks that match exactly in duration, down to the exact number of audio samples, even if your original input asset contains essence tracks whose durations do not match. For example, an input asset can contain audio that is slightly longer (or shorter) in duration than the image track. PixelStrings uses padding, and not trimming, to ensure the entirety of your input asset contents are preserved in the final IMP.

Note that Netflix does not accept IMPs where the original material has undergone any frame rate conversion or spatial resolution upscaling, and IMPs must satisfy requirements on allowed frame rates, bit rates, and spatial resolutions. To address these requirements and to help you generate Netflix-compliant IMPs, PixelStrings provides pre-defined "Netflix-compliance" workflows (available soon).

Do I need a different IMP for each picture format?

Yes, you will need one for each framerate, and each resolution. This is one of the benefits of using PixelStrings to create these multiple assets, as high-quality image processing is critical to the process ensure each version looks visibly pure as possible at final target playout. Think of PixelStrings as the ultimate progressive frame based creation tool.