October 18, 2016

Efficiency in Cloud Transcoding

by Prashanth Dixit (Principal Engineer, Media Server Technologies)

The latest Intel devices targeted at media centric loads, such as the Intel Xeon E3v4 processors, include integrated GPUs and dedicated video hardware that significantly improve the video processing capabilities of these processors. These devices feature Intel’s QuickSync Video (QSV) that offers significantly high encoding channel densities. At the fastest speed setting, QSV is capable of transcoding up to 16 concurrent 1080p30 H.264 channels on a single Intel Xeon E3 1285L v4 processor.

Hardware platforms such as Intel’s Visual Compute Accelerator (VCA) include 3 Xeon E3v4 processors on a PCIe card. With up to 4 such cards in a 1 RU chassis, servers with Intel VCA are highly efficient on cost, space and power consumption. Offering up to 192 channels of HD transcoding (16 channels each over 12 processors), VCA based servers have the potential to deliver exceptionally high channel density for cloud solutions.

Incorporating QSV into an existing solution can be accomplished through FFMPEG plug-ins that are readily available from Intel. The FFMPEG build with QSV can transcode a total of 120 HD transcode channels (without ABR) on a server with 4 VCA cards (without using any of the host CPU capabilities). This solution leverages only 62% of the QSV capabilities, indicating significant gaps in the efficiency of FFMPEG in dense transcoding scenarios.

In comparison, Ittiam’s Media Cloud software performs transcoding of 192 channels (100% of claimed capabilities of QSV on the platform). In other words, Ittiam’s Media Cloud is 60% more efficient than FFMPEG in utilizing the platform’s video capabilities. The image below consolidates the comparison in terms of channels and efficiency.

efficiency-cloud-transcoding

While it is easy to conclude that channel density is primarily driven by video encoding/decoding performance, system efficiency matters when the video performance increases significantly. Here are a few key considerations in the design and implementation of a media pipeline that enable Ittiam’s Media Cloud to achieve such high levels of efficiency:

  1. Scheduling: A cleverly designed multi-threaded solution with intelligent scheduling helps to utilize the GPU/Video hardware functions to the maximum possible extent. Allowing the GPU/Video subsystem to wait for stream processing on the host is the largest contributor to loss of efficiency.
  2. Platform resource management: GPU accelerated/hardware assisted platforms have multiple independent video processing units that can operate in parallel. Comprehensive understanding of the platform to parallelize processing to maximum possible extent aids in efficient scheduling.
  3. Buffer management: Poor buffer management leads to wait cycles on the GPU and increased processing overheads on the CPU. Understanding the buffer sharing mechanisms between CPU and GPU and minimizing any copy overheads in the pipeline is critical to increasing efficiency.
  4. ABR processing: ABR transcoding implies the need for multiple output streams per input stream; several of which are at lower resolutions. This leads to a significant increase in the total number of encoding instances in the transcoder. Additional intelligence for efficient multi-instance ABR transcoding instead of brute force multi-channel transcoding helps in addressing the efficiency concerns.

The next generation cloud platforms are making significant leaps in video processing capabilities and it is clear that many of the current transcoding software solutions are unable to efficiently utilize these capabilities. By virtue of being specifically designed to maximize the performance of such platforms, Ittiam’s Media Cloud software is a great fit for next generation cloud transcoding solutions.

For more information, reach out to us @ mkt@ittiam.com