September 9, 2021

Content adaptive encoding of User generated content using Ittiam THINKode

Preethi Konda (General Manager, Advanced Video BU)
Harish S (Senior Manager, Technical Sales)

Reading time: 8 min

Introduction

Global video streaming volumes are growing at a staggering pace necessitating the use of efficient video encoding schemes like content adaptive encoding in OTT streaming workflows. Lately, with the advent of Short Format Video (SFV) segment and hugely successful Social Media apps (SM apps) like Tik-Tok, User-Generated Content (UGC) sharing volumes are also exploding. The ease of capturing and creating video content using budget-friendly handheld devices with their built-in high-res cameras, availability of easy-to-use SM apps for content sharing, low or no-cost of storage for end-users has meant that our desire to freely share thoughts and content with our social networks is currently at an all-time high.

As users lap up this new sharing experience, the exponentially rising trend of UGC content sharing places the burden of increased storage costs on services which are driving this revolution. The onus is now with SM app providers and IoT / handheld device OEMs to incorporate novel video encoding schemes for UGC video to cut down storage costs while maintaining a high Quality of Experience (QoE) for users. This is a task easier said than done.

THINKode, Ittiam’s Content adaptive encoding solution is well proven for use in OTT VOD streaming. In this article, we will explore ways in which THINKode and its variants can help our customers achieve significant cost-saving in UGC streaming. We will articulate two deployment options for THINKode – one in Cloud and the other in Edge (IoT / handheld) device.

Introduction to THINKode

THINKode is a machine learning-based, non-iterative, content adaptive encoding solution. Compared to classical fixed bitrate ladder encoding, THINKode which is a content-aware encoding scheme provides good bandwidth savings with consistent video quality and with an accelerated turnaround (Figure 1). THINKode is low on computational complexity, is codec agnostic, and can also be used with customers’ currently deployed encoder.

Figure 1: Highlights of THINKode

Processing UGC content

Typical UGC capture and share use-cases in handhelds include a) recording content using built-in cameras, b) uploading video content to cloud storage, and c) sharing video content with social networks using SM apps (Figure 2). These use-cases in turn may employ video encoding as part of camera recording and video transcode for cloud storage or for sharing the content using social media apps. Video transcode for cloud storage or sharing may be carried out on the handheld device or on a cloud server.

Figure 2: THINKode deployment options for UGC

Unlike professionally generated content (PGC) like movies, UGC content can pose some unique challenges to achieving optimal video encoding:

  • The complexity of UGC is vast spread over the spectrum and can vary widely from news clips to lectures and sports.
  • There is a large variation in the capabilities of UGC capture devices, starting from low-end handhelds to high-end capture devices. This results in variance in quality of such videos quite significantly.

Ittiam’s THINKode uses content-adaptive encoding and is well trained, validated with UGC of different genres. It has proven to work well by delivering good content-aware bit saving with UGC. In the following sections, we will illustrate ways in which THINKode, THINKode Lite can be used in cloud and handheld devices respectively for optimally transcoding UGC content.

Transcoding of UGC in Cloud

Ittiam’s THINKode based AI Media Transcoder is well proven for video transcoding in cloud and has been deployed by leading OTT providers. To illustrate the benefits of using THINKode with UGC, we will now provide a comparison of its performance vis-a-vis fixed bitrate encode with UGC content of multiple genres.

To carry out this comparison, we first downloaded 30 video clips corresponding to 15 different genres from YouTube UGC Dataset. They were; then, encoded by using an x264 encoder with medium preset at a fixed bitrate of 4Mbps.

  • The encoded content showed a wide range of VMAF values ranging from 56 VMAF to 97 VMAF. Based on the VMAF distribution of encoded contents, 90 VMAF was set as target quality for THINKode (90 VMAF corresponds to the 70th percentile of VMAF distribution for these contents).
  • The maximum average encoding bitrate that THINKode could allocate to a segment was set as 4 Mbps. This was done to ensure that average bitrates for content does not exceed the fixed bitrate encoding level of 4 Mbps.
  • The original content was then encoded using THINKode by using x264 encoder as target encoder with medium preset.

The resulting bitrates and VMAF values from the 2 encodes – “THINKode + x264” and “x264 fixed bitrate encode”, are captured in Figure 3 and Figure 4 respectively. From the results, it can be observed that for the given set of content, THINKode provides 28% bitrate savings over fixed bitrate encoding. We can also note that most ‘bit savings’ are obtained for simpler content having VMAF>90 with fixed bitrate encoding.

ATSC-3.0-A-bundle-of-new-features-and-technologies
Figure 3: Content-aware bit savings for UGC using THINKode
Figure 4: VMAF values achieved from the tests

Transcoding of UGC in Handhelds

The phenomenon of sharing of UGC has placed IoT / handheld (edge) devices at the center of video workflow necessitating the use of efficient video encode and transcode schemes in these devices. Typically in edge devices, use of hardware video codecs available on ARM® based SoCs is the general norm for video encode and decode. However, there are some inefficiencies in this approach:

  • Hardware codecs use a single encode recipe catering to all types of video content which results in video quality fluctuations.
  • To compensate for quality, bitrates are typically increased leading to inefficient video encode and hence pushing up storage costs.

THINKode Lite is a variant of THINKode which is targeted to address considerations of UGC encoding and is optimized for deployment in battery-powered ARM® SoC-based IoT devices.

Highlights of THINKode Lite

  • Delivers good bitrate savings on handheld devices compared to fixed bitrate encoding.
  • Uses hardware video codecs available on SoC (or optionally, software codecs can also be used if required).
  • Has a low processing overhead which allows THINKode to run on ARM® CPUs with no explicit need for dedicated compute resources like NPU / GPU.
  • Extensively trained with UGC content of different genres and verified on many leading smartphone models.
  • Available as file I/O based Transcoder SDK (Figure 5) for ARM® CPUs offering easy integration path with social media apps.
Figure 5: THINKode Lite Transcoder for IoT devices

Conclusion

With massive volumes of video content being stored and managed by online video services, cost-efficient storage of video content of diverse complexities has become an absolute necessity. With the recent rising trend of UGC and IoT-centered content sharing, the use-case matrix for video delivery and consumption has only become more diverse. Hence, the need for efficient encoding and transcoding of user-generated video content (UGC) cannot be emphasized enough.

THINKode is well proven to deliver good bit savings with a perceptually consistent high-quality video and is the solution of choice for leading OTT providers. THINKode and its low-power variant THINKode Lite are ideal options for handheld OEMs and social media app providers to achieve good bitrate and cost-saving with UGC streaming.

References:

Want to know more about Ittiam THINKode and its variants? Reach us at mkt@ittiam.com and learn the reasons for top OTT services, leading handheld brands, and social media App providers to choose Ittiam for their products and services. We will be glad to help with further details including product datasheets and evaluations.

Request THINKode datasheet: THINKode 

Please read our blogs on: LC3 codecs for Bluetooth® LE Audio, Multimedia software offerings for ATSC 3.0, MPEG-H audio decoder for Immersive 3D Audio

Explore our Audio Solutions and Video codec offerings