Unravel the advantages of the open source video codec, led by the Alliance for Open Media (AOM)
The Internet streaming media industry is hoping for greater innovation in video encoding with the advent of AV1 – a royalty free video codec currently under standardization by the Alliance for Open Media.
The AV1 advantage
Prior to the formation of AOM, Google, Mozilla and Cisco were focused on developing VP9, Daala and Thor, respectively, as royalty-free video codecs. AOM – a consortium of industry leaders – quickly combined the best features of these three formats to form AV1 – a single royalty-free video codec.
The new video codec brings a distinct advantage in addressing the evolving requirements of the video compression market.
AV1 is a block based video compression codec similar to contemporary video codecs like VP9, and uses many traditional video coding tools including intra frame prediction, inter frame prediction, transform, quantization, entropy coding and loop filter. What is exciting is that the improvements to these traditional tools and additional toolsets introduced in AV1 offer significant enhancements over VP9, and promise to deliver outstanding video compression, enabling almost 30-50% gains in comparison with the incumbent standard.
A Glimpse into Key Tools
As an early member of AOM, Ittiam has been actively working on AV1 to enable usable and optimized real world implementations of the codec, when the standard is ready for deployment. In this blog, we provide you deeper insights into some of the new coding tools in AV1, and their benefits.
AV1 encoder has consolidated the following extensions to VP9 as part of the default toolset.
1. Intra prediction
In addition to DC, horizontal and vertical modes in VP9, AV1 adds extended intra prediction angular modes as default tools. With extended intra mode, you can now improve angular prediction accuracy by signaling direction granularity.
With this mode, you have an alternative to true-motion mode in VP9. It consists of smooth mode and path predictor, which work better on regions with smooth variations in textures.
2. Inter prediction
AV1 extends the number of references to up to six frames, which means, you can achieve better temporal prediction.
Interpolation filter types for motion compensation
AV1 introduces two filter types in addition to what VP9 offers to enable improved motion compensation. By also allowing different filter types to be signaled for horizontal and vertical interpolation, AV1 offers even greater control.
AV1 supports wedge prediction as an experimental tool to handle scenarios where an oblique line divides a block into two partitions. While one of these partitions needs to be inter coded, the other partition can be intra or inter coded. A codebook defined in the standard is used to signal the demarcation of the line separating the partition.
Overlapped Block Motion Compensation (OBMC) allows the predicted value for a given pixel in a current block to be derived as a weighted average of multiple motion compensations done using the motion vectors of the current and neighboring blocks. You can therefore reduce blocking artifacts and enable the selection of larger motion partitions.
3. Transform tools
In addition to DCT and ADST, AV1 introduces two other transforms namely flipped ADST and identity transform as extended transform types. With identity transform, you can effectively code residual blocks with edges and lines. AV1 thus offers you the advantage of a total of sixteen horizontal and vertical transform type combinations.
By default, AV1 supports recursive tree based partitioning in inter blocks and rectangular transform.
4. In-loop filters
When it comes to loop filtering algorithms in video encoding, the primary goal is to reduce artifacts introduced by block-based processing operations and quantization, and avoid their propagation through motion compensation. By adopting AV1, you also enjoy the advantage of directional de-ringing filter and constrained low pass filter, in addition to the default de-blocking filter.
Directional de-ringing filter
AV1 inherits this filter from the Daala codec. The de-ringing filter helps you suppress ringing artifacts caused by quantization of transform coefficients. It also conditionally replaces pixels affected by ringing artifacts.
Constrained Low Pass Filter (CLPF)
AV1 inherits this filter from the Thor codec. By using CLPF, you can correct the artifacts introduced by quantization errors and interpolation filters.
5. Miscellaneous tools
This default tool supports multiple global motion models such as translation, rotation, zoom, affine, and perspective motion models. You can thus reduce the motion vector signaling overheads in the presence of camera motion.
Tile and extended tiles
Unlike the VP9 encoder, AV1 supports configurable prediction dependency between row tiles in a tile column. Since the tile size is also configurable, you can effectively improve the multi-threading performance of the encoder.
Ability to change the quantization step size on a super-block basis. QP can be adapted in super block level.
Non-binary arithmetic coding
In addition to the basic coding tools, AV1 offers more than 30 tools as experimental features.
Take a look at some of these tools and their intended objectives.
1. In-loop filtering
AV1 includes an experimental loop restoration filter to remove blur artifacts caused by block-based processing. Currently, it supports the optional Wiener restoration filter and self-guided restoration filter.
AV1 supports super transform and 64×64 transform experimentally. Super transform allows a single transform across multiple motion partitions in inter blocks, and 64×64 transform is efficient in coding larger blocks.
3. Partition types
By default, AV1 supports recursive partitioning of 64×64 super-blocks. Experimentally, it supports super-block size of 128×128 also. Larger partition sizes are useful in coding ultra-high definition videos.
In addition to the four default partition types, four more partition types are included as experimental types, as shown in Figure 1.
Figure 1: Picture partition types in AV1
4. Intra prediction
AV1 adds filter intra as an experimental intra tool, which introduces a new 4-tap filtering technique based on four casual neighbor samples within the block.
AV1 supports quantization matrix and Perceptual Vector Quantization (PVQ) experimentally.
Based on gain-shape vector quantization, PVQ codes scalar gain (magnitude) and shape vector of a band of transform coefficients separately. It improves perceptual quality through better preservation of contrast, more accurate representation of coefficients, and easy adaptation of quantization to contrast based masking.
6. Chroma from Luma
Chroma from Luma (CfL) is a tool that works based on the correlation between luma and chroma planes. CfL predicts chroma samples based on previously encoded luminance plane.
Here is a consolidated view of AV1 tools and their key benefits.
|In-loop Filtering||Directional De-ringing Filter||Remove ringing artifacts due to transform and quantization|
|Constrained Low Pass Filter||Remove artifacts introduced by quantization and interpolation filter|
|Loop Restoration Filter||Remove blur artifacts due to block processing|
|Transform||Extended Transform||Improve energy compaction efficiency with 16 transform pairs|
|Recursive Transform||Flexibility in compacting regions with high energy|
|Super Transform||Single transform for multiple prediction blocks. Less overheads to signal transform size for multiple blocks|
|Rectangular Transform||Effective in transforming non-square regions|
|Picture Partition||Up to 128×128 Partition Size||Effective for coding homogeneous regions in UHD|
|8 Partition types + Wedge Prediction||More flexibility to partition close to underlying texture and motion|
|Inter Prediction||Overlapped Block Motion Compensation||Reduce discontinuities in block edges|
|Extended Reference Frames||Improve temporal prediction and reduce intra coding need|
|Extended Interpolation Filter types||Adapt interpolation to local content properties|
|Intra Prediction||Extended Intra Prediction, Filter Intra||Improve angular prediction accuracy|
|Smooth Mode||Better prediction on regions with smooth variations in textures|
|Quantization||Perceptual Vector Quantization||Improve perceptual video quality|
|Others||Global Motion Parameter||Reduce motion vector signaling overheads|
|Non-Binary Arithmetic Coding||Improve symbol coding throughput|
|Extended Tiles||Facilitate better multi-threading to improve parallelism|