One-for-All: an Efficient Variable Convolution Neural Network for In-Loop Filter of VVC
IEEE Transactions on Circuits and Systems for Video Technology(2021)
Peking Univ
Abstract
Recently, many researches on convolution neural network (CNN) based in-loop filters have been proposed to improve coding efficiency. However, most existing CNN based filters tend to train and deploy multiple networks for various quantization parameters (QP) and frame types (FT), which drastically increases resources in training these models and the memory burdens for video codec. In this paper, we propose a novel variable CNN (VCNN) based in-loop filter for VVC, which can effectively handle the compressed videos with different QPs and FTs via a single model. Specifically, an efficient and flexible attention module is developed to recalibrate features according to QPs or FTs. Then we embed the module into the residual block so that these informative features can be continuously utilized in the residual learning process. To minimize the information loss in the learning process of the entire network, we utilize a residual feature aggregation module (RFA) for more efficient feature extraction. Based on it, an efficient network architecture VCNN is designed that can not only effectively reduce compression artifacts, but also can be adaptive to various QPs and FTs. To address training data imbalance on various QPs and FTs and improve the robustness of the model, a focal mean square error loss function is employed to train the proposed network. Then we integrate the VCNN into VVC as an additional tool of in-loop filters after the deblocking filter. Extensive experimental results show that our VCNN approach obtains on average 3.63%, 4.36%, 4.23%, 3.56% under all intra, low-delay P, low-delay, and random access configurations, respectively, which is even better than QP-Separate models.
MoreTranslated text
Key words
Encoding,Videos,Feature extraction,Convolution,Adaptation models,Visualization,Training,Variable,in-loop filter,attention,versatile video coding (VVC)
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined