학술논문

Multi-Modal Learning-Based Blind Video Quality Assessment Metric for Synthesized Views

Document Type

Periodical

Author

Jin, C.; Peng, Z.; Chen, F.; Jiang, G.; Yu, M.

Source

IEEE Transactions on Broadcasting IEEE Trans. on Broadcast. Broadcasting, IEEE Transactions on. 70(1):208-222 Mar, 2024

Subject

Communication, Networking and Broadcast Technologies
Measurement
Distortion
Feature extraction
Quality assessment
Video recording
Visualization
Convolutional neural networks
Synthesized video quality assessment
no-reference
multi-model learning
sparse dictionary

Language

ISSN

0018-9316
1557-9611

Abstract

The quality attenuation of synthesized video will directly affect the widespread adoption of immersive video, so it is crucial to design a quality assessment model that can determine whether the synthesized video meets the requirements of commercial broadcasting. However, designing a general-purpose no-reference quality assessment metric for synthesized videos is difficult due to the imperfect view synthesizing technology and scene diversity. Currently, the existed quality assessment algorithms for synthesized views are mostly based on handcrafted feature extraction. Inspired by the theory that the input stimuli are hierarchically and sparsely processed in the cerebral cortex, we combine Convolutional Neural Network (CNN) learning and sparse dictionary learning mechanisms, and propose a Multi-Model Learning based Blind Synthesized Video Quality Assessment (MML-BSVQA) metric. Firstly, to better reflect the spatio-temporal distortions, we convert the synthesized video into the Spatial Domain (SD), Vertical Temporal Domain (VTD) and Horizontal Temporal Domain (HTD) using video decomposition operation plus optical flow estimation. Secondly, we extract the deep semantic features from three domains based on a pre-trained CNN model. Thirdly, we represent the sparse features of three domains using respective trained over-complete sparse dictionaries. Note that both the CNN model and sparse dictionaries are trained on natural videos to ensure the general-purpose of the proposed MML-BSVQA metric. Finally, the score of a synthesized video is generated by weighted regression. Experimental results on three synthesized video databases demonstrate that the proposed metric outperforms classic and state-of-the-art quality assessment metrics.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송