학술논문

Blind Image Quality Index With Cross-Domain Interaction and Cross-Scale Integration
Document Type
Periodical
Source
IEEE Transactions on Multimedia IEEE Trans. Multimedia Multimedia, IEEE Transactions on. 26:2729-2739 2024
Subject
Components, Circuits, Devices and Systems
Communication, Networking and Broadcast Technologies
Computing and Processing
General Topics for Engineers
Measurement
Feature extraction
Distortion
Image quality
Transformers
Databases
Indexes
Image quality assessment
deep learning
cross-domain interaction
cross-scale integration
Language
ISSN
1520-9210
1941-0077
Abstract
With the assistance of Convolutional Neural Networks (CNNs), Image Quality Assessment (IQA) models have made great progress in evaluating both simulated distortion and authentic distortion. However, most of the existing IQA models only learn the features of distorted images, and thus do not make full use of the available feature representation of other domains. Furthermore, the common multi-scale fusion strategies are relatively simple, such as downsampling and concatenating, which further limits the prediction performance. To this end, we propose a novel blind image quality index with cross-domain interaction and cross-scale integration, which is designed based on the combination of CNN and Transformer. First, the hierarchical spatial-domain and gradient-domain representations are obtained through a typical CNN architecture. Then, based on the proposed gradient-query cross-attention, these two types of features are fully interacted in the Cross-Domain Interaction (CDI) module. To represent the distortion information more comprehensively, the Cross-Scale Integration (CSI) module is proposed to combine the information between different scales progressively. Finally, the quality score is obtained through a simple regression module. The experimental results on five public IQA databases of both simulated and authentic scenes show that the proposed model outperforms the compared state-of-the-art metrics. In addition, cross-database experiments show that the proposed model has strong generalization performance.