基于特征符号化和Transformer的无参考图像质量评价方法

宋巍; 李嘉瑾; 刘晓晨; 刘智翔; 石少华

doi:10.37188/CJLCD.2022-0220

您当前的位置：

首页 >

文章列表页 >

基于特征符号化和Transformer的无参考图像质量评价方法

图像处理 | 更新时间：2023-03-15

- 基于特征符号化和Transformer的无参考图像质量评价方法
- No-reference image quality assessment based on feature tokenizer and Transformer
- 液晶与显示 2023年38卷第3期页码：356-367
- 作者机构：
  
  1.上海海洋大学信息学院, 上海 201306
  2.国家海洋局东海勘察中心, 上海 200137
- 作者简介：
  
  [ "宋巍（1977—），女，博士，教授，2012年于昆士兰科技大学获得博士学位，主要从事计算机视觉、海洋大数据方面的研究。E-mail：wsong@shou.edu.cn" ]
- 基金信息：
  
  国家自然科学基金(61972240)
- DOI：10.37188/CJLCD.2022-0220
  中图分类号： TP391.4
- 收稿日期：2022-06-29，
  
  修回日期：2022-07-18，
  
  纸质出版日期：2023-03-05
- 稿件说明：
移动端阅览
宋巍, 李嘉瑾, 刘晓晨, 等. 基于特征符号化和Transformer的无参考图像质量评价方法[J]. 液晶与显示, 2023,38(3):356-367.

SONG Wei, LI Jia-jin, LIU Xiao-chen, et al. No-reference image quality assessment based on feature tokenizer and Transformer[J]. Chinese journal of liquid crystals and displays, 2023, 38(3): 356-367.
宋巍, 李嘉瑾, 刘晓晨, 等. 基于特征符号化和Transformer的无参考图像质量评价方法[J]. 液晶与显示, 2023,38(3):356-367. DOI： 10.37188/CJLCD.2022-0220.

SONG Wei, LI Jia-jin, LIU Xiao-chen, et al. No-reference image quality assessment based on feature tokenizer and Transformer[J]. Chinese journal of liquid crystals and displays, 2023, 38(3): 356-367. DOI： 10.37188/CJLCD.2022-0220.

摘要

基于深度学习的无参考图像质量评价方法目前存在语义关联性不足或模型训练要求高的问题，为此，本文提出了一种基于语义特征符号化和Transformer的无参考图像质量评价方法。首先使用深层卷积神经网络提取图像的高层语义特征；然后将语义特征映射成视觉特征符号，并基于Transformer自注意力机制对视觉特征符号之间的关系进行建模，提取图像的全局特征，同时使用浅层神经网络提取底层局部图像特征，捕捉图像低级失真信息；最后结合全局图像信息与局部图像信息，对图像质量进行预测。为了验证模型的精度和鲁棒性，以相关系数PLCC和SROCC作为评价指标，在5个主流的图像质量评价数据集和1个水下图像质量评价数据集上进行了实验，并将本文提出的方法与15种传统和基于深度学习的无参考图像质量评价方法进行了对比。实验结果表明，本文方法以较少的参数量（大约1.56 MB）在各类数据集上均取得了优越的性能，尤其在多重失真数据集LIVE-MD上将SROCC提升到了0.958，证明在复杂的失真情况下仍能准确评估图像质量，本文网络结构能满足实际应用场景。

Abstract

The no-reference IQA methods based on deep learning have problems of insufficient semantic relevance or high model training requirements. This paper proposes a no-reference IQA based on semantic visual feature tokens and Transformer （VTT-IQA）. We firstly use a deep convolutional neural network to extract high-level semantic features of the image， and then map the semantic features to visual feature tokens. Subsequently， the relationship between visual feature tokens is modelled based on the Transformer self-attention mechanism to extract the global information. Meanwhile， a shallow neural network is used to extract the low-level local features of the image and capture its distortion information. Finally， the high-level semantic information and the low-level visual information are integrated to accurately predict the image quality. In order to verify the superiority and robustness of our proposed model， we compared our method with 15 traditional and deep learning based non-reference IQA methods on five mainstream IQA datasets and one underwater IQA dataset， using PLCC and SROCC as the performance evaluation metrics. The experimental results show that the proposed method achieves superior performance with less parameters （about 1.56 MB）. Especially， VTT-IQA achieves 0.958 of SROCC on LIVE-MD that contains multiply distorted images. It is proved that VTT-IQA can still accurately evaluate the image quality under complex distortion， and can meet the practical application.

关键词

Keywords

references

杨亚威，李俊山，张士杰，等 . 基于生物视觉标准模型特征的无参考型图像质量评价方法［J］. 液晶与显示， 2014 ， 29 （ 6 ）： 1016 - 1023 . doi: 10.3788/YJYXS20142906.1016 http://dx.doi.org/10.3788/YJYXS20142906.1016

YANG Y W ， LI J S ， ZHANG S J ， et al . Non-reference image quality assessment approach based on standard model features of biological vision ［J］. Chinese Journal of Liquid Crystals and Displays ， 2014 ， 29 （ 6 ）： 1016 - 1023 . （in Chinese） . doi: 10.3788/YJYXS20142906.1016 http://dx.doi.org/10.3788/YJYXS20142906.1016

MITTAL A ， SOUNDARARAJAN R ， BOVIK A C . Making a “completely blind” image quality analyzer ［J］. IEEE Signal Processing Letters ， 2013 ， 20 （ 3 ）： 209 - 212 . doi: 10.1109/lsp.2012.2227726 http://dx.doi.org/10.1109/lsp.2012.2227726

MITTAL A ， MOORTHY A K ， BOVIK A C . No-reference image quality assessment in the spatial domain ［J］. IEEE Transactions on Image Processing ， 2012 ， 21 （ 12 ）： 4695 - 4708 . doi: 10.1109/tip.2012.2214050 http://dx.doi.org/10.1109/tip.2012.2214050

MOORTHY A K ， BOVIK A C . Blind image quality assessment： from natural scene statistics to perceptual quality ［J］. IEEE Transactions on Image Processing ， 2011 ， 20 （ 12 ）： 3350 - 3364 . doi: 10.1109/tip.2011.2147325 http://dx.doi.org/10.1109/tip.2011.2147325

YE P ， KUMAR J ， KANG L ， et al . Unsupervised feature learning framework for no-reference image quality assessment ［C］// 2012 IEEE Conference on Computer Vision and Pattern Recognition . Providence ： IEEE ， 2012 ： 1098 - 1105 . doi: 10.1109/cvpr.2012.6247789 http://dx.doi.org/10.1109/cvpr.2012.6247789

XU J T ， YE P ， LI Q H ， et al . Blind image quality assessment based on high order statistics aggregation ［J］. IEEE Transactions on Image Processing ， 2016 ， 25 （ 9 ）： 4444 - 4457 . doi: 10.1109/tip.2016.2585880 http://dx.doi.org/10.1109/tip.2016.2585880

KANG L ， YE P ， LI Y ， et al . Convolutional neural networks for no-reference image quality assessment ［C］// 2014 IEEE Conference on Computer Vision and Pattern Recognition . Columbus ： IEEE ， 2014 ： 1733 - 1740 . doi: 10.1109/cvpr.2014.224 http://dx.doi.org/10.1109/cvpr.2014.224

BOSSE S ， MANIRY D ， MÜLLER K R ， et al . Deep neural networks for no-reference and full-reference image quality assessment ［J］. IEEE Transactions on Image Processing ， 2018 ， 27 （ 1 ）： 206 - 219 . doi: 10.1109/tip.2017.2760518 http://dx.doi.org/10.1109/tip.2017.2760518

KIM J ， LEE S . Fully deep blind image quality predictor ［J］. IEEE Journal of Selected Topics in Signal Processing ， 2017 ， 11 （ 1 ）： 206 - 220 . doi: 10.1109/jstsp.2016.2639328 http://dx.doi.org/10.1109/jstsp.2016.2639328

LIU X L ， VAN DE WEIJER J ， BAGDANOV A D . RankIQA： learning from rankings for no-reference image quality assessment ［C］// 2017 IEEE International Conference on Computer Vision . Venice ： IEEE ， 2017 ： 1040 - 1049 . doi: 10.1109/iccv.2017.118 http://dx.doi.org/10.1109/iccv.2017.118

MA K D ， LIU W T ， ZHANG K ， et al . End-to-end blind image quality assessment using deep neural networks ［J］. IEEE Transactions on Image Processing ， 2018 ， 27 （ 3 ）： 1202 - 1213 . doi: 10.1109/tip.2017.2774045 http://dx.doi.org/10.1109/tip.2017.2774045

WU J J ， MA J P ， LIANG F H ， et al . End-to-end blind image quality prediction with cascaded deep neural network ［J］. IEEE Transactions on Image Processing ， 2020 ， 29 ： 7414 - 7426 . doi: 10.1109/tip.2020.3002478 http://dx.doi.org/10.1109/tip.2020.3002478

ATHAR S ， WANG Z L ， WANG Z . Deep neural networks for blind image quality assessment： addressing the data challenge ［J/OL］. arXiv ， 2109 ： 12161 .

LIN K Y ， WANG G X . Hallucinated-IQA： no-reference image quality assessment via adversarial learning ［C］// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City ： IEEE ， 2018 ： 732 - 741 . doi: 10.1109/cvpr.2018.00083 http://dx.doi.org/10.1109/cvpr.2018.00083

MA J P ， WU J J ， LI L D ， et al . Blind image quality assessment with active inference ［J］. IEEE Transactions on Image Processing ， 2021 ， 30 ： 3650 - 3663 . doi: 10.1109/tip.2021.3064195 http://dx.doi.org/10.1109/tip.2021.3064195

GOODFELLOW I ， POUGET-ABADIE J ， MIRZA M ， et al . Generative adversarial networks ［J］. Communications of the ACM ， 2020 ， 63 （ 11 ）： 139 - 144 . doi: 10.1145/3422622 http://dx.doi.org/10.1145/3422622

WU B C ， XU C F ， DAI X L ， et al . Visual transformers： token-based image representation and processing for computer vision ［J/OL］. arXiv ： 2006 . 03677 ， 2020 . http： //arxiv. org/abs/2006. 03677 http://arxiv.org/abs/2006.03677 .

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems . Long Beach ： Curran Associates Inc. ， 2017 ： 6000 - 6010 .

KHAN S ， NASEER M ， HAYAT M ， et al . Transformers in vision： a survey ［J］. ACM Computing Surveys ， 2022 ， 54 （ 10 s）： 200. doi: 10.1145/3505244 http://dx.doi.org/10.1145/3505244

DOSOVITSKIY A ， BEYER L ， KOLESNIKOV A ， et al . An image is worth 16×16 words： transformers for image recognition at scale ［C］// 9th International Conference on Learning Representations . Austria ： ICLR ， 2021 .

JIANG D ， LI G F ， TAN C ， et al . Semantic segmentation for multiscale target based on object recognition using the improved faster-RCNN model ［J］. Future Generation Computer Systems ， 2021 ， 123 ： 94 - 104 . doi: 10.1016/j.future.2021.04.019 http://dx.doi.org/10.1016/j.future.2021.04.019

YOU J Y ， KORHONEN J . Transformer for image quality assessment ［C］// 2021 IEEE International Conference on Image Processing . Anchorage ： IEEE ， 2021 ： 1389 - 1393 . doi: 10.1109/icip42928.2021.9506075 http://dx.doi.org/10.1109/icip42928.2021.9506075

CHEON M ， YOON S J ， KANG B ， et al . Perceptual image quality assessment with transformers ［C］// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Nashville ： IEEE ， 2021 ： 433 - 442 . doi: 10.1109/cvprw53098.2021.00054 http://dx.doi.org/10.1109/cvprw53098.2021.00054

CHUBARAU A ， CLARK J . VTAMIQ： transformers for attention modulated image quality assessment ［J/OL］. arXiv ， 2021 ： 2110 . 01655 . doi: 10.2352/issn.2470-1173.2020.9.iqsp-067 http://dx.doi.org/10.2352/issn.2470-1173.2020.9.iqsp-067

SU S L ， YAN Q S ， ZHU Y ， et al . Blindly assess image quality in the wild guided by a self-adaptive hyper network ［C］// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle ： IEEE ， 2020 ： 3664 - 3673 . doi: 10.1109/cvpr42600.2020.00372 http://dx.doi.org/10.1109/cvpr42600.2020.00372

IOFFE S ， SZEGEDY C . Batch normalization： accelerating deep network training by reducing internal covariate shift ［C］// Proceedings of the 32nd International Conference on International Conference on Machine Learning . Lille ： JMLR.org ， 2015 ： 448 - 456 .

XIAO T T ， SINGH M ， MINTUN E ， et al . Early convolutions help transformers see better ［C］// Proceedings of the 35th Neural Information Processing Systems . NeurIPS ， 2021 ： 30392 - 30400 .

王宇庆，王索建 . 红外与可见光融合图像的质量评价［J］. 中国光学， 2014 ， 7 （ 3 ）： 396 - 401 . doi: 10.3788/co.20140703.0396 http://dx.doi.org/10.3788/co.20140703.0396

WANG Y Q ， WANG S J . Quality assessment method of IR and visible fusion image ［J］. Chinese Optics ， 2014 ， 7 （ 3 ）： 396 - 401 . （in Chinese） . doi: 10.3788/co.20140703.0396 http://dx.doi.org/10.3788/co.20140703.0396

SHEIKH H R ， SABIR M F ， BOVIK A C . A statistical evaluation of recent full reference image quality assessment algorithms ［J］. IEEE Transactions on Image Processing ， 2006 ， 15 （ 11 ）： 3440 - 3451 . doi: 10.1109/tip.2006.881959 http://dx.doi.org/10.1109/tip.2006.881959

LARSON E C ， CHANDLER D M . Most apparent distortion： full-reference image quality assessment and the role of strategy ［J］. Journal of Electronic Imaging ， 2010 ， 19 （ 1 ）： 011006 . doi: 10.1117/1.3267105 http://dx.doi.org/10.1117/1.3267105

PONOMARENKO N ， JIN L N ， IEREMEIEV O ， et al . Image database TID2013： peculiarities， results and perspectives ［J］. Signal Processing： Image Communication ， 2015 ， 30 ： 57 - 77 . doi: 10.1016/j.image.2014.10.009 http://dx.doi.org/10.1016/j.image.2014.10.009

JAYARAMAN D ， MITTAL A ， MOORTHY A K ， et al . Objective quality assessment of multiply distorted images ［C］// 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals， Systems and Computers . Pacific Grove ： IEEE ， 2012 ： 1693 - 1697 . doi: 10.1109/acssc.2012.6489321 http://dx.doi.org/10.1109/acssc.2012.6489321

GHADIYARAM D ， BOVIK A C . Massive online crowdsourced study of subjective and objective picture quality ［J］. IEEE Transactions on Image Processing ， 2016 ， 25 （ 1 ）： 372 - 387 . doi: 10.1109/tip.2015.2500021 http://dx.doi.org/10.1109/tip.2015.2500021

SAAD M A ， BOVIK A C ， CHARRIER C . Blind image quality assessment： a natural scene statistics approach in the DCT domain ［J］. IEEE Transactions on Image Processing ， 2012 ， 21 （ 8 ）： 3339 - 3352 . doi: 10.1109/tip.2012.2191563 http://dx.doi.org/10.1109/tip.2012.2191563

GHADIYARAM D ， BOVIK A C . Perceptual quality prediction on authentically distorted images using a bag of features approach ［J］. Journal of Vision ， 2017 ， 17 （ 1 ）： 32 . doi: 10.1167/17.1.32 http://dx.doi.org/10.1167/17.1.32

KIM J ， NGUYEN A D ， LEE S . Deep CNN-based blind image quality predictor ［J］. IEEE Transactions on Neural Networks and Learning Systems ， 2019 ， 30 （ 1 ）： 11 - 24 . doi: 10.1109/tnnls.2018.2829819 http://dx.doi.org/10.1109/tnnls.2018.2829819

PAN D ， SHI P ， HOU M ， et al . Blind predicting similar quality map for image quality assessment ［C］// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City ： IEEE ， 2018 ： 6373 - 6382 . doi: 10.1109/cvpr.2018.00667 http://dx.doi.org/10.1109/cvpr.2018.00667

KINGMA D P ， BA J . Adam： a method for stochastic optimization ［C］// Proceedings of the 3rd International Conference on Learning Representations . San Diego ： ICLR ， 2015 ： 1 - 15 .

HOSU V ， LIN H H ， SZIRANYI T ， et al . KonIQ-10k： an ecologically valid database for deep learning of blind image quality assessment ［J］. IEEE Transactions on Image Processing ， 2020 ， 29 ： 4041 - 4056 . doi: 10.1109/tip.2020.2967829 http://dx.doi.org/10.1109/tip.2020.2967829

顾约瑟，姜求平，邵枫，等 . 面向真实水下图像增强的质量评价数据集［J］. 中国图象图形学报， 2022 ， 27 （ 5 ）： 1467 - 1480 . doi: 10.11834/jig.210303 http://dx.doi.org/10.11834/jig.210303

GU Y S ， JIANG Q P ， SHAO F ， et al . A real-world quality evaluation dataset for enhanced underwater images ［J］. Journal of Image and Graphics ， 2022 ， 27 （ 5 ）： 1467 - 1480 . （in Chinese） . doi: 10.11834/jig.210303 http://dx.doi.org/10.11834/jig.210303

YANG M ， SOWMYA A . An underwater color image quality evaluation metric ［J］. IEEE Transactions on Image Processing ， 2015 ， 24 （ 12 ）： 6062 - 6071 . doi: 10.1109/tip.2015.2491020 http://dx.doi.org/10.1109/tip.2015.2491020

PANETTA K ， GAO C ， AGAIAN S . Human-visual-system-inspired underwater image quality measures ［J］. IEEE Journal of Oceanic Engineering ， 2016 ， 41 （ 3 ）： 541 - 551 . doi: 10.1109/joe.2015.2469915 http://dx.doi.org/10.1109/joe.2015.2469915

卢鹏，刘楷贇，邹国良，等 . 基于多特征融合和卷积神经网络的无参考图像质量评价［J］. 液晶与显示， 2022 ， 37 （ 1 ）： 66 - 76 . doi: 10.37188/CJLCD.2021-0175 http://dx.doi.org/10.37188/CJLCD.2021-0175

LU P ， LIU K Y ， ZOU G L ， et al . No reference image quality assessment based onfusion of multiple features and convolutional neural network ［J］. Chinese Journal of Liquid Crystals and Displays ， 2022 ， 37 （ 1 ）： 66 - 76 . （in Chinese） . doi: 10.37188/CJLCD.2021-0175 http://dx.doi.org/10.37188/CJLCD.2021-0175

浏览量

336

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于CNN-Transformer结构的遥感影像变化检测

基于神经辐射场遮挡优化的单视图三维重建方法

结合沙漏注意力与渐进式混合Transformer的图像分类方法

基于空间雕刻和自注意力机制的卫星姿态估计算法