1.上海海洋大学 信息学院, 上海 201306
2.国家海洋局东海勘察中心, 上海 200137
[ "宋巍(1977—),女,博士,教授,2012年于昆士兰科技大学获得博士学位,主要从事计算机视觉、海洋大数据方面的研究。E-mail:wsong@shou.edu.cn" ]
扫 描 看 全 文
宋巍, 李嘉瑾, 刘晓晨, 等. 基于特征符号化和Transformer的无参考图像质量评价方法[J]. 液晶与显示, 2023,38(3):356-367.
SONG Wei, LI Jia-jin, LIU Xiao-chen, et al. No-reference image quality assessment based on feature tokenizer and Transformer[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(3):356-367.
宋巍, 李嘉瑾, 刘晓晨, 等. 基于特征符号化和Transformer的无参考图像质量评价方法[J]. 液晶与显示, 2023,38(3):356-367. DOI: 10.37188/CJLCD.2022-0220.
SONG Wei, LI Jia-jin, LIU Xiao-chen, et al. No-reference image quality assessment based on feature tokenizer and Transformer[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(3):356-367. DOI: 10.37188/CJLCD.2022-0220.
基于深度学习的无参考图像质量评价方法目前存在语义关联性不足或模型训练要求高的问题,为此,本文提出了一种基于语义特征符号化和Transformer的无参考图像质量评价方法。首先使用深层卷积神经网络提取图像的高层语义特征;然后将语义特征映射成视觉特征符号,并基于Transformer自注意力机制对视觉特征符号之间的关系进行建模,提取图像的全局特征,同时使用浅层神经网络提取底层局部图像特征,捕捉图像低级失真信息;最后结合全局图像信息与局部图像信息,对图像质量进行预测。为了验证模型的精度和鲁棒性,以相关系数PLCC和SROCC作为评价指标,在5个主流的图像质量评价数据集和1个水下图像质量评价数据集上进行了实验,并将本文提出的方法与15种传统和基于深度学习的无参考图像质量评价方法进行了对比。实验结果表明,本文方法以较少的参数量(大约1.56 MB)在各类数据集上均取得了优越的性能,尤其在多重失真数据集LIVE-MD上将SROCC提升到了0.958,证明在复杂的失真情况下仍能准确评估图像质量,本文网络结构能满足实际应用场景。
The no-reference IQA methods based on deep learning have problems of insufficient semantic relevance or high model training requirements. This paper proposes a no-reference IQA based on semantic visual feature tokens and Transformer (VTT-IQA). We firstly use a deep convolutional neural network to extract high-level semantic features of the image, and then map the semantic features to visual feature tokens. Subsequently, the relationship between visual feature tokens is modelled based on the Transformer self-attention mechanism to extract the global information. Meanwhile, a shallow neural network is used to extract the low-level local features of the image and capture its distortion information. Finally, the high-level semantic information and the low-level visual information are integrated to accurately predict the image quality. In order to verify the superiority and robustness of our proposed model, we compared our method with 15 traditional and deep learning based non-reference IQA methods on five mainstream IQA datasets and one underwater IQA dataset, using PLCC and SROCC as the performance evaluation metrics. The experimental results show that the proposed method achieves superior performance with less parameters (about 1.56 MB). Especially, VTT-IQA achieves 0.958 of SROCC on LIVE-MD that contains multiply distorted images. It is proved that VTT-IQA can still accurately evaluate the image quality under complex distortion, and can meet the practical application.
图像质量无参考图像质量评价Transformer自注意力特征符号
image qualityno-reference quality assessmentTransformerself-attentionfeature tokens
杨亚威,李俊山,张士杰,等.基于生物视觉标准模型特征的无参考型图像质量评价方法[J].液晶与显示,2014,29(6):1016-1023. doi: 10.3788/YJYXS20142906.1016http://dx.doi.org/10.3788/YJYXS20142906.1016
YANG Y W, LI J S, ZHANG S J, et al. Non-reference image quality assessment approach based on standard model features of biological vision [J]. Chinese Journal of Liquid Crystals and Displays, 2014, 29(6): 1016-1023. (in Chinese). doi: 10.3788/YJYXS20142906.1016http://dx.doi.org/10.3788/YJYXS20142906.1016
MITTAL A, SOUNDARARAJAN R, BOVIK A C. Making a “completely blind” image quality analyzer [J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. doi: 10.1109/lsp.2012.2227726http://dx.doi.org/10.1109/lsp.2012.2227726
MITTAL A, MOORTHY A K, BOVIK A C. No-reference image quality assessment in the spatial domain [J]. IEEE Transactions on Image Processing, 2012, 21(12): 4695-4708. doi: 10.1109/tip.2012.2214050http://dx.doi.org/10.1109/tip.2012.2214050
MOORTHY A K, BOVIK A C. Blind image quality assessment: from natural scene statistics to perceptual quality [J]. IEEE Transactions on Image Processing, 2011, 20(12): 3350-3364. doi: 10.1109/tip.2011.2147325http://dx.doi.org/10.1109/tip.2011.2147325
YE P, KUMAR J, KANG L, et al. Unsupervised feature learning framework for no-reference image quality assessment [C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 1098-1105. doi: 10.1109/cvpr.2012.6247789http://dx.doi.org/10.1109/cvpr.2012.6247789
XU J T, YE P, LI Q H, et al. Blind image quality assessment based on high order statistics aggregation [J]. IEEE Transactions on Image Processing, 2016, 25(9): 4444-4457. doi: 10.1109/tip.2016.2585880http://dx.doi.org/10.1109/tip.2016.2585880
KANG L, YE P, LI Y, et al. Convolutional neural networks for no-reference image quality assessment [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1733-1740. doi: 10.1109/cvpr.2014.224http://dx.doi.org/10.1109/cvpr.2014.224
BOSSE S, MANIRY D, MÜLLER K R, et al. Deep neural networks for no-reference and full-reference image quality assessment [J]. IEEE Transactions on Image Processing, 2018, 27(1): 206-219. doi: 10.1109/tip.2017.2760518http://dx.doi.org/10.1109/tip.2017.2760518
KIM J, LEE S. Fully deep blind image quality predictor [J]. IEEE Journal of Selected Topics in Signal Processing, 2017, 11(1): 206-220. doi: 10.1109/jstsp.2016.2639328http://dx.doi.org/10.1109/jstsp.2016.2639328
LIU X L, VAN DE WEIJER J, BAGDANOV A D. RankIQA: learning from rankings for no-reference image quality assessment [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 1040-1049. doi: 10.1109/iccv.2017.118http://dx.doi.org/10.1109/iccv.2017.118
MA K D, LIU W T, ZHANG K, et al. End-to-end blind image quality assessment using deep neural networks [J]. IEEE Transactions on Image Processing, 2018, 27(3): 1202-1213. doi: 10.1109/tip.2017.2774045http://dx.doi.org/10.1109/tip.2017.2774045
WU J J, MA J P, LIANG F H, et al. End-to-end blind image quality prediction with cascaded deep neural network [J]. IEEE Transactions on Image Processing, 2020, 29: 7414-7426. doi: 10.1109/tip.2020.3002478http://dx.doi.org/10.1109/tip.2020.3002478
ATHAR S, WANG Z L, WANG Z. Deep neural networks for blind image quality assessment: addressing the data challenge [J/OL]. arXiv, 2109: 12161.
LIN K Y, WANG G X. Hallucinated-IQA: no-reference image quality assessment via adversarial learning [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 732-741. doi: 10.1109/cvpr.2018.00083http://dx.doi.org/10.1109/cvpr.2018.00083
MA J P, WU J J, LI L D, et al. Blind image quality assessment with active inference [J]. IEEE Transactions on Image Processing, 2021, 30: 3650-3663. doi: 10.1109/tip.2021.3064195http://dx.doi.org/10.1109/tip.2021.3064195
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks [J]. Communications of the ACM, 2020, 63(11): 139-144. doi: 10.1145/3422622http://dx.doi.org/10.1145/3422622
WU B C, XU C F, DAI X L, et al. Visual transformers: token-based image representation and processing for computer vision [J/OL]. arXiv:2006.03677, 2020. http: //arxiv. org/abs/2006. 03677http://arxiv.org/abs/2006.03677.
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017: 6000-6010.
KHAN S, NASEER M, HAYAT M, et al. Transformers in vision: a survey [J]. ACM Computing Surveys, 2022, 54(10s): 200. doi: 10.1145/3505244http://dx.doi.org/10.1145/3505244
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale [C]//9th International Conference on Learning Representations. Austria: ICLR, 2021.
JIANG D, LI G F, TAN C, et al. Semantic segmentation for multiscale target based on object recognition using the improved faster-RCNN model [J]. Future Generation Computer Systems, 2021, 123: 94-104. doi: 10.1016/j.future.2021.04.019http://dx.doi.org/10.1016/j.future.2021.04.019
YOU J Y, KORHONEN J. Transformer for image quality assessment [C]//2021 IEEE International Conference on Image Processing. Anchorage: IEEE, 2021: 1389-1393. doi: 10.1109/icip42928.2021.9506075http://dx.doi.org/10.1109/icip42928.2021.9506075
CHEON M, YOON S J, KANG B, et al. Perceptual image quality assessment with transformers [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Nashville: IEEE, 2021: 433-442. doi: 10.1109/cvprw53098.2021.00054http://dx.doi.org/10.1109/cvprw53098.2021.00054
CHUBARAU A, CLARK J. VTAMIQ: transformers for attention modulated image quality assessment [J/OL]. arXiv, 2021: 2110.01655. doi: 10.2352/issn.2470-1173.2020.9.iqsp-067http://dx.doi.org/10.2352/issn.2470-1173.2020.9.iqsp-067
SU S L, YAN Q S, ZHU Y, et al. Blindly assess image quality in the wild guided by a self-adaptive hyper network [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 3664-3673. doi: 10.1109/cvpr42600.2020.00372http://dx.doi.org/10.1109/cvpr42600.2020.00372
IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille: JMLR.org, 2015: 448-456.
XIAO T T, SINGH M, MINTUN E, et al. Early convolutions help transformers see better [C]//Proceedings of the 35th Neural Information Processing Systems. NeurIPS, 2021: 30392-30400.
王宇庆,王索建.红外与可见光融合图像的质量评价[J].中国光学,2014,7(3):396-401. doi: 10.3788/co.20140703.0396http://dx.doi.org/10.3788/co.20140703.0396
WANG Y Q, WANG S J. Quality assessment method of IR and visible fusion image [J]. Chinese Optics, 2014, 7(3): 396-401. (in Chinese). doi: 10.3788/co.20140703.0396http://dx.doi.org/10.3788/co.20140703.0396
SHEIKH H R, SABIR M F, BOVIK A C. A statistical evaluation of recent full reference image quality assessment algorithms [J]. IEEE Transactions on Image Processing, 2006, 15(11): 3440-3451. doi: 10.1109/tip.2006.881959http://dx.doi.org/10.1109/tip.2006.881959
LARSON E C, CHANDLER D M. Most apparent distortion: full-reference image quality assessment and the role of strategy [J]. Journal of Electronic Imaging, 2010, 19(1): 011006. doi: 10.1117/1.3267105http://dx.doi.org/10.1117/1.3267105
PONOMARENKO N, JIN L N, IEREMEIEV O, et al. Image database TID2013: peculiarities, results and perspectives [J]. Signal Processing: Image Communication, 2015, 30: 57-77. doi: 10.1016/j.image.2014.10.009http://dx.doi.org/10.1016/j.image.2014.10.009
JAYARAMAN D, MITTAL A, MOORTHY A K, et al. Objective quality assessment of multiply distorted images [C]//2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers. Pacific Grove: IEEE, 2012: 1693-1697. doi: 10.1109/acssc.2012.6489321http://dx.doi.org/10.1109/acssc.2012.6489321
GHADIYARAM D, BOVIK A C. Massive online crowdsourced study of subjective and objective picture quality [J]. IEEE Transactions on Image Processing, 2016, 25(1): 372-387. doi: 10.1109/tip.2015.2500021http://dx.doi.org/10.1109/tip.2015.2500021
SAAD M A, BOVIK A C, CHARRIER C. Blind image quality assessment: a natural scene statistics approach in the DCT domain [J]. IEEE Transactions on Image Processing, 2012, 21(8): 3339-3352. doi: 10.1109/tip.2012.2191563http://dx.doi.org/10.1109/tip.2012.2191563
GHADIYARAM D, BOVIK A C. Perceptual quality prediction on authentically distorted images using a bag of features approach [J]. Journal of Vision, 2017, 17(1): 32. doi: 10.1167/17.1.32http://dx.doi.org/10.1167/17.1.32
KIM J, NGUYEN A D, LEE S. Deep CNN-based blind image quality predictor [J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(1): 11-24. doi: 10.1109/tnnls.2018.2829819http://dx.doi.org/10.1109/tnnls.2018.2829819
PAN D, SHI P, HOU M, et al. Blind predicting similar quality map for image quality assessment [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6373-6382. doi: 10.1109/cvpr.2018.00667http://dx.doi.org/10.1109/cvpr.2018.00667
KINGMA D P, BA J. Adam: a method for stochastic optimization [C]//Proceedings of the 3rd International Conference on Learning Representations. San Diego: ICLR, 2015: 1-15.
HOSU V, LIN H H, SZIRANYI T, et al. KonIQ-10k: an ecologically valid database for deep learning of blind image quality assessment [J]. IEEE Transactions on Image Processing, 2020, 29: 4041-4056. doi: 10.1109/tip.2020.2967829http://dx.doi.org/10.1109/tip.2020.2967829
顾约瑟,姜求平,邵枫,等.面向真实水下图像增强的质量评价数据集[J].中国图象图形学报,2022,27(5):1467-1480. doi: 10.11834/jig.210303http://dx.doi.org/10.11834/jig.210303
GU Y S, JIANG Q P, SHAO F, et al. A real-world quality evaluation dataset for enhanced underwater images [J]. Journal of Image and Graphics, 2022, 27(5): 1467-1480. (in Chinese). doi: 10.11834/jig.210303http://dx.doi.org/10.11834/jig.210303
YANG M, SOWMYA A. An underwater color image quality evaluation metric [J]. IEEE Transactions on Image Processing, 2015, 24(12): 6062-6071. doi: 10.1109/tip.2015.2491020http://dx.doi.org/10.1109/tip.2015.2491020
PANETTA K, GAO C, AGAIAN S. Human-visual-system-inspired underwater image quality measures [J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. doi: 10.1109/joe.2015.2469915http://dx.doi.org/10.1109/joe.2015.2469915
卢鹏,刘楷贇,邹国良,等.基于多特征融合和卷积神经网络的无参考图像质量评价[J].液晶与显示,2022,37(1):66-76. doi: 10.37188/CJLCD.2021-0175http://dx.doi.org/10.37188/CJLCD.2021-0175
LU P, LIU K Y, ZOU G L, et al. No reference image quality assessment based onfusion of multiple features and convolutional neural network [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(1): 66-76. (in Chinese). doi: 10.37188/CJLCD.2021-0175http://dx.doi.org/10.37188/CJLCD.2021-0175
0
浏览量
94
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构