1.陕西科技大学 电子信息与人工智能学院, 陕西 西安710021
[ "赵晓(1978—),女,陕西临潼人,博士,副教授,2019年于陕西科技大学获得博士学位,主要从事数字图像处理、模式识别等方面的研究。E-mail:zhaox@ sust.edu.cn" ]
[ "杨晨(1997—),男,陕西西安人,硕士研究生,2021年于陕西科技大学获得学士学位,主要从事图像分类和面部表情识别等方面的研究。E-mail:2427135900@qq.com" ]
扫 描 看 全 文
赵晓, 杨晨, 王若男, 等. 基于注意力机制ResNet轻量网络的面部表情识别[J]. 液晶与显示, 2023,38(11):1503-1510.
ZHAO Xiao, YANG Chen, WANG Ruo-nan, et al. Facial expression recognition based on attention mechanism ResNet lightweight network[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1503-1510.
赵晓, 杨晨, 王若男, 等. 基于注意力机制ResNet轻量网络的面部表情识别[J]. 液晶与显示, 2023,38(11):1503-1510. DOI: 10.37188/CJLCD.2023-0046.
ZHAO Xiao, YANG Chen, WANG Ruo-nan, et al. Facial expression recognition based on attention mechanism ResNet lightweight network[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1503-1510. DOI: 10.37188/CJLCD.2023-0046.
针对ResNet18网络模型在面部表情识别时存在网络模型大、准确率低等问题,提出了一种基于注意力机制ResNet轻量网络模型(Multi-Scale CBAM Lightweight ResNet,MCLResNet),能够以较少的参数量、较高的准确率实现面部表情的识别。首先,采用ResNet18作为主干网络提取特征,引入分组卷积减少ResNet18的参数量;利用倒残差结构增加网络深度,优化了图像特征提取效果。其次,将CBAM(Convolutional Block Attention Module)通道注意力模块中的共享全连接层替换为1×3的卷积模块,有效减少了通道信息的丢失;在CBAM空间注意力模块中添加多尺度卷积模块获得了不同尺度的空间特征信息。最后,将多尺度空间特征融合的CBAM模块(Multi-Scale CBAM,MSCBAM)添加到轻量的ResNet模型中,有效增加了网络模型的特征表达能力,另外在引入MSCBAM的网络模型输出层增加一层全连接层,以此增加模型在输出时的非线性表示。该模型在FER2013和CK+数据集上的实验结果表明,本文提出的模型参数量相比ResNet18下降82.58%,并且有较好的识别准确率。
Aiming at the problems of large network model and low accuracy of ResNet18 network model in facial expression recognition, a Lightweight ResNet based on multi-scale CBAM (Convolutional Block Attention Module) attention mechanism (MCLResNet) is proposed, which can realize facial expression recognition with less parameters and higher accuracy. Firstly, ResNet18 is used as the backbone network to extract features, and group convolution is introduced to reduce the parameters quantity of ResNet18. The inverted residual structure is used to increase the network depth and optimized the effect of image feature extraction. Secondly,the shared fully connected layer in the channel attention module of CBAM is replaced with a 1×3 convolution module,,which effectively reduces the loss of channel information. The multi-scale convolution module is added to the CBAM spatial attention module to obtain spatial feature information at different scales. Finally, multi-scale CBAM module (MSCBAM) is added to the lightweight ResNet model, which effectively increases the feature expression ability of the network model. In addition, a fully connected layer is added to the output layer of the network model introduced into MSCBAM, so as to increase the nonlinear representation of the model at the output. The experimental results of the model on FER2013dataset and CK+ dataset show that the parameters quantity of the model proposed in this paper is reduced by 82.58% compared with ResNet18,and the recognition accuracy is better.
ResNet轻量网络多尺度空间特征融合面部表情识别注意力机制
lightweight resnet networkmulti-scale spatial feature fusionfacial expression recognitionattention mechanism
MEHRABIAN A. Communication without words [M]//MORTENSEN C D. Communication Theory. 2nd ed. New York: Routledge, 2008: 193-200.
刘博雯,帅建伟,曹玉萍.面部表情识别技术在精神疾病诊疗中的应用[J].中华行为医学与脑科学杂志,2021,30(10):955-960. doi: 10.3760/cma.j.cn371468-20201227-00084http://dx.doi.org/10.3760/cma.j.cn371468-20201227-00084
LIU B W, SHUAI J W, CAO Y P. Application of facial expression recognition technology in diagnosis and treatment of psychiatry [J]. Chinese Journal of Behavioral Medicine and Brain Science, 2021, 30(10): 955-960. (in Chinese). doi: 10.3760/cma.j.cn371468-20201227-00084http://dx.doi.org/10.3760/cma.j.cn371468-20201227-00084
FAN X J, WANG Q F, KE J J, et al. Adversarially adaptive normalization for single domain generalization [C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 8204-8213. doi: 10.1109/cvpr46437.2021.00811http://dx.doi.org/10.1109/cvpr46437.2021.00811
ALI K, HUGHES C E. Facial expression recognition by using a disentangled identity-invariant expression representation [C]. 2020 25th International Conference on Pattern Recognition (ICPR). Milan: IEEE, 2021: 9460-9467. doi: 10.1109/icpr48806.2021.9412172http://dx.doi.org/10.1109/icpr48806.2021.9412172
OJALA T, PIETIKAINEN M, MAENPAA T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971-987. doi: 10.1109/tpami.2002.1017623http://dx.doi.org/10.1109/tpami.2002.1017623
SIKKA K, WU T F, SUSSKIND J, et al. Exploring bag of words architectures in the facial expression domain [C]//Proceedings of the European Conference on Computer Vision. Florence: Springer, 2012: 250-259. doi: 10.1007/978-3-642-33868-7_25http://dx.doi.org/10.1007/978-3-642-33868-7_25
DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893. doi: 10.1109/cvpr.2005.4http://dx.doi.org/10.1109/cvpr.2005.4
LOWE D G. Object recognition from local scale-invariant features [C]//Proceedings of the Seventh IEEE International Conference on Computer Vision. Kerkyra: IEEE, 1999: 1150-1157. doi: 10.1109/iccv.1999.790410http://dx.doi.org/10.1109/iccv.1999.790410
刘洋,韩广良,史春蕾.基于SIFT算法的多表情人脸识别[J].液晶与显示,2016,31(12):1156-1160. doi: 10.3788/yjyxs20163112.1156http://dx.doi.org/10.3788/yjyxs20163112.1156
LIU Y, HAN G L, SHI C L. Recognition of expression-variant faces based on SIFT method [J]. Chinese Journal of Liquid Crystals and Displays, 2016, 31(12): 1156-1160. (in Chinese). doi: 10.3788/yjyxs20163112.1156http://dx.doi.org/10.3788/yjyxs20163112.1156
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386http://dx.doi.org/10.1145/3065386
陈津徽,张元良,尹泽睿.基于改进的VGG19网络的面部表情识别[J].电脑知识与技术,2020,16(29):187-188.
CHEN J H, ZHANG Y L, YIN Z R. Facial expression recognition based on improved VGG19 network [J]. Computer Knowledge and Technology, 2020, 16(29): 187-188.
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778. doi: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90
陈斌,朱晋宁,东一舟.基于残差整流增强卷积神经网络的表情识别[J].液晶与显示,2020,35(12):1299-1308. doi: 10.37188/yjyxs20203512.1299http://dx.doi.org/10.37188/yjyxs20203512.1299
CHEN B, ZHU J N, DONG Y Z. Expression recognition based on residual rectifier enhanced convolution neural network [J]. Chinese Journal of Liquid Crystals and Display, 2020, 35(12): 1299-1308. (in Chinese). doi: 10.37188/yjyxs20203512.1299http://dx.doi.org/10.37188/yjyxs20203512.1299
WANG L, HE D. Image super-resolution reconstruction algorithm based on channel shuffle [C]. 2021 Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS). Shenyang: IEEE, 2021: 225-229. doi: 10.1109/acctcs52002.2021.00051http://dx.doi.org/10.1109/acctcs52002.2021.00051
SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520. doi: 10.1109/cvpr.2018.00474http://dx.doi.org/10.1109/cvpr.2018.00474
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-14)[2019-06-23]. https://arXiv.org/abs/1704.04861https://arXiv.org/abs/1704.04861. doi: 10.48550/arXiv.1704.04861http://dx.doi.org/10.48550/arXiv.1704.04861
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]//Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018: 3-19. doi: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1
付国栋,黄进,杨涛,等.改进CBAM的轻量级注意力模型[J].计算机工程与应用,2021,57(20):150-156. doi: 10.3778/j.issn.1002-8331.2101-0369http://dx.doi.org/10.3778/j.issn.1002-8331.2101-0369
FU G D, HUANG J, YANG T, et al. Improved lightweight attention model based on CBAM [J]. Computer Engineering and Applications, 2021, 57(20): 150-156. (in Chinese). doi: 10.3778/j.issn.1002-8331.2101-0369http://dx.doi.org/10.3778/j.issn.1002-8331.2101-0369
ZHENG W M, ZHOU X Y, ZOU C R, et al. Facial expression recognition using kernel canonical correlation analysis (KCCA) [J]. IEEE Transactions on Neural Networks, 2006, 17(1): 233-238. doi: 10.1109/tnn.2005.860849http://dx.doi.org/10.1109/tnn.2005.860849
LUCEY P, COHN J F, KANADE T, et al. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression [C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. San Francisco: IEEE, 2010: 94-101. doi: 10.1109/cvprw.2010.5543262http://dx.doi.org/10.1109/cvprw.2010.5543262
张鹏,孔韦韦,滕金保.基于多尺度特征注意力机制的人脸表情识别[J].计算机工程与应用,2022,58(1):182-189. doi: 10.3778/j.issn.1002-8331.2106-0174http://dx.doi.org/10.3778/j.issn.1002-8331.2106-0174
ZHANG P, KONG W W, TENG J B. Facial expression recognition based on multi-scale feature attention mechanism [J]. Computer Engineering and Applications, 2022, 58(1): 182-189. (in Chinese). doi: 10.3778/j.issn.1002-8331.2106-0174http://dx.doi.org/10.3778/j.issn.1002-8331.2106-0174
0
浏览量
8
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构