浏览全部资源
扫码关注微信
1.辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105
2.辽宁工程技术大学 矿业学院, 辽宁 阜新 123000
Received:06 February 2023,
Revised:13 March 2023,
Published:05 December 2023
移动端阅览
LI Jian-dong, LI Jia-qi, QU Hai-cheng. Attention and cross-scale fusion for vehicle and pedestrian detection[J]. Chinese journal of liquid crystals and displays, 2023, 38(12): 1707-1716.
LI Jian-dong, LI Jia-qi, QU Hai-cheng. Attention and cross-scale fusion for vehicle and pedestrian detection[J]. Chinese journal of liquid crystals and displays, 2023, 38(12): 1707-1716. DOI: 10.37188/CJLCD.2023-0037.
针对道路交通中目标所处环境复杂,存在模型对关键特征提取不充分、目标定位准确率低的问题,选取SSD模型为基本框架展开了特征提取方式、关键信息增强和非局部性特征定位的研究。首先,为针对性地解决道路交通场景下目标多尺度的问题,提出跳跃式反向特征金字塔结构,生成更具判别力的特征;其次,为解决不同语义层次的信息对特征融合过程贡献度不同的问题,设计基于注意力机制的自适应特征融合模块,在通道层面非先验地增强关键特征表达能力;最后,引入十字交叉注意力模块,提升模型对目标的位置敏感度。实验结果表明,与原始SSD模型相比,在保证实时性的情况下,改进方法的精度均值在PASCAL VOC子数据集上提升了2.6%,在自制道路交通数据集上提升了3.9%。综合考量,改进算法可广泛适用于道路车辆与行人检测任务中。
Due to the complex environment of the target in road traffic, there exist the problems of the insufficient extraction of key features by the model and the low accuracy of target positioning. The SSD model is used as the basic framework in this paper, and research is conducted on feature extraction methods, key information enhancement, and non-local feature positioning. Firstly, in order to solve the multi-scale problem of targets in road traffic scenarios, a jumping reverse feature pyramid structure is proposed to generate more discriminant features. Secondly, in order to solve the problem that information at different semantic levels has different degrees of contribution to the feature fusion process, an adaptive feature fusion module based on attention mechanism is designed to enhance the key feature expression ability non-priori at the channel level. Finally, the cross-attention module is introduced to improve the position sensitivity of the model to the target. Experimental results indicate that compared with the original model of SSD, in guarantee under the condition of real-time, the average accuracy of the proposed algorithm is improved by 2.6% on PASCAL VOC sub-dataset and 3.9% on homemade road traffic dataset. Taking everything into account, the improved algorithm can be applied widely to the task of detecting vehicles and pedestrians on the road.
XUE Z J , CHEN W J , LI J . Enhancement and fusion of multi-scale feature maps for small object detection [C]. 2020 39th Chinese Control Conference (CCC). Shenyang : IEEE , 2020 : 7212 - 7217 . doi: 10.23919/ccc50068.2020.9189352 http://dx.doi.org/10.23919/ccc50068.2020.9189352
LIU Y , MA Z , LIU X M , et al . Privacy-preserving object detection for medical images with faster R-CNN [J]. IEEE Transactions on Information Forensics and Security , 2022 , 17 : 69 - 84 . doi: 10.1109/tifs.2019.2946476 http://dx.doi.org/10.1109/tifs.2019.2946476
JAEGER P F , KOHL S A A , BICKELHAUPT S , et al . Retina U-Net: embarrassingly simple exploitation of segmentation supervision for medical object detection [C]. Machine Learning for Health Workshop . Vancouver : PMLR , 2020 : 171 - 183 . doi: 10.1007/978-1-4842-6543-7_10 http://dx.doi.org/10.1007/978-1-4842-6543-7_10
SAKHARE K V , TEWARI T , VYAS V . Review of vehicle detection systems in advanced driver assistant systems [J]. Archives of Computational Methods in Engineering , 2020 , 27 ( 2 ): 591 - 610 . doi: 10.1007/s11831-019-09321-3 http://dx.doi.org/10.1007/s11831-019-09321-3
ZHANG L L , LIN L , LIANG X D , et al . Is faster R-CNN doing well for pedestrian detection? [C]. 14th European Conference on Computer Vision . Amsterdam : Springer , 2016 : 443 - 457 . doi: 10.1007/978-3-319-46475-6_28 http://dx.doi.org/10.1007/978-3-319-46475-6_28
PANG J M , CHEN K , SHI J P , et al . Libra R-CNN: towards balanced learning for object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 821 - 830 . doi: 10.1109/cvpr.2019.00091 http://dx.doi.org/10.1109/cvpr.2019.00091
白创 , 王英杰 , 闫昱 , 等 . 基于多向特征金字塔的轻量级目标检测算法 [J]. 液晶与显示 , 2021 , 36 ( 11 ): 1516 - 1524 . doi: 10.37188/CJLCD.2021-0018 http://dx.doi.org/10.37188/CJLCD.2021-0018
BAI C , WANG Y J , YAN Y , et al . Lightweight object detection algorithm based on multi-directional feature pyramid [J]. Chinese Journal of Liquid Crystals and Displays , 2021 , 36 ( 11 ): 1516 - 1524 . (in Chinese) . doi: 10.37188/CJLCD.2021-0018 http://dx.doi.org/10.37188/CJLCD.2021-0018
WANG H L , TIAN S H , ZHANG Z A , et al . A improved Yolov4’s vehicle and pedestrian detection method [C]. ICMLCA 2021 ; 2nd International Conference on Machine Learning and Computer Application. Shenyang : VDE , 2021: 1 - 7 .
李经宇 , 杨静 , 孔斌 , 等 . 基于注意力机制的多尺度车辆行人检测算法 [J]. 光学 精密工程 , 2021 , 29 ( 6 ): 1448 - 1458 . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448
LI J Y , YANG J , KONG B , et al . Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism [J]. Optics and Precision Engineering , 2021 , 29 ( 6 ): 1448 - 1458 . (in Chinese) . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448
LI Y F , WANG X Q , HE Y , et al . Deep spatial-temporal feature extraction and lightweight feature fusion for tool condition monitoring [J]. IEEE Transactions on Industrial Electronics , 2022 , 69 ( 7 ): 7349 - 7359 . doi: 10.1109/tie.2021.3102443 http://dx.doi.org/10.1109/tie.2021.3102443
董小伟 , 韩悦 , 张正 , 等 . 基于多尺度加权特征融合网络的地铁行人目标检测算法 [J]. 电子与信息学报 , 2021 , 43 ( 7 ): 2113 - 2120 . doi: 10.11999/JEIT200450 http://dx.doi.org/10.11999/JEIT200450
DONG X W , HAN Y , ZHANG Z , et al . Metro pedestrian detection algorithm based on multi-scale weighted feature fusion network [J]. Journal of Electronics & Information Technology , 2021 , 43 ( 7 ): 2113 - 2120 . (in Chinese) . doi: 10.11999/JEIT200450 http://dx.doi.org/10.11999/JEIT200450
邹梓吟 , 盖绍彦 , 达飞鹏 , 等 . 基于注意力机制的遮挡行人检测算法 [J]. 光学学报 , 2021 , 41 ( 15 ): 1515001 . doi: 10.3788/aos202141.1515001 http://dx.doi.org/10.3788/aos202141.1515001
ZOU Z Y , GAI S Y , DA F P , et al . Occluded pedestrian detection algorithm based on attention mechanism [J]. Acta Optica Sinica , 2021 , 41 ( 15 ): 1515001 . (in Chinese) . doi: 10.3788/aos202141.1515001 http://dx.doi.org/10.3788/aos202141.1515001
LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE , 2017 : 936 - 944 . doi: 10.1109/cvpr.2017.106 http://dx.doi.org/10.1109/cvpr.2017.106
LIU S , QI L , QIN H F , et al . Path aggregation network for instance segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 8759 - 8768 . doi: 10.1109/cvpr.2018.00913 http://dx.doi.org/10.1109/cvpr.2018.00913
NIE Y D , BIAN C J , LI L G , et al . LFC-SSD: multiscale aircraft detection based on local feature correlation [J]. IEEE Geoscience and Remote Sensing Letters , 2022 , 19 : 6510505 . doi: 10.1109/lgrs.2022.3177836 http://dx.doi.org/10.1109/lgrs.2022.3177836
CHOLLET F . Xception: deep learning with depthwise separable convolutions [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE , 2017 : 1800 - 1807 . doi: 10.1109/cvpr.2017.195 http://dx.doi.org/10.1109/cvpr.2017.195
SANDLER M , HOWARD A , ZHU M L , et al . MobileNetV2: inverted residuals and linear bottlenecks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 4510 - 4520 . doi: 10.1109/cvpr.2018.00474 http://dx.doi.org/10.1109/cvpr.2018.00474
MA N N , ZHANG X Y , ZHENG H T , et al . ShuffleNet V2: practical guidelines for efficient CNN architecture design [C]// Proceedings of the 15th European Conference on Computer Vision (ECCV) . Munich : Springer , 2018 : 122 - 138 . doi: 10.1007/978-3-030-01264-9_8 http://dx.doi.org/10.1007/978-3-030-01264-9_8
LIANG J H , ZHANG T , FENG G Q . Channel compression: rethinking information redundancy among channels in CNN architecture [J]. IEEE Access , 2020 , 8 : 147265 - 147274 . doi: 10.1109/access.2020.3015714 http://dx.doi.org/10.1109/access.2020.3015714
HUANG S H , LU Z C , CHENG R , et al . FaPN: Feature-aligned pyramid network for dense image prediction [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal : IEEE , 2021 : 844 - 853 . doi: 10.1109/iccv48922.2021.00090 http://dx.doi.org/10.1109/iccv48922.2021.00090
DAI Y M , GIESEKE F , OEHMCKE S , et al . Attentional feature fusion [C]// Proceedings of the IEEE Winter Conference on Applications of Computer Vision . Waikoloa : IEEE , 2021 : 3559 - 3568 . doi: 10.1109/wacv48630.2021.00360 http://dx.doi.org/10.1109/wacv48630.2021.00360
LI X , WANG W H , HU X L , et al . Selective kernel networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 510 - 519 . doi: 10.1109/cvpr.2019.00060 http://dx.doi.org/10.1109/cvpr.2019.00060
ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: split-attention networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans : IEEE , 2022 : 2735 - 2745 . doi: 10.1109/cvprw56347.2022.00309 http://dx.doi.org/10.1109/cvprw56347.2022.00309
WANG Q L , WU B G , ZHU P F , et al . ECA-Net: efficient channel attention for deep convolutional neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE , 2020 : 11531 - 11539 . doi: 10.1109/cvpr42600.2020.01155 http://dx.doi.org/10.1109/cvpr42600.2020.01155
WANG X L , GIRSHICK R , GUPTA A , et al . Non-local neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 : 7794 - 7803 . doi: 10.1109/cvpr.2018.00813 http://dx.doi.org/10.1109/cvpr.2018.00813
HUANG Z L , WANG X G , HUANG L C , et al . CCNet: Criss-cross attention for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 603 - 612 . doi: 10.1109/iccv.2019.00069 http://dx.doi.org/10.1109/iccv.2019.00069
BELL S , ZITNICK C L , BALA K , et al . Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas : IEEE , 2016 : 2874 - 2883 . doi: 10.1109/cvpr.2016.314 http://dx.doi.org/10.1109/cvpr.2016.314
REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: towards real-time object detection with region proposal networks [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems . Montreal : ACM , 2015 : 91 - 99 .
TIAN Z , SHEN C H , CHEN H , et al . FCOS: fully convolutional one-stage object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul : IEEE , 2019 : 9626 - 9635 . doi: 10.1109/iccv.2019.00972 http://dx.doi.org/10.1109/iccv.2019.00972
HWANG B , LEE S , HAN H . LNFCOS: efficient object detection through deep learning based on LNblock [J]. Electronics , 2022 , 11 ( 17 ): 2783 . doi: 10.3390/electronics11172783 http://dx.doi.org/10.3390/electronics11172783
FU C Y , LIU W , RANGA A , et al . DSSD: deconvolutional single shot detector [J/OL]. arXiv , 2017 : 1701 . 06659 .
AHMAD T , CHEN X N , SAQLAIN A S , et al . EDF-SSD: An improved feature fused SSD for object detection [C]. 2021 IEEE 6th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA) . Chengdu : IEEE , 2021 : 469 - 473 . doi: 10.1109/icccbda51879.2021.9442501 http://dx.doi.org/10.1109/icccbda51879.2021.9442501
REDMON J , FARHADI A . YOLOv3: an incremental improvement [J/OL]. arXiv , 2018 : 1804 . 02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
HUANG Z C , WANG J L , FU X S , et al . DC-SPP-YOLO: dense connection and spatial pyramid pooling based YOLO for object detection [J]. Information Sciences , 2020 , 522 : 241 - 258 . doi: 10.1016/j.ins.2020.02.067 http://dx.doi.org/10.1016/j.ins.2020.02.067
JEONG J , PARK H , KWAK N . Enhancement of SSD by concatenating feature maps for object detection [C]. British Machine Vision Conference 2017. London : BMVC , 2017 . doi: 10.5244/c.31.76 http://dx.doi.org/10.5244/c.31.76
刘涛 , 汪西莉 . 采用卷积核金字塔和空洞卷积的单阶段目标检测 [J]. 中国图象图形学报 , 2020 , 25 ( 1 ): 102 - 112 . doi: 10.11834/jig.190166 http://dx.doi.org/10.11834/jig.190166
LIU T , WANG X L . Single-stage object detection using filter pyramid and atrous convolution [J]. Journal of Image and Graphics , 2020 , 25 ( 1 ): 102 - 112 . (in Chinese) . doi: 10.11834/jig.190166 http://dx.doi.org/10.11834/jig.190166
贾天豪 , 彭力 . 残差学习与循环注意力下的SSD目标检测算法 [J]. 计算机科学 , 2023 , 50 ( 5 ): 170 - 176 . doi: 10.11896/jsjkx.220400085 http://dx.doi.org/10.11896/jsjkx.220400085
JIA T H , PENG L . SSD object detection algorithm with residual learning and cyclic attention [J]. Computer Science , 2023 , 50 ( 5 ): 170 - 176 . (in Chinese) . doi: 10.11896/jsjkx.220400085 http://dx.doi.org/10.11896/jsjkx.220400085
姜竣 , 翟东海 . 基于空洞卷积与特征增强的单阶段目标检测算法 [J]. 计算机工程 , 2021 , 47 ( 7 ): 232 - 238,248 .
JIANG J , ZHAI D H . Single-stage object detection algorithm based on dilated convolution and feature enhancement [J]. Computer Engineering , 2021 , 47 ( 7 ): 232 - 238, 248 . (in Chinese)
叶召元 , 郑建立 . 基于自动驾驶场景的目标检测算法DFSSD [J]. 计算机工程与应用 , 2020 , 56 ( 16 ): 139 - 147 .
YE Z Y , ZHENG J L . Object detection algorithm DFSSD based on automatic driving scene [J]. Computer Engineering and Applications , 2020 , 56 ( 16 ): 139 - 147 . (in Chinese)
DAI J F , LI Y , HE K M , et al . R-FCN: object detection via region-based fully convolutional networks [C]// Proceedings of the 30th International Conference on Neural Information Processing Systems . Barcelona : ACM , 2016 : 379 - 387 .
0
Views
147
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution