1.西安建筑科技大学 信息与控制工程学院, 陕西 西安 710055
2.中国社会科学院 考古研究所, 北京 100101
3.陕西省文物保护研究院, 陕西 西安 710075
[ "李健昱(1999—),女,陕西渭南人,硕士研究生,2020年于西安建筑科技大学获得学士学位,主要从事机器视觉及图像处理方面的研究。E-mail: ljy09@foxmail.com" ]
[ "王慧琴(1970—),女,山西长治人,博士,教授,2000年于西安交通大学获得博士学位,主要研究方向为智能信息处理、信息理论与应用、信息技术与管理、数字建筑等。E-mail:hqwang@ xauat.edu.cn" ]
扫 描 看 全 文
李健昱, 王慧琴, 刘瑞, 等. 复杂纹理背景下的密集骨签文字检测算法[J]. 液晶与显示, 2023,38(9):1293-1303.
LI Jian-yu, WANG Hui-qin, LIU Rui, et al. Dense bone stick text detection algorithm in complex texture background[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(9):1293-1303.
李健昱, 王慧琴, 刘瑞, 等. 复杂纹理背景下的密集骨签文字检测算法[J]. 液晶与显示, 2023,38(9):1293-1303. DOI: 10.37188/CJLCD.2022-0393.
LI Jian-yu, WANG Hui-qin, LIU Rui, et al. Dense bone stick text detection algorithm in complex texture background[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(9):1293-1303. DOI: 10.37188/CJLCD.2022-0393.
骨签是记载西汉时期地方工官向中央上缴产品的重要文物,准确检测其文字内容具有重要意义。针对复杂纹理背景下骨签文字特征难以提取及文字密集、粘连导致检测框冗余的问题,提出融合自注意力卷积和改进损失函数的骨签文字检测算法。首先,在YOLOv5特征提取端加入自注意力卷积模块,增强网络对骨签文字特征的注意,同时使模型捕捉更丰富的全局信息,抑制裂痕对文字特征提取的干扰。其次,使用Focal-EIOU损失函数替换原网络的CIOU进行优化,Focal-EIOU使用宽高损失降低预测框与真实框的宽高差距,剔除大于真实框的预测框,解决文字密集与粘连产生的检测框冗余问题,进而提高模型精准预测能力。实验结果表明,本文算法的平均精确率达到93.35%,相比YOLOv5提高了3.08%,对于复杂纹理背景下的密集粘连骨签文字检测任务更为适用。
Bone stick is an important cultural relic that records the products handed over by local officials to the central government in the Western Han Dynasty. It is of great significance to accurately detect the written content. In order to solve the problem that bone stick text is difficult to extract under complex texture background and the dense text and adhesion lead to multiple characters in one frame, a bone stick text detection algorithm combining self-attention convolution and improved loss function is proposed. Firstly, a self-attention convolution module is added to the YOLOv5 feature extraction to enhance the network's attention to the features of bone stick, and to make the model capture more global information and suppress the interference of the crack to the feature extraction. In addition, the Focal-EIOU loss function is used to replace the CIOU network for optimization. Focal-EIOU uses the wide-height loss to reduce the wide-height gap between the prediction box and the real box, and eliminates the prediction box larger than the real box, the detection frame redundancy problem caused by text density and adhesion is solved to improve the precision prediction ability of the model. The experimental results show that the average accuracy of the proposed algorithm reaches 93.35%, which is 3.08% higher than that of YOLOv5. It is more suitable for the task of detecting dense adhesive bone stick text in complex texture background.
文字检测骨签注意力机制YOLOv5损失函数
text detectionbone stickattention mechanismYOLOv5loss function
吴至录.西汉长安城未央宫骨签书法研究[D].北京:中国艺术研究院,2018.
WU Z L. Research on the bones calligraph of Weiyang Palace in Chang’an city in Western Han dynasty [D]. Beijing: Chinese National Academy of Arts, 2018. (in Chinese)
王海勇.未央宫遗址出土骨签书法研究[D].北京:中央美术学院,2018.
WANG H Y. A study on the calligraphy of the bone markers unearthed at the Weiyang Palace site [D]. Beijing: China Central Academy of Fine Arts, 2018. (in Chinese)
陈玺,何斌,龙勇机,等.复杂海背景下的自适应舰船目标检测[J].液晶与显示,2022,37(3):405-414. doi: 10.37188/cjlcd.2021-0219http://dx.doi.org/10.37188/cjlcd.2021-0219
CHEN X, HE B, LONG Y J, et al. Adaptive ship target detection in complex background [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(3): 405-414. (in Chinese). doi: 10.37188/cjlcd.2021-0219http://dx.doi.org/10.37188/cjlcd.2021-0219
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37. doi: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. doi: 10.1109/cvpr.2014.81http://dx.doi.org/10.1109/cvpr.2014.81
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988. doi: 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. doi: 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
刘杰,朱旋,宋密密.改进YOLOv2的端到端自然场景中文字符检测[J].控制与决策,2021,36(10):2483-2489. doi: 10.13195/j.kzyjc.2020.0270http://dx.doi.org/10.13195/j.kzyjc.2020.0270
LIU J, ZHU X, SONG M M. End-to-end Chinese character detection in natural scene based on improved YOLOv2 [J]. Control and Decision, 2021, 36(10): 2483-2489. (in Chinese). doi: 10.13195/j.kzyjc.2020.0270http://dx.doi.org/10.13195/j.kzyjc.2020.0270
REDMON J, FARHADI A. YOLOv3: an incremental improvement [J/OL]. arXiv, 2018: 1804.02767. doi: 10.1109/cvpr.2017.690http://dx.doi.org/10.1109/cvpr.2017.690
殷航,张智,王耀林.基于YOLOv3与MSER的自然场景中文文本检测研究与实现[J].计算机应用与软件,2021,38(10):168-172,195. doi: 10.3969/j.issn.1000-386x.2021.10.026http://dx.doi.org/10.3969/j.issn.1000-386x.2021.10.026
YIN H, ZHANG Z, WANG Y L. Research and implementation of Chinese text detection in natural scene based on YOLOv3 and MSER [J]. Computer Applications and Software, 2021, 38(10): 168-172, 195. (in Chinese). doi: 10.3969/j.issn.1000-386x.2021.10.026http://dx.doi.org/10.3969/j.issn.1000-386x.2021.10.026
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [J/OL]. arXiv, 2020: 2004.10934.
ULTRARYTICS. YOLOv5 [EB/OL]. (2020-06-03)[2021-04-15]. https://github.com/Ultraytic/YOLOv5https://github.com/Ultraytic/YOLOv5.
徐芳,刘晶红,孙辉,等.光学遥感图像海面船舶目标检测技术进展[J].光学 精密工程,2021,29(4):916-931. doi: 10.37188/OPE.2020.0419http://dx.doi.org/10.37188/OPE.2020.0419
XU F, LIU J H, SUN H, et al. Research progress on vessel detection using optical remote sensing image [J]. Optics and Precision Engineering, 2021, 29(4): 916-931. (in Chinese). doi: 10.37188/OPE.2020.0419http://dx.doi.org/10.37188/OPE.2020.0419
高梦婷,孙晗,唐云祁,等.基于改进YOLOv5的指纹二级特征检测方法[J].激光与光电子学进展,2023,60(10):1010006.
GAO M T, SUN H, TANG Y Q, et al. Fingerprint second-order minutiae detection method based on improved YOLOv5 [J]. Laser & Optoelectronics Progress, 2023, 60(10): 1010006. (in Chinese)
董乙杉,李兆鑫,郭靖圆,等.一种改进YOLOv5的X光违禁品检测模型[J].激光与光电子学进展,2023,60(4):0415005. doi: 10.3788/LOP212848http://dx.doi.org/10.3788/LOP212848
DONG Y S, LI Z X, GUO J Y, et al. Improved YOLOv5 model for X-ray prohibited item detection [J]. Laser & Optoelectronics Progress, 2023, 60(4): 0415005. (in Chinese). doi: 10.3788/LOP212848http://dx.doi.org/10.3788/LOP212848
奉志强,谢志军,包正伟,等.基于改进YOLOv5的无人机实时密集小目标检测算法[J].航空学报,2023,44(7):327106.
FENG Z Q, XIE Z J, BAO Z W, et al. Real-time dense small object detection algorithm for UAV based on improved YOLOv5 [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(7): 327106. (in Chinese)
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5936-5944. doi: 10.1109/cvpr.2017.106http://dx.doi.org/10.1109/cvpr.2017.106
LI H, XIONG P, AN J, et al. Pyramid attention network for semantic segmentation [C]. IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1701-1709. doi: 10.1109/cvpr.2019.00770http://dx.doi.org/10.1109/cvpr.2019.00770
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000. doi: 10.1609/aaai.v34i07.6999http://dx.doi.org/10.1609/aaai.v34i07.6999
ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression [J]. Neurocomputing, 2022, 506(C): 146-157. doi: 10.1016/j.neucom.2022.07.042http://dx.doi.org/10.1016/j.neucom.2022.07.042
ZHOU B L, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2921-2929. doi: 10.1109/cvpr.2016.319http://dx.doi.org/10.1109/cvpr.2016.319
TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10781-10790. doi: 10.1109/cvpr42600.2020.01079http://dx.doi.org/10.1109/cvpr42600.2020.01079
0
浏览量
40
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构