基于人眼视觉规律的注视点分类及其在图像标注中的应用

漆正溢; 方红萍; 万中华; 张瀚源; 伍世虔

doi:10.37188/CJLCD.2022-0245

您当前的位置：

首页 >

文章列表页 >

基于人眼视觉规律的注视点分类及其在图像标注中的应用

图像处理 | 更新时间：2023-04-10

- 基于人眼视觉规律的注视点分类及其在图像标注中的应用
- Visual-pattern-based fixation classification and its application in image annotation
- 液晶与显示 2023年38卷第4期页码：515-523
- 作者机构：
  
  1.武汉科技大学信息科学与工程学院，湖北武汉 430081
  2.武汉科技大学机械自动化学院，湖北武汉 430081
- 作者简介：
  
  [ "漆正溢（1998—），男，湖北黄冈人，硕士研究生，2020年于湖北师范大学获得学士学位，主要从事眼动追踪、计算机视觉的研究。E-mail：wana2333@163.com" ]
  [ "方红萍（1977—），女，湖北武汉人，博士，副教授，2015年于武汉科技大学获得博士学位，主要从事图像处理、模式识别的研究。E-mail：fanghongping@wust.edu.cn" ]
- 基金信息：
  
  国家自然科学基金(61775172);武汉科技大学重大科技项目培育计划(2018TDX06)
- DOI：10.37188/CJLCD.2022-0245
  中图分类号：
扫描看全文
漆正溢, 方红萍, 万中华, 等. 基于人眼视觉规律的注视点分类及其在图像标注中的应用[J]. 液晶与显示, 2023,38(4):515-523.

QI Zheng-yi, FANG Hong-ping, WAN Zhong-hua, et al. Visual-pattern-based fixation classification and its application in image annotation[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(4):515-523.
漆正溢, 方红萍, 万中华, 等. 基于人眼视觉规律的注视点分类及其在图像标注中的应用[J]. 液晶与显示, 2023,38(4):515-523. DOI： 10.37188/CJLCD.2022-0245.

QI Zheng-yi, FANG Hong-ping, WAN Zhong-hua, et al. Visual-pattern-based fixation classification and its application in image annotation[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(4):515-523. DOI： 10.37188/CJLCD.2022-0245.

摘要

针对现有眼动图像标注算法中停留在非目标上的注视点容易引入定位干扰，导致标注精度不高的问题，本文首先实验探索了标注任务中的眼动规律；然后提出将标注注视序列分为视觉搜索和视觉识别两个阶段，并设计了基于参数自适应DBSCAN算法的视觉搜索和视觉识别注视点分类方法，旨在将提取的识别注视点作为眼动图像标注算法的输入，提高标注结果的准确性；最后基于2014 DIMITRIOS P数据集开展实验对比与分析。实验结果表明，与现有相关算法相比，F1度量提升4%，算法运行效率提升了近1倍，眼动图像标注算法精度提高3.34%，满足稳定可靠、精度高、运行速度快等要求。

Abstract

In the existing works of eye-movement image annotation algorithms， the fixation points resting on non-targets may introduce localization interference， and produce low annotation accuracy. To solve this problem. Firstly， experimental studies are conducted to explore the eye-movement pattern in the annotation task. Then， the annotated gaze sequences are divided into two stages： visual search and visual recognition， and a fixation points classification method based on the parameter adaptive DBSCAN algorithm is proposed to extract recognition fixation points as the input of the eye-movement image annotation algorithm in order to improve the accuracy of the annotation results. Finally， the experimental comparison and analysis are carried out based on the 2014 DIMITRIOS P data set. The experimental results show that compared with the existing related algorithms， the F1 score is improved by 4%， the algorithm operation efficiency is improved by nearly one time， and the eye-movement image annotation accuracy is improved by 3.34%， which meets the requirements of stability， reliability， high accuracy， and fast running speed.

关键词

眼动注视点分类眼动图像标注视觉搜索视觉识别参数自适应DBSCAN

Keywords

fixations classificationeye movement image annotationvisual researchvisual recognitionadaptive DBSCAN

references

兰旭婷，郭中华，李昌昊.基于注意力与特征融合的光学遥感图像飞机目标检测［J］.液晶与显示，2021，36（11）：1506-1515. doi: 10.37188/CJLCD.2021-0088http://dx.doi.org/10.37188/CJLCD.2021-0088

LAN X T， GUO Z H， LI C H. Attention and feature fusion for aircraft target detection in optical remote sensing images ［J］. Chinese Journal of Liquid Crystals and Displays， 2021， 36（11）： 1506-1515. （in Chinese）. doi: 10.37188/CJLCD.2021-0088http://dx.doi.org/10.37188/CJLCD.2021-0088

孔雅洁，张叶.引入高斯掩码自注意力模块的YOLOv3目标检测方法［J］.液晶与显示，2022，37（4）：539-548. doi: 10.37188/cjlcd.2021-0250http://dx.doi.org/10.37188/cjlcd.2021-0250

KONG Y J， ZHANG Y. YOLOv3 object detection method by introducing Gaussian mask self-attention module ［J］. Chinese Journal of Liquid Crystals and Displays， 2022， 37（4）： 539-548. （in Chinese）. doi: 10.37188/cjlcd.2021-0250http://dx.doi.org/10.37188/cjlcd.2021-0250

李玲，宋莹玮，杨秀华，等.应用图学习算法的跨媒体相关模型图像语义标注［J］.光学精密工程，2016，24（1）：229-235. doi: 10.3788/ope.20162401.0229http://dx.doi.org/10.3788/ope.20162401.0229

LI L， SONG Y W， YANG X H， et al. Image semantic annotation of CMRM based on graph learning ［J］. Optics and Precision Engineering， 2016， 24（1）： 229-235. （in Chinese）. doi: 10.3788/ope.20162401.0229http://dx.doi.org/10.3788/ope.20162401.0229

SHI W X， HUANG Z， HUANG H H， et al. LOEN： lensless opto-electronic neural network empowered machine vision ［J］. Light： Science & Applications， 2022， 11（1）： 121. doi: 10.1038/s41377-022-00809-5http://dx.doi.org/10.1038/s41377-022-00809-5

RUSSAKOVSKY O， DENG J， SU H， et al. ImageNet large scale visual recognition challenge ［J］. International Journal of Computer Vision， 2015， 115（3）： 211-252. doi: 10.1007/s11263-015-0816-yhttp://dx.doi.org/10.1007/s11263-015-0816-y

SU H， DENG J， FEI-FEI L. Crowdsourcing annotations for visual object detection ［R］. Toronto： AAAI Workshop， 2012： 40-46.

樊镕.基于眼动先验的显著性检测［D］.天津：天津大学，2015.

FAN R. Visual saliency detection based on eye tracking prior knowledge ［D］. Tianjin： Tianjin University， 2015. （in Chinese）

EINHÄUSER W， SPAIN M， PERONA P. Objects predict fixations better than early saliency ［J］. Journal of Vision， 2008， 8（14）： 1-26. doi: 10.1167/8.14.18http://dx.doi.org/10.1167/8.14.18

WOLFE J M. Visual search ［J］. Current Biology， 2010， 20（8）： R346-R349. doi: 10.1016/j.cub.2010.02.016http://dx.doi.org/10.1016/j.cub.2010.02.016

YUN K， PENG Y F， SAMARAS D， et al. Studying relationships between human gaze， description， and computer vision ［C］//2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland： IEEE， 2013： 739-746. doi: 10.1109/cvpr.2013.101http://dx.doi.org/10.1109/cvpr.2013.101

PAPADOPOULOS D P， CLARKE A D F， KELLER F， et al. Training object class detectors from eye tracking data ［C］//Proceedings of the 13th European Conference on Computer Vision. Zurich： Springer， 2014： 361-376. doi: 10.1007/978-3-319-10602-1_24http://dx.doi.org/10.1007/978-3-319-10602-1_24

PAPADOPOULOS D P. Efficient human annotation schemes for training object class detectors ［D］. Edinburgh： University of Edinburgh， 2018.

ZHANG R H， SARAN A， LIU B， et al. Human gaze assisted artificial intelligence： a review ［C］//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Yokohama： IJCAI， 2020： 4951-4958. doi: 10.24963/ijcai.2020/689http://dx.doi.org/10.24963/ijcai.2020/689

KUETTEL D， FERRARI V. Figure-ground segmentation by transferring window masks ［C］//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence： IEEE， 2012： 558-565. doi: 10.1109/cvpr.2012.6247721http://dx.doi.org/10.1109/cvpr.2012.6247721

ALEXE B， DESELAERS T， FERRARI V. What is an object？［C］//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco： IEEE， 2010： 73-80. doi: 10.1109/cvpr.2010.5540226http://dx.doi.org/10.1109/cvpr.2010.5540226

ELAZARY L， ITTI L. A Bayesian model for efficient visual search and recognition ［J］. Vision Research， 2010， 50（14）： 1338-1352. doi: 10.1016/j.visres.2010.01.002http://dx.doi.org/10.1016/j.visres.2010.01.002

HENDERSON J M. Human gaze control during real-world scene perception ［J］. Trends in Cognitive Sciences， 2003， 7（11）： 498-504. doi: 10.1016/j.tics.2003.09.006http://dx.doi.org/10.1016/j.tics.2003.09.006

孙文财，杨志发，李世武，等.面向驾驶员注视区域划分的DBSCAN-MMC方法［J］.浙江大学学报（工学版），2015，49（8）：1455-1461. doi: 10.3785/j.issn.1008-973X.2015.08.008http://dx.doi.org/10.3785/j.issn.1008-973X.2015.08.008

SUN W C， YANG Z F， LI S W， et al. Driver fixation area division oriented DBSCAN-MMC method ［J］. Journal of Zhejiang University （Engineering Science）， 2015， 49（8）： 1455-1461. （in Chinese）. doi: 10.3785/j.issn.1008-973X.2015.08.008http://dx.doi.org/10.3785/j.issn.1008-973X.2015.08.008

谭婷，王羽尘，宗晨宏，等.公路隧道群不同区段驾驶人视觉特征差异性研究［J］.物流科技，2020，43（10）：68-72，82. doi: 10.3969/j.issn.1002-3100.2020.10.018http://dx.doi.org/10.3969/j.issn.1002-3100.2020.10.018

TAN T， WANG Y C， ZONG C H， et al. Research on the difference of drivers’ visual characteristics in different sections of highway tunnel group ［J］. Logistics Sci-Tech， 2020， 43（10）： 68-72， 82. （in Chinese）. doi: 10.3969/j.issn.1002-3100.2020.10.018http://dx.doi.org/10.3969/j.issn.1002-3100.2020.10.018

贺辉，黄君浩.基于眼动跟踪的人机交互应用［J］.山东大学学报（工学版），2021，51（2）：1-8.

HE H， HUANG J H. Eye tracking in human-computer interaction control ［J］. Journal of Shandong University （Engineering Science）， 2021， 51（2）： 1-8. （in Chinese）

陆柳杏，石宇，李济远，等.人机交互研究中的眼动追踪：主题、作用、动向［J］.图书情报工作，2020，64（1）：113-119.

LU L X， SHI Y， LI J Y， et al. Eye-tracking in human-computer interaction： status quo， roles， and trends ［J］. Library and Information Service， 2020， 64（1）： 113-119. （in Chinese）

HOFFMAN J E. Search through a sequentially presented visual display ［J］. Perception & Psychophysics， 1978， 23（1）： 1-11. doi: 10.3758/bf03214288http://dx.doi.org/10.3758/bf03214288

THEEUWES J. Endogenous and exogenous control of visual selection ［J］. Perception， 1994， 23（4）： 429-440. doi: 10.1068/p230429http://dx.doi.org/10.1068/p230429

ESTER M， KRIEGEL H P， SANDER J， et al. A density-based algorithm for discovering clusters in large spatial databases with noise ［C］//Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland： AAAI Press， 1996： 226-231.

MARK E， LUC V G， CHRISTOPHER K I W， et al. The PASCAL visual object classes challenge 2012 （VOC2012） results ［EB/OL］. http：//www.pascal-network.org/challenges/VOC/voc2012/workshop/index.htmlhttp://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html. doi: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4

CHENG M M， LIU Y， LIN W Y， et al. BING： binarized normed gradients for objectness estimation at 300 fps ［J］. Computational Visual Media， 2019， 5（1）： 3-20. doi: 10.1007/s41095-018-0120-1http://dx.doi.org/10.1007/s41095-018-0120-1

CHENG G， YANG J Y， GAO D C， et al. High-quality proposals for weakly supervised object detection ［J］. IEEE Transactions on Image Processing， 2020， 29： 5794-5804. doi: 10.1109/tip.2020.2987161http://dx.doi.org/10.1109/tip.2020.2987161

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据