1.北京邮电大学 电子工程学院, 北京 100876
[ "任尚恩(1999—),男,山东济南人,硕士研究生,2021年于西安电子科技大学获得学士学位,主要从事手势交互和计算机图形学的研究。E-mail:shangen_ren@bupt.edu.cn" ]
[ "桑新柱(1977—),男,山东菏泽人,博士,教授,2005年于北京邮电大学获得博士学位,主要从事裸眼3D光场显示、智能信息处理和光电信息处理方面的研究。E-mail:xzsang@bupt.edu.cn" ]
扫 描 看 全 文
任尚恩, 邢树军, 陈硕, 等. 基于小样本手部关键点的MLP网络提升3D光场交互准确度方法[J]. 液晶与显示, 2023,38(9):1198-1204.
REN Shang-en, XING Shu-jun, CHEN Shuo, et al. Method for improving the accuracy of 3D light field interaction based on a small dataset of hand key points using an MLP network[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(9):1198-1204.
任尚恩, 邢树军, 陈硕, 等. 基于小样本手部关键点的MLP网络提升3D光场交互准确度方法[J]. 液晶与显示, 2023,38(9):1198-1204. DOI: 10.37188/CJLCD.2023-0072.
REN Shang-en, XING Shu-jun, CHEN Shuo, et al. Method for improving the accuracy of 3D light field interaction based on a small dataset of hand key points using an MLP network[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(9):1198-1204. DOI: 10.37188/CJLCD.2023-0072.
针对当前3D光场手势交互存在识别率低、识别速度慢、深度学习网络需要较多数据样本的问题,本文提出了一种基于小样本手部关键点的多层感知器(Multi-Layer Perceptron,MLP)网络提升3D光场交互准确度方法,识别速度达到毫秒级。在手部关键点采集过程中,从不同位置采集得到的同一种手势关键点三维数据存在显著差异。为了消除差异,本文提出在同一右手笛卡尔坐标系下,通过位移和罗德里格旋转公式对简化后的手势模型进行位姿变换,将同一种手势归一化。一个MLP神经网络被用来从归一化后的手部关键点跳变关系中提取手部特征。实验结果表明,本文提出的方法对3D光场交互中的简单手势识别率为95%以上,对复杂手势的识别率为90%以上。与此同时,该方法在小样本数据集训练下表现出优秀的性能,能够满足精确和快速手势识别的要求。最后,本文展示了一种将所提出的方法成功应用于3D光场交互的场景。
To address the issues of low recognition rate, slow recognition speed, and the need for large amounts of data samples in current 3D light field gesture interaction, this paper proposes a method based on a small dataset of hand key points using a multi-layer perceptron(MLP) network to improve the accuracy of 3D light field interaction, with recognition speed reaching the millisecond level. In the process of collecting hand key points, there are significant differences in the three-dimensional data of the same type of hand gesture collected from different locations. In order to eliminate these differences, this paper proposes a method of normalizing the same gesture through pose transformation of the simplified gesture model in the same right-hand Cartesian coordinate system using displacement and Rodrigues rotation formula. An MLP neural network is utilized to extract hand features from the normalized hand key points transition relationships. Experimental results show that the proposed method has a recognition rate of above 95% for simple gestures in 3D light field interaction, and a recognition rate of above 90% for complex gestures. Furthermore, the proposed method demonstrates excellent performance under training with a small dataset, meeting the requirements of both accurate and fast gesture recognition. Finally, this paper presents a successful application of the proposed method to a 3D light field interaction scenario.
交互手势分类识别多层感知器小样本数据集
interactiongesture recognitionMLPsmall dataset
LIU X, LI H F. The progress of light-field 3-D displays [J]. Information Display, 2014, 30(6): 6-14. doi: 10.1002/j.2637-496x.2014.tb00760.xhttp://dx.doi.org/10.1002/j.2637-496x.2014.tb00760.x
SANG X Z, GAO X, YU X B, et al. Interactive floating full-parallax digital three-dimensional light-field display based on wavefront recomposing [J]. Optics Express, 2018, 26(7): 8883-8889. doi: 10.1364/oe.26.008883http://dx.doi.org/10.1364/oe.26.008883
MA Q G, CAO L C, HE Z H, et al. Progress of three-dimensional light-field display [J]. Chinese Optics Letters, 2019, 17(11): 111001. doi: 10.3788/col201917.111001http://dx.doi.org/10.3788/col201917.111001
LIU B Y, SANG X Z, YU X B, et al. Time-multiplexed light field display with 120-degree wide viewing angle [J]. Optics Express, 2019, 27(24): 35728-35739. doi: 10.1364/oe.27.035728http://dx.doi.org/10.1364/oe.27.035728
李宁驰,于迅博,高鑫,等. 一种裸眼3D显示中的多视点校正方案[J]. 液晶与显示,2022,37(5):605-612. doi: 10.37188/CJLCD.2022-0039http://dx.doi.org/10.37188/CJLCD.2022-0039
LI N C, YU X B, GAO X, et al. Multi-view correction scheme for naked eye 3D display [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(5): 605-612.(in Chinese). doi: 10.37188/CJLCD.2022-0039http://dx.doi.org/10.37188/CJLCD.2022-0039
于迅博,李涵宇,高鑫,等. 基于预处理卷积神经网络提升3D光场显示视觉分辨率的方法[J]. 液晶与显示,2022,37(5):549-554. doi: 10.37188/CJLCD.2022-0044http://dx.doi.org/10.37188/CJLCD.2022-0044
YU X B, LI H Y, GAO X, et al. 3D light field display with improved visual resolution based on pre-processing convolutional neural network [J]. Chinese Journal of Liquid Crystals and Displays, 2022, 37(5): 549-554.(in Chinese). doi: 10.37188/CJLCD.2022-0044http://dx.doi.org/10.37188/CJLCD.2022-0044
YANG L, SANG X Z, YU X B, et al. Demonstration of a large-size horizontal light-field display based on the LED panel and the micro-pinhole unit array [J]. Optics Communications, 2018, 414: 140-145. doi: 10.1016/j.optcom.2017.12.069http://dx.doi.org/10.1016/j.optcom.2017.12.069
YANG S W, SANG X Z, YU X B, et al. 162-inch 3D light field display based on aspheric lens array and holographic functional screen [J]. Optics Express, 2018, 26(25): 33013-33021. doi: 10.1364/oe.26.033013http://dx.doi.org/10.1364/oe.26.033013
YANG L, SANG X Z, YU X B, et al. Viewing-angle and viewing-resolution enhanced integral imaging based on time-multiplexed lens stitching [J]. Optics Express, 2019, 27(11): 15679-15692. doi: 10.1364/oe.27.015679http://dx.doi.org/10.1364/oe.27.015679
LI Y L, LI N N, WANG D, et al. Tunable liquid crystal grating based holographic 3D display system with wide viewing angle and large size [J]. Light: Science & Applications, 2022, 11(1): 188. doi: 10.1038/s41377-022-00880-yhttp://dx.doi.org/10.1038/s41377-022-00880-y
XIONG J H, WU S T. Planar liquid crystal polarization optics for augmented reality and virtual reality: from fundamentals to applications [J]. eLight, 2021, 1(1): 3. doi: 10.1186/s43593-021-00003-xhttp://dx.doi.org/10.1186/s43593-021-00003-x
HILVERMAN C, COOK S W, DUFF M C. Hippocampal declarative memory supports gesture production: evidence from amnesia [J]. Cortex, 2016, 85: 25-36. doi: 10.1016/j.cortex.2016.09.015http://dx.doi.org/10.1016/j.cortex.2016.09.015
MITRA S, ACHARYA T. Gesture recognition: a survey [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2007, 37(3): 311-324. doi: 10.1109/tsmcc.2007.893280http://dx.doi.org/10.1109/tsmcc.2007.893280
ABHISHEK K S, QUBELEY L C F, HO D. Glove-based hand gesture recognition sign language translator using capacitive touch sensor [C]. 2016 IEEE International Conference on Electron Devices and Solid-State Circuits (EDSSC). Hong Kong, China: IEEE, 2016: 334-337. doi: 10.1109/edssc.2016.7785276http://dx.doi.org/10.1109/edssc.2016.7785276
WEN R, YANG L J, CHUI C K, et al. Intraoperative visual guidance and control interface for augmented reality robotic surgery [C]. IEEE ICCA 2010. Xiamen: IEEE, 2010: 947-952. doi: 10.1109/icca.2010.5524421http://dx.doi.org/10.1109/icca.2010.5524421
0
浏览量
140
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构