1.东北大学 计算机科学与工程学院, 辽宁 沈阳 110167
2.中国科学院 长春光学精密机械与物理研究所 网络与信息化技术中心, 吉林 长春 130033
[ "齐翌辰(1997—),女,吉林长春人,硕士研究生,2020年于延边大学获得学士学位,主要从事模式识别和多模态方面的研究。E-mail:qiyichen_619@163.com" ]
[ "赵伟超(1992—),男,吉林辽源人,硕士,助理研究员,2016年于西北工业大学获得硕士学位,主要从事普适计算、边缘计算、自然语言处理方面的研究。E-mail:zhaoweichao@ciomp.ac.cn" ]
扫 描 看 全 文
齐翌辰, 赵伟超. 基于有监督对比学习的航天信息获取与图像生成[J]. 液晶与显示, 2023,38(11):1531-1541.
QI Yi-chen, ZHAO Wei-chao. Aerospace information acquisition and image generation based on supervised contrastive learning[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1531-1541.
齐翌辰, 赵伟超. 基于有监督对比学习的航天信息获取与图像生成[J]. 液晶与显示, 2023,38(11):1531-1541. DOI: 10.37188/CJLCD.2023-0056.
QI Yi-chen, ZHAO Wei-chao. Aerospace information acquisition and image generation based on supervised contrastive learning[J]. Chinese Journal of Liquid Crystals and Displays, 2023,38(11):1531-1541. DOI: 10.37188/CJLCD.2023-0056.
为了提高获取开源航天信息的效率并解决开源航天信息内容较长、数量较为有限、应用常用文本分类模型鲁棒性较差以及文本信息不够直观等问题,本文提出一种基于有监督对比学习的航天信息分类方法。该方法基于带有注意力机制(Attention)的双向长短期记忆网络(Bidirectional Long Short-Term Memory, BiLSTM),融合对比学习技术,对开源的信息进行处理并分析,进而高效地筛选出航天类的信息,利用unCLIP(un-Contrastive Language-Image Pre-Training)模型生成信息对应的图像。实验结果表明,对比CNN(Convolutional Neural Networks)、BiLSTM、Transformer和BiLSTM-Attention等常用的文本分类方法,该方法在准确率、召回率和F1-Score上均表现良好,其中F1-Score达到0.97,同时以图像的形式呈现信息,使信息更加清晰直观。本文方法可以充分使用网络公开的数据资源,有效地提取开源航天信息并生成对应图像,对航天信息的分析和研究具有重要价值。
In order to improve the efficiency of obtaining open source aerospace information, and solve the problems of long open source aerospace information content, relatively limited quantity, poor robustness of commonly used text classification models, and unintuitive text information, this paper proposes a method for aerospace information text classification based on supervised contrastive learning. The method is based on the bidirectional long short-term memory (BiLSTM) network with the attention mechanism, integrates comparative learning technology, processes and analyzes open source information, efficiently screenes out aerospace information, and uses the unCLIP (un-Contrastive Language-Image Pre-Training) model to generate an image corresponding to the information. The experimental results show that compared with commonly used text classification methods such as CNN (Convolutional Neural Networks), BiLSTM, Transformer and BiLSTM-Attention, this method performes well in accuracy, recall and F1-Score, among them, F1-Score reaches 0.97. At the same time, information is presented in the form of images to make information clearer and more intuitive. It can make full use of open data resources on the network, effectively extract open-source space information and generate corresponding images, which is of great value to the analysis and research of aerospace information.
有监督文本分类对比学习文本生成图像航天信息
supervised text classificationcontrastive learningtext-to-image synthesisaerospace information
佟艳春.基于项目知识管理的航天科技情报协同工作系统研究[D].哈尔滨:哈尔滨工业大学,2015.
TONG Y C. Research on the cooperative work system of aerospace science and technology intelligence based on project knowledge management [D]. Harbin: Harbin Institute of Technology, 2015. (in Chinese)
孔凡芃,刘旭红,刘秀磊,等.基于BERT模型的航天科技开源情报分类[J].北京信息科技大学学报(自然科学版),2021,36(1):28-33. doi: 10.16508/j.cnki.11-5866/n.2021.01.006http://dx.doi.org/10.16508/j.cnki.11-5866/n.2021.01.006
KONG F P, LIU X H, LIU X L, et al. Classification of open source intelligence of aerospace science and technology based on BERT model [J]. Journal of Beijing Information Science & Technology University, 2021, 36(1): 28-33. (in Chinese). doi: 10.16508/j.cnki.11-5866/n.2021.01.006http://dx.doi.org/10.16508/j.cnki.11-5866/n.2021.01.006
郭颂,边伟,刘洋,等.基于SVM主题爬虫的航天情报采集应用研究[J].电子设计工程,2016,24(17):28-30,34. doi: 10.3969/j.issn.1674-6236.2016.17.009http://dx.doi.org/10.3969/j.issn.1674-6236.2016.17.009
GUO S, BIAN W, LIU Y, et al. Research on the application of SVM-based focused crawler for space intelligence collection [J]. Electronic Design Engineering, 2016, 24(17): 28-30, 34. (in Chinese). doi: 10.3969/j.issn.1674-6236.2016.17.009http://dx.doi.org/10.3969/j.issn.1674-6236.2016.17.009
张亚超.面向航天情报领域的文本分类算法研究与实现[D].西安:西安电子科技大学,2018.
ZHANG Y. The research and implementation on text classification algorithm applied for aerospace intelligence [D]. Xi'an: Xidian University, 2018. (in Chinese)
刘秀磊,孔凡芃,谌彤童,等.基于BERT与XGBoost的航天科技开源情报分类[J].郑州大学学报(理学版),2021,53(3):15-22.
LIU X L, KONG F P, CHEN T T, et al. Research on classification of aerospace science and technology open source information based on BERT and XGBoost [J]. Journal of Zhengzhou University (Natural Science Edition), 2021, 53(3): 15-22.(in Chinese)
张玉峰,朱莹.基于Web文本挖掘的企业竞争情报获取方法研究[J].情报理论与实践,2006,29(5):563-566. doi: 10.3969/j.issn.1000-7490.2006.05.014http://dx.doi.org/10.3969/j.issn.1000-7490.2006.05.014
ZHANG Y F, ZHU Y. Enterprise competitive intelligence acquisition method based on Web text mining [J]. Information Studies: Theory & Application, 2006, 29(5): 563-566. (in Chinese). doi: 10.3969/j.issn.1000-7490.2006.05.014http://dx.doi.org/10.3969/j.issn.1000-7490.2006.05.014
黄胜,郭继光,陆泽健,等.面向军事领域的Web开源情报主题挖掘研究[J].中国电子科学研究院学报,2017,12(4):400-405. doi: 10.3969/j.issn.1673-5692.2017.04.013http://dx.doi.org/10.3969/j.issn.1673-5692.2017.04.013
HUANG S, GUO J G, LU Z J, et al. Study of web open source intelligence topic mining in military domain [J]. Journal of China Academy of Electronics and Information Technology, 2017, 12(4): 400-405. (in Chinese). doi: 10.3969/j.issn.1673-5692.2017.04.013http://dx.doi.org/10.3969/j.issn.1673-5692.2017.04.013
王明乾,倪林,张斌.基于文本分类的开源军事情报获取方法[J].情报探索,2021(7):17-23. doi: 10.3969/j.issn.1005-8095.2021.07.003http://dx.doi.org/10.3969/j.issn.1005-8095.2021.07.003
WANG M Q, NI L, ZHANG B. An open source military intelligence acquisition method based on text classification [J]. Information Research, 2021(7): 17-23. (in Chinese). doi: 10.3969/j.issn.1005-8095.2021.07.003http://dx.doi.org/10.3969/j.issn.1005-8095.2021.07.003
刘舆,曾德贤,胡远方,等.基于知识图谱的卫星情报分析方法研究[J].情报探索,2021(11):1-7. doi: 10.3969/j.issn.1005-8095.2021.11.001http://dx.doi.org/10.3969/j.issn.1005-8095.2021.11.001
LIU Y, ZENG D X, HU Y F, et al. Research on satellite intelligence analysis method based on knowledge graph [J]. Information Research, 2021(11): 1-7. (in Chinese). doi: 10.3969/j.issn.1005-8095.2021.11.001http://dx.doi.org/10.3969/j.issn.1005-8095.2021.11.001
ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification [C]//Proceedings of the Eighth International Joint Conference on Natural Language Processing. Taipei, China: ACL, 2017: 253-263. doi: 10.18653/v1/d16-1076http://dx.doi.org/10.18653/v1/d16-1076
LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning [C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. New York: IJCAI, 2016: 2873-2879. doi: 10.24963/ijcai.2017/473http://dx.doi.org/10.24963/ijcai.2017/473
ZHOU P, SHI W, TIAN J, et al. Attention-based bidirectional long short-term memory networks for relation classification [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Berlin: ACL, 2016: 207-212. doi: 10.18653/v1/p16-2034http://dx.doi.org/10.18653/v1/p16-2034
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017: 6000-6010.
郭东恩,夏英,罗小波,等.基于有监督对比学习的遥感图像场景分类[J].光子学报,2021,50(7):0710002.
GUO D E, XIA Y, LUO X B, et al. Remote sensing image scene classification based on supervised contrastive learning [J]. Acta Photonica Sinica, 2021, 50(7): 0710002. (in Chinese)
KHOSLA P, TETERWAK P, WANG C, et al. Supervised contrastive learning [C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2020: 1567.
GAO T, YAO X, CHEN D. SimCSE: simple contrastive learning of sentence embeddings [C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana: ACL, 2021: 6894-6910. doi: 10.18653/v1/2021.emnlp-main.552http://dx.doi.org/10.18653/v1/2021.emnlp-main.552
LIANG X B, WU L J, LI J T, et al. R-drop: regularized dropout for neural networks [C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. Virtual: NeurIPS, 2021: 10890-10905.
REED S, AKATA Z, YAN X C, et al. Generative adversarial text to image synthesis [C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: JMLR.org, 2016: 1060-1069.
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision [C]//Proceedings of the 38th International Conference on Machine Learning. Virtual: PMLR, 2021: 8748-8763.
周林鹏,姚剑敏,严群,等.融合多尺度特征及注意力机制的医学图像检索[J].液晶与显示,2021,36(8):1174-1185. doi: 10.37188/CJLCD.2020-0248http://dx.doi.org/10.37188/CJLCD.2020-0248
ZHOU L P, YAO J M, YAN Q, et al. Medicalimage retrieval with multiscale features and attention mechanisms [J]. Chinese Journal of Liquid Crystals and Displays, 2021, 36(8): 1174-1185. (in Chinese). doi: 10.37188/CJLCD.2020-0248http://dx.doi.org/10.37188/CJLCD.2020-0248
RAMESH A, DHARIWAL P, NICHOL A, et al. Hierarchical text-conditional image generation with CLIP latents [J/OL]. arXiv, 2022: 2204.06125.
刘婷婷,朱文东,刘广一.基于深度学习的文本分类研究进展[J].电力信息与通信技术,2018,16(3):1-7.
LIU T T, ZHU W D, LIU G Y. Advances in deep learning based text classification [J]. Electric Power Information and Communication Technology, 2018, 16(3): 1-7. (in Chinese)
韦人予.中文分词技术研究[J].信息与电脑,2020,32(10):26-29.
WEI R Y. Research on Chinese word segmentation technology [J]. China Computer & Communication, 2020, 32(10): 26-29. (in Chinese)
周钦强,孙炳达,王义.文本自动分类系统文本预处理方法的研究 [J].计算机应用研究,2005,22(2):85-86. doi: 10.3969/j.issn.1001-3695.2005.02.029http://dx.doi.org/10.3969/j.issn.1001-3695.2005.02.029
ZHOU Q Q, SUN B D, WANG Y. Study on new pretreatment method for Chinese text classification system [J]. Application Research of Computers, 2005, 22(2): 85-86. (in Chinese). doi: 10.3969/j.issn.1001-3695.2005.02.029http://dx.doi.org/10.3969/j.issn.1001-3695.2005.02.029
MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [C]. 1st International Conference on Learning Representations. Scottsdale: ICLR, 2013.
SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting [C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2015: 802-810.
HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging [J/OL]. arXiv, 2015: 1508.01991.
ALLPORT A. Visual attention[M]//POSNER M I. Foundations of Cognitive Science. Cambridge: The MIT Press, 1989: 631-682.
HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors [J/OL]. arXiv, 2012: 1207.0580.
奉国和.文本分类性能评价研究[J].情报杂志,2011,30(8):66-70. doi: 10.3969/j.issn.1002-1965.2011.08.014http://dx.doi.org/10.3969/j.issn.1002-1965.2011.08.014
FENG G H. Review of performance evaluation of text classification [J]. Journal of Intelligence, 2011, 30(8): 66-70. (in Chinese). doi: 10.3969/j.issn.1002-1965.2011.08.014http://dx.doi.org/10.3969/j.issn.1002-1965.2011.08.014
0
浏览量
6
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构