最新刊期

    11 2023

      Object Detection and Recognition

    • LIU Ying,SUN Hai-jiang,ZHAO Yong-xian
      Vol. 38, Issue 11, Pages: 1455-1467(2023) DOI: 10.37188/CJLCD.2023-0030
      摘要:The small target in infrared images has fewer pixels and lack of obvious feature details in complex scenes, which make it difficult to extract target features and usually has low detection accuracy. This paper proposes an infrared small target detection method based on attention mechanism under complex background. Based on YOLOv5 (You Only Look Once) network, SimAMC3 attention mechanism module is designed to optimize the feature extraction layer of the network. The target detection head is designed by adding a feature fusion layer to change the depth of feature extraction, a new weak target detection layer can be obtained, so that the shallow feature layer can better retain the spatial information of the weak target. Finally, the screening method of prediction box is improved to increase the detection accuracy of objects with close distance or overlapping. In the experiment, two SIRST infrared dim-small target image datasets are selected to label and train them. The experimental results show that compared with the original YOLOv5 algorithm, the improved algorithm improves the mean average accuracy (mAP) by 4.8% and 7.1%. It can effectively detect infrared dim-small targets under different complex backgrounds, reflecting good robustness and adaptability, so it can be effectively applied to detect infrared dim-small target.  
      关键词:deep learning;infrared dim-small target;target detection;attention mechanism   
      373
      |
      16
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33681731 false
      发布时间:2023-11-03
    • XIAO Zhen-jiu,ZHAO Hao-ze,ZHANG Li-li,XIA Yu,GUO Jie-long,YU Hui,LI Cheng-long,WANG Li-wen
      Vol. 38, Issue 11, Pages: 1468-1480(2023) DOI: 10.37188/CJLCD.2023-0005
      摘要:In the object detection task, there is no correlation between the regression content of the traditional bounding box regression loss function and the evaluation standard IoU(Intersection over Union), and there is some irrationality for the regression attribute of the bounding box, which reduces the detection accuracy and convergence speed. In addition, the sample imbalance also exists in the regression task, and a large number of low-quality samples affect the loss function convergence. In this paper, a novel loss function, termed as CRIoU(Complete Relativity Intersection over Union),is proposed to improve the detection accuracy and convergence speed. Firstly, this work determines the design idea and determines the improved IoU loss function normal form. Secondly, on the basis of IoU loss, the ratio of the perimeter of the rectangle formed by the two center points and the minimum closure area formed by the two frames is introduced as the penalty term for the distance between the center points of the bounding box, and the improved IoU loss is applied to the non-maximum suppression.Then, the side error of the two frames and the side square of the minimum bounding box are introduced as the side penalty term,a novel loss function, termed as CRIoU(Complete Relativity Intersection over Union), is proposed. Finally, on the basis of CRIoU, an adaptive weighting factor is added to weight the regression loss of high-quality samples, and an AF-CRIoU (Adaptive focal CRIoU) is defined. The experimental results show that the detection accuracy of the AF-CRIoU loss function compared with the traditional non IoU series loss is up to 8.52%, the detection accuracy of the CIoU series loss is up to 2.69%, and the A-CRIoU NMS (Around CRIoU Non Maximum Suppression) method compared with the original NMS method is up to 0.14%. In addition, AF-CRIoU loss is applied to the detection of safety helmet, which also achieves good detection results.  
      关键词:object detection;bounding box regression;IoU loss function;Non-maximum suppression;Adaptive focal loss   
      428
      |
      12
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33598329 false
      发布时间:2023-11-03
    • WANG Ting-yu,WANG Zhi-yi,YANG Yong-qiang,MI Xiao-tao,WANG Jian-li,YAO Kai-nan,CHENG Xue
      Vol. 38, Issue 11, Pages: 1481-1489(2023) DOI: 10.37188/CJLCD.2023-0045
      摘要:In order to solve the problem that traditional laser differential confocal microscope(LDCM) can not measure the angle of tilt with high precision while measuring the distance, a kind of angle measurement sensor based on LDCM was proposed. When measuring the inclined surface, the sensor first uses the differential response signal obtained by axial scanning to accurately locate the focal position, and then analyzes the pupil field intensity distribution of the microscope and extracts the peak position of the spot image, so as to achieve the accurate measurement of the inclination. First, the field distribution model of the focused beam reflected by the tilted surface to be measured was established, and the field intensity distribution in the pupil plane of the microscope was analyzed at different tilt angles. Then, on the basis of analyzing the characteristics of slanting spot, a method of extracting the peak position of spot using improved Meanshift algorithm is proposed. Finally, the effectiveness of the sensor for tilt angle measurement is verified by experiments. The experimental results show that the average error of the sensor is 0.011° for the tilt degree (0°~8°) and 0.128° for the tilt direction (0°~360°), which can meet the requirements of measuring tilt angle of measuring surface in the process of detecting three-dimensional surface by using LDCM. The sensor provides a new method for high precision contour measurement of free-form surface.  
      关键词:Non-contact optical probe;Differential confocal;3D detection;Tilt measurement;Peak extraction   
      316
      |
      21
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35265772 false
      发布时间:2023-11-03
    • ZHANG Xin,QIAO Ji-hong,ZHANG Hui-yan,ZHANG Yan,ZHANG Xin,XU Ji-ping
      Vol. 38, Issue 11, Pages: 1490-1502(2023) DOI: 10.37188/CJLCD.2023-0007
      摘要:In order to fully extract relevant features of color images and evaluate image color by simulating visual perception characteristics of human eye, an automatic evaluation method for camera subjective scene imaging color and white balance (CIQA for short) is proposed. Firstly, the corresponding position of ColorChecker standard twenty-four color cards in subjective image is identified, based on the combination of SIFT (Scale-invariant feature transform) and transmission transform,the corresponding position of ColorChecker standard twenty-four color cards in subjective image is identified. Aimed at constructing the deviation least square method model to calculate the weight distribution proportion of color restoration and white balance indicators,the expert grading method and entropy weight method are applied. The proximity between the schemes and positive and negative ideal schemes is calculated by optimized TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) method dependent upon multi attribute weights to realize ranking of the smartphones. Experiments are carried out on the pictures collected in real scenes in comparison with the two existing decision-making methods. The results show that the proposed method can improve evaluation efficiency and save manpower, which can obtain evaluation results that are consistent with subjective judgment of human eye.  
      关键词:target recognition;indicator;deviation least square method;color;smart phone   
      261
      |
      10
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35178627 false
      发布时间:2023-11-03
    • ZHAO Xiao,YANG Chen,WANG Ruo-nan,LI Yue-chen
      Vol. 38, Issue 11, Pages: 1503-1510(2023) DOI: 10.37188/CJLCD.2023-0046
      摘要:Aiming at the problems of large network model and low accuracy of ResNet18 network model in facial expression recognition, a Lightweight ResNet based on multi-scale CBAM (Convolutional Block Attention Module) attention mechanism (MCLResNet) is proposed, which can realize facial expression recognition with less parameters and higher accuracy. Firstly, ResNet18 is used as the backbone network to extract features, and group convolution is introduced to reduce the parameters quantity of ResNet18. The inverted residual structure is used to increase the network depth and optimized the effect of image feature extraction. Secondly,the shared fully connected layer in the channel attention module of CBAM is replaced with a 1×3 convolution module, which effectively reduces the loss of channel information. The multi-scale convolution module is added to the CBAM spatial attention module to obtain spatial feature information at different scales. Finally, multi-scale CBAM module (MSCBAM) is added to the lightweight ResNet model, which effectively increases the feature expression ability of the network model. In addition, a fully connected layer is added to the output layer of the network model introduced into MSCBAM, so as to increase the nonlinear representation of the model at the output. The experimental results of the model on FER2013dataset and CK+ dataset show that the parameters quantity of the model proposed in this paper is reduced by 82.58% compared with ResNet18,and the recognition accuracy is better.  
      关键词:Lightweight ResNet Network;Multi-scale spatial feature fusion;Facial expression recognition;attention mechanism   
      262
      |
      8
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35641900 false
      发布时间:2023-11-03
    • XU Yan-long,PAN Hao,DING Bai-yuan
      Vol. 38, Issue 11, Pages: 1511-1520(2023) DOI: 10.37188/CJLCD.2023-0052
      摘要:Synthetic aperture radar (SAR) image target recognition is an important application of SAR image interpretation. In order to improve the robustness of SAR target recognition, this paper proposed an attribute scattering center matching method based on deep belief network (DBN). The attribute scattering center had rich parameters, which could well reflect the local scattering characteristics of the target. DBN took advantage of deep learning to achieve robust matching between the scattering center sets from test samples and template samples, which could also better adapt to noise interference, partial absence and other situations. Based on the matching correspondence of the attribute scattering center sets, the similarity measure criterion was defined. The target label of the test sample was determined based on the principle of the maximum similarity. Experiments were carried out based on MSTAR dataset, and the proposed method was proved to be effective and robust for SAR target recognition.  
      关键词:synthetic aperture radar;target recognition;attribute scattering center;deep belief network   
      267
      |
      7
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35178523 false
      发布时间:2023-11-03
    • NIU Zhao-xu,SUN Hai-jiang
      Vol. 38, Issue 11, Pages: 1521-1530(2023) DOI: 10.37188/CJLCD.2023-0013
      摘要:In order to realize the acceleration of convolutional neural network in low-power, edge computing and other scenarios, a Winograd algorithm convolutional neural network accelerator based on field programmable gate array (FPGA) is designed. Firstly, the image data and weight data are quantized into 8-bit fixed-point numbers, and the quantization process in the hardware convolution calculation process is designed to improve data transmission speed and calculation speed. Secondly, the input data buffer multiplexing module is designed, which fuses the data of multiple input channels and transmits them, reusing the row overlapping data. Then, the Winograd pipeline convolution module is designed to realize the combined reuse of column data, so as to maximize the reuse of data on chip and reduce the occupation of data storage on chip and bandwidth pressure. Finally, the accelerator is deployed on the ZCU104 development board of Xilinx. Experimental verification shows that the convolution layer computing performance of accelerator reaches to 354.5 GOPS, and the on-chip DSP computing efficiency reaches to 0.69, which is more than 1.6 times higher than relevant research. The accelerator can complete remote sensing image classification task based on VGG-16 network with high energy efficiency ratio.  
      关键词:convolution neural network;field programmable gate array;Winograd algorithm;Assembly line;parallel computing   
      298
      |
      11
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33681767 false
      发布时间:2023-11-03
    • QI Yi-chen,ZHAO Wei-chao
      Vol. 38, Issue 11, Pages: 1531-1541(2023) DOI: 10.37188/CJLCD.2023-0056
      摘要:In order to improve the efficiency of obtaining open source aerospace information, and solve the problems of long open source aerospace information content, relatively limited quantity, poor robustness of commonly used text classification models, and unintuitive text information, this paper proposes a method for aerospace information text classification based on supervised contrastive learning. The method is based on the bidirectional long short-term memory (BiLSTM) network with the attention mechanism, integrates comparative learning technology, processes and analyzes open source information, efficiently screenes out aerospace information, and uses the unCLIP (un-Contrastive Language-Image Pre-Training) model to generate an image corresponding to the information. The experimental results show that compared with commonly used text classification methods such as CNN (Convolutional Neural Networks), BiLSTM, Transformer and BiLSTM-Attention, this method performes well in accuracy, recall and F1-Score, among them, F1-Score reaches 0.97. At the same time, information is presented in the form of images to make information clearer and more intuitive. It can make full use of open data resources on the network, effectively extract open-source space information and generate corresponding images, which is of great value to the analysis and research of aerospace information.  
      关键词:supervised text classification;contrastive learning;text-to-image synthesis;aerospace information   
      255
      |
      6
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35265732 false
      发布时间:2023-11-03
    • ZHANG Run-jiang,GUO Jie-long,YU Hui,LAN Hai,WAGN Xi-hao,WEI Xian
      Vol. 38, Issue 11, Pages: 1542-1553(2023) DOI: 10.37188/CJLCD.2022-0419
      摘要:In response to current phenomenon that all targets in incremental learning are fixed pose, this paper considers a more rigorous setting, i.‍e. online class incremental learning for multi-pose targets, which innovatively proposes an ignoring pose replay method to alleviate the catastrophic forgetting in facing multi-pose targets in online class incremental learning. Firstly, 2D/3D targets are point-clouded to facilitate the extraction of useful geometric information. Secondly, the network modifies for equivariance based on the SE(d)(d=2,3) group to enable the network to extract richer geometric information, thus reducing the impact of target poses on the model in each task. Finally, specific samples are sampled for replay to mitigate catastrophic forgetting based on loss variation. Experimental results show that when facing fixed posture targets MNIST and CIFAR-10, final average accuracy reaches to 88% and 42.6% respectively, which is comparable to the comparison method, and final average forgetting is significantly better than the comparison method, with a reduction of about 3% and 15% respectively. In the case of the multi-pose target RotMNIST and trCIFAR-10, the proposed method continues to perform well in fixed-pose targets, largely independent of target pose. In addition, the performance in 3D datasets ModelNet40 and trModelNet40 remains stable. The method proposed is able to be independent of the target pose in online class incremental learning, while achieving catastrophic forgetting mitigation, with excellent stability and plasticity.  
      关键词:Online class-incremental learning;Catastrophic forgetting;Ignoring pose replay;Equivariance;Point cloud classification   
      317
      |
      7
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33681660 false
      发布时间:2023-11-03

      Image Enhancement

    • WANG De-xing,YANG Yu-rui,YUAN Hong-chun,GAO Kai
      Vol. 38, Issue 11, Pages: 1554-1566(2023) DOI: 10.37188/CJLCD.2022-0382
      摘要:In order to solve the serious color bias and low contrast quality problems caused by light absorption and scattering, an underwater image enhancement method combining lightweight feature fusion network and multi-color model correction is proposed in this paper. Firstly, the feature fusion network of the encoder and decoder structure of the convolution layer is used to correct the color deviation of the underwater image. The improved feature fusion module in the network reduces the damage of the fully connected layer to the image spatial structure, protects the spatial features, and reduces the number of parameters of the module. At the same time, the improved attention module parallelizes the pooling operation to extract texture details and protect background information. Then, the multi-color model correction module is used to correct according to the relationship between pixels to further reduce the color deviation and improve the contrast and brightness.The experimental results show that compared with the latest image enhancement methods, the average value of NRMSE, PSNR and SSIM on the reference image dataset is improved by 9.30%, 3.70% and 2.30% than the second place of comparison algorithms, respectively. The average value of UCIQE, IE and NIQE on the non-reference image dataset is 6%, 2.9% and 4.5% higher than the second place of comparison algorithms. Combining subjective perception and objective evaluation, this method can correct color deviation of underwater images, improve contrast and brightness, and improve image quality.  
      关键词:image processing;Neural networks;attention mechanism;color model;encoding and decoding structure   
      315
      |
      10
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33681694 false
      发布时间:2023-11-03
    • LI Ming-qing,WANG Yu-qing,SUN Hai-jiang
      Vol. 38, Issue 11, Pages: 1567-1579(2023) DOI: 10.37188/CJLCD.2022-0423
      摘要:In the nonuniformity correction problem of infrared focal plane array detector (IRFPA), the traditional neural network algorithm will appear the image edge blur, low contrast, ghosting artifacts and other phenomena. Aiming at these phenomena, this paper proposes an improved neural network nonuniformity correction algorithm based on side window filtering. The algorithm first uses side window filtering on the input image to obtain the desired image, and protects the edge details of the target while removing the non-uniform noise to improve the image quality. On this basis, it can effectively avoid the ghosting artifacts problem of the corrected image by suppressing the local divergence of the correction parameters through the saturated nonlinear function. The experimental results show that the algorithm proposed in this paper can effectively remove the non-uniform noise in the image, and there is no obvious ghosting artifacts phenomenon. The average image roughness of the three groups of test image sequences is reduced by 30.17%. The maximum time consumption for continuous processing of 400 image sequences on the experimental computer is 37.417 0 s, which is 95.05% less than that of the comparison algorithm improved based on bilateral filtering, and 45.81% less than that of the comparison algorithm based on wavelet principal component analysis. The algorithm in this paper has obvious advantages in nonuniformity correction effect and algorithm operation efficiency, which provides a new research idea for real-time nonuniformity correction on mobile platforms with small computational power and low power consumption.  
      关键词:infrared focal plane array;nonuniformity correction;side window filter;neural network   
      4
      |
      2
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 43699243 false
      发布时间:2023-11-03
    • SONG Wei,SHI Li-biao,GENG Li-jia,MA Zhen-ling,DU Yan-ling
      Vol. 38, Issue 11, Pages: 1580-1589(2023) DOI: 10.37188/CJLCD.2022-0387
      摘要:Image geometric distortion correction is a key pre-processing step for many computer vision applications. Current geometric distortion correction methods based on deep learning mainly solve the problems of single distortion correction of images. For this reason, this paper proposes a hybrid distortion correction method for images with improved U-Net networks. Firstly, a method is proposed to construct hybrid distortion image datasets, which solves the problems of sparse training dataset and single distortion type. Secondly, the U-Net with spatial attention for image feature extraction and reconstruction of the distortion coordinate map is used to turn the image correction problem into a prediction problem of the pixel-by-pixel coordinate displacement change of the distorted image, and a loss function combining coordinate difference loss and image resampling loss is designed to effectively improve the correction accuracy. Finally, the performance of each module of the method in this paper is verified by ablation experiments. Compared with the latest deep learning-based distortion correction methods, the experimental results show that the method in this paper has better performance in terms of quantitative indexes and subjective evaluation, and reaches 0.251 9 of the mean absolute error for coordinate correction of distorted images. Calibration experiments on the optical images acquired by GoPro cameras have further verified the effectiveness of the proposed method in practice.  
      关键词:Hybrid distortion correction;U-Net;spatial attention;Coordinate difference loss;Resampling loss   
      318
      |
      12
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 33596549 false
      发布时间:2023-11-03

      Image Segmentation

    • JIANG Shi-yi,XU Yang,LI Dan-yang,FAN Run-ze
      Vol. 38, Issue 11, Pages: 1590-1599(2023) DOI: 10.37188/CJLCD.2023-0010
      摘要:The traditional semantic segmentation knowledge distillation schemes still have problems such as incomplete distillation and insignificant feature information transmission which affect the performance of network, and the complex situation of knowledge transferred by teachers' network which makes it easy to lose the location information of feature. To solve these problems, this paper presents feature refine semantic segmentation network based on knowledge distillation. Firstly, a feature extraction method is designed to separate the foreground content and background noise in the distilled knowledge, and the pseudo knowledge of the teacher network is filtered out to pass more accurate feature content to the student network, so as to improve the performance of the feature. At the same time, the inter-class distance and intra-class distance are extracted in the implicit encoding of the feature space to obtain the corresponding feature coordinate mask. Then, the student network minimizes the output of the feature location with the teacher network by simulating the feature location information, and calculates the distillation loss with the student network respectively, so as to improve the segmentation accuracy of the student network and assist the student network to converge faster. Finally, excellent segmentation performance is achieved on the public datasets Pascal VOC and Cityscapes, and the MIoU reaches 74.19% and 76.53% respectively, which is 2.04% and 4.48% higher than that of the original student network. Compared with the mainstream methods, the method in this paper has better segmentation performance and robustness, and provides a new method for semantic segmentation knowledge distillation.  
      关键词:semantic segmentation;neural network;knowledge distillation;feature refine;deep learning   
      264
      |
      7
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 35178660 false
      发布时间:2023-11-03
    • HUANG Cong,ZOU Yao-bin
      Vol. 38, Issue 11, Pages: 1600-1614(2023) DOI: 10.37188/CJLCD.2022-0427
      摘要:For images with bimodal gray-level histogram, the traditional two-dimensional histogram threshold segmentation method is more effective, but when gray-level histogram is non-peak, unimodal or multimodal, their segmentation results are poor. Considering that the two-dimensional survival function obtained by two-dimensional histogram mapping has the advantages of continuous density and uniform morphology, a fast two-dimensional cumulative residual Tsallis entropy threshold segmentation method is proposed based on the two-dimensional survival function of images. The method firstly constructs a two-dimensional survival function based on the two-dimensional histogram, and then a two-dimensional cumulative residual Tsallis entropy objective function is defined to compute the segmentation threshold on the basis of the two-dimensional survival function. Further, a recursive algorithm is used to reduce time complexity of calculating the objective function to O(L2). Finally, based on the two-dimensional cumulative residual Tsallis entropy criterion in recursive form, an optimal threshold vector is obtained for threshold segmentation. In 26 synthetic images and 76 real-world images, the proposed method is compared with two fast two-dimensional threshold segmentation methods, two clustering segmentation methods and one active contour segmentation method respectively under two indicators of time and misclassification error (ME). Experimental results show that the time is shortened by 0.013 s, and ME value is reduced by 0.051~0.089 on average in comparison with the method of performance 2 in both synthetic and real-world images. The proposed fast two-dimensional cumulative residual Tsallis entropy threshold segmentation method is not only superior to the 5 comparison methods in computational efficiency, but also has relatively obvious advantages in segmentation adaptability and segmentation accuracy.  
      关键词:threshold segmentation;Two-dimensional histogram;Two-dimensional survival function;Cumulative residual Tsallis entropy;Fast recursive algorithm   
      232
      |
      14
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 38194359 false
      发布时间:2023-11-03
    0