最新刊期

卷 39 ，期 2 ， 2024

Image Segmentation

DFNet： efficient decoder-free semantic segmentation networks AI导读
LIU Lamei,DU Baochang,HUANG Huiling,ZHANG Yongjian,HAN Jun
Vol. 39, Issue 2, Pages: 121-130(2024) DOI: 10.37188/CJLCD.2023-0036

摘要：To tackle the challenges posed by the cumbersome computation and intricate decoding structure of codec semantic segmentation networks， we present a novel decoder-free binary semantic segmentation model DFNet. By discarding the complex decoding structure and jump connections that are ubiquitous in conventional segmentation networks， our model adopts a convolutional remolding upsampling method to directly reshape feature coding and obtain precise segmentation results， significantly streamlining the network architecture. Moreover， our encoder integrates a lightweight dual attention mechanism EC&SA to facilitate the effective communication of channel and spatial information， bolstering the network’s coding capability. To further enhance the model’‍s segmentation accuracy， we replace the traditional segmentation loss with PolyCE loss， a powerful tool that resolves the issue of positive and negative sample imbalance. Experimental results on binary segmentation datasets such as DeepGlobe road segmentation and Crack Forest defect detection show that the segmentation accuracy F1 mean and IoU mean of this model reach 84.69% and 73.95%， respectively， and the segmentation speed is as high as 94 FPS， which far exceeds the mainstream semantic segmentation model and greatly improves the efficiency of the segmentation task.

关键词：Binary Segmentation;Convolution Remolding Upsampling;EC&SA;PolyCE;Road Segmentation;defect detection

233

|

11

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36411265 false

发布时间：2024-02-27
Semi-supervised spatial spectral local discriminant analysis for hyperspectral image feature extraction AI导读
LÜ Huanhuan,HUANG Yucheng,ZHANG Hui,WANG Yali
Vol. 39, Issue 2, Pages: 131-145(2024) DOI: 10.37188/CJLCD.2023-0054

摘要：Making full use of the spatial spectral features contained in hyperspectral images， a hyperspectral image feature extraction algorithm （S₄LFDA） for semi-supervised spatial spectral local discriminant analysis is proposed. In view of the spatial consistency of hyperspectral datasets， the pixels are first spatially reconstructed to preserve the neighbor relationship of hyperspectral data， and the spectral information divergence is introduced to reconstruct the similarity between cells. In order to make full use of a large number of unlabeled samples to improve the performance of the algorithm， the fuzzy C-means clustering algorithm is used to cluster the samples to obtain pseudo-labels. Then， the normalization term is added to the intra-class divergence matrix and interclass divergence matrix of the local FDA algorithm to maintain the consistency of the cluster structure of the unlabeled samples. Finally， the local FDA algorithm is used to maximize the interclass divergence and minimize the intra-class divergence of the labeled samples and solve the best projection vector. The S₄LFDA algorithm not only maintains the divisibility of the data set in the spectral domain， but also maintains the neighbor relationship of the pixels in the spatial region， rationally uses labeled samples and unlabeled samples， and improves the classification performance of the algorithm. Experiments are carried out in Pavia University and Indian Pines， and the overall classification accuracy reaches to 95.60% and 94.38%， which effectively improves the performance of feature classification compared with other dimensional reduction algorithms.

关键词：Hyperspectral image;Semi-supervision;Spatial spectrum;Discriminant analysis;feature extraction;Feature classification

223

|

0

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36013648 false

发布时间：2024-02-27
Mural sketch extraction based on multispectral data and fused pixel difference convolution AI导读
ZHANG Huanhuan,WANG Huiqin,WANG Ke,WANG Zhan,ZHEN Gang,HE Zhang
Vol. 39, Issue 2, Pages: 146-156(2024) DOI: 10.37188/CJLCD.2023-0110

摘要：Extracting the line drawings of ancient frescoes using existing edge detection methods suffers from high noise interference and more information loss. In this paper， we propose a fusion pixel difference convolution method to extract the optimal band of mural lines. The minimum noise separation method is used to separate the effective information and noise from the multispectral data of the mural， and the optimal principal component band is selected for the extraction of the line art. For the problem of traditional convolution to extract image gradient information， pixel difference convolution is introduced to improve the image gradient information for edge detection. A scale enhancement module （SEM） is added to the side output network to enrich the multiscale features. Meanwhile， for the pixel misclassification issues caused by pixel level imbalance， Dice loss function strategy based on image similarity is designed to minimize the pixel distance step by step to obtain clear image edges， and the mural dataset prior knowledge fine-tuning model is used to solve the problem of insufficient dataset. The experimental results show that the method in this paper can extract clearer line drawings in scenes with faded and noisy murals， and the SSIM and RMSE of the line drawing images are better than other algorithms， improving 2%~10% and 2%~4%， respectively， compared with PiDiNet. The model is validated on the public dataset BIPED， and the ODS and OIS of the proposed method are improved compared with PiDiNet by 0.005 and 0.007， respectively. The method can extract clear and complete line images for faded and diseased murals.

关键词：Sketch extraction;spectral imaging;Pixel difference convolution;Pixel-level balancing;Mural

221

|

6

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36411361 false

发布时间：2024-02-27
Retargeting method based on principal component analysis and image blocking AI导读
PENG Yanfei,WANG Jing,LIU Xiaoxuan,GONG Shengjie
Vol. 39, Issue 2, Pages: 157-167(2024) DOI: 10.37188/CJLCD.2023-0104

摘要：Aiming at the problem of poor visual effect and slow processing speed of existing image retargeting methods， a content-aware image retargeting method based on principal component analysis and blocking is proposed. First， the principal component analysis method is used to fuse the gradient map and the saliency map to extract more abundant image features to avoid the distortion of the main information. Then， the adjacent seams are replaced by the mean value to avoid pixel incoherence. Finally， according to the size of the column energy value in the energy map， the image is divided into salient regions and non-salient regions， and the blocks are scaled in parallel to pay more attention to image features and improve operating efficiency. The experimental analysis is carried out on the MIT RetargetMe， DUT-OMRON and NJU2000 datasets， and the subjective perception， the objective factor running time and SIFT-flow are used as evaluation indicators to compare with several commonly used algorithms. The experimental results show that the method ensures the integrity of the image subject information， and the average running time is 1/3 of the seam carving algorithm. The proposed method not only has better visual effect， but also can reduce the computational complexity.

关键词：principal component analysis;energy map;blocking;seams;scaling

236

|

3

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36013800 false

发布时间：2024-02-27

Image Enhancement

Text-to-image method based on XLnet and DMGAN AI导读
ZHAO Zewei,CHE Jin,LÜ Wenhan
Vol. 39, Issue 2, Pages: 168-179(2024) DOI: 10.37188/CJLCD.2023-0076

摘要：In order to solve the problem that the text encoder cannot dig the text information deeply in the task of text image generation， which leads to the semantic inconsistency of the subsequent generated images， a text image generation method is proposed based on improved DMGAN model. Firstly， XLnet’s pre-training model is used to encode the text. This model can capture a large number of prior knowledge of the text under the pre-training of large-scale corpus， and realize the deep mining of context information. Then， the channel attention module is added to initial stage of image generation by DMGAN model and the image refinement stage to highlight important feature channels， and further improve the semantic consistency and spatial layout rationality of the generated images， as well as the convergence speed and stability of the model. Experimental results show that in comparison with original DMGAN model， the image on CUB dataset generated by the proposed model has a 0.47 increase in the IS index and a 2.78 decrease in the FID in dex， which fully indicates that the model has better cross-mode generation ability.

关键词：text-to-image;XLnet model;Generate adversarial networks;attention of channel

228

|

1

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36015293 false

发布时间：2024-02-27
Text-to-image generation method based on self-supervised attention and image features fusion AI导读
LIAO Yonghui,ZHANG Haitao,JIN Haibo
Vol. 39, Issue 2, Pages: 180-191(2024) DOI: 10.37188/CJLCD.2023-0107

摘要：Current hierarchical text-to-image generation methods only use up-sampling for feature extraction during the initial image generation stage， but up-sampling process is essentially convolutional operations， and the limitations of convolutional operations can cause global information to be ignored and remote semantics to be unable to interact. Although there have been methods to add self-attention mechanisms to models， there are still problems such as lack of image details， image structural errors， and so on. In response to the above existing problems， a generation countermeasure network model SAF-GAN based on self-supervised attention and image feature fusion is proposed. A self-supervised module based on ContNet is added to the initial feature generation stage， and attention mechanism is used for autonomous mapping learning between image features. The dynamic attention matrix is guided by the context relationship of features， achieving a high combination of context mining and self-attention learning， which improves the feature generation effect of low resolution images， and subsequently refines and generates high-resolution images through alternating training of networks at different stages. At the same time， the feature fusion enhancement module is added. By fusing low resolution features of previous stage of the model with features of the current stage， the generation network can make full use of the high semantic information of low level features and high resolution information of the high level features. The semantic consistency of feature maps with different resolutions is further guaranteed， so as to achieve the high-resolution realistic image generation.Experimental results show that in comparison with benchmark model （AttnGAN）， the IS score of the SAF-GAN model is increased by 0.31 and the FID index is decreased by 3.45 on the CUB dataset， while the IS score of the SAF-GAN model is increased by 2.68 and the FID index is decreased by 5.18 on the COCO dataset. It is concluded that the proposed model can effectively generate more realistic images， which proves the effectiveness of the proposed method.

关键词：computer vision;generative adversarial networks;text-to-image;CotNet;Image feature fusion

253

|

15

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36012686 false

发布时间：2024-02-27

Object Tracking and Recognition

Advances in twin network research in visual tracking technology AI导读
HE Zemin,ZENG Juntao,YUAN Baoxi,LIANG Dejian,MIAO Zongcheng
Vol. 39, Issue 2, Pages: 192-204(2024) DOI: 10.37188/CJLCD.2023-0113

摘要：In the field of computer vision， twin network-based tracking algorithms improve accuracy and speed in comparison with traditional algorithms， but they are still affected by target occlusion， deformation， and environmental changes， which leads to the performance degradation of twin network-based tracking algorithms. In order to gain an in-depth understanding of the single target tracking algorithm based on twin networks， the existing target tracking algorithms based on twin networks are summarized and analyzed， mainly including the introduction of attention mechanism method， hyper-parameter inference method and template update method in twin networks， which reviews target tracking algorithms of these three methods and introduces in detail the research and development status of algorithms based on twin networks at home and abroad in recent years. The representative algorithms of the three aspects are experimentally compared using VOT2016， VOT2017， VOT2018 and OTB-2015 datasets to obtain the performance of multiple twin network-based target tracking algorithms. Finally， the twin network-based target tracking algorithms are summarized and the future development direction is prospected.

关键词：computer vision;target tracking;siamese networks;deep learning

243

|

4

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36209241 false

发布时间：2024-02-27
Method of vehicle license plate recognition in haze weather based on deep learning AI导读
YANG Yun,WANG Jing,JIANG Jiale
Vol. 39, Issue 2, Pages: 205-216(2024) DOI: 10.37188/CJLCD.2023-0123

摘要：A deep learning based license plate number recognition method is proposed to address the issues of low accuracy and missed detection in license plate recognition under haze weather. Firstly， the AOD-Net algorithm is used to pre-process the vehicle image for defogging. Then， a license plate detection network ACG_YOLOv5s is designed based on YOLOv5 network. ACG_YOLOv5s integrates CBAM attention mechanism on the basis of YOLOv5s network to improve the model’‍s anti-interference ability. An adaptive feature fusion network （ASFF） is introduced， which assigns weights to different feature layers of the network based on the weights adaptively learned by the model， thereby highlighting important feature information. The traditional convolution is replaced with Ghost convolution module and the number of parameters during network training is reduced while ensuring model performance. Finally， LPRNet is used to recognize the detected license plate images. The experimental results indicate that the improved ACG_YOLOv5s network has a license plate detection accuracy of 99.6%， LPRNet recognition accuracy of 96%， and a small memory footprint. The combination of AOD-Net algorithm and YOLO algorithm can more effectively detect license plate numbers in license plate images under haze weather.

关键词：License plate number recognition;AOD-Net algorithm;YOLOv5 network;attention mechanism

258

|

7

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36411415 false

发布时间：2024-02-27
Human posture estimation and movement recognition in fitness behavior AI导读
FU Huichen,GAO Junwei,CHE Luyang
Vol. 39, Issue 2, Pages: 217-227(2024) DOI: 10.37188/CJLCD.2023-0127

摘要：Human pose estimation and motion recognition have important application value in the fields of security， medical treatment and sports. In order to solve the problem of human pose estimation and motion recognition of various movements under complex background， an improved YOLOv7-POSE algorithm is proposed， and data sets of various shooting angles are made by oneself for training. Based on YOLOv7， this algorithm adds classification function to original network model. CA convolutional attention mechanism is introduced into Backbone network， which improves recognition ability of important features in the classification of human bone nodes and actions. The CBS convolution kernel of original model is replaced by HorNet network structure， which improves detection accuracy of human key points and accuracy of action classification. The pyramidal structure of the Head layer is replaced by pyramidal structure of empty space， which improves the precision and speeds up model convergence. The regression function of target detection box is replaced by CIOU with EIOU， which improves the precision of coordinate regression. The data sets of bodybuilding movements under complex background and various shooting angles are made by self-shooting， and the comparison experiment is carried out on the self-made data set. Experimental results show that mAP of the improved Yolov7-POSE on the test set is 95.7%， 4% higher than that of original YOLOv7 algorithm. The recognition accuracy of all kinds of movements increases significantly， and the detection of key point errors and omissions decreases significantly.

关键词：image processing;key point detection;pose estimation;convolutional attention mechanism;Atrous spatial pyramid pooling

228

|

7

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36411152 false

发布时间：2024-02-27
SIFT fast image matching algorithm with local adaptive threshold AI导读
WANG Yin,JIANG Zheng,LIU Bin
Vol. 39, Issue 2, Pages: 228-236(2024) DOI: 10.37188/CJLCD.2023-0085

摘要：Aiming at the problems of complex traditional SIFT matching algorithm， many feature redundancy points， and difficulty in meeting real-time performance， this paper proposes a SIFT fast image matching algorithm with local adaptive threshold. Based on the SIFT algorithm， the proposed method optimizes the construction of Gaussian pyramids， eliminates redundant feature points by reducing the number of pyramid layers to improve the detection efficiency. The threshold in the FAST algorithm is extracted according to the local contrast of the image， so as to achieve high-quality feature point detection. The feature points with strong robustness are screened out for more accurate matching. Secondly， a Gaussian circular window is used to establish a 32-dimensional dimensionality reduction feature vector to improve the operation efficiency of the algorithm. Finally， the feature points are purified according to the geometric consistency between the matching feature point pairs， which effectively reduces the false matching. The experimental results show that the comprehensive performance of the proposed method in terms of matching accuracy and computational efficiency is better than that of SIFT algorithm and other comparative matching algorithms， and the matching accuracy is improved by about 10% and the algorithm execution time is shortened by about 49% compared with the traditional SIFT algorithm. The correct matching rate is above 93% in the case of image scale， rotation and lighting change.

关键词：SIFT algorithm;Gaussian pyramid;Adaptive thresholds;Feature descriptor;image matching

242

|

5

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 36015248 false

发布时间：2024-02-27

Object Detection

Defect detection algorithm of improved YOLOv5s solar cell AI导读
PENG Xueling,LIN Shanling,LIN Zhixian,GUO Tailiang
Vol. 39, Issue 2, Pages: 237-247(2024) DOI: 10.37188/CJLCD.2023-0249

摘要：Aiming at the problem of low accuracy of the method for solar cell defect detection， a surface defect detection algorithm based on the improved YOLOv5s solar cell is proposed. First， in order to solve the problem of small target defect detection on the cell sheet， the Contextual Transformer Network （CoT） is proposed， which can provide global contextual information for small targets and the model better at predicting small targets. Secondly， by adding CBAM attention to the C3 module in the Head part， the important channels and spatial locations of the input feature maps can be better captured to improve the performance and robustness of the model. Next， the integrity of feature information is ensured by using CARAFE， a lightweight generalized up-sampling operator， to reduce the loss of feature information during up-sampling. Finally， by using WIoU as the bounding box loss function， the accuracy of the regression can be greatly improved and the convergence of model can be achieved quickly. The experimental results show that compared with the original algorithm， the improved YOLOv5s improves the three indicators of Precision， Recall， and mAP@0.5 by 5.5%， 4.1%， and 3.3% respectively， and the detection speed reaches 76 FPS， which meets the requirements of solar cell defect detection.

关键词：solar cell;YOLOv5s;Contextual Transformer Network;CARAFE;loss function

22

|

0

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 43464517 false

发布时间：2024-02-27
2， the measurement accuracy of full field of view is about 32 μm， and the measurement accuracy of center field of view is 10 μm. It has good stability and repeatability， and can meet the requirements of most industrial testing applications， having a broad application prospect.","issue":"2","siteId":66,"isCollected":false,"keywordExt":"","keyword":"structured light;three-dimensional measurement;Telecentric lens;high precision","publicationName":"Chinese Journal of Liquid Crystals and Displays","publishDate":"2024-02-27","ppubDate":"2024-02-05","oa":0,"part":"Object Detection","graphicAbstract":0,"referenceStatus":0,"coverId":"","pictureUrl":"","showImg":0,"showAuthor":1,"docsource":"平台生产","vasDocsource":null,"notdsjcj":true,"dsjcj":false,"showDocSource":true,"vasDownloadLink":"","vasDocLink":"","volume":"39","type":"article","coverArticle":0,"analysisStatus":1,"sequence":null,"enhance":0,"elocationId":null,"wechatRelation":0,"pageOrElocationId":"pages:248-256","yearAndVolumeAndIssueStr":"Vol. 39, Issue 2, Pages: 248-256(2024) DOI: 10.37188/CJLCD.2023-0136","formatType":"","version":"","firstStatus":0,"articleType":null,"cstr":null,"detailUrl":"/en/article/doi/10.37188/CJLCD.2023-0136/","referenceCount":0,"downloads":5,"articleScore":0,"score":0,"rateEnable":false})'>High-resolution three-dimensional measurement system based on structured light illumination AI导读
QI Hao,DONG Jian,ZHAO Nan,YU Yi
Vol. 39, Issue 2, Pages: 248-256(2024) DOI: 10.37188/CJLCD.2023-0136

摘要：In order to achieve high precision 3D measurement of small objects， a set of 3D shape measurement system based on structured light illumination is constructed in this paper. The phase coding algorithm， the calibration algorithm of telecentric camera and the calibration algorithm of projector used in this system are investigated. First， the coordinate of feature points of two-dimensional planar target is obtained by edge extraction algorithm， and an improved Zhang's calibration algorithm is used to complete the calibration of the telecentric camera， and the mapping relationship between camera pixels and projector pixels is obtained by using phase encoded structural light. Then， the mapping relationship enables the projector to capture the position information of feature points， and then complete the calibration of projector. Finally， based on stereo vision model， the object is reconstructed in three dimensions. The measured results show that the calibrated measuring system has a field of view greater than 2 000 cm²， the measurement accuracy of full field of view is about 32 μm， and the measurement accuracy of center field of view is 10 μm. It has good stability and repeatability， and can meet the requirements of most industrial testing applications， having a broad application prospect.

关键词：structured light;three-dimensional measurement;Telecentric lens;high precision

189

|

5

|

0

<HTML>
<L-PDF><WORD><Meta-XML>

<引用本文> <批量引用> 37326204 false

发布时间：2024-02-27

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China 130033 Postal code：130033
Tel：0431-86176059 Email：yjxs@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰

最新刊期

DFNet： efficient decoder-free semantic segmentation networks AI导读

Semi-supervised spatial spectral local discriminant analysis for hyperspectral image feature extraction AI导读

Mural sketch extraction based on multispectral data and fused pixel difference convolution AI导读

Retargeting method based on principal component analysis and image blocking AI导读

Text-to-image method based on XLnet and DMGAN AI导读

Text-to-image generation method based on self-supervised attention and image features fusion AI导读

Advances in twin network research in visual tracking technology AI导读

Method of vehicle license plate recognition in haze weather based on deep learning AI导读

Human posture estimation and movement recognition in fitness behavior AI导读

SIFT fast image matching algorithm with local adaptive threshold AI导读

Defect detection algorithm of improved YOLOv5s solar cell AI导读