最新刊期

    40 11 2025

      Material Physics

    • Synthesis strategies for soft-matter quasicrystals AI导读

      最新研究发现,软物质准晶通过分子自组装实现,大幅降低合成难度,为功能材料设计提供新思路。专家提出了两种构建思路,为软准晶理性设计与应用提供参考。
      LIU Yaohong, XIE Mingsi, CHEN Peng, LIU Heng, JIA Gaojun, ZHANG Chunxiu, YU Haifeng
      Vol. 40, Issue 11, Pages: 1569-1587(2025) DOI: 10.37188/CJLCD.2025-0169
      摘要:Soft matter quasicrystals are realized through molecular self-assembly, which confers unique advantages in design and regulation. This approach not only significantly reduces synthesis complexity but also offers novel pathways for achieving multiscale order and functionalized materials. Currently, quasicrystalline phases have been observed in soft matter systems such as dendrimers, ABC tridentate star polymers, block copolymers, surfactants, and binary colloidal nanoparticles, predominantly forming dodecagonal axisymmetric quasicrystals. Research on soft matter quasicrystals intersects with disciplines including supramolecular chemistry, organic chemistry, colloid chemistry, polymer chemistry, and soft matter physics. The formation of soft matter quasicrystals relies on sophisticated molecular design and the regulation of hierarchical self-assembly structures. As novel self-assembled systems, their unique non-periodic ordered structures offer fresh perspectives for functional material design. Despite a few examples of quasicrystals discovered in block copolymers and dendritic macromolecules, current research remains largely focused on structural analysis of serendipitously obtained phases. The rational design and synthesis of quasicrystals from first principles remains a major challenge in the field. Building on insights from metallic quasicrystals, there is an urgent need to establish new research paradigms for soft matter quasicrystals. Such paradigms could significantly accelerate the rational design and synthesis of thermodynamically stable or long-lived metastable soft quasicrystalline structures, thus providing new material platforms for applications in diverse fields. This paper systematically reviews the fundamental concepts and research background of quasicrystals. Building upon a summary of the conditions for soft matter quasicrystal formation and typical case studies, it focuses on exploring construction strategies for soft matter quasicrystals. Two approaches for designing soft matter quasicrystals are proposed: an empirical method based on approximating quasicrystal analogues and a theoretical approach based on computational prediction through reverse synthesis. These strategies provide a reference for the rational design and functional applications of soft quasicrystals.  
      关键词:soft matter quasicrystals;self-assembly;synthesis;twelvefold rotational symmetry;quasiperiodic structure   
      164
      |
      119
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126823117 false
      更新时间:2025-12-12
    • 最新研究通过有机改性Laponite纳米片,优化了超薄封装薄膜的阻隔性和黏附性,为柔性封装技术发展提供新思路。
      LI Xuyang, ZHANG Jinhui, XU Haifei, CHENG Jin, CHEN Zuxin, LIANG Haifeng, ZHU Xueliang, XUE Jianshe, YUAN Guangcai, YU Zhinong
      Vol. 40, Issue 11, Pages: 1588-1596(2025) DOI: 10.37188/CJLCD.2025-0202
      摘要:To address the loss of flexibility and increased risk of side penetration in commercially applied inorganic/organic composite encapsulation films—issues often resulting from overly thick organic layers, this study investigates strategies for synergistically optimizing the barrier properties and interlayer adhesion of thinned organic layers. We constructed a thinned "sandwich-structured" composite film by incorporating synthesized lithium magnesium silicate (Laponite) nanosheets, both pristine and modified with cetyltrimethylammonium chloride(CTAC), into a polymethyl methacrylate(PMMA) matrix, which was then laminated with silicon nitride(SiNx) layers. The results demonstrate that the incorporated nanosheets significantly enhance the water vapor barrier performance by creating a tortuous diffusion path. Compared to the pristine Laponite system, the CTAC-modified nanosheets, owing to their hydrophobic surface and superior dispersion within the PMMA matrix, led to a remarkable improvement in the overall performance of the CTAC-Laponite-PMMA composite. The resulting film exhibited an ultra-low surface roughness of 0.312 nm, ensuring excellent interfacial adhesion with the adjacent inorganic layers. Consequently, the water vapor transmission rate(WVTR) of the entire sandwich structure was further reduced to 5.90×10-5 g/(m2·day). This work demonstrates a novel approach to co-optimizing barrier properties and interfacial adhesion in thinned organic layers via organically modified nanosheets, offering a promising pathway for developing high-performance, ultra-thin flexible encapsulation.  
      关键词:thin film encapsulation;nanosheets;water vapor transmission rate;organically modified   
      132
      |
      127
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 131313015 false
      更新时间:2025-12-12

      Device Physics and Device Preparation

    • 在全息成像、近眼AR/VR、光通信等领域,高频脉冲驱动下的高分辨率液晶空间光调制器应用广泛。但液晶盒快速响应性能不足和输出抖动仍是技术难题。现有研究多关注静态或低频缓变电场,高频脉冲驱动与开关动力学的复杂关系尚未明晰。
      GE Boqun, HU Yongbing, WEI Sui, SHEN Chuan, HAO Xuefeng, ZHONG Zhaohan
      Vol. 40, Issue 11, Pages: 1597-1605(2025) DOI: 10.37188/CJLCD.2025-0168
      摘要:High-resolution liquid crystal spatial light modulators driven by high-frequency pulses are widely used in holographic imaging, near-eye AR/VR, optical communication and other fields, but the lack of fast response performance and output jitter of the liquid crystal cell are still technical problems that need to be solved urgently. Existing studies are mostly limited to static or low-frequency slow electric fields, and the complex relationship between high-frequency pulse driving and switching dynamics is not clear. Based on the Ericksen-Leslie dynamics theory of nematic liquid crystals, the characteristic curves of equations under inertia or viscosity dominated has been discussed and simulated in this paper. As mentioned previously, a parallel-aligned liquid crystal cell with a cell thickness of approximately 1 μm was used, driven by rectangular pulses from 1 kHz to 10 kHz, and the relaxation process of the liquid crystal was observed by the light intensity method. The results show that after the driving field is turned off, the relaxation process of the nematic liquid crystal deformation is jointly dominated by inertial and viscous properties. Based on this finding, we conclude that the proposed model provides a more precise interpretation of the switching dynamics of liquid crystal cells, which will facilitate the optimization of liquid crystal spatial light modulators. Meanwhile, this work is expected to promote a more in-depth investigation into the switching dynamics mechanism of liquid crystal cells under high-frequency pulse driving.  
      关键词:Liquid crystal spatial light modulator;high-frequency pulse drive;ericksen-leslie kinetic theory;director switching dynamics;inertia and viscosity effects   
      172
      |
      134
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126825187 false
      更新时间:2025-12-12

      Display Technology

    • 最新研究突破了增强现实光学显示系统的性能限制,为下一代AR设备提供高效率解决方案。
      XIE Yue, TIAN Zhongtao, ZHENG Baorong, LIU Zhengbiao, ZHANG Qinbo, SUN Xiaowei
      Vol. 40, Issue 11, Pages: 1606-1614(2025) DOI: 10.37188/CJLCD.2025-0157
      摘要:Augmented Reality (AR) is emerging as the next-generation computing platform after mobile computing; however, its core optical display systems still grapple with significant challenges in the field of view (FOV), full-color reproduction, and lightweight design. This paper presents an optical metasurface-based AR waveguide technology that precisely manipulates light fields using subwavelength structures, aimed at systematically addressing the inherent limitations of conventional diffractive waveguide solutions. The paper first analyzes the current technological landscape and commercialization progress within the AR market, identifying existing constraints, and delves into the metasurface waveguides’ core physical principles and design methodologies. To facilitate industrialization, this paper introduces an integrated metasurface waveguide design and manufacturing platform. The crucial role of high-directionality light sources in the evolution of AR displays is also highlighted. Research results indicate that metasurface waveguides can effectively overcome the performance limitations of traditional diffractive waveguides, demonstrating immense potential in expanding FOV, achieving high-fidelity full-color displays, and significantly reducing device weight. This offers a highly efficient and integrated innovative solution for next-generation AR display systems. Finally, the paper explores the profound strategic implications of this technological pathway for the future spatial computing industry, foreseeing its foundational role in the widespread adoption and diverse applications of AR devices.  
      关键词:augmented reality;optical waveguide;metasurface;field of view;full-color;lightweight   
      251
      |
      174
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 129569529 false
      更新时间:2025-12-12
    • 在数字经济时代,未来显示技术将融合仿生工程,实现多功能、智能化发展。专家提出仿生显示概念,为显示技术发展提供新思路。
      ZHOU Xiongtu, YANG Weiquan, PENG Yuyan, ZHANG Jiawei, ZHANG Yongai, WU Chaoxing, GUO Tailiang
      Vol. 40, Issue 11, Pages: 1615-1635(2025) DOI: 10.37188/CJLCD.2025-0149
      摘要:With the advent of the digital economy era, future display technologies are expected to evolve towards a multifaceted development trend characterized by multi-functionality, intelligence, high presence, and personalization. In these advanced display devices, biomimetic methods can significantly enhance human visual capabilities, thereby promoting innovations in information display functions, spectral extension, threshold expansion, shape innovation, and more. Consequently, the deep integration of display technology with bionic engineering is anticipated to better realize the vision of clear, distant, and realistic viewing experiences, offering vast prospects for development. This paper introduces the concept of bioinspired displays. It begins by summarizing the bioinspired strategies currently employed in information display devices from two perspectives: mimicking human attributes and mimicking other organisms. Next, it explores the anticipated development trends and ideas for bioinspired displays. Finally, it recommends focusing on four key research and development tasks: bioinspired display materials and devices, bioinspired display device and system integration technology, bioinspired information processing and human-computer interaction technology, as well as visual perception and health-oriented display evaluation.  
      关键词:bioinspired displays;future displays;near-eye display;artificial intelligence;visual enhancement   
      95
      |
      122
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126823194 false
      更新时间:2025-12-12
    • 在新型显示领域,AMOLED像素驱动技术发展迅速,从2T1C架构到LTPO宽频驱动架构,再到OTD架构,显著降低显示动态功耗,为低成本、低功耗AMOLED显示提供技术新路径。
      ZHOU Lei, LIU Jiajie, XU Miao, WU Weijing, PENG Junbiao
      Vol. 40, Issue 11, Pages: 1636-1646(2025) DOI: 10.37188/CJLCD.2025-0151
      摘要:Active-matrix organic light-emitting diode(AMOLED) display technology has become a mainstream solution in next-generation displays due to its high contrast ratio, rapid response, and flexibility. The pixel driving architecture critically determines display quality and energy efficiency. This paper systematically reviews the technical evolution of AMOLED pixel driving circuits: From the early simplified 2T1C architecture to the pipelined compensation architectures(e.g., 6T1C/7T1C) enabling real-time internal compensation of threshold voltage(Vth); Further advancing to LTPO wide-frequency driving architectures, which leverage hybrid low-temperature polycrystalline silicon and oxide (LTPS & Oxide) TFT backplane technology to support adaptive refresh rate adjustment (1~120 Hz), significantly reducing dynamic power consumption; And innovating with the ​One-Time Driving(OTD) architecture, which adopts nonvolatile memory and in-memory computing design. This approach decouples Vth latching from data refreshing, reducing the compensation frequency to 1/N of the data refresh rate(e.g., N=20, compensating Vth once per 20 frames), thereby slashing dynamic power consumption by >50% at high refresh rates. The paper thoroughly analyzes circuit principles, timing diagram, and power models of these architectures. It highlights that the integration of ​high-mobility rare-earth-doped oxide TFTs(e.g., Ln-IZO) with the OTD architecture will pave a new technical path for low-cost, low-power AMOLED displays.  
      关键词:​ AMOLED pixel driving circuit;threshold voltage compensation;LTPO technology;one-time driving (OTD) architecture;dynamic power consumption   
      154
      |
      125
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126825447 false
      更新时间:2025-12-12

      Image Processing

    • 在遥感图像超分辨率重建领域,TRCDSR网络通过两阶段残差条件扩散机制,有效提升了重建质量与计算效率。
      BU Lijing, CHEN Xiangxue, ZHANG Zhengpeng, WU Jun
      Vol. 40, Issue 11, Pages: 1647-1660(2025) DOI: 10.37188/CJLCD.2025-0158
      摘要:When the traditional diffusion model is used for super-resolution reconstruction of remote sensing images, there are difficulties such as insufficient utilization of a priori conditions, lengthy sampling steps, and poor recovery of high frequency details. In this paper, we propose a two-stage residual conditional diffusion super-resolution network(TRCDSR). In the first stage generates preliminary super-resolution results with a pre-trained lightweight CNN model to provide a high-quality structural priori for the diffusion model. In the second stage, we introduce the residual conditional diffusion mechanism, which takes the residual signals as the input to let the noise prediction network to focus on the high-frequency detail reconstruction. By improving the DDIM inverse sampling formula, the residual correction process is decoupled into a deterministic prediction term and a random noise term, and the high-quality reconstruction is completed in 20~50 steps. The multi-scale prior condition enhancement module(PCEM) and the fusion of spatial and channel attention mechanism(FAN) are further introduced to enhance the model’s adaptability to complex remote sensing scenes. Experiments on several remote sensing datasets, such as AID, SECOND, RSSCN, etc., show that TRCDSR outperforms other diffusion models, GAN and Transformer-like methods in terms of reconstruction quality, computational efficiency and generalization ability.  
      关键词:diffusion model;remote sensing super-resolution reconstruction;residual network;enhancement of a priori conditions   
      138
      |
      106
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126823015 false
      更新时间:2025-12-12
    • 在遥感影像变化检测领域,专家提出了DDSE-Net网络,有效提升了检测性能,为解决信息损失和伪变化干扰问题提供新方案。
      XIONG Jing, DONG Ting, GUAN Zongsheng
      Vol. 40, Issue 11, Pages: 1661-1674(2025) DOI: 10.37188/CJLCD.2025-0155
      摘要:In remote sensing image change detection, continuous downsampling often results in severe information loss and pseudo-change interference. To address these issues, we propose the dual-domain difference and scale selection enhance change detection network (DDSE-Net), which is built upon a dual-branch encoder and U-Net architecture. The proposed framework incorporates three key innovations: (1) A dual-domain joint difference enhancement module (DJEM), which first employs channel attention to strengthen difference representations along the channel dimension, and further integrates wavelet transform with spatial attention to achieve dual-domain enhancement in both spatial and frequency domains. This design effectively emphasizes true change information while suppressing pseudo changes. (2) A scale selection enhancement downsampling module (SEDM), which captures multi-scale features through convolution and pooling operations of varying receptive fields during the downsampling process. The extracted features are subsequently refined with spatial and channel attention, thereby alleviating information loss. (3) A gated difference perception module (GDPM), which introduces a gating mechanism to adaptively weight and fuse multi-scale change features, enabling more comprehensive integration and enhancing the network’s multi-scale representation capability. The proposed DDSE-Net achieves F1-score improvements of at least 4.48%, 2.18%, and 1.16% on the WHU, Google, and LEVIR datasets, respectively, compared with eight mainstream change detection networks, including FC-EF, FC-Conc, IFN, SNUNet, BIT, MSCANet, LightCDNet, and STADE-CDNet, demonstrating the effectiveness of DDSE-Net.  
      关键词:remote sensing change detection;dual-domain joint difference enhancement;scale selection enhancement;gated difference perception   
      117
      |
      120
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 129570263 false
      更新时间:2025-12-12
    • 在自动驾驶与具身智能领域,研究人员提出了一种局部上下文增强的快速3D弱特征目标检测方法,有效提高了远小或遮挡物体的检测精度,为复杂场景下的3D目标检测提供了解决方案。
      ZHANG Yang, SUN Haijiang, ZHANG Xiaowen, JI Yong
      Vol. 40, Issue 11, Pages: 1675-1687(2025) DOI: 10.37188/CJLCD.2025-0145
      摘要:3D object detection has extensive applications in autonomous driving and embodied intelligence, yet it struggles with poor discrimination and high detection difficulty for weak-featured objects in scenes—such as distant, small, or occluded objects. To address this, this paper proposes a fast 3D weak-feature object detection method enhanced by local context. First, to address the challenge of sparse feature representation for weak targets, we introduce the Local Sparse Feature Enhancement Module (LSFE). This module adaptively adjusts feature weights at local spatial positions to enhance the expressive power of sparse features, thereby increasing the model’s sensitivity to sparse characteristics. Second, to mitigate background interference affecting weak-featured objects, the Multi-Scale Context Learning Module (MSCL) is introduced. It integrates spatial and channel-wise attention mechanisms to acquire multi-scale contextual information and suppress background noise. Finally, to better utilize shallow-layer features, a high-resolution feature layer is added to the network’s detection head structure, enhancing the perception of object details. Experimental results on the KITTI dataset demonstrate that our method significantly improves detection accuracy for weakly featured objects compared to baseline approaches: mAP increases by 12.78% for Pedestrians, 2.69% for Cyclists, and 6.84% for Cars. Our method achieves high-precision detection while maintaining real-time inference speed, providing an effective solution for 3D object detection in complex scenes.  
      关键词:autonomous driving;point cloud data;3D object detection;weak feature target detection;local context learning   
      119
      |
      112
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126823087 false
      更新时间:2025-12-12
    • 在航空预警领域,红外弱小目标检测技术取得新进展,专家提出了基于时空三维卷积网络的检测方法,有效提升了目标检测的召回率和平均精度。
      LI Shigang, WANG Weijia, ZHU Shengjie, LIANG Zhongyi, MA Mingyang, WANG Dejiang, BAI Jincheng
      Vol. 40, Issue 11, Pages: 1688-1699(2025) DOI: 10.37188/CJLCD.2025-0161
      摘要:In the field of aviation early warning, infrared weak target detection technology is crucial for long-range all-weather battlefield perception. Aiming at the problem of low probability of target detection and high false alarm rate caused by a small proportion of pixels and lack of features of infrared dim and small targets under complex background, a detection method for infrared dim and small targets under complex backgrounds via a spatio-temporal three-dimensional convolutional network was proposed. This method proposes a feature extraction backbone network that combines 2D convolution with 3D convolution, and combines spatial texture features and inter-frame motion features to achieve collaborative perception of target structure and temporal changes. According to the characteristics of infrared dim and small targets, a local contrast module is designed as a feature enhancement module to expand the receptive field for feature enhancement; In addition, introducing asymmetric attention mechanism for feature fusion increases the preservation of texture and positional information; Finally, the point regression loss function is used to calculate the detection results. In the experiment, the public data set was compared with the self-built data set, labeled and trained. Experimental results show that compared with the conventional multi-frame target detection network, the improved algorithm has a recall rate improvement of no less than 7.52% and an average precision improvement rate of no less than 6.46%. It can be effectively applied to infrared dim and small target detection in complex backgrounds, and embodies good robustness and adaptability.  
      关键词:infrared dim and small target;deep learning;object detection;spatio-temporal three-dimensional convolution   
      84
      |
      124
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 129570948 false
      更新时间:2025-12-12
    • Photovoltaic panel defect detection technology based on improved YOLOv8n AI导读

      在太阳能发电领域,研究人员提出了SCA-YOLOv8n检测模型,有效提升了光伏板缺陷检测的精度与可靠性。
      DENG Wanyu, YUAN Zhaoyang
      Vol. 40, Issue 11, Pages: 1700-1709(2025) DOI: 10.37188/CJLCD.2025-0166
      摘要:As a core component of solar power generation systems, defects on the surface of photovoltaic panels can seriously affect their photovoltaic conversion efficiency and service life. In response to the challenges of identifying small defects and low contrast between defects and background in photovoltaic panel defect detection, this study proposes the SCA-YOLOv8n detection model. First, the SCConv cross-coupling module was designed to enhance the model’s ability to extract multi-scale defect features while reducing redundant information through space-channel feature interactive reconstruction. Second, we construct the coordinate attention (CoordAtt) mechanism to focus on defect regions from the channel and spatial dimensions and suppress background interference. Finally, a lightweight adaptive downsampling (ADown) module is embedded to replace traditional stride convolution, reducing computational complexity while minimizing feature information loss. The experimental results show that the improved model achieves an mAP@0.5 of 94.4%, which is a 2.0% improvement over the original YOLOv8n model. Additionally, the number of parameters is reduced by 5.0%, and GFLOPs decrease by 4.9%. These results comprehensively demonstrate that the proposed improvements not only achieve model lightweighting but also significantly enhance the accuracy and reliability of photovoltaic panel defect detection.  
      关键词:photovoltaic panel defect detection;YOLOv8n;SCConv;CoordAtt;ADown   
      122
      |
      53
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 126825417 false
      更新时间:2025-12-12
    • Object tracking algorithm based on Transformer and tracking trajectory AI导读

      在单目标跟踪领域,研究者提出了一种基于Transformer与跟踪轨迹的算法,有效提升了目标遮挡和相似物体干扰下的跟踪性能。
      WANG Xin, CHEN Zhiwang, WEI Yanqiao, SUN Yixuan, PENG Yong
      Vol. 40, Issue 11, Pages: 1710-1728(2025) DOI: 10.37188/CJLCD.2025-0174
      摘要:To address the decline in tracking performance caused by target occlusion and similar object interference in single-object tracking, this paper proposes a target tracking algorithm based on Transformer and tracking trajectories. The proposed algorithm employs a vision Transformer(ViT) as its backbone network. To address the inherent sensitivity of Transformers to background information during feature extraction, we introduce a dedicated focusing layer. This layer dynamically adjusts attention distribution, thereby enhancing the weight of the target region while effectively suppressing background noise. Furthermore, a hybrid attention module is designed to achieve feature decoupling between the template and search regions. Specifically, the template region utilizes a self-attention mechanism to reinforce target-specific features, while the search region integrates global contextual information through a cross-attention mechanism. Additionally, the algorithm incorporates a tracking trajectory-based post-processor. This module constructs a robust target trajectory from a sequence of historical tracking results and employs a Kalman filter to assess the reliability of the predicted bounding box. If the predicted bounding box’s reliability exceeds a predefined threshold, it is directly output. Otherwise, a reverse tracking mechanism is activated for both the predicted and candidate boxes. This process generates multiple potential trajectories, whose matching degree with the established target trajectory is then calculated, allowing for the optimal selection of a bounding box to refine the tracking outcome. During the training phase, the EIoU loss function is utilized for bounding box regression, which further enhances localization accuracy. Experimental results demonstrate that the proposed algorithm achieves an average overlap(AO) of 74.6% on the GOT-10K dataset and a precision (P) of 91.4% on the UAV123 dataset. Moreover, it exhibits superior tracking performance across various challenging benchmarks, including LaSOT, TrackingNet, and OTB100. Visualized tracking results further validate the algorithm’s capability to maintain stable and accurate tracking even in complex scenarios characterized by severe occlusion and significant interference from similar objects.  
      关键词:object tracking;attention mechanism;tracking trajectory;object occlusion;similar-object interference   
      96
      |
      105
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 129566623 false
      更新时间:2025-12-12
    • 在视频摘要领域,研究者提出了一种新模型,通过多尺度时间偏移与可形变局部注意力机制,有效提升了视频摘要的准确性和有效性。
      LI Zehui, ZHANG Lin, SHAN Xianying, SHEN Ganjie
      Vol. 40, Issue 11, Pages: 1729-1743(2025) DOI: 10.37188/CJLCD.2025-0189
      摘要:Aiming at the problems of insufficient multi-scale temporal modeling and local feature modeling in video summarization tasks, this paper proposes a video summarization model that combines multi-scale time shift and deformable local attention mechanism. Firstly, a multi-scale adaptive bidirectional time shift module (MAB-TSM) was designed. Through learnable dynamic bit shift step size prediction and multi-scale dilated convolution, adaptive modeling of video long and short time series dependencies was achieved. Secondly, the deformable Local Attention Module (DALAM) is designed, combined with the dynamic video segmentation strategy and the adaptive sampling position adjustment mechanism, to reduce the computational complexity while enhancing the refined feature expression ability of the local key regions. In addition, the BiFPN network for cross-scale fusion is improved. A cross-scale attention enhancement module is introduced on the basis of BiFPN to enhance the complementary expression of multi-scale features. The proposed model was subjected to multiple experiments on the SumMe and TVSum datasets. The F1 scores of the model in the canonical mode reached 56.8% and 62.6% respectively, which were superior to the existing methods. Moreover, the Kendall rank correlation coefficient and Spearman rank correlation coefficient reached 0.153 and 0.200 respectively, demonstrating excellent consistency. The experimental results prove the accuracy and effectiveness of the model in the video summarization task.  
      关键词:computer vision;edge detection;geometric figure;curve fitting;subpixel   
      70
      |
      56
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 131833294 false
      更新时间:2025-12-12
    • Target detection algorithm based on improved YOLOv11s UAV aerial image AI导读

      在无人机航拍图像小目标检测领域,HMD-YOLO算法通过优化模型结构和损失函数,有效提升了检测精度和效率。
      LÜ Xuehan, LI Fu, QI Mingrui, XU Jingjing, YANG Xinmeng, GONG Yuan
      Vol. 40, Issue 11, Pages: 1744-1756(2025) DOI: 10.37188/CJLCD.2025-0193
      摘要:Small object detection in UAV aerial images often suffers from challenges such as tiny target sizes, complex backgrounds, and limited computational resources. Existing UAV object detection models generally show insufficient accuracy and struggle to achieve a good balance between detection precision and efficiency. To address these challenges, this paper proposes an improved small object detection algorithm based on YOLOv11s, namely HMD-YOLO.Firstly, we design the HR-MSCA (High-Resolution Multi-Scale Convolutional Attention) module, which improves small object detection through a joint design of resolution enhancement and multi-scale convolutional attention. Secondly, a lightweight and efficient dynamic upsampler, Litesample, is employed to replace the original upsampling module in the neck of the model. Additionally, the Wise-IoU loss function is introduced to improve the accuracy of bounding box regression and overall model performance. Finally, a Dynamic Detection Head(DynamicHead) is incor-porated to further enhance the model’s capability in detecting small objects. Experimental results on the VisDrone2019 dataset demonstrate that the improved model achieves 49.98% mAP@0.5 and 30.73% mAP@0.95, showing improvements of 12.15% and 8.22% respectively over YOLO v11s. The effectiveness of the proposed approach is further validated through generalization experiments on the TinyPerson dataset, where the model also achieves a significant performance gain.  
      关键词:drone aerial surveying;small object recognition;YOLOv11;HMD-YOLO   
      136
      |
      98
      |
      0
      <HTML>
      <L-PDF><WORD><Meta-XML>
      <引用本文> <批量引用> 129571307 false
      更新时间:2025-12-12
    0