

浏览全部资源
扫码关注微信
1.福州大学 物理与信息工程学院, 福建 福州 350108
2.晋江市博感电子科技有限公司, 福建 晋江 362200
Received:24 July 2021,
Revised:26 September 2021,
Published:2022-01
移动端阅览
De-cai LI, Qun YAN, Jian-min YAO, et al. Video inpainting based on residual convolution attention network[J]. Chinese journal of liquid crystals and displays, 2022, 37(1): 86-96.
De-cai LI, Qun YAN, Jian-min YAO, et al. Video inpainting based on residual convolution attention network[J]. Chinese journal of liquid crystals and displays, 2022, 37(1): 86-96. DOI: 10.37188/CJLCD.2021-0196.
视频修复旨在填补视频中的缺失区域,由于很难精确保持修复内容的时空一致性,故视频修复仍具有挑战性。针对现有视频修复中存在的修复结果语义信息不连续,出现视频模糊和时间伪影,以及网络设计越来越复杂,网络整体速度变慢的问题,本文提出了一种基于残差网络的卷积注意力网络(RCAN)用以视频修复。通过将自注意力机制和全局注意力机制引入进残差网络,增强网络对所有输入帧的时空特征的学习能力,并采用时空对抗损失函数进行优化,提高视频修复的质量。同时网络还能够高度自由地定义层数和参数量,提高网络的实际应用能力。实验结果表明,该网络在DAVIS和YouTube-VOS数据集上取得了PSNR为30.68 dB,SSIM为0.961,FID为0.113的平均修复结果,基本符合实际场景对模型的修复质量要求,为视频修复提供了一种新思路。
Video inpainting
which aims at filling in missing regions of a video
remains challenging due to the difficulty of preserving the precise spatial and temporal coherence of video contents. In order to solve the problems of discontinuous semantic information
video blurriness and temporal artifact
and more and more complex network design
the overall speed of the network becoming slow
this paper proposes a residual convolution attention network (RCAN) for video inpainting. By introducing the self-attention mechanism and the global attention mechanism into the residual network
the ability of the network to learn the spatio-temporal features of all input frames is enhanced. This method proposes a spatial-temporal adversarial loss function to optimize RCAN
which improves the quality of video inpainting. At the same time
the network can define the number of layers and parameters with a high degree of freedom to improve the practical application ability of the network. Experimental results show that the network can achieve an average inpainting result in that the PSNR is 30.68 dB
the SSIM is 0.961
and the FID is 0.113 on DAVIS and YouTube-VOS data sets. This method meets the inpainting quality requirements of the actual scene on the model and provides a new idea for video inpainting.
BERTALMIO M, BERTOZZI A L, SAPIRO G. Navier-stokes, fluid dynamics, and image and video inpainting[C]// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai: IEEE, 2001: 355-362.
M GRANADOS , J TOMPKIN , K KIM , 等 . How not to be seen-object removal from videos of crowded scenes . Computer Graphics Forum , 2012 . 31 ( 2pt1 ): 219 - 228 . http://diglib2.eg.org/EG/DL/CGF/volume31/issue2/v31i2pp219-228.pdf.abstract.pdf http://diglib2.eg.org/EG/DL/CGF/volume31/issue2/v31i2pp219-228.pdf.abstract.pdf , .
徐 展 , 曹 哲 . 复杂运动摄像机拍摄视频的背景修复技术 . 计算机应用 , 2014 . 34 ( 12 ): 3540 - 3544, 3559 . https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201412038.htm https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201412038.htm , .
Z XU , Z CAO . Video background completion with complexly moving camera . Journal of Computer Applications , 2014 . 34 ( 12 ): 3540 - 3544, 3559 . https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201412038.htm https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201412038.htm , .
Y MATSUSHITA , E OFEK , W N GE , 等 . Full-frame video stabilization with motion inpainting . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2006 . 28 ( 7 ): 1150 - 1163 . DOI: 10.1109/TPAMI.2006.141 http://doi.org/10.1109/TPAMI.2006.141 .
PATWARDHAN K A, SAPIRO G, BERTALMIO M. Video inpainting of occluding and occluded objects[C]// Proceedings of the IEEE International Conference on Image Processing 2005. Genova: IEEE, 2005: Ⅱ-69.
KIM D, WOO S, LEE J Y, et al . Deep video inpainting[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 5785-5794.
N C TANG , C T HSU , C W SU , 等 . Video inpainting on digitized vintage films via maintaining spatiotemporal continuity . IEEE Transactions on Multimedia , 2011 . 13 ( 4 ): 602 - 614 . DOI: 10.1109/TMM.2011.2112642 http://doi.org/10.1109/TMM.2011.2112642 .
KIM D, WOO S, LEE J Y, et al . Deep blind video decaptioning by temporal aggregation and recurrence[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 4258-4267.
J B HUANG , S B KANG , N AHUJA , 等 . Temporally coherent completion of dynamic video . ACM Transactions on Graphics , 2016 . 35 ( 6 ): 196 .
付 傲威 , 赵 敏 , 罗 令 , 等 . 自由立体显示中基于深度卷积神经网络的虚拟视点生成方法 . 液晶与显示 , 2019 . 34 ( 11 ): 1031 - 1036 . DOI: 10.3788/YJYXS20193411.1031 http://doi.org/10.3788/YJYXS20193411.1031 .
A W FU , M ZHAO , L LUO , 等 . Virtual viewpoint image generation method based on deep convolutional neural network in autostereoscopic display . Chinese Journal of Liquid Crystals and Displays , 2019 . 34 ( 11 ): 1031 - 1036 . DOI: 10.3788/YJYXS20193411.1031 http://doi.org/10.3788/YJYXS20193411.1031 .
A NEWSON , A ALMANSA , M FRADET , 等 . Video inpainting of complex scenes . SIAM Journal on Imaging Sciences , 2014 . 7 ( 4 ): 1993 - 2019 . DOI: 10.1137/140954933 http://doi.org/10.1137/140954933 .
T K SHIH , N C TANG , J N HWANG . Exemplar-based video inpainting without ghost shadow artifacts by maintaining temporal continuity . IEEE Transactions on Circuits and Systems for Video Technology , 2009 . 19 ( 3 ): 347 - 360 . DOI: 10.1109/TCSVT.2009.2013519 http://doi.org/10.1109/TCSVT.2009.2013519 .
吴 磊 , 吕 国强 , 赵 晨 , 等 . 基于多尺度残差网络的CT图像超分辨率重建 . 液晶与显示 , 2019 . 34 ( 10 ): 1006 - 1012 . DOI: 10.3788/YJYXS20193410.1006 http://doi.org/10.3788/YJYXS20193410.1006 .
L WU , G Q LYU , C ZHAO , 等 . CT image super-resolution reconstruction based on multi-scale residual network . Chinese Journal of Liquid Crystals and Displays , 2019 . 34 ( 10 ): 1006 - 1012 . DOI: 10.3788/YJYXS20193410.1006 http://doi.org/10.3788/YJYXS20193410.1006 .
陈 宗航 , 胡 海龙 , 姚 剑敏 , 等 . 基于改进生成对抗网络的单帧图像超分辨率重建 . 液晶与显示 , 2021 . 36 ( 5 ): 705 - 712 . DOI: 10.37188/CJLCD.2020-0250 http://doi.org/10.37188/CJLCD.2020-0250 .
Z H CHEN , H L HU , J M YAO , 等 . Single frame image super-resolution reconstruction based on improved generative adversarial network . Chinese Journal of Liquid Crystals and Displays , 2021 . 36 ( 5 ): 705 - 712 . DOI: 10.37188/CJLCD.2020-0250 http://doi.org/10.37188/CJLCD.2020-0250 .
CHANG Y L, LIU Z Y, LEE K Y, et al . Free-form video inpainting with 3D gated convolution and temporal patchgan[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 9065-9074.
C WANG , H B HUANG , X G HAN , 等 . Video inpainting by jointly learning temporal structure and spatial details . Proceedings of the AAAI Conference on Artificial Intelligence , 2019 . 33 ( 1 ): 5232 - 5239 . http://arxiv.org/pdf/1806.08482 http://arxiv.org/pdf/1806.08482 , .
XU R, LI X X, ZHOU B L, et al . Deep flow-guided video inpainting[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 3718-3727.
ZHANG H T, MAI L, JIN H L, et al . An internal learning approach to video inpainting[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 2720-2729.
YU J H, LIN Z, YANG J M, et al . Free-form image inpainting with gated convolution[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 4470-4479.
LIN J, GAN C, HAN S. TSM: temporal shift module for efficient video understanding[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul: IEEE, 2019: 7082-7092.
于 冰 , 丁 友东 , 谢 志峰 , 等 . 基于时空生成对抗网络的视频修复 . 计算机辅助设计与图形学学报 , 2020 . 32 ( 5 ): 769 - 779 . https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF202005008.htm https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF202005008.htm , .
B YU , Y D DING , Z F XIE , 等 . Temporal-spatial generative adversarial networks for video inpainting . Journal of Computer-Aided Design & Computer Graphics , 2020 . 32 ( 5 ): 769 - 779 . https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF202005008.htm https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF202005008.htm , .
LEE S, OH S W, WON D, et al . Copy-and-paste networks for deep video inpainting[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul: IEEE, 2019: 4412-4420.
OH S W, LEE S, LEE J Y, et al . Onion-peel networks for deep video completion[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul: IEEE, 2019: 4402-4411.
HE K M, ZHANG X Y, REN S Q, et al . Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Las Vegas: IEEE, 2016: 770-778.
VASWANI A, SHAZEER N, PARMAR N, et al . Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems . Long Beach: Curran Associates Inc., 2017: 6000-6010.
D M HAUSMAN , J WOODWARD . Independence, invariance and the causal markov condition . The British Journal for the Philosophy of Science , 1999 . 50 ( 4 ): 521 - 583 . DOI: 10.1093/bjps/50.4.521 http://doi.org/10.1093/bjps/50.4.521 .
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al . An image is worth 16×16 words: transformers for image recognition at scale[J]. arXiv preprint arXiv : 2010.11929, 2020.
YANG F Z, YANG H, FU J L, et al . Learning texture transformer network for image super-resolution[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle: IEEE, 2020: 5790-5799.
CAELLES S, MONTES A, MANINIS K K, et al. The 2018 DAVIS challenge on video object segmentation[J]. arXiv preprint arXiv : 1803.00557, 2018.
XU N, YANG L J, FAN Y C, et al . YouTube-VOS: a large-scale video object segmentation benchmark[J]. arXiv preprint arXiv : 1809.03327, 2018.
CHANG Y L, LIU Z Y, LEE K Y, et al . Learnable gated temporal shift module for free-form video inpainting[C]// Proceedings of the 30th British Machine Vision Conference . Cardiff, UK: BMVA Press, 2019: 149.
0
Views
139
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621