Deep video inpainting detection and localization based on ConvNeXt dual-stream network

Ye Yao, Tingfeng Han, Xudong Gao, Yizhi Ren, Weizhi Meng

Research output: Contribution to journalJournal articleResearchpeer-review

47 Downloads (Pure)

Abstract

Currently, deep learning-based video inpainting algorithms can fill in a specified video region with visually plausible content, usually leaving imperceptible traces. Since deep video inpainting methods can be used to maliciously manipulate video content, there is an urgent need for an effective method to detect and localize deep video inpainting. In this paper, we propose a dual-stream video inpainting detection network, which includes a ConvNeXt dual-stream encoder and a multi-scale feature cross-fusion decoder. To further explore the spatial and temporal traces left by deep inpainting, we extract motion residuals and enhance them using 3D convolution and SRM filtering. Furthermore, we extract filtered residuals using LoG and Laplacian filtering. These residuals are then entered into ConvNeXt, thereby learning discriminative inpainting features. To enhance detection accuracy, we design a top-down pyramid decoder that aims at deep fusion of multi-dimensional multi-scale features to fully exploit the information of different dimensions and levels in detail. We created two datasets containing state-of-the-art video inpainting algorithms and conducted various experiments to evaluate our approach. The experimental results demonstrate that our approach outperforms existing methods and attains a competitive performance despite encountering unseen inpainting algorithms.

Original languageEnglish
Article number123331
JournalExpert Systems with Applications
Volume247
Number of pages10
ISSN0957-4174
DOIs
Publication statusPublished - 2024

Keywords

  • Convolutional neural network
  • LoG and Laplace filtering
  • Multi-scale feature
  • Video inpainting detection

Fingerprint

Dive into the research topics of 'Deep video inpainting detection and localization based on ConvNeXt dual-stream network'. Together they form a unique fingerprint.

Cite this