CVAug 5, 2024Code
Dense Feature Interaction Network for Image Inpainting LocalizationYe Yao, Tingfeng Han, Shan Jia et al.
Image inpainting, the process of filling in missing areas in an image, is a common image editing technique. Inpainting can be used to conceal or alter image contents in malicious manipulation of images, driving the need for research in image inpainting detection. Most existing methods use a basic encoder-decoder structure, which often results in a high number of false positives or misses the inpainted regions, especially when dealing with targets of varying semantics and scales. Additionally, the lack of an effective approach to capture boundary artifacts leads to less accurate edge localization. In this paper, we describe a new method for inpainting detection based on a Dense Feature Interaction Network (DeFI-Net). DeFI-Net uses a novel feature pyramid architecture to capture and amplify multi-scale representations across various stages, thereby improving the detection of image inpainting by better strengthening feature-level interactions. Additionally, the network can adaptively direct the lower-level features, which carry edge and shape information, to refine the localization of manipulated regions while integrating the higher-level semantic features. Using DeFI-Net, we develop a method combining complementary representations to accurately identify inpainted areas. Evaluation on seven image inpainting datasets demonstrates the effectiveness of our approach, which achieves state-of-the-art performance in detecting inpainting across diverse models. Code and models are available at https://github.com/Boombb/DeFI-Net_Inpainting.
DCJul 30, 2025
Low-Communication Resilient Distributed Estimation Algorithm Based on Memory MechanismWei Li, Limei Hu, Feng Chen et al.
In multi-task adversarial networks, the accurate estimation of unknown parameters in a distributed algorithm is hindered by attacked nodes or links. To tackle this challenge, this brief proposes a low-communication resilient distributed estimation algorithm. First, a node selection strategy based on reputation is introduced that allows nodes to communicate with more reliable subset of neighbors. Subsequently, to discern trustworthy intermediate estimates, the Weighted Support Vector Data Description (W-SVDD) model is employed to train the memory data. This trained model contributes to reinforce the resilience of the distributed estimation process against the impact of attacked nodes or links. Additionally, an event-triggered mechanism is introduced to minimize ineffective updates to the W-SVDD model, and a suitable threshold is derived based on assumptions. The convergence of the algorithm is analyzed. Finally, simulation results demonstrate that the proposed algorithm achieves superior performance with less communication cost compared to other algorithms.
CVMar 30, 2025
TraceMark-LDM: Authenticatable Watermarking for Latent Diffusion Models via Binary-Guided RearrangementWenhao Luo, Zhangyi Shen, Ye Yao et al.
Image generation algorithms are increasingly integral to diverse aspects of human society, driven by their practical applications. However, insufficient oversight in artificial Intelligence generated content (AIGC) can facilitate the spread of malicious content and increase the risk of copyright infringement. Among the diverse range of image generation models, the Latent Diffusion Model (LDM) is currently the most widely used, dominating the majority of the Text-to-Image model market. Currently, most attribution methods for LDMs rely on directly embedding watermarks into the generated images or their intermediate noise, a practice that compromises both the quality and the robustness of the generated content. To address these limitations, we introduce TraceMark-LDM, an novel algorithm that integrates watermarking to attribute generated images while guaranteeing non-destructive performance. Unlike current methods, TraceMark-LDM leverages watermarks as guidance to rearrange random variables sampled from a Gaussian distribution. To mitigate potential deviations caused by inversion errors, the small absolute elements are grouped and rearranged. Additionally, we fine-tune the LDM encoder to enhance the robustness of the watermark. Experimental results show that images synthesized using TraceMark-LDM exhibit superior quality and attribution accuracy compared to state-of-the-art (SOTA) techniques. Notably, TraceMark-LDM demonstrates exceptional robustness against various common attack methods, consistently outperforming SOTA methods.
MMMar 26, 2018
Distinguishing Computer-generated Graphics from Natural Images Based on Sensor Pattern Noise and Deep LearningYe Yao, Weitong Hu, Wei Zhang et al.
Computer-generated graphics (CGs) are images generated by computer software. The~rapid development of computer graphics technologies has made it easier to generate photorealistic computer graphics, and these graphics are quite difficult to distinguish from natural images (NIs) with the naked eye. In this paper, we propose a method based on sensor pattern noise (SPN) and deep learning to distinguish CGs from NIs. Before being fed into our convolutional neural network (CNN)-based model, these images---CGs and NIs---are clipped into image patches. Furthermore, three high-pass filters (HPFs) are used to remove low-frequency signals, which represent the image content. These filters are also used to reveal the residual signal as well as SPN introduced by the digital camera device. Different from the traditional methods of distinguishing CGs from NIs, the proposed method utilizes a five-layer CNN to classify the input image patches. Based on the classification results of the image patches, we deploy a majority vote scheme to obtain the classification results for the full-size images. The~experiments have demonstrated that (1) the proposed method with three HPFs can achieve better results than that with only one HPF or no HPF and that (2) the proposed method with three HPFs achieves 100\% accuracy, although the NIs undergo a JPEG compression with a quality factor of 75.
MMFeb 7, 2018
Computer-Aided Annotation for Video Tampering Dataset of Forensic ResearchYe Yao
The annotation of video tampering dataset is a boring task that takes a lot of manpower and financial resources. At present, there is no published literature which is capable to improve the annotation efficiency of forged videos. We presented a computer-aided annotation method for video tampering dataset in this paper. This annotation method can be utilized to label the frames of forged video sequences. By means of comparing the original video frames with the forged video frames, we can locate the position and the trajectory of the forged areas of the forged video frames. Then, we select several key points on the temporal domain according to the trajectory of the forged areas, and mark the forged area of the forged frames in the key point with a mouse. Finally, we use the linear prediction algorithm based on the coordinates of the key positions in the temporal domain to generate the annotation information of forged areas in other video frames which without manually labeled. If the bounding box generated by the computer-aided algorithm deviates from the actual location of the forged area, we can use the mouse to change the position of the bounding box during the preview period. This method combines the manual annotation with computer-aided annotation. It solves the problems of the inaccuracy of annotation by computer-aided as well as the low efficiency of annotation manually, and meet the needs of annotation for an enormous amount of forged videos in the research of video passive forensics.
LGOct 1, 2017
DeepTFP: Mobile Time Series Data Analytics based Traffic Flow PredictionYuanfang Chen, Falin Chen, Yizhi Ren et al.
Traffic flow prediction is an important research issue to avoid traffic congestion in transportation systems. Traffic congestion avoiding can be achieved by knowing traffic flow and then conducting transportation planning. Achieving traffic flow prediction is challenging as the prediction is affected by many complex factors such as inter-region traffic, vehicles' relations, and sudden events. However, as the mobile data of vehicles has been widely collected by sensor-embedded devices in transportation systems, it is possible to predict the traffic flow by analysing mobile data. This study proposes a deep learning based prediction algorithm, DeepTFP, to collectively predict the traffic flow on each and every traffic road of a city. This algorithm uses three deep residual neural networks to model temporal closeness, period, and trend properties of traffic flow. Each residual neural network consists of a branch of residual convolutional units. DeepTFP aggregates the outputs of the three residual neural networks to optimize the parameters of a time series prediction model. Contrast experiments on mobile time series data from the transportation system of England demonstrate that the proposed DeepTFP outperforms the Long Short-Term Memory (LSTM) architecture based method in prediction accuracy.