Nasir Memon

h-index41

35papers

655citations

Novelty49%

AI Score44

Ranked #72,026 of 201,326 authors (top 36%)#25,272 in CV (top 43%)

35 Papers

CROct 12, 2022Code

GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response

Govind Mittal, Chinmay Hegde, Nasir Memon

With the rise of AI-enabled Real-Time Deepfakes (RTDFs), the integrity of online video interactions has become a growing concern. RTDFs have now made it feasible to replace an imposter's face with their victim in live video interactions. Such advancement in deepfakes also coaxes detection to rise to the same standard. However, existing deepfake detection techniques are asynchronous and hence ill-suited for RTDFs. To bridge this gap, we propose a challenge-response approach that establishes authenticity in live settings. We focus on talking-head style video interaction and present a taxonomy of challenges that specifically target inherent limitations of RTDF generation pipelines. We evaluate representative examples from the taxonomy by collecting a unique dataset comprising eight challenges, which consistently and visibly degrades the quality of state-of-the-art deepfake generators. These results are corroborated both by humans and a new automated scoring function, leading to 88.6% and 80.1% AUC, respectively. The findings underscore the promising potential of challenge-response systems for explainable and scalable real-time deepfake detection in practical scenarios. We provide access to data and code at \url{https://github.com/mittalgovind/GOTCHA-Deepfakes}.

CRSep 21, 2023

Information Forensics and Security: A quarter-century-long journey

Mauro Barni, Patrizio Campisi, Edward J. Delp et al.

Information Forensics and Security (IFS) is an active R&D area whose goal is to ensure that people use devices, data, and intellectual properties for authorized purposes and to facilitate the gathering of solid evidence to hold perpetrators accountable. For over a quarter century since the 1990s, the IFS research area has grown tremendously to address the societal needs of the digital information era. The IEEE Signal Processing Society (SPS) has emerged as an important hub and leader in this area, and the article below celebrates some landmark technical contributions. In particular, we highlight the major technological advances on some selected focus areas in the field developed in the last 25 years from the research community and present future trends.

SDApr 24, 2022

Dictionary Attacks on Speaker Verification

Mirko Marras, Pawel Korus, Anubhav Jain et al.

In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.

CVDec 5, 2022

A Dataless FaceSwap Detection Approach Using Synthetic Images

Anubhav Jain, Nasir Memon, Julian Togelius

Face swapping technology used to create "Deepfakes" has advanced significantly over the past few years and now enables us to create realistic facial manipulations. Current deep learning algorithms to detect deepfakes have shown promising results, however, they require large amounts of training data, and as we show they are biased towards a particular ethnicity. We propose a deepfake detection methodology that eliminates the need for any real data by making use of synthetically generated data using StyleGAN3. This not only performs at par with the traditional training methodology of using real data but it shows better generalization capabilities when finetuned with a small amount of real data. Furthermore, this also reduces biases created by facial image datasets that might have sparse data from particular ethnicities.

CVSep 11, 2022

Diversity and Novelty MasterPrints: Generating Multiple DeepMasterPrints for Increased User Coverage

M Charity, Nasir Memon, Zehua Jiang et al.

This work expands on previous advancements in genetic fingerprint spoofing via the DeepMasterPrints and introduces Diversity and Novelty MasterPrints. This system uses quality diversity evolutionary algorithms to generate dictionaries of artificial prints with a focus on increasing coverage of users from the dataset. The Diversity MasterPrints focus on generating solution prints that match with users not covered by previously found prints, and the Novelty MasterPrints explicitly search for prints with more that are farther in user space than previous prints. Our multi-print search methodologies outperform the singular DeepMasterPrints in both coverage and generalization while maintaining quality of the fingerprint image output.

CVJul 17, 2023

Identity-Preserving Aging of Face Images via Latent Diffusion Models

Sudipta Banerjee, Govind Mittal, Ameya Joshi et al.

The performance of automated face recognition systems is inevitably impacted by the facial aging process. However, high quality datasets of individuals collected over several years are typically small in scale. In this work, we propose, train, and validate the use of latent text-to-image diffusion models for synthetically aging and de-aging face images. Our models succeed with few-shot training, and have the added benefit of being controllable via intuitive textual prompting. We observe high degrees of visual realism in the generated images while maintaining biometric fidelity measured by commonly used metrics. We evaluate our method on two benchmark datasets (CelebA and AgeDB) and observe significant reduction (~44%) in the False Non-Match Rate compared to existing state-of the-art baselines.

CVAug 16, 2023

Fair GANs through model rebalancing for extremely imbalanced class distributions

Anubhav Jain, Nasir Memon, Julian Togelius

Deep generative models require large amounts of training data. This often poses a problem as the collection of datasets can be expensive and difficult, in particular datasets that are representative of the appropriate underlying distribution (e.g. demographic). This introduces biases in datasets which are further propagated in the models. We present an approach to construct an unbiased generative adversarial network (GAN) from an existing biased GAN by rebalancing the model distribution. We do so by generating balanced data from an existing imbalanced deep generative model using an evolutionary algorithm and then using this data to train a balanced generative model. Additionally, we propose a bias mitigation loss function that minimizes the deviation of the learned class distribution from being equiprobable. We show results for the StyleGAN2 models while training on the Flickr Faces High Quality (FFHQ) dataset for racial fairness and see that the proposed approach improves on the fairness metric by almost 5 times, whilst maintaining image quality. We further validate our approach by applying it to an imbalanced CIFAR10 dataset where we show that we can obtain comparable fairness and image quality as when training on a balanced CIFAR10 dataset which is also twice as large. Lastly, we argue that the traditionally used image quality metrics such as Frechet inception distance (FID) are unsuitable for scenarios where the class distributions are imbalanced and a balanced reference set is not available.

CVDec 19, 2025Code

FPBench: A Comprehensive Benchmark of Multimodal Large Language Models for Fingerprint Analysis

Ekta Balkrishna Gavas, Sudipta Banerjee, Chinmay Hegde et al.

Multimodal LLMs (MLLMs) have gained significant traction in complex data analysis, visual question answering, generation, and reasoning. Recently, they have been used for analyzing the biometric utility of iris and face images. However, their capabilities in fingerprint understanding are yet unexplored. In this work, we design a comprehensive benchmark, \textsc{FPBench} that evaluates the performance of 20 MLLMs (open-source and proprietary) across 7 real and synthetic datasets on 8 biometric and forensic tasks using zero-shot and chain-of-thought prompting strategies. We discuss our findings in terms of performance, explainability and share our insights into the challenges and limitations. We establish \textsc{FPBench} as the first comprehensive benchmark for fingerprint domain understanding with MLLMs paving the path for foundation models for fingerprints.

CVNov 20, 2023

Alpha-wolves and Alpha-mammals: Exploring Dictionary Attacks on Iris Recognition Systems

Sudipta Banerjee, Anubhav Jain, Zehua Jiang et al.

A dictionary attack in a biometric system entails the use of a small number of strategically generated images or templates to successfully match with a large number of identities, thereby compromising security. We focus on dictionary attacks at the template level, specifically the IrisCodes used in iris recognition systems. We present an hitherto unknown vulnerability wherein we mix IrisCodes using simple bitwise operators to generate alpha-mixtures - alpha-wolves (combining a set of "wolf" samples) and alpha-mammals (combining a set of users selected via search optimization) that increase false matches. We evaluate this vulnerability using the IITD, CASIA-IrisV4-Thousand and Synthetic datasets, and observe that an alpha-wolf (from two wolves) can match upto 71 identities @FMR=0.001%, while an alpha-mammal (from two identities) can match upto 133 other identities @FMR=0.01% on the IITD dataset.

CVApr 8, 2025Code

FaceCloak: Learning to Protect Face Templates

Sudipta Banerjee, Anubhav Jain, Chinmay Hegde et al.

Generative models can reconstruct face images from encoded representations (templates) bearing remarkable likeness to the original face, raising security and privacy concerns. We present \textsc{FaceCloak}, a neural network framework that protects face templates by generating smart, renewable binary cloaks. Our method proactively thwarts inversion attacks by cloaking face templates with unique disruptors synthesized from a single face template on the fly while provably retaining biometric utility and unlinkability. Our cloaked templates can suppress sensitive attributes while generalizing to novel feature extraction schemes and outperform leading baselines in terms of biometric matching and resiliency to reconstruction attacks. \textsc{FaceCloak}-based matching is extremely fast (inference time =0.28 ms) and light (0.57 MB). We have released our \href{https://github.com/sudban3089/FaceCloak.git}{code} for reproducible research.

CVFeb 20, 2021Code

Hard-Attention for Scalable Image Classification

Athanasios Papadopoulos, Paweł Korus, Nasir Memon

Can we leverage high-resolution information without the unsustainable quadratic complexity to input scale? We propose Traversal Network (TNet), a novel multi-scale hard-attention architecture, which traverses image scale-space in a top-down fashion, visiting only the most informative image regions along the way. TNet offers an adjustable trade-off between accuracy and complexity, by changing the number of attended image locations. We compare our model against hard-attention baselines on ImageNet, achieving higher accuracy with less resources (FLOPs, processing time and memory). We further test our model on fMoW dataset, where we process satellite images of size up to $896 \times 896$ px, getting up to $2.5$x faster processing compared to baselines operating on the same resolution, while achieving higher accuracy as well. TNet is modular, meaning that most classification models could be adopted as its backbone for feature extraction, making the reported performance gains orthogonal to benefits offered by existing optimized deep models. Finally, hard-attention guarantees a degree of interpretability to our model's predictions, without any extra cost beyond inference. Code is available at $\href{https://github.com/Tpap/TNet}{github.com/Tpap/TNet}$.

CVDec 10, 2024

TraSCE: Trajectory Steering for Concept Erasure

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

Recent advancements in text-to-image diffusion models have brought them to the public spotlight, becoming widely accessible and embraced by everyday users. However, these models have been shown to generate harmful content such as not-safe-for-work (NSFW) images. While approaches have been proposed to erase such abstract concepts from the models, jail-breaking techniques have succeeded in bypassing such safety measures. In this paper, we propose TraSCE, an approach to guide the diffusion trajectory away from generating harmful content. Our approach is based on negative prompting, but as we show in this paper, a widely used negative prompting strategy is not a complete solution and can easily be bypassed in some corner cases. To address this issue, we first propose using a specific formulation of negative prompting instead of the widely used one. Furthermore, we introduce a localized loss-based guidance that enhances the modified negative prompting technique by steering the diffusion trajectory. We demonstrate that our proposed method achieves state-of-the-art results on various benchmarks in removing harmful content, including ones proposed by red teams, and erasing artistic styles and objects. Our proposed approach does not require any training, weight modifications, or training data (either image or prompt), making it easier for model owners to erase new concepts.

CVApr 27, 2025

Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image

Anubhav Jain, Yuya Kobayashi, Naoki Murata et al.

Watermarking techniques are vital for protecting intellectual property and preventing fraudulent use of media. Most previous watermarking schemes designed for diffusion models embed a secret key in the initial noise. The resulting pattern is often considered hard to remove and forge into unrelated images. In this paper, we propose a black-box adversarial attack without presuming access to the diffusion model weights. Our attack uses only a single watermarked example and is based on a simple observation: there is a many-to-one mapping between images and initial noises. There are regions in the clean image latent space pertaining to each watermark that get mapped to the same initial noise when inverted. Based on this intuition, we propose an adversarial attack to forge the watermark by introducing perturbations to the images such that we can enter the region of watermarked images. We show that we can also apply a similar approach for watermark removal by learning perturbations to exit this region. We report results on multiple watermarking schemes (Tree-Ring, RingID, WIND, and Gaussian Shading) across two diffusion models (SDv1.4 and SDv2.0). Our results demonstrate the effectiveness of the attack and expose vulnerabilities in the watermarking methods, motivating future research on improving them.

CVNov 23, 2024

Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

Diffusion models are prone to exactly reproduce images from the training data. This exact reproduction of the training data is concerning as it can lead to copyright infringement and/or leakage of privacy-sensitive information. In this paper, we present a novel perspective on the memorization phenomenon and propose a simple yet effective approach to mitigate it. We argue that memorization occurs because of an attraction basin in the denoising process which steers the diffusion trajectory towards a memorized image. However, this can be mitigated by guiding the diffusion trajectory away from the attraction basin by not applying classifier-free guidance until an ideal transition point occurs from which classifier-free guidance is applied. This leads to the generation of non-memorized images that are high in image quality and well-aligned with the conditioning mechanism. To further improve on this, we present a new guidance technique, opposite guidance, that escapes the attraction basin sooner in the denoising process. We demonstrate the existence of attraction basins in various scenarios in which memorization occurs, and we show that our proposed approach successfully mitigates memorization.

CVMar 12, 2024

Mitigating the Impact of Attribute Editing on Face Recognition

Sudipta Banerjee, Sai Pranaswi Mullangi, Shruti Wagle et al.

Through a large-scale study over diverse face images, we show that facial attribute editing using modern generative AI models can severely degrade automated face recognition systems. This degradation persists even with identity-preserving generative models. To mitigate this issue, we propose two novel techniques for local and global attribute editing. We empirically ablate twenty-six facial semantic, demographic and expression-based attributes that have been edited using state-of-the-art generative models, and evaluate them using ArcFace and AdaFace matchers on CelebA, CelebAMaskHQ and LFW datasets. Finally, we use LLaVA, an emerging visual question-answering framework for attribute prediction to validate our editing techniques. Our methods outperform the current state-of-the-art at facial editing (BLIP, InstantID) while improving identity retention by a significant extent.

CVJul 17, 2025

DiffClean: Diffusion-based Makeup Removal for Accurate Age Estimation

Ekta Balkrishna Gavas, Chinmay Hegde, Nasir Memon et al.

Accurate age verification can protect underage users from unauthorized access to online platforms and e-commerce sites that provide age-restricted services. However, accurate age estimation can be confounded by several factors, including facial makeup that can induce changes to alter perceived identity and age to fool both humans and machines. In this work, we propose DiffClean which erases makeup traces using a text-guided diffusion model to defend against makeup attacks. DiffClean improves age estimation (minor vs. adult accuracy by 4.8%) and face verification (TMR by 8.9% at FMR=0.01%) over competing baselines on digitally simulated and real makeup images.

IRDec 23, 2024

WavePulse: Real-time Content Analytics of Radio Livestreams

Govind Mittal, Sarthak Gupta, Shruti Wagle et al.

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.

CVMay 12, 2023

Zero-shot racially balanced dataset generation using an existing biased StyleGAN2

Anubhav Jain, Nasir Memon, Julian Togelius

Facial recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Further, many of these datasets lack diversity in terms of ethnicity and demographics, which can lead to biased models that can have serious societal and security implications. To address these issues, we propose a methodology that leverages the biased generative model StyleGAN2 to create demographically diverse images of synthetic individuals. The synthetic dataset is created using a novel evolutionary search algorithm that targets specific demographic groups. By training face recognition models with the resulting balanced dataset containing 50,000 identities per race (13.5 million images in total), we can improve their performance and minimize biases that might have been present in a model trained on a real dataset.

SINov 11, 2020

The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic

Nicholas Micallef, Bing He, Srijan Kumar et al.

Fact checking by professionals is viewed as a vital defense in the fight against misinformation.While fact checking is important and its impact has been significant, fact checks could have limited visibility and may not reach the intended audience, such as those deeply embedded in polarized communities. Concerned citizens (i.e., the crowd), who are users of the platforms where misinformation appears, can play a crucial role in disseminating fact-checking information and in countering the spread of misinformation. To explore if this is the case, we conduct a data-driven study of misinformation on the Twitter platform, focusing on tweets related to the COVID-19 pandemic, analyzing the spread of misinformation, professional fact checks, and the crowd response to popular misleading claims about COVID-19. In this work, we curate a dataset of false claims and statements that seek to challenge or refute them. We train a classifier to create a novel dataset of 155,468 COVID-19-related tweets, containing 33,237 false claims and 33,413 refuting arguments.Our findings show that professional fact-checking tweets have limited volume and reach. In contrast, we observe that the surge in misinformation tweets results in a quick response and a corresponding increase in tweets that refute such misinformation. More importantly, we find contrasting differences in the way the crowd refutes tweets, some tweets appear to be opinions, while others contain concrete evidence, such as a link to a reputed source. Our work provides insights into how misinformation is organically countered in social platforms by some of their users and the role they play in amplifying professional fact checks.These insights could lead to development of tools and mechanisms that can empower concerned citizens in combating misinformation. The code and data can be found in http://claws.cc.gatech.edu/covid_counter_misinformation.html.

IVApr 4, 2020

Empirical Evaluation of PRNU Fingerprint Variation for Mismatched Imaging Pipelines

Sharad Joshi, Pawel Korus, Nitin Khanna et al.

We assess the variability of PRNU-based camera fingerprints with mismatched imaging pipelines (e.g., different camera ISP or digital darkroom software). We show that camera fingerprints exhibit non-negligible variations in this setup, which may lead to unexpected degradation of detection statistics in real-world use-cases. We tested 13 different pipelines, including standard digital darkroom software and recent neural-networks. We observed that correlation between fingerprints from mismatched pipelines drops on average to 0.38 and the PCE detection statistic drops by over 40%. The degradation in error rates is the strongest for small patches commonly used in photo manipulation detection, and when neural networks are used for photo development. At a fixed 0.5% FPR setting, the TPR drops by 17 ppt (percentage points) for 128 px and 256 px patches.

IVFeb 24, 2020

Fusion of Camera Model and Source Device Specific Forensic Methods for Improved Tamper Detection

Ahmet Gökhan Poyraz, Ahmet Emir Dirik, Ahmet Karaküçük et al.

PRNU based camera recognition method is widely studied in the image forensic literature. In recent years, CNN based camera model recognition methods have been developed. These two methods also provide solutions to tamper localization problem. In this paper, we propose their combination via a Neural Network to achieve better small-scale tamper detection performance. According to the results, the fusion method performs better than underlying methods even under high JPEG compression. For forgeries as small as 100$\times$100 pixel size, the proposed method outperforms the state-of-the-art, which validates the usefulness of fusion for localization of small-size image forgeries. We believe the proposed approach is feasible for any tamper-detection pipeline using the PRNU based methodology.

MMSep 10, 2019

Camera Fingerprint Extraction via Spatial Domain Averaged Frames

Samet Taspinar, Manoranjan Mohanty, Nasir Memon

Photo Response Non-Uniformity (PRNU) based camera attribution is an effective method to determine the source camera of visual media (an image or a video). To apply this method, images or videos need to be obtained from a camera to create a "camera fingerprint" which then can be compared against the PRNU of the query media whose origin is under question. The fingerprint extraction process can be time-consuming when a large number of video frames or images have to be denoised. This may need to be done when the individual images have been subjected to high compression or other geometric processing such as video stabilization. This paper investigates a simple, yet effective and efficient technique to create a camera fingerprint when so many still images need to be denoised. The technique utilizes Spatial Domain Averaged (SDA) frames. An SDA-frame is the arithmetic mean of multiple still images. When it is used for fingerprint extraction, the number of denoising operations can be significantly decreased with little or no performance loss. Experimental results show that the proposed method can work more than 50 times faster than conventional methods while providing similar matching results.

CRAug 16, 2019

FiFTy: Large-scale File Fragment Type Identification using Neural Networks

Govind Mittal, Pawel Korus, Nasir Memon

We present FiFTy, a modern file type identification tool for memory forensics and data carving. In contrast to previous approaches based on hand-crafted features, we design a compact neural network architecture, which uses a trainable embedding space, akin to successful natural language processing models. Our approach dispenses with explicit feature extraction which is a bottleneck in legacy systems. We evaluate the proposed method on a novel dataset with 75 file types - the most diverse and balanced dataset reported to date. FiFTy consistently outperforms all baselines in terms of speed, accuracy and individual misclassification rates. We achieved an average accuracy of 77.5% with processing speed of approx 38 sec/GB, which is better and more than an order of magnitude faster than the previous state-of-the-art tool - Sceadan (69% at 9 min/GB). Our tool and the corresponding dataset are available publicly online.

MMApr 2, 2019

Source Camera Attribution of Multi-Format Devices

Samet Taspinar, Manoranjan Mohanty, Nasir Memon

Photo Response Non-Uniformity (PRNU) based source camera attribution is an effective method to determine the origin camera of visual media (an image or a video). However, given that modern devices, especially smartphones, capture images, and videos at different resolutions using the same sensor array, PRNU attribution can become ineffective as the camera fingerprint and query visual media can be misaligned. We examine different resizing techniques such as binning, line-skipping, cropping and scaling that cameras use to downsize the raw sensor image to different media. Taking such techniques into account, this paper studies the problem of source camera attribution. We define the notion of Ratio of Alignment, which is a measure of shared sensor elements among spatially corresponding pixels within two media objects resized with different techniques. We then compute the Ratio of Alignment between the different combinations of three common resizing methods under simplified conditions and experimentally validate our analysis. Based on the insights drawn from the different techniques used by cameras and the RoA analysis, the paper proposes an algorithm for matching the source of a video with an image and vice versa. We also present an efficient search method resulting in significantly improved performance in matching as well as computation time.

MMMar 23, 2019

Analysis of Rolling Shutter Effect on ENF based Video Forensics

Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon

ENF is a time-varying signal of the frequency of mains electricity in a power grid. It continuously fluctuates around a nominal value (50/60 Hz) due to changes in supply and demand of power over time. Depending on these ENF variations, the luminous intensity of a mains-powered light source also fluctuates. These fluctuations in luminance can be captured by video recordings. Accordingly, ENF can be estimated from such videos by analysis of steady content in the video scene. When videos are captured by using a rolling shutter sampling mechanism, as is done mostly with CMOS cameras, there is an idle period between successive frames. Consequently, a number of illumination samples of the scene are effectively lost due to the idle period. These missing samples affect ENF estimation, in the sense of the frequency shift caused and the power attenuation that results. This work develops an analytical model for videos captured using a rolling shutter mechanism. The model illustrates how the frequency of the main ENF harmonic varies depending on the idle period length, and how the power of the captured ENF attenuates as idle period increases. Based on this, a novel idle period estimation method for potential use in camera forensics that is able to operate independently of video frame rate is proposed. Finally, a novel time-of-recording verification approach based on use of multiple ENF components, idle period assumptions and interpolation of missing ENF samples is also proposed.

MMMar 23, 2019

Detecting the Presence of ENF Signal in Digital Videos: a Superpixel based Approach

Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon

ENF (Electrical Network Frequency) instantaneously fluctuates around its nominal value (50/60 Hz) due to a continuous disparity between generated power and consumed power. Consequently, luminous intensity of a mains-powered light source varies depending on ENF fluctuations in the grid network. Variations in the luminance over time can be captured from video recordings and ENF can be estimated through content analysis of these recordings. In ENF based video forensics, it is critical to check whether a given video file is appropriate for this type of analysis. That is, if ENF signal is not present in a given video, it would be useless to apply ENF based forensic analysis. In this work, an ENF signal presence detection method is introduced for videos. The proposed method is based on multiple ENF signal estimations from steady superpixels, i.e. pixels that are most likely uniform in color, brightness, and texture, and intraclass similarity of the estimated signals. Subsequently, consistency among these estimates is then used to determine the presence or absence of an ENF signal in a given video. The proposed technique can operate on video clips as short as 2 minutes and is independent of the camera sensor type, i.e. CCD or CMOS.

CVFeb 27, 2019

Neural Imaging Pipelines - the Scourge or Hope of Forensics?

Pawel Korus, Nasir Memon

Forensic analysis of digital photographs relies on intrinsic statistical traces introduced at the time of their acquisition or subsequent editing. Such traces are often removed by post-processing (e.g., down-sampling and re-compression applied upon distribution in the Web) which inhibits reliable provenance analysis. Increasing adoption of computational methods within digital cameras further complicates the process and renders explicit mathematical modeling infeasible. While this trend challenges forensic analysis even in near-acquisition conditions, it also creates new opportunities. This paper explores end-to-end optimization of the entire image acquisition and distribution workflow to facilitate reliable forensic analysis at the end of the distribution channel, where state-of-the-art forensic techniques fail. We demonstrate that a neural network can be trained to replace the entire photo development pipeline, and jointly optimized for high-fidelity photo rendering and reliable provenance analysis. Such optimized neural imaging pipeline allowed us to increase image manipulation detection accuracy from approx. 45% to over 90%. The network learns to introduce carefully crafted artifacts, akin to digital watermarks, which facilitate subsequent manipulation detection. Analysis of performance trade-offs indicates that most of the gains can be obtained with only minor distortion. The findings encourage further research towards building more reliable imaging pipelines with explicit provenance-guaranteeing properties.

CVDec 4, 2018

Content Authentication for Neural Imaging Pipelines: End-to-end Optimization of Photo Provenance in Complex Distribution Channels

Pawel Korus, Nasir Memon

Forensic analysis of digital photo provenance relies on intrinsic traces left in the photograph at the time of its acquisition. Such analysis becomes unreliable after heavy post-processing, such as down-sampling and re-compression applied upon distribution in the Web. This paper explores end-to-end optimization of the entire image acquisition and distribution workflow to facilitate reliable forensic analysis at the end of the distribution channel. We demonstrate that neural imaging pipelines can be trained to replace the internals of digital cameras, and jointly optimized for high-fidelity photo development and reliable provenance analysis. In our experiments, the proposed approach increased image manipulation detection accuracy from 45% to over 90%. The findings encourage further research towards building more reliable imaging pipelines with explicit provenance-guaranteeing properties.

HCAug 5, 2018

Kid on The Phone! Toward Automatic Detection of Children on Mobile Devices

Toan Nguyen, Aditi Roy, Nasir Memon

Studies have shown that children can be exposed to smart devices at a very early age. This has important implications on research in children-computer interaction, children online safety and early education. Many systems have been built based on such research. In this work, we present multiple techniques to automatically detect the presence of a child on a smart device, which could be used as the first step on such systems. Our methods distinguish children from adults based on behavioral differences while operating a touch-enabled modern computing device. Behavioral differences are extracted from data recorded by the touchscreen and built-in sensors. To evaluate the effectiveness of the proposed methods, a new data set has been created from 50 children and adults who interacted with off-the-shelf applications on smart phones. Results show that it is possible to achieve 99% accuracy and less than 0.5% error rate after 8 consecutive touch gestures using only touch information or 5 seconds of sensor reading. If information is used from multiple sensors, then only after 3 gestures, similar performance could be achieved.

CRJul 2, 2018

Tap-based User Authentication for Smartwatches

Toan Nguyen, Nasir Memon

This paper presents TapMeIn, an eyes-free, two-factor authentication method for smartwatches. It allows users to tap a memorable melody (tap-password) of their choice anywhere on the touchscreen to unlock their watch. A user is verified based on the tap-password as well as her physiological and behavioral characteristics when tapping. Results from preliminary experiments with 41 participants show that TapMeIn could achieve an accuracy of 98.7% with a False Positive Rate of only 0.98%. In addition, TapMeIn retains its performance in different conditions such as sitting and walking. In terms of speed, TapMeIn has an average authentication time of 2 seconds. A user study with the System Usability Scale (SUS) tool suggests that TapMeIn has a high usability score.

CRDec 22, 2017

An HMM-based behavior modeling approach for continuous mobile authentication

Aditi Roy, Tzipora Halevi, Nasir Memon

This paper studies continuous authentication for touch interface based mobile devices. A Hidden Markov Model (HMM) based behavioral template training approach is presented, which does not require training data from other subjects other than the owner of the mobile. The stroke patterns of a user are modeled using a continuous left-right HMM. The approach models the horizontal and vertical scrolling patterns of a user since these are the basic and mostly used interactions on a mobile device. The effectiveness of the proposed method is evaluated through extensive experiments using the Toucha-lytics database which comprises of touch data over time. The results show that the performance of the proposed approach is better than the state-of-the-art method.

CRDec 22, 2017

An HMM-based Multi-sensor Approach for Continuous Mobile Authentication

Aditi Roy, Tzipora Halevi, Nasir Memon

With the increased popularity of smart phones, there is a greater need to have a robust authentication mechanism that handles various security threats and privacy leakages effectively. This paper studies continuous authentication for touch interface based mobile devices. A Hidden Markov Model (HMM) based behavioral template training approach is presented, which does not require training data from other subjects other than the owner of the mobile device and can get updated with new data over time. The gesture patterns of the user are modeled from multiple sensors - touch, accelerometer and gyroscope data using a continuous left-right HMM. The approach models the tap and stroke patterns of a user since these are the basic and most frequently used interactions on a mobile device. To evaluate the effectiveness of the proposed method a new data set has been created from 42 users who interacted with off-the-shelf applications on their smart phones. Results show that the performance of the proposed approach is promising and potentially better than other state-of-the-art approaches.

CRAug 22, 2017

IllusionPIN: Shoulder-Surfing Resistant Authentication Using Hybrid Images

Athanasios Papadopoulos, Toan Nguyen, Emre Durmus et al.

We address the problem of shoulder-surfing attacks on authentication schemes by proposing IllusionPIN (IPIN), a PIN-based authentication method that operates on touchscreen devices. IPIN uses the technique of hybrid images to blend two keypads with different digit orderings in such a way, that the user who is close to the device is seeing one keypad to enter her PIN, while the attacker who is looking at the device from a bigger distance is seeing only the other keypad. The user's keypad is shuffled in every authentication attempt since the attacker may memorize the spatial arrangement of the pressed digits. To reason about the security of IllusionPIN, we developed an algorithm which is based on human visual perception and estimates the minimum distance from which an observer is unable to interpret the keypad of the user. We tested our estimations with 84 simulated shoulder-surfing attacks from 21 different people. None of the attacks was successful against our estimations. In addition, we estimated the minimum distance from which a camera is unable to capture the visual information from the keypad of the user. Based on our analysis, it seems practically almost impossible for a surveillance camera to capture the PIN of a smartphone user when IPIN is in use.

CVMay 21, 2017

DeepMasterPrints: Generating MasterPrints for Dictionary Attacks via Latent Variable Evolution

Philip Bontrager, Aditi Roy, Julian Togelius et al.

Recent research has demonstrated the vulnerability of fingerprint recognition systems to dictionary attacks based on MasterPrints. MasterPrints are real or synthetic fingerprints that can fortuitously match with a large number of fingerprints thereby undermining the security afforded by fingerprint systems. Previous work by Roy et al. generated synthetic MasterPrints at the feature-level. In this work we generate complete image-level MasterPrints known as DeepMasterPrints, whose attack accuracy is found to be much superior than that of previous methods. The proposed method, referred to as Latent Variable Evolution, is based on training a Generative Adversarial Network on a set of real fingerprint images. Stochastic search in the form of the Covariance Matrix Adaptation Evolution Strategy is then used to search for latent input variables to the generator network that can maximize the number of impostor matches as assessed by a fingerprint recognizer. Experiments convey the efficacy of the proposed method in generating DeepMasterPrints. The underlying method is likely to have broad applications in fingerprint security as well as fingerprint synthesis.

HCJan 31, 2013

Phishing, Personality Traits and Facebook

Tzipora Halevi, Jim Lewis, Nasir Memon

Phishing attacks have become an increasing threat to online users. Recent research has begun to focus on the factors that cause people to respond to them. Our study examines the correlation between the Big Five personality traits and email phishing response. We also examine how these factors affect users behavior on Facebook, including posting personal information and choosing Facebook privacy settings. Our research shows that when using a prize phishing email, we find a strong correlation between gender and the response to the phishing email. In addition, we find that the neuroticism is the factor most correlated to responding to this email. Our study also found that people who score high on the openness factor tend to both post more information on Facebook as well as have less strict privacy settings, which may cause them to be susceptible to privacy attacks. In addition, our work detected no correlation between the participants estimate of being vulnerable to phishing attacks and actually being phished, which suggests susceptibility to phishing is not due to lack of awareness of the phishing risks and that realtime response to phishing is hard to predict in advance by online users. We believe that better understanding of the traits which contribute to online vulnerability can help develop methods for increasing users privacy and security in the future.