COMP-PHDec 2, 2025
Towards a fully differentiable digital twin for solar cellsMarie Louise Schubert, Houssam Metni, Jan David Fischbach et al.
Maximizing energy yield (EY) - the total electric energy generated by a solar cell within a year at a specific location - is crucial in photovoltaics (PV), especially for emerging technologies. Computational methods provide the necessary insights and guidance for future research. However, existing simulations typically focus on only isolated aspects of solar cells. This lack of consistency highlights the need for a framework unifying all computational levels, from material to cell properties, for accurate prediction and optimization of EY prediction. To address this challenge, a differentiable digital twin, Sol(Di)$^2$T, is introduced to enable comprehensive end-to-end optimization of solar cells. The workflow starts with material properties and morphological processing parameters, followed by optical and electrical simulations. Finally, climatic conditions and geographic location are incorporated to predict the EY. Each step is either intrinsically differentiable or replaced with a machine-learned surrogate model, enabling not only accurate EY prediction but also gradient-based optimization with respect to input parameters. Consequently, Sol(Di)$^2$T extends EY predictions to previously unexplored conditions. Demonstrated for an organic solar cell, the proposed framework marks a significant step towards tailoring solar cells for specific applications while ensuring maximal performance.
CVJun 1, 2022
Attack-Agnostic Adversarial DetectionJiaxin Cheng, Mohamed Hussein, Jay Billa et al.
The growing number of adversarial attacks in recent years gives attackers an advantage over defenders, as defenders must train detectors after knowing the types of attacks, and many models need to be maintained to ensure good performance in detecting any upcoming attacks. We propose a way to end the tug-of-war between attackers and defenders by treating adversarial attack detection as an anomaly detection problem so that the detector is agnostic to the attack. We quantify the statistical deviation caused by adversarial perturbations in two aspects. The Least Significant Component Feature (LSCF) quantifies the deviation of adversarial examples from the statistics of benign samples and Hessian Feature (HF) reflects how adversarial examples distort the landscape of the model's optima by measuring the local loss curvature. Empirical results show that our method can achieve an overall ROC AUC of 94.9%, 89.7%, and 94.6% on CIFAR10, CIFAR100, and SVHN, respectively, and has comparable performance to adversarial detectors trained with adversarial examples on most of the attacks.
ASDec 24, 2025
Speak the Art: A Direct Speech to Image Generation FrameworkMariam Saeed, Manar Amr, Farida Adel et al.
Direct speech-to-image generation has recently shown promising results. However, compared to text-to-image generation, there is still a large gap to enclose. Current approaches use two stages to tackle this task: speech encoding network and image generative adversarial network (GAN). The speech encoding networks in these approaches produce embeddings that do not capture sufficient linguistic information to semantically represent the input speech. GANs suffer from issues such as non-convergence, mode collapse, and diminished gradient, which result in unstable model parameters, limited sample diversity, and ineffective generator learning, respectively. To address these weaknesses, we introduce a framework called Speak the Art (STA) which consists of a speech encoding network and a VQ-Diffusion network conditioned on speech embeddings. To improve speech embeddings, the speech encoding network is supervised by a large pre-trained image-text model during training. Replacing GANs with diffusion leads to more stable training and the generation of diverse images. Additionally, we investigate the feasibility of extending our framework to be multilingual. As a proof of concept, we trained our framework with two languages: English and Arabic. Finally, we show that our results surpass state-of-the-art models by a large margin.
ROSep 9, 2020Code
SEAN: Social Environment for Autonomous NavigationNathan Tsoi, Mohamed Hussein, Jeacy Espinoza et al.
Social navigation research is performed on a variety of robotic platforms, scenarios, and environments. Making comparisons between navigation algorithms is challenging because of the effort involved in building these systems and the diversity of platforms used by the community; nonetheless, evaluation is critical to understanding progress in the field. In a step towards reproducible evaluation of social navigation algorithms, we propose the Social Environment for Autonomous Navigation (SEAN). SEAN is a high visual fidelity, open source, and extensible social navigation simulation platform which includes a toolkit for evaluation of navigation algorithms. We demonstrate SEAN and its evaluation toolkit in two environments with dynamic pedestrians and using two different robots.
CYSep 23, 2025
A Mega-Study of Digital Twins Reveals Strengths, Weaknesses and Opportunities for Further ImprovementTianyi Peng, George Gui, Daniel J. Merlau et al.
Digital representations of individuals ("digital twins") promise to transform social science and decision-making. Yet it remains unclear whether such twins truly mirror the people they emulate. We conducted 19 preregistered studies with a representative U.S. panel and their digital twins, each constructed from rich individual-level data, enabling direct comparisons between human and twin behavior across a wide range of domains and stimuli (including never-seen-before ones). Twins reproduced individual responses with 75% accuracy and seemingly low correlation with human answers (approximately 0.2). However, this apparently high accuracy was no higher than that achieved by generic personas based on demographics only. In contrast, correlation improved when twins incorporated detailed personal information, even outperforming traditional machine learning benchmarks that require additional data. Twins exhibited systematic strengths and weaknesses - performing better in social and personality domains, but worse in political ones - and were more accurate for participants with higher education, higher income, and moderate political views and religious attendance. Together, these findings delineate both the promise and the current limits of digital twins: they capture some relative differences among individuals but not yet the unique judgments of specific people. All data and code are publicly available to support the further development and evaluation of digital twin pipelines.
CVAug 27, 2021
Detection and Continual Learning of Novel Face Presentation AttacksMohammad Rostami, Leonidas Spinoulas, Mohamed Hussein et al.
Advances in deep learning, combined with availability of large datasets, have led to impressive improvements in face presentation attack detection research. However, state-of-the-art face antispoofing systems are still vulnerable to novel types of attacks that are never seen during training. Moreover, even if such attacks are correctly detected, these systems lack the ability to adapt to newly encountered attacks. The post-training ability of continually detecting new types of attacks and self-adaptation to identify these attack types, after the initial detection phase, is highly appealing. In this paper, we enable a deep neural network to detect anomalies in the observed input data points as potential new types of attacks by suppressing the confidence-level of the network outside the training samples' distribution. We then use experience replay to update the model to incorporate knowledge about new types of attacks without forgetting the past learned attack types. Experimental results are provided to demonstrate the effectiveness of the proposed method on two benchmark datasets as well as a newly introduced dataset which exhibits a large variety of attack types.
RODec 22, 2020
An Approach to Deploy Interactive Robotic Simulators on the Web for HRI Experiments: Results in Social Robot NavigationNathan Tsoi, Mohamed Hussein, Olivia Fugikawa et al.
Evaluation of social robot navigation inherently requires human input due to its qualitative nature. Motivated by the need to scale human evaluation, we propose a general method for deploying interactive, rich-client robotic simulations on the web. Prior approaches implement specific web-compatible simulators or provide tools to build a simulator for a specific study. Instead, our approach builds on standard Linux tools to share a graphical desktop with remote users. We leverage these tools to deploy simulators on the web that would typically be constrained to desktop computing environments. As an example implementation of our approach, we introduce the SEAN Experimental Platform (SEAN-EP). With SEAN-EP, remote users can virtually interact with a mobile robot in the Social Environment for Autonomous Navigation, without installing any software on their computer or needing specialized hardware. We validated that SEAN-EP could quickly scale the collection of human feedback and its usability through an online survey. In addition, we compared human feedback from participants that interacted with a robot using SEAN-EP with feedback obtained through a more traditional video survey. Our results suggest that human perceptions of robots may differ based on whether they interact with the robots in simulation or observe them in videos. Also, they suggest that people perceive the surveys with interactive simulations as less mentally demanding than video surveys.
CVJun 12, 2020
Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New DatasetLeonidas Spinoulas, Hengameh Mirzaalian, Mohamed Hussein et al.
Fingerprint presentation attack detection is becoming an increasingly challenging problem due to the continuous advancement of attack preparation techniques, which generate realistic-looking fake fingerprint presentations. In this work, rather than relying on legacy fingerprint images, which are widely used in the community, we study the usefulness of multiple recently introduced sensing modalities. Our study covers front-illumination imaging using short-wave-infrared, near-infrared, and laser illumination; and back-illumination imaging using near-infrared light. Toward studying the effectiveness of each of these unconventional sensing modalities and their fusion for liveness detection, we conducted a comprehensive analysis using a fully convolutional deep neural network framework. Our evaluation compares different combination of the new sensing modalities to legacy data from one of our collections as well as the public LivDet2015 dataset, showing the superiority of the new sensing modalities in most cases. It also covers the cases of known and unknown attacks and the cases of intra-dataset and inter-dataset evaluations. Our results indicate that the power of our approach stems from the nature of the captured data rather than the employed classification framework, which justifies the extra cost for hardware-based (or hybrid) solutions. We plan to publicly release one of our dataset collections.
CVJun 12, 2020
Multispectral Biometrics System Framework: Application to Presentation Attack DetectionLeonidas Spinoulas, Mohamed Hussein, David Geissbühler et al.
In this work, we present a general framework for building a biometrics system capable of capturing multispectral data from a series of sensors synchronized with active illumination sources. The framework unifies the system design for different biometric modalities and its realization on face, finger and iris data is described in detail. To the best of our knowledge, the presented design is the first to employ such a diverse set of electromagnetic spectrum bands, ranging from visible to long-wave-infrared wavelengths, and is capable of acquiring large volumes of data in seconds. Having performed a series of data collections, we run a comprehensive analysis on the captured data using a deep-learning classifier for presentation attack detection. Our study follows a data-centric approach attempting to highlight the strengths and weaknesses of each spectral band at distinguishing live from fake samples.
LGJun 6, 2019
On the Effectiveness of Laser Speckle Contrast Imaging and Deep Neural Networks for Detecting Known and Unknown Fingerprint Presentation AttacksHengameh Mirzaalian, Mohamed Hussein, Wael Abd-Almageed
Fingerprint presentation attack detection (FPAD) is becoming an increasingly challenging problem due to the continuous advancement of attack techniques, which generate `realistic-looking' fake fingerprint presentations. Recently, laser speckle contrast imaging (LSCI) has been introduced as a new sensing modality for FPAD. LSCI has the interesting characteristic of capturing the blood flow under the skin surface. Toward studying the importance and effectiveness of LSCI for FPAD, we conduct a comprehensive study using different patch-based deep neural network architectures. Our studied architectures include 2D and 3D convolutional networks as well as a recurrent network using long short-term memory (LSTM) units. The study demonstrates that strong FPAD performance can be achieved using LSCI. We evaluate the different models over a new large dataset. The dataset consists of 3743 bona fide samples, collected from 335 unique subjects, and 218 presentation attack samples, including six different types of attacks. To examine the effect of changing the training and testing sets, we conduct a 3-fold cross validation evaluation. To examine the effect of the presence of an unseen attack, we apply a leave-one-attack out strategy. The FPAD classification results of the networks, which are separately optimized and tuned for the temporal and spatial patch-sizes, indicate that the best performance is achieved by LSTM.