57.2CVApr 18
Adverse-to-the-eXtreme Panoptic Segmentation: URVIS 2026 Study and BenchmarkYiting Wang, Nolwenn Peyratout, Tim Brodermann et al.
This paper presents the report of the URVIS 2026 challenge on adverse-to-extreme panoptic segmentation. As the first challenge of its kind, it attracted 17 registered participants and 47 submissions, with 4 teams reaching the final phase. The challenge is based on the MUSES dataset, a multi-sensor benchmark for panoptic segmentation in adverse-to-extreme weather, including RGB frame camera, LiDAR, radar, and event camera data. Weighted Panoptic Quality (wPQ) is designed and adopted as the official ranking metric for fair evaluation across weather conditions. In this report, we summarise the challenge setting and benchmark results, analyse the performance of the submitted methods, and discuss current progress and remaining challenges for robust multimodal panoptic segmentation. Link: https://urvis-workshop.github.io/challenge-Muses.html
CVJul 7, 2022
Semi-supervised Object Detection via Virtual Category LearningChangrui Chen, Kurt Debattista, Jungong Han
Due to the costliness of labelled data in real-world applications, semi-supervised object detectors, underpinned by pseudo labelling, are appealing. However, handling confusing samples is nontrivial: discarding valuable confusing samples would compromise the model generalisation while using them for training would exacerbate the confirmation bias issue caused by inevitable mislabelling. To solve this problem, this paper proposes to use confusing samples proactively without label correction. Specifically, a virtual category (VC) is assigned to each confusing sample such that they can safely contribute to the model optimisation even without a concrete label. It is attributed to specifying the embedding distance between the training sample and the virtual category as the lower bound of the inter-class distance. Moreover, we also modify the localisation loss to allow high-quality boundaries for location regression. Extensive experiments demonstrate that the proposed VC learning significantly surpasses the state-of-the-art, especially with small amounts of available labels.
LGJul 28, 2023
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving SystemsAmir Samadi, Amir Shirian, Konstantinos Koufos et al.
A CF explainer identifies the minimum modifications in the input that would alter the model's output to its complement. In other words, a CF explainer computes the minimum modifications required to cross the model's decision boundary. Current deep generative CF models often work with user-selected features rather than focusing on the discriminative features of the black-box model. Consequently, such CF examples may not necessarily lie near the decision boundary, thereby contradicting the definition of CFs. To address this issue, we propose in this paper a novel approach that leverages saliency maps to generate more informative CF explanations. Source codes are available at: https://github.com/Amir-Samadi//Saliency_Aware_CF.
CVSep 26, 2024
Good Data Is All Imitation Learning NeedsAmir Samadi, Konstantinos Koufos, Kurt Debattista et al.
In this paper, we address the limitations of traditional teacher-student models, imitation learning, and behaviour cloning in the context of Autonomous/Automated Driving Systems (ADS), where these methods often struggle with incomplete coverage of real-world scenarios. To enhance the robustness of such models, we introduce the use of Counterfactual Explanations (CFEs) as a novel data augmentation technique for end-to-end ADS. CFEs, by generating training samples near decision boundaries through minimal input modifications, lead to a more comprehensive representation of expert driver strategies, particularly in safety-critical scenarios. This approach can therefore help improve the model's ability to handle rare and challenging driving events, such as anticipating darting out pedestrians, ultimately leading to safer and more trustworthy decision-making for ADS. Our experiments in the CARLA simulator demonstrate that CF-Driver outperforms the current state-of-the-art method, achieving a higher driving score and lower infraction rates. Specifically, CF-Driver attains a driving score of 84.2, surpassing the previous best model by 15.02 percentage points. These results highlight the effectiveness of incorporating CFEs in training end-to-end ADS. To foster further research, the CF-Driver code is made publicly available.
39.6CVMar 16
AURORA-KITTI: Any-Weather Depth Completion and Denoising in the WildYiting Wang, Tim Brödermann, Hamed Haghighi et al.
Robust depth completion is fundamental to real-world 3D scene understanding, yet existing RGB-LiDAR fusion methods degrade significantly under adverse weather, where both camera images and LiDAR measurements suffer from weather-induced corruption. In this paper, we introduce AURORA-KITTI, the first large-scale multi-modal, multi-weather benchmark for robust depth completion in the wild. We further formulate Depth Completion and Denoising (DCD) as a unified task that jointly reconstructs a dense depth map from corrupted sparse inputs while suppressing weather-induced noise. AURORA-KITTI contains over \textit{82K} weather-consistent RGBL pairs with metric depth ground truth, spanning diverse weather types, three severity levels, day and night scenes, paired clean references, lens occlusion conditions, and textual descriptions. Moreover, we introduce DDCD, an efficient distillation-based baseline that leverages depth foundation models to inject clean structural priors into in-the-wild DCD training. DDCD achieves state-of-the-art performance on AURORA-KITTI and the real-world DENSE dataset while maintaining efficiency. Notably, our results further show that weather-aware, physically consistent data contributes more to robustness than architectural modifications alone. Data and code will be released upon publication.
LGApr 28, 2024Code
SAFE-RL: Saliency-Aware Counterfactual Explainer for Deep Reinforcement Learning PoliciesAmir Samadi, Konstantinos Koufos, Kurt Debattista et al.
While Deep Reinforcement Learning (DRL) has emerged as a promising solution for intricate control tasks, the lack of explainability of the learned policies impedes its uptake in safety-critical applications, such as automated driving systems (ADS). Counterfactual (CF) explanations have recently gained prominence for their ability to interpret black-box Deep Learning (DL) models. CF examples are associated with minimal changes in the input, resulting in a complementary output by the DL model. Finding such alternations, particularly for high-dimensional visual inputs, poses significant challenges. Besides, the temporal dependency introduced by the reliance of the DRL agent action on a history of past state observations further complicates the generation of CF examples. To address these challenges, we propose using a saliency map to identify the most influential input pixels across the sequence of past observed states by the agent. Then, we feed this map to a deep generative model, enabling the generation of plausible CFs with constrained modifications centred on the salient regions. We evaluate the effectiveness of our framework in diverse domains, including ADS, Atari Pong, Pacman and space-invaders games, using traditional performance metrics such as validity, proximity and sparsity. Experimental results demonstrate that this framework generates more informative and plausible CFs than the state-of-the-art for a wide range of environments and DRL agents. In order to foster research in this area, we have made our datasets and codes publicly available at https://github.com/Amir-Samadi/SAFE-RL.
CVApr 8, 2024Code
Taming Transformers for Realistic Lidar Point Cloud GenerationHamed Haghighi, Amir Samadi, Mehrdad Dianati et al.
Diffusion Models (DMs) have achieved State-Of-The-Art (SOTA) results in the Lidar point cloud generation task, benefiting from their stable training and iterative refinement during sampling. However, DMs often fail to realistically model Lidar raydrop noise due to their inherent denoising process. To retain the strength of iterative sampling while enhancing the generation of raydrop noise, we introduce LidarGRIT, a generative model that uses auto-regressive transformers to iteratively sample the range images in the latent space rather than image space. Furthermore, LidarGRIT utilises VQ-VAE to separately decode range images and raydrop masks. Our results show that LidarGRIT achieves superior performance compared to SOTA models on KITTI-360 and KITTI odometry datasets. Code available at:https://github.com/hamedhaghighi/LidarGRIT.
45.9CVMay 14
DriveCtrl: Conditioned Sim-to-Real Driving Video GenerationHaonan Zhao, Yiting Wang, Jingkun Chen et al.
Large-scale labelled driving video data is essential for training autonomous driving systems. Although simulation offers scalable and fully annotated data, the domain gap between synthetic and real-world driving videos significantly limits its utility for downstream deployment. Existing video generation methods are not well-suited for this task, as they fail to simultaneously preserve scene structure, object dynamics, temporal consistency, and visual realism, all of which are critical for maintaining annotation validity in generated data. In this paper, we present DriveCtrl, a depth-conditioned controllable sim-to-real video generation framework for realistic driving video synthesis. Built upon a pretrained video foundation model, DriveCtrl introduces a structure-aware adapter that enables depth-guided generation while preserving the scene layout and motion patterns of the source simulation, producing temporally coherent driving videos that remain aligned with the original simulated sequences. We further introduce a scalable data generation pipeline that transforms simulator videos into realistic driving footage matching the visual style of a target real-world dataset. The pipeline supports three conditioning signals: structural depth, reference-dataset style, and text prompts, while preserving frame-level annotations for downstream perception tasks. To better assess this task, we propose a driving-domain-specific knowledge-informed evaluation metric called Driving Video Realism Score (DVRS) that assesses the realism of generated videos. Experiments demonstrate that DriveCtrl consistently outperforms the base model and competing alternatives in realism, temporal quality, and perception task performance, substantially narrowing the sim-to-real gap for driving video generation.
CVJun 8, 2024Code
HDRT: A Large-Scale Dataset for Infrared-Guided HDR ImagingJingchao Peng, Thomas Bashford-Rogers, Francesco Banterle et al.
Capturing images with enough details to solve imaging tasks is a long-standing challenge in imaging, particularly due to the limitations of standard dynamic range (SDR) images which often lose details in underexposed or overexposed regions. Traditional high dynamic range (HDR) methods, like multi-exposure fusion or inverse tone mapping, struggle with ghosting and incomplete data reconstruction. Infrared (IR) imaging offers a unique advantage by being less affected by lighting conditions, providing consistent detail capture regardless of visible light intensity. In this paper, we introduce the HDRT dataset, the first comprehensive dataset that consists of HDR and thermal IR images. The HDRT dataset comprises 50,000 images captured across three seasons over six months in eight cities, providing a diverse range of lighting conditions and environmental contexts. Leveraging this dataset, we propose HDRTNet, a novel deep neural method that fuses IR and SDR content to generate HDR images. Extensive experiments validate HDRTNet against the state-of-the-art, showing substantial quantitative and qualitative quality improvements. The HDRT dataset not only advances IR-guided HDR imaging but also offers significant potential for broader research in HDR imaging, multi-modal fusion, domain transfer, and beyond. The dataset is available at https://huggingface.co/datasets/jingchao-peng/HDRTDataset.
CVDec 25, 2023Code
A Unified Generative Framework for Realistic Lidar Simulation in Autonomous Driving SystemsHamed Haghighi, Mehrdad Dianati, Valentina Donzella et al.
Simulation models for perception sensors are integral components of automotive simulators used for the virtual Verification and Validation (V\&V) of Autonomous Driving Systems (ADS). These models also serve as powerful tools for generating synthetic datasets to train deep learning-based perception models. Lidar is a widely used sensor type among the perception sensors for ADS due to its high precision in 3D environment scanning. However, developing realistic Lidar simulation models is a significant technical challenge. In particular, unrealistic models can result in a large gap between the synthesised and real-world point clouds, limiting their effectiveness in ADS applications. Recently, deep generative models have emerged as promising solutions to synthesise realistic sensory data. However, for Lidar simulation, deep generative models have been primarily hybridised with conventional algorithms, leaving unified generative approaches largely unexplored in the literature. Motivated by this research gap, we propose a unified generative framework to enhance Lidar simulation fidelity. Our proposed framework projects Lidar point clouds into depth-reflectance images via a lossless transformation, and employs our novel Controllable Lidar point cloud Generative model, CoLiGen, to translate the images. We extensively evaluate our CoLiGen model, comparing it with the state-of-the-art image-to-image translation models using various metrics to assess the realness, faithfulness, and performance of a downstream perception model. Our results show that CoLiGen exhibits superior performance across most metrics. The dataset and source code for this research are available at https://github.com/hamedhaghighi/CoLiGen.git.
LGMay 25, 2023Code
Counterfactual Explainer Framework for Deep Reinforcement Learning Models Using Policy DistillationAmir Samadi, Konstantinos Koufos, Kurt Debattista et al.
Deep Reinforcement Learning (DRL) has demonstrated promising capability in solving complex control problems. However, DRL applications in safety-critical systems are hindered by the inherent lack of robust verification techniques to assure their performance in such applications. One of the key requirements of the verification process is the development of effective techniques to explain the system functionality, i.e., why the system produces specific results in given circumstances. Recently, interpretation methods based on the Counterfactual (CF) explanation approach have been proposed to address the problem of explanation in DRLs. This paper proposes a novel CF explanation framework to explain the decisions made by a black-box DRL. To evaluate the efficacy of the proposed explanation framework, we carried out several experiments in the domains of automated driving systems and Atari Pong game. Our analysis demonstrates that the proposed framework generates plausible and meaningful explanations for various decisions made by deep underlying DRLs. Source codes are available at: \url{https://github.com/Amir-Samadi/Counterfactual-Explanation}
CVApr 14, 2024
Exploring Generative AI for Sim2Real in Driving Data SynthesisHaonan Zhao, Yiting Wang, Thomas Bashford-Rogers et al.
Datasets are essential for training and testing vehicle perception algorithms. However, the collection and annotation of real-world images is time-consuming and expensive. Driving simulators offer a solution by automatically generating various driving scenarios with corresponding annotations, but the simulation-to-reality (Sim2Real) domain gap remains a challenge. While most of the Generative Artificial Intelligence (AI) follows the de facto Generative Adversarial Nets (GANs)-based methods, the recent emerging diffusion probabilistic models have not been fully explored in mitigating Sim2Real challenges for driving data synthesis. To explore the performance, this paper applied three different generative AI methods to leverage semantic label maps from a driving simulator as a bridge for the creation of realistic datasets. A comparative analysis of these methods is presented from the perspective of image quality and perception. New synthetic datasets, which include driving images and auto-generated high-quality annotations, are produced with low costs and high scene variability. The experimental results show that although GAN-based methods are adept at generating high-quality images when provided with manually annotated labels, ControlNet produces synthetic datasets with fewer artefacts and more structural fidelity when using simulator-generated labels. This suggests that the diffusion-based approach may provide improved stability and an alternative method for addressing Sim2Real challenges.
CVMay 24, 2024
Semantic Aware Diffusion Inverse Tone MappingAbhishek Goswami, Aru Ranjan Singh, Francesco Banterle et al.
The range of real-world scene luminance is larger than the capture capability of many digital camera sensors which leads to details being lost in captured images, most typically in bright regions. Inverse tone mapping attempts to boost these captured Standard Dynamic Range (SDR) images back to High Dynamic Range (HDR) by creating a mapping that linearizes the well exposed values from the SDR image, and provides a luminance boost to the clipped content. However, in most cases, the details in the clipped regions cannot be recovered or estimated. In this paper, we present a novel inverse tone mapping approach for mapping SDR images to HDR that generates lost details in clipped regions through a semantic-aware diffusion based inpainting approach. Our method proposes two major contributions - first, we propose to use a semantic graph to guide SDR diffusion based inpainting in masked regions in a saturated image. Second, drawing inspiration from traditional HDR imaging and bracketing methods, we propose a principled formulation to lift the SDR inpainted regions to HDR that is compatible with generative inpainting methods. Results show that our method demonstrates superior performance across different datasets on objective metrics, and subjective experiments show that the proposed method matches (and in most cases outperforms) state-of-art inverse tone mapping operators in terms of objective metrics and outperforms them for visual fidelity.
CVFeb 23, 2024
Benchmarking the Robustness of Panoptic Segmentation for Automated DrivingYiting Wang, Haonan Zhao, Daniel Gummadi et al.
Precise situational awareness is required for the safe decision-making of assisted and automated driving (AAD) functions. Panoptic segmentation is a promising perception technique to identify and categorise objects, impending hazards, and driveable space at a pixel level. While segmentation quality is generally associated with the quality of the camera data, a comprehensive understanding and modelling of this relationship are paramount for AAD system designers. Motivated by such a need, this work proposes a unifying pipeline to assess the robustness of panoptic segmentation models for AAD, correlating it with traditional image quality. The first step of the proposed pipeline involves generating degraded camera data that reflects real-world noise factors. To this end, 19 noise factors have been identified and implemented with 3 severity levels. Of these factors, this work proposes novel models for unfavourable light and snow. After applying the degradation models, three state-of-the-art CNN- and vision transformers (ViT)-based panoptic segmentation networks are used to analyse their robustness. The variations of the segmentation performance are then correlated to 8 selected image quality metrics. This research reveals that: 1) certain specific noise factors produce the highest impact on panoptic segmentation, i.e. droplets on lens and Gaussian noise; 2) the ViT-based panoptic segmentation backbones show better robustness to the considered noise factors; 3) some image quality metrics (i.e. LPIPS and CW-SSIM) correlate strongly with panoptic segmentation performance and therefore they can be used as predictive metrics for network performance.
CVNov 25, 2024
CapHDR2IR: Caption-Driven Transfer from Visible Light to Infrared DomainJingchao Peng, Thomas Bashford-Rogers, Zhuang Shao et al.
Infrared (IR) imaging offers advantages in several fields due to its unique ability of capturing content in extreme light conditions. However, the demanding hardware requirements of high-resolution IR sensors limit its widespread application. As an alternative, visible light can be used to synthesize IR images but this causes a loss of fidelity in image details and introduces inconsistencies due to lack of contextual awareness of the scene. This stems from a combination of using visible light with a standard dynamic range, especially under extreme lighting, and a lack of contextual awareness can result in pseudo-thermal-crossover artifacts. This occurs when multiple objects with similar temperatures appear indistinguishable in the training data, further exacerbating the loss of fidelity. To solve this challenge, this paper proposes CapHDR2IR, a novel framework incorporating vision-language models using high dynamic range (HDR) images as inputs to generate IR images. HDR images capture a wider range of luminance variations, ensuring reliable IR image generation in different light conditions. Additionally, a dense caption branch integrates semantic understanding, resulting in more meaningful and discernible IR outputs. Extensive experiments on the HDRT dataset show that the proposed CapHDR2IR achieves state-of-the-art performance compared with existing general domain transfer methods and those tailored for visible-to-infrared image translation.
CVNov 25, 2024
Luminance Component Analysis for Exposure CorrectionJingchao Peng, Thomas Bashford-Rogers, Jingkun Chen et al.
Exposure correction methods aim to adjust the luminance while maintaining other luminance-unrelated information. However, current exposure correction methods have difficulty in fully separating luminance-related and luminance-unrelated components, leading to distortions in color, loss of detail, and requiring extra restoration procedures. Inspired by principal component analysis (PCA), this paper proposes an exposure correction method called luminance component analysis (LCA). LCA applies the orthogonal constraint to a U-Net structure to decouple luminance-related and luminance-unrelated features. With decoupled luminance-related features, LCA adjusts only the luminance-related components while keeping the luminance-unrelated components unchanged. To optimize the orthogonal constraint problem, LCA employs a geometric optimization algorithm, which converts the constrained problem in Euclidean space to an unconstrained problem in orthogonal Stiefel manifolds. Extensive experiments show that LCA can decouple the luminance feature from the RGB color space. Moreover, LCA achieves the best PSNR (21.33) and SSIM (0.88) in the exposure correction dataset with 28.72 FPS.
GRFeb 11, 2022
Unsupervised HDR Imaging: What Can Be Learned from a Single 8-bit Video?Francesco Banterle, Demetris Marnerides, Kurt Debattista et al.
Recently, Deep Learning-based methods for inverse tone-mapping standard dynamic range (SDR) images to obtain high dynamic range (HDR) images have become very popular. These methods manage to fill over-exposed areas convincingly both in terms of details and dynamic range. Typically, these methods, to be effective, need to learn from large datasets and to transfer this knowledge to the network weights. In this work, we tackle this problem from a completely different perspective. What can we learn from a single SDR video? With the presented zero-shot approach, we show that, in many cases, a single SDR video is sufficient to be able to generate an HDR video of the same quality or better than other state-of-the-art methods.
CVJun 17, 2021
Deep HDR Hallucination for Inverse Tone MappingDemetris Marnerides, Thomas Bashford-Rogers, Kurt Debattista
Inverse Tone Mapping (ITM) methods attempt to reconstruct High Dynamic Range (HDR) information from Low Dynamic Range (LDR) image content. The dynamic range of well-exposed areas must be expanded and any missing information due to over/under-exposure must be recovered (hallucinated). The majority of methods focus on the former and are relatively successful, while most attempts on the latter are not of sufficient quality, even ones based on Convolutional Neural Networks (CNNs). A major factor for the reduced inpainting quality in some works is the choice of loss function. Work based on Generative Adversarial Networks (GANs) shows promising results for image synthesis and LDR inpainting, suggesting that GAN losses can improve inverse tone mapping results. This work presents a GAN-based method that hallucinates missing information from badly exposed areas in LDR images and compares its efficacy with alternative variations. The proposed method is quantitatively competitive with state-of-the-art inverse tone mapping methods, providing good dynamic range expansion for well-exposed areas and plausible hallucinations for saturated and under-exposed areas. A density-based normalisation method, targeted for HDR content, is also proposed, as well as an HDR data augmentation method targeted for HDR hallucination.
IVAug 19, 2020
Deep Controllable Backlight DimmingLvyin Duan, Demetris Marnerides, Alan Chalmers et al.
Dual-panel displays require local dimming algorithms in order to reproduce content with high fidelity and high dynamic range. In this work, a novel deep learning based local dimming method is proposed for rendering HDR images on dual-panel HDR displays. The method uses a Convolutional Neural Network to predict backlight values, using as input the HDR image that is to be displayed. The model is designed and trained via a controllable power parameter that allows a user to trade off between power and quality. The proposed method is evaluated against six other methods on a test set of 105 HDR images, using a variety of quantitative quality metrics. Results demonstrate improved display quality and better power consumption when using the proposed method compared to the best alternatives.
IVApr 22, 2020
Spectrally Consistent UNet for High Fidelity Image TransformationsDemetris Marnerides, Thomas Bashford-Rogers, Kurt Debattista
Convolutional Neural Networks (CNNs) are the current de-facto models used for many imaging tasks due to their high learning capacity as well as their architectural qualities. The ubiquitous UNet architecture provides an efficient and multi-scale solution that combines local and global information. Despite the success of UNet architectures, the use of upsampling layers can cause artefacts. In this work, a method for assessing the structural biases of UNets and the effects these have on the outputs is presented, characterising their impact in the Fourier domain. A new upsampling module is proposed, based on a novel use of the Guided Image Filter, that provides spectrally consistent outputs when used in a UNet architecture, forming the Guided UNet (GUNet). The GUNet architecture is applied and evaluated for example applications of inverse tone mapping/dynamic range expansion and colourisation from grey-scale images and is shown to provide higher fidelity outputs.
GRFeb 7, 2020
Audio-Visual-Olfactory Resource Allocation for Tri-modal Virtual EnvironmentsEfstratios Doukakis, Kurt Debattista, Thomas Bashford-Rogers et al.
Virtual Environments (VEs) provide the opportunity to simulate a wide range of applications, from training to entertainment, in a safe and controlled manner. For applications which require realistic representations of real world environments, the VEs need to provide multiple, physically accurate sensory stimuli. However, simulating all the senses that comprise the human sensory system (HSS) is a task that requires significant computational resources. Since it is intractable to deliver all senses at the highest quality, we propose a resource distribution scheme in order to achieve an optimal perceptual experience within the given computational budgets. This paper investigates resource balancing for multi-modal scenarios composed of aural, visual and olfactory stimuli. Three experimental studies were conducted. The first experiment identified perceptual boundaries for olfactory computation. In the second experiment, participants (N=25) were asked, across a fixed number of budgets (M=5), to identify what they perceived to be the best visual, acoustic and olfactory stimulus quality for a given computational budget. Results demonstrate that participants tend to prioritise visual quality compared to other sensory stimuli. However, as the budget size is increased, users prefer a balanced distribution of resources with an increased preference for having smell impulses in the VE. Based on the collected data, a quality prediction model is proposed and its accuracy is validated against previously unused budgets and an untested scenario in a third and final experiment.
HCJan 30, 2020
Visuohaptic augmented feedback for enhancing motor skills acquisitionAli Asadipour, Kurt Debattista, Alan Chalmers
Serious games are accepted as an effective approach to deliver augmented feedback in motor (re-) learning processes. The multi-modal nature of the conventional computer games (e.g. audiovisual representation) plus the ability to interact via haptic-enabled inputs provides a more immersive experience. Thus, particular disciplines such as medical education in which frequent hands on rehearsals play a key role in learning core motor skills (e.g. physical palpations) may benefit from this technique. Challenges such as the impracticality of verbalising palpation experience by tutors and ethical considerations may prevent the medical students from correctly learning core palpation skills. This work presents a new data glove, built from off-the-shelf components which captures pressure sensitivity designed to provide feedback for palpation tasks. In this work the data glove is used to control a serious game adapted from the infinite runner genre to improve motor skill acquisition. A comparative evaluation on usability and effectiveness of the method using multimodal visualisations, as part of a larger study to enhance pressure sensitivity, is presented. Thirty participants divided into a game-playing group (n = 15) and a control group (n = 15) were invited to perform a simple palpation task. The game-playing group significantly outperformed the control group in which abstract visualisation of force was provided to the users in a blind-folded transfer test. The game-based training approach was positively described by the game-playing group as enjoyable and engaging.
CVMar 6, 2018
ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range ContentDemetris Marnerides, Thomas Bashford-Rogers, Jonathan Hatchett et al.
High dynamic range (HDR) imaging provides the capability of handling real world lighting as opposed to the traditional low dynamic range (LDR) which struggles to accurately represent images with higher dynamic range. However, most imaging content is still available only in LDR. This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet. ExpandNet accepts LDR images as input and generates images with an expanded range in an end-to-end fashion. The model attempts to reconstruct missing information that was lost from the original signal due to quantization, clipping, tone mapping or gamma correction. The added information is reconstructed from learned features, as the network is trained in a supervised fashion using a dataset of HDR images. The approach is fully automatic and data driven; it does not require any heuristics or human expertise. ExpandNet uses a multiscale architecture which avoids the use of upsampling layers to improve image quality. The method performs well compared to expansion/inverse tone mapping operators quantitatively on multiple metrics, even for badly exposed inputs.