Joaquim Jorge

HC
h-index7
15papers
326citations
Novelty34%
AI Score37

15 Papers

IVFeb 26, 2023Code
MDF-Net for abnormality detection by fusing X-rays with clinical data

Chihcheng Hsieh, Isabel Blanco Nobre, Sandra Costa Sousa et al.

This study investigates the effects of including patients' clinical information on the performance of deep learning (DL) classifiers for disease location in chest X-ray images. Although current classifiers achieve high performance using chest X-ray images alone, our interviews with radiologists indicate that clinical data is highly informative and essential for interpreting images and making proper diagnoses. In this work, we propose a novel architecture consisting of two fusion methods that enable the model to simultaneously process patients' clinical data (structured data) and chest X-rays (image data). Since these data modalities are in different dimensional spaces, we propose a spatial arrangement strategy, spatialization, to facilitate the multimodal learning process in a Mask R-CNN model. We performed an extensive experimental evaluation using MIMIC-Eye, a dataset comprising modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED (patients' clinical data), and REFLACX (annotations of disease locations in chest X-rays). Results show that incorporating patients' clinical data in a DL model together with the proposed fusion methods improves the disease localization in chest X-rays by 12\% in terms of Average Precision compared to a standard Mask R-CNN using only chest X-rays. Further ablation studies also emphasize the importance of multimodal DL architectures and the incorporation of patients' clinical data in disease localization. The architecture proposed in this work is publicly available to promote the scientific reproducibility of our study (https://github.com/ChihchengHsieh/multimodal-abnormalities-detection)

HCFeb 6, 2023
Development of an Immersive Virtual Colonoscopy Viewer for Colon Growths Diagnosis

João Serras, Anderson Maciel, Soraia Paulo et al.

Desktop-based virtual colonoscopy has been proven to be an asset in the identification of colon anomalies. The process is accurate, although time-consuming. The use of immersive interfaces for virtual colonoscopy is incipient and not yet understood. In this work, we present a new design exploring elements of the VR paradigm to make the immersive analysis more efficient while still effective. We also plan the conduction of experiments with experts to assess the multi-factor influences of coverage, duration, and diagnostic accuracy.

LGMar 4, 2022
Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box

Catarina Moreira, Yu-Liang Chou, Chihcheng Hsieh et al.

This study investigates the impact of machine learning models on the generation of counterfactual explanations by conducting a benchmark evaluation over three different types of models: a decision tree (fully transparent, interpretable, white-box model), a random forest (semi-interpretable, grey-box model), and a neural network (fully opaque, black-box model). We tested the counterfactual generation process using four algorithms (DiCE, WatcherCF, prototype, and GrowingSpheresCF) in the literature in 25 different datasets. Our findings indicate that: (1) Different machine learning models have little impact on the generation of counterfactual explanations; (2) Counterfactual algorithms based uniquely on proximity loss functions are not actionable and will not provide meaningful explanations; (3) One cannot have meaningful evaluation results without guaranteeing plausibility in the counterfactual generation. Algorithms that do not consider plausibility in their internal mechanisms will lead to biased and unreliable conclusions if evaluated with the current state-of-the-art metrics; (4) A counterfactual inspection analysis is strongly recommended to ensure a robust examination of counterfactual explanations and the potential identification of biases.

HCMar 3, 2022
Improving X-ray Diagnostics through Eye-Tracking and XR

Catarina Moreira, Isabel Blanco Nobre, Sandra Costa Sousa et al.

There is a growing need to assist radiologists in performing X-ray readings and diagnoses fast, comfortably, and effectively. As radiologists strive to maximize productivity, it is essential to consider the impact of reading rooms in interpreting complex examinations and ensure that higher volume and reporting speeds do not compromise patient outcomes. Virtual Reality (VR) is a disruptive technology for clinical practice in assessing X-ray images. We argue that conjugating eye-tracking with VR devices and Machine Learning may overcome obstacles posed by inadequate ergonomic postures and poor room conditions that often cause erroneous diagnostics when professionals examine digital images.

CVFeb 6, 2023
Integrating Eye-Gaze Data into CXR DL Approaches: A Preliminary study

André Luís, Chihcheng Hsieh, Isabel Blanco Nobre et al.

This paper proposes a novel multimodal DL architecture incorporating medical images and eye-tracking data for abnormality detection in chest x-rays. Our results show that applying eye gaze data directly into DL architectures does not show superior predictive performance in abnormality detection chest X-rays. These results support other works in the literature and suggest that human-generated data, such as eye gaze, needs a more thorough investigation before being applied to DL architectures.

AIJul 11, 2024
DALL-M: Context-Aware Clinical Data Augmentation with LLMs

Chihcheng Hsieh, Catarina Moreira, Isabel Blanco Nobre et al.

X-ray images are vital in medical diagnostics, but their effectiveness is limited without clinical context. Radiologists often find chest X-rays insufficient for diagnosing underlying diseases, necessitating the integration of structured clinical features with radiology reports. To address this, we introduce DALL-M, a novel framework that enhances clinical datasets by generating contextual synthetic data. DALL-M augments structured patient data, including vital signs (e.g., heart rate, oxygen saturation), radiology findings (e.g., lesion presence), and demographic factors. It integrates this tabular data with contextual knowledge extracted from radiology reports and domain-specific resources (e.g., Radiopaedia, Wikipedia), ensuring clinical consistency and reliability. DALL-M follows a three-phase process: (i) clinical context storage, (ii) expert query generation, and (iii) context-aware feature augmentation. Using large language models (LLMs), it generates both contextual synthetic values for existing clinical features and entirely new, clinically relevant features. Applied to 799 cases from the MIMIC-IV dataset, DALL-M expanded the original 9 clinical features to 91. Empirical validation with machine learning models (including Decision Trees, Random Forests, XGBoost, and TabNET) demonstrated a 16.5% improvement in F1 score and a 25% increase in Precision and Recall. DALL-M bridges an important gap in clinical data augmentation by preserving data integrity while enhancing predictive modeling in healthcare. Our results show that integrating LLM-generated synthetic features significantly improves model performance, making DALL-M a scalable and practical approach for AI-driven medical diagnostics.

AIOct 21, 2024Code
PROMPTHEUS: A Human-Centered Pipeline to Streamline SLRs with LLMs

João Pedro Fernandes Torres, Catherine Mulligan, Joaquim Jorge et al.

The growing volume of academic publications poses significant challenges for researchers conducting timely and accurate Systematic Literature Reviews, particularly in fast-evolving fields like artificial intelligence. This growth of academic literature also makes it increasingly difficult for lay people to access scientific knowledge effectively, meaning academic literature is often misrepresented in the popular press and, more broadly, in society. Traditional SLR methods are labor-intensive and error-prone, and they struggle to keep up with the rapid pace of new research. To address these issues, we developed \textit{PROMPTHEUS}: an AI-driven pipeline solution that automates the SLR process using Large Language Models. We aimed to enhance efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. PROMPTHEUS automates key stages of the SLR process, including systematic search, data extraction, topic modeling using BERTopic, and summarization with transformer models. Evaluations conducted across five research domains demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape. In addition, such tools may reduce the increasing mistrust in science by making summarization more accessible to laypeople. The code for this project can be found on the GitHub repository at https://github.com/joaopftorres/PROMPTHEUS.git

CVJan 21
FlowSSC: Universal Generative Monocular Semantic Scene Completion via One-Step Latent Diffusion

Zichen Xi, Hao-Xiang Chen, Nan Xue et al.

Semantic Scene Completion (SSC) from monocular RGB images is a fundamental yet challenging task due to the inherent ambiguity of inferring occluded 3D geometry from a single view. While feed-forward methods have made progress, they often struggle to generate plausible details in occluded regions and preserve the fundamental spatial relationships of objects. Such accurate generative reasoning capability for the entire 3D space is critical in real-world applications. In this paper, we present FlowSSC, the first generative framework applied directly to monocular semantic scene completion. FlowSSC treats the SSC task as a conditional generation problem and can seamlessly integrate with existing feed-forward SSC methods to significantly boost their performance. To achieve real-time inference without compromising quality, we introduce Shortcut Flow-matching that operates in a compact triplane latent space. Unlike standard diffusion models that require hundreds of steps, our method utilizes a shortcut mechanism to achieve high-fidelity generation in a single step, enabling practical deployment in autonomous systems. Extensive experiments on SemanticKITTI demonstrate that FlowSSC achieves state-of-the-art performance, significantly outperforming existing baselines.

CVJun 5, 2024
SelfReDepth: Self-Supervised Real-Time Depth Restoration for Consumer-Grade Sensors

Alexandre Duarte, Francisco Fernandes, João M. Pereira et al.

Depth maps produced by consumer-grade sensors suffer from inaccurate measurements and missing data from either system or scene-specific sources. Data-driven denoising algorithms can mitigate such problems. However, they require vast amounts of ground truth depth data. Recent research has tackled this limitation using self-supervised learning techniques, but it requires multiple RGB-D sensors. Moreover, most existing approaches focus on denoising single isolated depth maps or specific subjects of interest, highlighting a need for methods to effectively denoise depth maps in real-time dynamic environments. This paper extends state-of-the-art approaches for depth-denoising commodity depth devices, proposing SelfReDepth, a self-supervised deep learning technique for depth restoration, via denoising and hole-filling by inpainting full-depth maps captured with RGB-D sensors. The algorithm targets depth data in video streams, utilizing multiple sequential depth frames coupled with color data to achieve high-quality depth videos with temporal coherence. Finally, SelfReDepth is designed to be compatible with various RGB-D sensors and usable in real-time scenarios as a pre-processing step before applying other depth-dependent algorithms. Our results demonstrate our approach's real-time performance on real-world datasets. They show that it outperforms state-of-the-art denoising and restoration performance at over 30fps on Commercial Depth Cameras, with potential benefits for augmented and mixed-reality applications.

HCJan 8, 2022
Controlling camera movement in VR colonography

Soraia F Paulo, Daniel Medeiros, Daniel Lopes et al.

Immersive Colonography allows medical professionals to navigate inside the intricate tubular geometries of subject-specific 3D colon images using Virtual Reality displays. Typically, camera travel is performed via Fly-Through or Fly-Over techniques that enable semi-automatic traveling through a constrained, well-defined path at user-controlled speeds. However, Fly-Through is known to limit the visibility of lesions located behind or inside haustral folds. At the same time, Fly-Over requires splitting the entire colon visualization into two specific halves. In this paper, we study the effect of immersive Fly-Through and Fly-Over techniques on lesion detection and introduce a camera travel technique that maintains a fixed camera orientation throughout the entire medial axis path. While these techniques have been studied in non-VR desktop environments, their performance is not well understood in VR setups. We performed a comparative study to ascertain which camera travel technique is more appropriate for constrained path navigation in Immersive Colonography and validated our conclusions with two radiologists. To this end, we asked 18 participants to navigate inside a 3D colon to find specific marks. Our results suggest that the Fly-Over technique may lead to enhanced lesion detection at the cost of higher task completion times. Nevertheless, the Fly-Through method may offer a more balanced trade-off between speed and effectiveness, whereas the fixed camera orientation technique provided seemingly inferior performance results. Our study further provides design guidelines and informs future work.

AIMar 7, 2021
Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Yu-Liang Chou, Catarina Moreira, Peter Bruza et al.

There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence.

HCOct 15, 2020
Camera Travel for Immersive Colonography

Soraia F. Paulo, Daniel Medeiros, Pedro Borges et al.

Immersive Colonography allows medical professionals to navigate inside the intricate tubular geometries of subject-specific 3D colon images using Virtual Reality displays. Typically, camera travel is performed via Fly-Through or Fly-Over techniques that enable semi-automatic traveling through a constrained, well-defined path at user controlled speeds. However, Fly-Through is known to limit the visibility of lesions located behind or inside haustral folds, while Fly-Over requires splitting the entire colon visualization into two specific halves. In this paper, we study the effect of immersive Fly-Through and Fly-Over techniques on lesion detection, and introduce a camera travel technique that maintains a fixed camera orientation throughout the entire medial axis path. While these techniques have been studied in non-VR desktop environments, their performance is yet not well understood in VR setups. We performed a comparative study to ascertain which camera travel technique is more appropriate for constrained path navigation in Immersive Colonography. To this end, we asked 18 participants to navigate inside a 3D colon to find specific marks. Our results suggest that the Fly-Over technique may lead to enhanced lesion detection at the cost of higher task completion times, while the Fly-Through method may offer a more balanced trade-off between both speed and effectiveness, whereas the fixed camera orientation technique provided seemingly inferior performance results. Our study further provides design guidelines and informs future work.

HCNov 29, 2019
Safe Walking In VR using Augmented Virtuality

Maurício Sousa, Daniel Mendes, Joaquim Jorge

New technologies allow ordinary people to access Virtual Reality at affordable prices in their homes. One of the most important tasks when interacting with immersive Virtual Reality is to navigate the virtual environments (VEs). Arguably, the best methods to accomplish this use of direct control interfaces. Among those, natural walking (NW) makes for enjoyable user experience. However, common techniques to support direct control interfaces in VEs feature constraints that make it difficult to use those methods in cramped home environments. Indeed, NW requires unobstructed and open space. To approach this problem, we propose a new virtual locomotion technique, Combined Walking in Place (CWIP). CWIP allows people to take advantage of the available physical space and empowers them to use NW to navigate in the virtual world. For longer distances, we adopt Walking in Place (WIP) to enable them to move in the virtual world beyond the confines of a cramped real room. However, roaming in immersive alternate reality, while moving in the confines of a cluttered environment can lead people to stumble and fall. To approach these problems, we developed Augmented Virtual Reality (AVR), to inform users about real-world hazards, such as chairs, drawers, walls via proxies and signs placed in the virtual world. We propose thus CWIP-AVR as a way to safely explore VR in the cramped confines of your own home. To our knowledge, this is the first approach to combined different locomotion modalities in a safe manner. We evaluated it in a user study with 20 participants to validate their ability to navigate a virtual world while walking in a confined and cluttered real space. Our results show that CWIP-AVR allows people to navigate VR safely, switching between locomotion modes flexibly while maintaining a good immersion.

HCOct 8, 2019
Negative Space: Workspace Awareness in 3D Face-to-Face Remote Collaboration

Maurício Sousa, Daniel Mendes, Rafael Kuffner dos Anjos et al.

Face-to-face telepresence promotes the sense of "being there" and can improve collaboration by allowing immediate understanding of remote people's nonverbal cues. Several approaches successfully explored interactions with 2D content using a see-through whiteboard metaphor. However, with 3D content, there is a decrease in awareness due to ambiguities originated by participants' opposing points-of-view. In this paper, we investigate how people and content should be presented for discussing 3D renderings within face-to-face collaborative sessions. To this end, we performed a user evaluation to compare four different conditions, in which we varied reflections of both workspace and remote people representation. Results suggest potentially more benefits to remote collaboration from workspace consistency rather than people's representation fidelity. We contribute a novel design space, the Negative Space, for remote face-to-face collaboration focusing on 3D content.

HCFeb 5, 2014
Understanding Individual Differences: Towards Effective Mobile Interface Design and Adaptation for the Blind

Tiago Guerreiro, Hugo Nicolau, João Oliveira et al.

No two people are alike. We usually ignore this diversity as we have the capability to adapt and, without noticing, become experts in interfaces that were probably misadjusted to begin with. This adaptation is not always at the user's reach. One neglected group is the blind. Spatial ability, memory, and tactile sensitivity are some characteristics that diverge between users. Regardless, all are presented with the same methods ignoring their capabilities and needs. Interaction with mobile devices is highly visually demanding which widens the gap between blind people. Our research goal is to identify the individual attributes that influence mobile interaction, considering the blind, and match them with mobile interaction modalities in a comprehensive and extensible design space. We aim to provide knowledge both for device design, device prescription and interface adaptation.