Manfred Huber

LG
h-index3
14papers
111citations
Novelty49%
AI Score44

14 Papers

IVMay 20, 2022
A SSIM Guided cGAN Architecture For Clinically Driven Generative Image Synthesis of Multiplexed Spatial Proteomics Channels

Jillur Rahman Saurav, Mohammad Sadegh Nasr, Paul Koomey et al.

Here we present a structural similarity index measure (SSIM) guided conditional Generative Adversarial Network (cGAN) that generatively performs image-to-image (i2i) synthesis to generate photo-accurate protein channels in multiplexed spatial proteomics images. This approach can be utilized to accurately generate missing spatial proteomics channels that were not included during experimental data collection either at the bench or the clinic. Experimental spatial proteomic data from the Human BioMolecular Atlas Program (HuBMAP) was used to generate spatial representations of missing proteins through a U-Net based image synthesis pipeline. HuBMAP channels were hierarchically clustered by the (SSIM) as a heuristic to obtain the minimal set needed to recapitulate the underlying biology represented by the spatial landscape of proteins. We subsequently prove that our SSIM based architecture allows for scaling of generative image synthesis to slides with up to 100 channels, which is better than current state of the art algorithms which are limited to data with 11 channels. We validate these claims by generating a new experimental spatial proteomics data set from human lung adenocarcinoma tissue sections and show that a model trained on HuBMAP can accurately synthesize channels from our new data set. The ability to recapitulate experimental data from sparsely stained multiplexed histological slides containing spatial proteomic will have tremendous impact on medical diagnostics and drug development, and also raises important questions on the medical ethics of utilizing data produced by generative image synthesis in the clinical setting. The algorithm that we present in this paper will allow researchers and clinicians to save time and costs in proteomics based histological staining while also increasing the amount of data that they can generate through their experiments.

LGAug 6, 2023
Weakly Supervised Multi-Task Representation Learning for Human Activity Analysis Using Wearables

Taoran Sheng, Manfred Huber

Sensor data streams from wearable devices and smart environments are widely studied in areas like human activity recognition (HAR), person identification, or health monitoring. However, most of the previous works in activity and sensor stream analysis have been focusing on one aspect of the data, e.g. only recognizing the type of the activity or only identifying the person who performed the activity. We instead propose an approach that uses a weakly supervised multi-output siamese network that learns to map the data into multiple representation spaces, where each representation space focuses on one aspect of the data. The representation vectors of the data samples are positioned in the space such that the data with the same semantic meaning in that aspect are closely located to each other. Therefore, as demonstrated with a set of experiments, the trained model can provide metrics for clustering data based on multiple aspects, allowing it to address multiple tasks simultaneously and even to outperform single task supervised methods in many situations. In addition, further experiments are presented that in more detail analyze the effect of the architecture and of using multiple tasks within this framework, that investigate the scalability of the model to include additional tasks, and that demonstrate the ability of the framework to combine data for which only partial relationship information with respect to the target tasks is available.

HCJul 18, 2023
Siamese Networks for Weakly Supervised Human Activity Recognition

Taoran Sheng, Manfred Huber

Deep learning has been successfully applied to human activity recognition. However, training deep neural networks requires explicitly labeled data which is difficult to acquire. In this paper, we present a model with multiple siamese networks that are trained by using only the information about the similarity between pairs of data samples without knowing the explicit labels. The trained model maps the activity data samples into fixed size representation vectors such that the distance between the vectors in the representation space approximates the similarity of the data samples in the input space. Thus, the trained model can work as a metric for a wide range of different clustering algorithms. The training process minimizes a similarity loss function that forces the distance metric to be small for pairs of samples from the same kind of activity, and large for pairs of samples from different kinds of activities. We evaluate the model on three datasets to verify its effectiveness in segmentation and recognition of continuous human activity sequences.

LGJul 21, 2023
Unsupervised Embedding Learning for Human Activity Recognition Using Wearable Sensor Data

Taoran Sheng, Manfred Huber

The embedded sensors in widely used smartphones and other wearable devices make the data of human activities more accessible. However, recognizing different human activities from the wearable sensor data remains a challenging research problem in ubiquitous computing. One of the reasons is that the majority of the acquired data has no labels. In this paper, we present an unsupervised approach, which is based on the nature of human activity, to project the human activities into an embedding space in which similar activities will be located closely together. Using this, subsequent clustering algorithms can benefit from the embeddings, forming behavior clusters that represent the distinct activities performed by a person. Results of experiments on three labeled benchmark datasets demonstrate the effectiveness of the framework and show that our approach can help the clustering algorithm achieve improved performance in identifying and categorizing the underlying human activities compared to unsupervised techniques applied directly to the original data set.

SPJul 29, 2024
Consistency Based Weakly Self-Supervised Learning for Human Activity Recognition with Wearables

Taoran Sheng, Manfred Huber

While the widely available embedded sensors in smartphones and other wearable devices make it easier to obtain data of human activities, recognizing different types of human activities from sensor-based data remains a difficult research topic in ubiquitous computing. One reason for this is that most of the collected data is unlabeled. However, many current human activity recognition (HAR) systems are based on supervised methods, which heavily rely on the labels of the data. We describe a weakly self-supervised approach in this paper that consists of two stages: (1) In stage one, the model learns from the nature of human activities by projecting the data into an embedding space where similar activities are grouped together; (2) In stage two, the model is fine-tuned using similarity information in a few-shot learning fashion using the similarity information of the data. This allows downstream classification or clustering tasks to benefit from the embeddings. Experiments on three benchmark datasets demonstrate the framework's effectiveness and show that our approach can help the clustering algorithm achieve comparable performance in identifying and categorizing the underlying human activities as pure supervised techniques applied directly to a corresponding fully labeled data set.

IVMar 23, 2023
Clinically Relevant Latent Space Embedding of Cancer Histopathology Slides through Variational Autoencoder Based Image Compression

Mohammad Sadegh Nasr, Amir Hajighasemi, Paul Koomey et al.

In this paper, we introduce a Variational Autoencoder (VAE) based training approach that can compress and decompress cancer pathology slides at a compression ratio of 1:512, which is better than the previously reported state of the art (SOTA) in the literature, while still maintaining accuracy in clinical validation tasks. The compression approach was tested on more common computer vision datasets such as CIFAR10, and we explore which image characteristics enable this compression ratio on cancer imaging data but not generic images. We generate and visualize embeddings from the compressed latent space and demonstrate how they are useful for clinical interpretation of data, and how in the future such latent embeddings can be used to accelerate search of clinical imaging data.

LGAug 9, 2022
Generalized Reinforcement Learning: Experience Particles, Action Operator, Reinforcement Field, Memory Association, and Decision Concepts

Po-Hsiang Chiu, Manfred Huber

Learning a control policy capable of adapting to time-varying and potentially evolving system dynamics has been a great challenge to the mainstream reinforcement learning (RL). Mainly, the ever-changing system properties would continuously affect how the RL agent interacts with the state space through its actions, which effectively (re-)introduces concept drifts to the underlying policy learning process. We postulated that higher adaptability for the control policy can be achieved by characterizing and representing actions with extra "degrees of freedom" and thereby, with greater flexibility, adjusts to variations from the action's "behavioral" outcomes, including how these actions get carried out in real time and the shift in the action set itself. This paper proposes a Bayesian-flavored generalized RL framework by first establishing the notion of parametric action model to better cope with uncertainty and fluid action behaviors, followed by introducing the notion of reinforcement field as a physics-inspired construct established through "polarized experience particles" maintained in the RL agent's working memory. These particles effectively encode the agent's dynamic learning experience that evolves over time in a self-organizing way. Using the reinforcement field as a substrate, we will further generalize the policy search to incorporate high-level decision concepts by viewing the past memory as an implicit graph structure, in which the memory instances, or particles, are interconnected with their degrees of associability/similarity defined and quantified such that the "associative memory" principle can be consistently applied to establish and augment the learning agent's evolving world model.

IVJun 29, 2023
Histopathology Slide Indexing and Search: Are We There Yet?

Helen H. Shang, Mohammad Sadegh Nasr, Jai Prakash Veerla et al.

The search and retrieval of digital histopathology slides is an important task that has yet to be solved. In this case study, we investigate the clinical readiness of three state-of-the-art histopathology slide search engines, Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a qualitative assessment of each model's performance in providing retrieval results that are reliable and useful to pathologists. We found that all three image search engines fail to produce consistently reliable results and have difficulties in capturing granular and subtle features of malignancy, limiting their diagnostic accuracy. Based on our findings, we also propose a minimal set of requirements to further advance the development of accurate and reliable histopathology image search engines for successful clinical adoption.

RODec 15, 2024Code
Volumetric Mapping with Panoptic Refinement via Kernel Density Estimation for Mobile Robots

Khang Nguyen, Tuan Dang, Manfred Huber

Reconstructing three-dimensional (3D) scenes with semantic understanding is vital in many robotic applications. Robots need to identify which objects, along with their positions and shapes, to manipulate them precisely with given tasks. Mobile robots, especially, usually use lightweight networks to segment objects on RGB images and then localize them via depth maps; however, they often encounter out-of-distribution scenarios where masks over-cover the objects. In this paper, we address the problem of panoptic segmentation quality in 3D scene reconstruction by refining segmentation errors using non-parametric statistical methods. To enhance mask precision, we map the predicted masks into a depth frame to estimate their distribution via kernel densities. The outliers in depth perception are then rejected without the need for additional parameters in an adaptive manner to out-of-distribution scenarios, followed by 3D reconstruction using projective signed distance functions (SDFs). We validate our method on a synthetic dataset, which shows improvements in both quantitative and qualitative results for panoptic mapping. Through real-world testing, the results furthermore show our method's capability to be deployed on a real-robot system. Our source code is available at: https://github.com/mkhangg/refined panoptic mapping.

LGDec 15, 2025
Reducing Label Dependency in Human Activity Recognition with Wearables: From Supervised Learning to Novel Weakly Self-Supervised Approaches

Taoran Sheng, Manfred Huber

Human activity recognition (HAR) using wearable sensors has advanced through various machine learning paradigms, each with inherent trade-offs between performance and labeling requirements. While fully supervised techniques achieve high accuracy, they demand extensive labeled datasets that are costly to obtain. Conversely, unsupervised methods eliminate labeling needs but often deliver suboptimal performance. This paper presents a comprehensive investigation across the supervision spectrum for wearable-based HAR, with particular focus on novel approaches that minimize labeling requirements while maintaining competitive accuracy. We develop and empirically compare: (1) traditional fully supervised learning, (2) basic unsupervised learning, (3) a weakly supervised learning approach with constraints, (4) a multi-task learning approach with knowledge sharing, (5) a self-supervised approach based on domain expertise, and (6) a novel weakly self-supervised learning framework that leverages domain knowledge and minimal labeled data. Experiments across benchmark datasets demonstrate that: (i) our weakly supervised methods achieve performance comparable to fully supervised approaches while significantly reducing supervision requirements; (ii) the proposed multi-task framework enhances performance through knowledge sharing between related tasks; (iii) our weakly self-supervised approach demonstrates remarkable efficiency with just 10\% of labeled data. These results not only highlight the complementary strengths of different learning paradigms, offering insights into tailoring HAR solutions based on the availability of labeled data, but also establish that our novel weakly self-supervised framework offers a promising solution for practical HAR applications where labeled data are limited.

ROSep 5, 2025
DeGuV: Depth-Guided Visual Reinforcement Learning for Generalization and Interpretability in Manipulation

Tien Pham, Xinyun Chi, Khang Nguyen et al.

Reinforcement learning (RL) agents can learn to solve complex tasks from visual inputs, but generalizing these learned skills to new environments remains a major challenge in RL application, especially robotics. While data augmentation can improve generalization, it often compromises sample efficiency and training stability. This paper introduces DeGuV, an RL framework that enhances both generalization and sample efficiency. In specific, we leverage a learnable masker network that produces a mask from the depth input, preserving only critical visual information while discarding irrelevant pixels. Through this, we ensure that our RL agents focus on essential features, improving robustness under data augmentation. In addition, we incorporate contrastive learning and stabilize Q-value estimation under augmentation to further enhance sample efficiency and training stability. We evaluate our proposed method on the RL-ViGen benchmark using the Franka Emika robot and demonstrate its effectiveness in zero-shot sim-to-real transfer. Our results show that DeGuV outperforms state-of-the-art methods in both generalization and sample efficiency while also improving interpretability by highlighting the most relevant regions in the visual input

LGAug 19, 2025
Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control

SM Mazharul Islam, Manfred Huber

A policy in deep reinforcement learning (RL), either deterministic or stochastic, is commonly parameterized as a Gaussian distribution alone, limiting the learned behavior to be unimodal. However, the nature of many practical decision-making problems favors a multimodal policy that facilitates robust exploration of the environment and thus to address learning challenges arising from sparse rewards, complex dynamics, or the need for strategic adaptation to varying contexts. This issue is exacerbated in continuous control domains where exploration usually takes place in the vicinity of the predicted optimal action, either through an additive Gaussian noise or the sampling process of a stochastic policy. In this paper, we introduce Categorical Policies to model multimodal behavior modes with an intermediate categorical distribution, and then generate output action that is conditioned on the sampled mode. We explore two sampling schemes that ensure differentiable discrete latent structure while maintaining efficient gradient-based optimization. By utilizing a latent categorical distribution to select the behavior mode, our approach naturally expresses multimodality while remaining fully differentiable via the sampling tricks. We evaluate our multimodal policy on a set of DeepMind Control Suite environments, demonstrating that through better exploration, our learned policies converge faster and outperform standard Gaussian policies. Our results indicate that the Categorical distribution serves as a powerful tool for structured exploration and multimodal behavior representation in continuous control.

LGNov 5, 2021
Increasing Fairness in Predictions Using Bias Parity Score Based Loss Function Regularization

Bhanu Jain, Manfred Huber, Ramez Elmasri

Increasing utilization of machine learning based decision support systems emphasizes the need for resulting predictions to be both accurate and fair to all stakeholders. In this work we present a novel approach to increase a Neural Network model's fairness during training. We introduce a family of fairness enhancing regularization components that we use in conjunction with the traditional binary-cross-entropy based accuracy loss. These loss functions are based on Bias Parity Score (BPS), a score that helps quantify bias in the models with a single number. In the current work we investigate the behavior and effect of these regularization components on bias. We deploy them in the context of a recidivism prediction task as well as on a census-based adult income dataset. The results demonstrate that with a good choice of fairness loss function we can reduce the trained model's bias without deteriorating accuracy even in unbalanced dataset.

ROMar 4, 2021
Learning the Next Best View for 3D Point Clouds via Topological Features

Christopher Collander, William J. Beksi, Manfred Huber

In this paper, we introduce a reinforcement learning approach utilizing a novel topology-based information gain metric for directing the next best view of a noisy 3D sensor. The metric combines the disjoint sections of an observed surface to focus on high-detail features such as holes and concave sections. Experimental results show that our approach can aid in establishing the placement of a robotic sensor to optimize the information provided by its streaming point cloud data. Furthermore, a labeled dataset of 3D objects, a CAD design for a custom robotic manipulator, and software for the transformation, union, and registration of point clouds has been publicly released to the research community.