33.7CVMay 30
CodeCytos: AI-assisted spatial molecular imaging analysis via code-augmented agent action spaceHung Q. Vo, Huy Q. Vo, Son T. Ly et al.
Conventional tissue image analysis software provides foundational capabilities for cellular analysis, including segmentation, basic morphological feature extraction, and spatial organization analysis. However, these tools often require manual intervention and are not well integrated with code-driven automation, limiting efficiency and scalability for complex spatial tissue studies. In addition, they offer limited flexibility for custom analyses, as they typically support only a fixed set of pre-implemented spatial cellular features. To address these limitations, we propose CodeCytos, a coding-based reasoning agent framework that enables dynamic, programmable interaction with spatial molecular imaging data to improve automation and customization. CodeCytos is designed to streamline the exploration of custom spatial cellular features and adapt to diverse research needs. We demonstrate its utility through case studies on four expert-curated datasets from distinct tissue types: frontal cortex, non-small-cell lung cancer, pancreas, and tonsil. We evaluate CodeCytos under a realistic minimal prompt setting, where bioscientists pose simple questions without task-specific instructions or contextual information about spatial cellular analysis, and benchmark multiple LLM backbones with strong coding capabilities. We further show that incorporating tailored, domain-agnostic few-shot in-context coding-reasoning examples (randomly sampled demonstrations outside the spatial analysis domain) can substantially improve performance without requiring costly, expert-crafted in-domain demonstrations. Overall, CodeCytos outperforms baseline approaches, highlighting the potential of code-action agents to assist with custom feature exploration in spatial molecular imaging and to accelerate biomarker discovery.
CVNov 25, 2023
Segmentation of diagnostic tissue compartments on whole slide images with renal thrombotic microangiopathies (TMAs)Huy Q. Vo, Pietro A. Cicalese, Surya Seshan et al.
The thrombotic microangiopathies (TMAs) manifest in renal biopsy histology with a broad spectrum of acute and chronic findings. Precise diagnostic criteria for a renal biopsy diagnosis of TMA are missing. As a first step towards a machine learning- and computer vision-based analysis of wholes slide images from renal biopsies, we trained a segmentation model for the decisive diagnostic kidney tissue compartments artery, arteriole, glomerulus on a set of whole slide images from renal biopsies with TMAs and Mimickers (distinct diseases with a similar nephropathological appearance as TMA like severe benign nephrosclerosis, various vasculitides, Bevacizumab-plug glomerulopathy, arteriolar light chain deposition disease). Our segmentation model combines a U-Net-based tissue detection with a Shifted windows-transformer architecture to reach excellent segmentation results for even the most severely altered glomeruli, arterioles and arteries, even on unseen staining domains from a different nephropathology lab. With accurate automatic segmentation of the decisive renal biopsy compartments in human renal vasculopathies, we have laid the foundation for large-scale compartment-specific machine learning and computer vision analysis of renal biopsy repositories with TMAs.
CVSep 7, 2021Code
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand HygieneHuy Q. Vo, Tuong Do, Vi C. Pham et al.
This paper contributes a new high-quality dataset for hand gesture recognition in hand hygiene systems, named "MFH". Generally, current datasets are not focused on: (i) fine-grained actions; and (ii) data mismatch between different viewpoints, which are available under realistic settings. To address the aforementioned issues, the MFH dataset is proposed to contain a total of 731147 samples obtained by different camera views in 6 non-overlapping locations. Additionally, each sample belongs to one of seven steps introduced by the World Health Organization (WHO). As a minor contribution, inspired by advances in fine-grained image recognition and distribution adaptation, this paper recommends using the self-supervised learning method to handle these preceding problems. The extensive experiments on the benchmarking MFH dataset show that the introduced method yields competitive performance in both the Accuracy and the Macro F1-score. The code and the MFH dataset are available at https://github.com/willogy-team/hand-gesture-recognition-smc2021.
20.6ITMar 26
Investigating the Fundamental Limit: A Feasibility Study of Hybrid-Neural ArchivalMarcus Armstrong, ZiWei Qiu, Huy Q. Vo et al.
Large Language Models (LLMs) possess a theoretical capability to model information density far beyond the limits of classical statistical methods (e.g., Lempel-Ziv). However, utilizing this capability for lossless compression involves navigating severe system constraints, including non-deterministic hardware and prohibitive computational costs. In this work, we present an exploratory study into the feasibility of LLM-based archival systems. We introduce \textbf{Hybrid-LLM}, a proof-of-concept architecture designed to investigate the "entropic capacity" of foundation models in a storage context. \textbf{We identify a critical barrier to deployment:} the "GPU Butterfly Effect," where microscopic hardware non-determinism precludes data recovery. We resolve this via a novel logit quantization protocol, enabling the rigorous measurement of neural compression rates on real-world data. Our experiments reveal a distinct divergence between "retrieval-based" density (0.39 BPC on memorized literature) and "predictive" density (0.75 BPC on unseen news). While current inference latency ($\approx 2600\times$ slower than Zstd) limits immediate deployment to ultra-cold storage, our findings demonstrate that LLMs successfully capture semantic redundancy inaccessible to classical algorithms, establishing a baseline for future research into semantic file systems.