Masayuki Ikebe

IV
h-index22
10papers
39citations
Novelty41%
AI Score45

10 Papers

IVSep 11, 2024Code
BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs

Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.

Conventional radiography is the widely used imaging technology in diagnosing, monitoring, and prognosticating musculoskeletal (MSK) diseases because of its easy availability, versatility, and cost-effectiveness. In conventional radiographs, bone overlaps are prevalent, and can impede the accurate assessment of bone characteristics by radiologists or algorithms, posing significant challenges to conventional and computer-aided diagnoses. This work initiated the study of a challenging scenario - bone layer separation in conventional radiographs, in which separate overlapped bone regions enable the independent assessment of the bone characteristics of each bone layer and lay the groundwork for MSK disease diagnosis and its automation. This work proposed a Bone Layer Separation GAN (BLS-GAN) framework that can produce high-quality bone layer images with reasonable bone characteristics and texture. This framework introduced a reconstructor based on conventional radiography imaging principles, which achieved efficient reconstruction and mitigates the recurrent calculations and training instability issues caused by soft tissue in the overlapped regions. Additionally, pre-training with synthetic images was implemented to enhance the stability of both the training process and the results. The generated images passed the visual Turing test, and improved performance in downstream tasks. This work affirms the feasibility of extracting bone layer images from conventional radiographs, which holds promise for leveraging bone layer separation technology to facilitate more comprehensive analytical research in MSK diagnosis, monitoring, and prognosis. Code and dataset: https://github.com/pokeblow/BLS-GAN.git.

87.3CVMar 14Code
Step-CoT: Stepwise Visual Chain-of-Thought for Medical Visual Question Answering

Lin Fan, Yafei Ou, Zhipeng Deng et al.

Chain-of-thought (CoT) reasoning has advanced medical visual question answering (VQA), yet most existing CoT rationales are free-form and fail to capture the structured reasoning process clinicians actually follow. This work asks: Can traceable, multi-step reasoning supervision improve reasoning accuracy and the interpretability of Medical VQA? To this end, we introduce Step-CoT, a large-scale medical reasoning dataset with expert-curated, structured multi-step CoT aligned to clinical diagnostic workflows, implicitly grounding the model's reasoning in radiographic evidence. Step-CoT comprises more than 10K real clinical cases and 70K VQA pairs organized around diagnostic workflows, providing supervised intermediate steps that guide models to follow valid reasoning trajectories. To effectively learn from Step-CoT, we further introduce a teacher-student framework with a dynamic graph-structured focusing mechanism that prioritizes diagnostically informative steps while filtering out less relevant contexts. Our experiments show that using Step-CoT can improve reasoning accuracy and interpretability. Benchmark: github.com/hahaha111111/Step-CoT. Dataset Card: huggingface.co/datasets/fl-15o/Step-CoT

IVApr 27, 2023
A Deep Registration Method for Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

Haolin Wang, Yafei Ou, Wanxuan Fang et al.

Rheumatoid arthritis (RA) is a chronic autoimmune inflammatory disease that results in progressive articular destruction and severe disability. Joint space narrowing (JSN) progression has been regarded as an important indicator for RA progression and has received sustained attention. In the diagnosis and monitoring of RA, radiology plays a crucial role to monitor joint space. A new framework for monitoring joint space by quantifying JSN progression through image registration in radiographic images has been developed. This framework offers the advantage of high accuracy, however, challenges do exist in reducing mismatches and improving reliability. In this work, a deep intra-subject rigid registration network is proposed to automatically quantify JSN progression in the early stage of RA. In our experiments, the mean-square error of Euclidean distance between moving and fixed image is 0.0031, standard deviation is 0.0661 mm, and the mismatching rate is 0.48\%. The proposed method has sub-pixel level accuracy, exceeding manual measurements by far, and is equipped with immune to noise, rotation, and scaling of joints. Moreover, this work provides loss visualization, which can aid radiologists and rheumatologists in assessing quantification reliability, with important implications for possible future clinical applications. As a result, we are optimistic that this proposed work will make a significant contribution to the automatic quantification of JSN progression in RA.

IVMay 19, 2022
A Sub-pixel Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

Yafei Ou, Prasoon Ambalathankandy, Ryunosuke Furuya et al.

Rheumatoid arthritis (RA) is a chronic autoimmune disease that primarily affects peripheral synovial joints, like fingers, wrist and feet. Radiology plays a critical role in the diagnosis and monitoring of RA. Limited by the current spatial resolution of radiographic imaging, joint space narrowing (JSN) progression of RA with the same reason above can be less than one pixel per year with universal spatial resolution. Insensitive monitoring of JSN can hinder the radiologist/rheumatologist from making a proper and timely clinical judgment. In this paper, we propose a novel and sensitive method that we call partial image phase-only correlation which aims to automatically quantify JSN progression in the early stages of RA. The majority of the current literature utilizes the mean error, root-mean-square deviation and standard deviation to report the accuracy at pixel level. Our work measures JSN progression between a baseline and its follow-up finger joint images by using the phase spectrum in the frequency domain. Using this study, the mean error can be reduced to 0.0130mm when applied to phantom radiographs with ground truth, and 0.0519mm standard deviation for clinical radiography. With its sub-pixel accuracy far beyond manual measurement, we are optimistic that our work is promising for automatically quantifying JSN progression.

31.7CVMay 7
RAM-H1200: A Unified Evaluation and Dataset on Hand Radiographs for Rheumatoid Arthritis

Songxiao Yang, Haolin Wang, Yao Fu et al.

Rheumatoid arthritis (RA) assessment from hand radiographs requires multi-level analysis and modeling of anatomical structures and fine-grained local pathological changes. However, existing public resources do not support such unified multi-level analysis, often lacking full-hand coverage, fine-grained annotations, and consistent integration with clinical scoring systems. In particular, annotations that enable quantitative analysis of bone erosion (BE) remain scarce. RAM-H1200 contains 1,200 hand radiographs collected from six medical centers, with multi-level annotations including (i) whole-hand bone structure instance segmentation, (ii) pixel-level BE masks, (iii) SvdH-defined joint regions of interest, and (iv) joint-level SvdH scores for both BE and joint space narrowing (JSN). It is designed to evaluate whether models can jointly capture anatomical structure, localized erosive pathology, and clinically standardized RA severity from hand radiographs. The proposed BE masks enable, for the first time, quantitative BE analysis beyond coarse categorical grading by providing explicit spatial supervision for lesion extent and morphology. To our knowledge, RAM-H1200 is the first public large-scale benchmark that jointly supports whole-hand bone structure instance segmentation, pixel-level BE delineation, and clinically grounded joint-level SvdH scoring for both BE and JSN. Results across benchmark tasks show that anatomical modeling is substantially more mature than quantitative BE analysis: whole-hand bone segmentation achieves strong performance, whereas BE segmentation remains a major open challenge. By unifying anatomical structure modeling, quantitative lesion analysis, and clinically grounded SvdH scoring, RAM-H1200 provides a single benchmark for comprehensive RA analysis on hand radiographs.

CVFeb 7, 2024
A Psychological Study: Importance of Contrast and Luminance in Color to Grayscale Mapping

Prasoon Ambalathankandy, Yafei Ou, Sae Kaneko et al.

Grayscale images are essential in image processing and computer vision tasks. They effectively emphasize luminance and contrast, highlighting important visual features, while also being easily compatible with other algorithms. Moreover, their simplified representation makes them efficient for storage and transmission purposes. While preserving contrast is important for maintaining visual quality, other factors such as preserving information relevant to the specific application or task at hand may be more critical for achieving optimal performance. To evaluate and compare different decolorization algorithms, we designed a psychological experiment. During the experiment, participants were instructed to imagine color images in a hypothetical "colorless world" and select the grayscale image that best resembled their mental visualization. We conducted a comparison between two types of algorithms: (i) perceptual-based simple color space conversion algorithms, and (ii) spatial contrast-based algorithms, including iteration-based methods. Our experimental findings indicate that CIELAB exhibited superior performance on average, providing further evidence for the effectiveness of perception-based decolorization algorithms. On the other hand, the spatial contrast-based algorithms showed relatively poorer performance, possibly due to factors such as DC-offset and artificial contrast generation. However, these algorithms demonstrated shorter selection times. Notably, no single algorithm consistently outperformed the others across all test images. In this paper, we will delve into a comprehensive discussion on the significance of contrast and luminance in color-to-grayscale mapping based on our experimental results and analysis.

CVFeb 9, 2024
Halo Reduction in Display Systems through Smoothed Local Histogram Equalization and Human Visual System Modeling

Prasoon Ambalathankandy, Yafei Ou, Masayuki Ikebe

Halo artifacts significantly impact display quality. We propose a method to reduce halos in Local Histogram Equalization (LHE) algorithms by separately addressing dark and light variants. This approach results in visually natural images by exploring the relationship between lateral inhibition and halo artifacts in the human visual system.

IVFeb 4, 2025
Layer Separation: Adjustable Joint Space Width Images Synthesis in Conventional Radiography

Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.

Rheumatoid arthritis (RA) is a chronic autoimmune disease characterized by joint inflammation and progressive structural damage. Joint space width (JSW) is a critical indicator in conventional radiography for evaluating disease progression, which has become a prominent research topic in computer-aided diagnostic (CAD) systems. However, deep learning-based radiological CAD systems for JSW analysis face significant challenges in data quality, including data imbalance, limited variety, and annotation difficulties. This work introduced a challenging image synthesis scenario and proposed Layer Separation Networks (LSN) to accurately separate the soft tissue layer, the upper bone layer, and the lower bone layer in conventional radiographs of finger joints. Using these layers, the adjustable JSW images can be synthesized to address data quality challenges and achieve ground truth (GT) generation. Experimental results demonstrated that LSN-based synthetic images closely resemble real radiographs, and significantly enhanced the performance in downstream tasks. The code and dataset will be available.

IVJul 7, 2025
RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis

Songxiao Yang, Haolin Wang, Yao Fu et al.

Rheumatoid arthritis (RA) is a common autoimmune disease that has been the focus of research in computer-aided diagnosis (CAD) and disease monitoring. In clinical settings, conventional radiography (CR) is widely used for the screening and evaluation of RA due to its low cost and accessibility. The wrist is a critical region for the diagnosis of RA. However, CAD research in this area remains limited, primarily due to the challenges in acquiring high-quality instance-level annotations. (i) The wrist comprises numerous small bones with narrow joint spaces, complex structures, and frequent overlaps, requiring detailed anatomical knowledge for accurate annotation. (ii) Disease progression in RA often leads to osteophyte, bone erosion (BE), and even bony ankylosis, which alter bone morphology and increase annotation difficulty, necessitating expertise in rheumatology. This work presents a multi-task dataset for wrist bone in CR, including two tasks: (i) wrist bone instance segmentation and (ii) Sharp/van der Heijde (SvdH) BE scoring, which is the first public resource for wrist bone instance segmentation. This dataset comprises 1048 wrist conventional radiographs of 388 patients from six medical centers, with pixel-level instance segmentation annotations for 618 images and SvdH BE scores for 800 images. This dataset can potentially support a wide range of research tasks related to RA, including joint space narrowing (JSN) progression quantification, BE detection, bone deformity evaluation, and osteophyte detection. It may also be applied to other wrist-related tasks, such as carpal bone fracture localization. We hope this dataset will significantly lower the barrier to research on wrist RA and accelerate progress in CAD research within the RA-related domain.

LGJun 15, 2024
MDA: An Interpretable and Scalable Multi-Modal Fusion under Missing Modalities and Intrinsic Noise Conditions

Lin Fan, Yafei Ou, Cenyang Zheng et al.

Multi-modal learning has shown exceptional performance in various tasks, especially in medical applications, where it integrates diverse medical information for comprehensive diagnostic evidence. However, there still are several challenges in multi-modal learning, 1. Heterogeneity between modalities, 2. uncertainty in missing modalities, 3. influence of intrinsic noise, and 4. interpretability for fusion result. This paper introduces the Modal-Domain Attention (MDA) model to address the above challenges. MDA constructs linear relationships between modalities through continuous attention, due to its ability to adaptively allocate dynamic attention to different modalities, MDA can reduce attention to low-correlation data, missing modalities, or modalities with inherent noise, thereby maintaining SOTA performance across various tasks on multiple public datasets. Furthermore, our observations on the contribution of different modalities indicate that MDA aligns with established clinical diagnostic imaging gold standards and holds promise as a reference for pathologies where these standards are not yet clearly defined. The code and dataset will be available.