LGAICLCVNov 8, 2021

Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation

arXiv:2111.04318v2130 citations
Originality Highly original
AI Analysis

This addresses the challenge of expensive and time-consuming paired data collection in medical imaging for clinicians and researchers, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of generating medical reports from images without requiring paired image-report training data by proposing an unsupervised model called Knowledge Graph Auto-Encoder (KGAE), which uses a knowledge graph as a shared latent space to bridge visual and textual domains, and it consistently outperforms state-of-the-art models on two datasets after fine-tuning.

Medical report generation, which aims to automatically generate a long and coherent report of a given medical image, has been receiving growing research interests. Existing approaches mainly adopt a supervised manner and heavily rely on coupled image-report pairs. However, in the medical domain, building a large-scale image-report paired dataset is both time-consuming and expensive. To relax the dependency on paired data, we propose an unsupervised model Knowledge Graph Auto-Encoder (KGAE) which accepts independent sets of images and reports in training. KGAE consists of a pre-constructed knowledge graph, a knowledge-driven encoder and a knowledge-driven decoder. The knowledge graph works as the shared latent space to bridge the visual and textual domains; The knowledge-driven encoder projects medical images and reports to the corresponding coordinates in this latent space and the knowledge-driven decoder generates a medical report given a coordinate in this space. Since the knowledge-driven encoder and decoder can be trained with independent sets of images and reports, KGAE is unsupervised. The experiments show that the unsupervised KGAE generates desirable medical reports without using any image-report training pairs. Moreover, KGAE can also work in both semi-supervised and supervised settings, and accept paired images and reports in training. By further fine-tuning with image-report pairs, KGAE consistently outperforms the current state-of-the-art models on two datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes