CVJun 27, 2025Code
CERBERUS: Crack Evaluation & Recognition Benchmark for Engineering Reliability & Urban StabilityJustin Reinman, Sunwoong Choi
CERBERUS is a synthetic benchmark designed to help train and evaluate AI models for detecting cracks and other defects in infrastructure. It includes a crack image generator and realistic 3D inspection scenarios built in Unity. The benchmark features two types of setups: a simple Fly-By wall inspection and a more complex Underpass scene with lighting and geometry challenges. We tested a popular object detection model (YOLO) using different combinations of synthetic and real crack data. Results show that combining synthetic and real data improves performance on real-world images. CERBERUS provides a flexible, repeatable way to test defect detection systems and supports future research in automated infrastructure inspection. CERBERUS is publicly available at https://github.com/justinreinman/Cerberus-Defect-Generator.
LGFeb 28, 2024
Data augmentation method for modeling health records with applications to clopidogrel treatment failure detectionSunwoong Choi, Samuel Kim
We present a novel data augmentation method to address the challenge of data scarcity in modeling longitudinal patterns in Electronic Health Records (EHR) of patients using natural language processing (NLP) algorithms. The proposed method generates augmented data by rearranging the orders of medical records within a visit where the order of elements are not obvious, if any. Applying the proposed method to the clopidogrel treatment failure detection task enabled up to 5.3% absolute improvement in terms of ROC-AUC (from 0.908 without augmentation to 0.961 with augmentation) when it was used during the pre-training procedure. It was also shown that the augmentation helped to improve performance during fine-tuning procedures, especially when the amount of labeled training data is limited.