CLDec 29, 2023
Automatic Essay Scoring in a Brazilian ScenarioFelipe Akio Matsuoka
This paper presents a novel Automatic Essay Scoring (AES) algorithm tailored for the Portuguese-language essays of Brazil's Exame Nacional do Ensino Médio (ENEM), addressing the challenges in traditional human grading systems. Our approach leverages advanced deep learning techniques to align closely with human grading criteria, targeting efficiency and scalability in evaluating large volumes of student essays. This research not only responds to the logistical and financial constraints of manual grading in Brazilian educational assessments but also promises to enhance fairness and consistency in scoring, marking a significant step forward in the application of AES in large-scale academic settings.
CVNov 28, 2025
Evaluating the Clinical Impact of Generative Inpainting on Bone Age EstimationFelipe Akio Matsuoka, Eduardo Moreno J. M. Farina, Augusto Sarquis Serpa et al.
Generative foundation models can remove visual artifacts through realistic image inpainting, but their impact on medical AI performance remains uncertain. Pediatric hand radiographs often contain non-anatomical markers, and it is unclear whether inpainting these regions preserves features needed for bone age and gender prediction. To evaluate the clinical reliability of generative model-based inpainting for artifact removal, we used the RSNA Bone Age Challenge dataset, selecting 200 original radiographs and generating 600 inpainted versions with gpt-image-1 using natural language prompts to target non-anatomical artifacts. Downstream performance was assessed with deep learning ensembles for bone age estimation and gender classification, using mean absolute error (MAE) and area under the ROC curve (AUC) as metrics, and pixel intensity distributions to detect structural alterations. Inpainting markedly degraded model performance: bone age MAE increased from 6.26 to 30.11 months, and gender classification AUC decreased from 0.955 to 0.704. Inpainted images displayed pixel-intensity shifts and inconsistencies, indicating structural modifications not corrected by simple calibration. These findings show that, although visually realistic, foundation model-based inpainting can obscure subtle but clinically relevant features and introduce latent bias even when edits are confined to non-diagnostic regions, underscoring the need for rigorous, task-specific validation before integrating such generative tools into clinical AI workflows.