LGCYAug 27, 2025

Centralized vs. Federated Learning for Educational Data Mining: A Comparative Study on Student Performance Prediction with SAEB Microdata

arXiv:2509.00086v11 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This addresses privacy concerns in educational data mining for Brazilian schools, though it is incremental as it applies an existing federated learning method to a new dataset.

The study tackled the challenge of predicting student performance while complying with Brazil's privacy laws by comparing federated learning with centralized methods, finding that federated learning achieved 61.23% accuracy with only a marginal loss compared to 63.96% for centralized models.

The application of data mining and artificial intelligence in education offers unprecedented potential for personalizing learning and early identification of at-risk students. However, the practical use of these techniques faces a significant barrier in privacy legislation, such as Brazil's General Data Protection Law (LGPD), which restricts the centralization of sensitive student data. To resolve this challenge, privacy-preserving computational approaches are required. The present study evaluates the feasibility and effectiveness of Federated Learning, specifically the FedProx algorithm, to predict student performance using microdata from the Brazilian Basic Education Assessment System (SAEB). A Deep Neural Network (DNN) model was trained in a federated manner, simulating a scenario with 50 schools, and its performance was rigorously benchmarked against a centralized eXtreme Gradient Boosting (XGBoost) model. The analysis, conducted on a universe of over two million student records, revealed that the centralized model achieved an accuracy of 63.96%. Remarkably, the federated model reached a peak accuracy of 61.23%, demonstrating a marginal performance loss in exchange for a robust privacy guarantee. The results indicate that Federated Learning is a viable and effective solution for building collaborative predictive models in the Brazilian educational context, in alignment with the requirements of the LGPD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes