LGSep 29, 2025

A Fuzzy Logic-Based Framework for Explainable Machine Learning in Big Data Analytics

arXiv:2510.05120v12 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses the need for explainable and fair ML in domains like environmental monitoring, though it appears incremental as it builds on existing fuzzy logic and clustering techniques.

The paper tackles the problem of interpretability and fairness in machine learning for big data analytics by proposing a novel framework combining type-2 fuzzy sets, granular computing, and clustering, which achieves a 4% improvement in silhouette score over type-1 methods and reduces entropy by up to 1% for fairness.

The growing complexity of machine learning (ML) models in big data analytics, especially in domains such as environmental monitoring, highlights the critical need for interpretability and explainability to promote trust, ethical considerations, and regulatory adherence (e.g., GDPR). Traditional "black-box" models obstruct transparency, whereas post-hoc explainable AI (XAI) techniques like LIME and SHAP frequently compromise accuracy or fail to deliver inherent insights. This paper presents a novel framework that combines type-2 fuzzy sets, granular computing, and clustering to boost explainability and fairness in big data environments. When applied to the UCI Air Quality dataset, the framework effectively manages uncertainty in noisy sensor data, produces linguistic rules, and assesses fairness using silhouette scores and entropy. Key contributions encompass: (1) A type-2 fuzzy clustering approach that enhances cohesion by about 4% compared to type-1 methods (silhouette 0.365 vs. 0.349) and improves fairness (entropy 0.918); (2) Incorporation of fairness measures to mitigate biases in unsupervised scenarios; (3) A rule-based component for intrinsic XAI, achieving an average coverage of 0.65; (4) Scalable assessments showing linear runtime (roughly 0.005 seconds for sampled big data sizes). Experimental outcomes reveal superior performance relative to baselines such as DBSCAN and Agglomerative Clustering in terms of interpretability, fairness, and efficiency. Notably, the proposed method achieves a 4% improvement in silhouette score over type-1 fuzzy clustering and outperforms baselines in fairness (entropy reduction by up to 1%) and efficiency.

View on arXiv PDF

Similar