Afshin Khadangi

h-index7

6papers

12citations

Novelty55%

AI Score43

Ranked #54,376 of 194,257 authors (top 28%)#10,732 in CL (top 35%)

6 Papers

2.8CVJul 4, 2023

DeepFlorist: Rethinking Deep Neural Networks and Ensemble Learning as A Meta-Classifier For Object Classification

Afshin Khadangi

In this paper, we propose a novel learning paradigm called "DeepFlorist" for flower classification using ensemble learning as a meta-classifier. DeepFlorist combines the power of deep learning with the robustness of ensemble methods to achieve accurate and reliable flower classification results. The proposed network architecture leverages a combination of dense convolutional and convolutional neural networks (DCNNs and CNNs) to extract high-level features from flower images, followed by a fully connected layer for classification. To enhance the performance and generalization of DeepFlorist, an ensemble learning approach is employed, incorporating multiple diverse models to improve the classification accuracy. Experimental results on benchmark flower datasets demonstrate the effectiveness of DeepFlorist, outperforming state-of-the-art methods in terms of accuracy and robustness. The proposed framework holds significant potential for automated flower recognition systems in real-world applications, enabling advancements in plant taxonomy, conservation efforts, and ecological studies.

2.0LGAug 21, 2023Code

We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond

Afshin Khadangi

In the rapidly advancing field of deep learning, optimising deep neural networks is paramount. This paper introduces a novel method, Enhanced Velocity Estimation (EVE), which innovatively applies different learning rates to distinct components of the gradients. By bifurcating the learning rate, EVE enables more nuanced control and faster convergence, addressing the challenges associated with traditional single learning rate approaches. Utilising a momentum term that adapts to the learning landscape, the method achieves a more efficient navigation of the complex loss surface, resulting in enhanced performance and stability. Extensive experiments demonstrate that EVE significantly outperforms existing optimisation techniques across various benchmark datasets and architectures.

1.4LGFeb 25

Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns

Afshin Khadangi

Continual learning is a core requirement for deployed language models, yet standard training and fine-tuning pipelines remain brittle under non-stationary data. Online updates often induce catastrophic forgetting, while methods that improve stability frequently increase latency, memory footprint, or dense computation in ways that do not scale well to long contexts. We introduce TRC$^{2}$ (Thalamically Routed Cortical Columns), a decoder-only backbone that addresses continual learning at the architectural level. TRC$^{2}$ combines sparse thalamic routing over cortical columns with mechanisms for modulation, prediction, memory, and feedback, together with a fast corrective pathway that supports rapid adaptation without destabilizing slower parameters. The resulting block is sparse and chunk-parallel, enabling efficient training and inference while preserving clean ablations of each subsystem. We instantiate a reproducible training and evaluation stack and a continual-learning harness that measures proxy forgetting under streaming domain shifts. Across language modeling and continual learning benchmarks, TRC$^{2}$ improves the stability-plasticity tradeoff at comparable compute, enabling rapid on-stream adaptation while preserving previously acquired behavior.

4.9CLMay 8, 2025Code

KG-HTC: Integrating Knowledge Graphs into LLMs for Effective Zero-shot Hierarchical Text Classification

Qianbo Zang, Christophe Zgrzendek, Igor Tchappi et al.

Hierarchical Text Classification (HTC) involves assigning documents to labels organized within a taxonomy. Most previous research on HTC has focused on supervised methods. However, in real-world scenarios, employing supervised HTC can be challenging due to a lack of annotated data. Moreover, HTC often faces issues with large label spaces and long-tail distributions. In this work, we present Knowledge Graphs for zero-shot Hierarchical Text Classification (KG-HTC), which aims to address these challenges of HTC in applications by integrating knowledge graphs with Large Language Models (LLMs) to provide structured semantic context during classification. Our method retrieves relevant subgraphs from knowledge graphs related to the input text using a Retrieval-Augmented Generation (RAG) approach. Our KG-HTC can enhance LLMs to understand label semantics at various hierarchy levels. We evaluate KG-HTC on three open-source HTC datasets: WoS, DBpedia, and Amazon. Our experimental results show that KG-HTC significantly outperforms three baselines in the strict zero-shot setting, particularly achieving substantial improvements at deeper levels of the hierarchy. This evaluation demonstrates the effectiveness of incorporating structured knowledge into LLMs to address HTC's challenges in large label spaces and long-tailed label distributions. Our code is available at: https://github.com/QianboZang/KG-HTC.

9.6CLFeb 4, 2025

CognArtive: Large Language Models for Automating Art Analysis and Decoding Aesthetic Elements

Afshin Khadangi, Amir Sartipi, Igor Tchappi et al.

Art, as a universal language, can be interpreted in diverse ways, with artworks embodying profound meanings and nuances. The advent of Large Language Models (LLMs) and the availability of Multimodal Large Language Models (MLLMs) raise the question of how these transformative models can be used to assess and interpret the artistic elements of artworks. While research has been conducted in this domain, to the best of our knowledge, a deep and detailed understanding of the technical and expressive features of artworks using LLMs has not been explored. In this study, we investigate the automation of a formal art analysis framework to analyze a high-throughput number of artworks rapidly and examine how their patterns evolve over time. We explore how LLMs can decode artistic expressions, visual elements, composition, and techniques, revealing emerging patterns that develop across periods. Finally, we discuss the strengths and limitations of LLMs in this context, emphasizing their ability to process vast quantities of art-related data and generate insightful interpretations. Due to the exhaustive and granular nature of the results, we have developed interactive data visualizations, available online https://cognartive.github.io/, to enhance understanding and accessibility.

4.9CLApr 4, 2025Code

Noise Augmented Fine Tuning for Mitigating Hallucinations in Large Language Models

Afshin Khadangi, Amir Sartipi, Igor Tchappi et al.

Large language models (LLMs) often produce inaccurate or misleading content-hallucinations. To address this challenge, we introduce Noise-Augmented Fine-Tuning (NoiseFiT), a novel framework that leverages adaptive noise injection based on the signal-to-noise ratio (SNR) to enhance model robustness. In particular, NoiseFiT selectively perturbs layers identified as either high-SNR (more robust) or low-SNR (potentially under-regularized) using a dynamically scaled Gaussian noise. We further propose a hybrid loss that combines standard cross-entropy, soft cross-entropy, and consistency regularization to ensure stable and accurate outputs under noisy training conditions. Our theoretical analysis shows that adaptive noise injection is both unbiased and variance-preserving, providing strong guarantees for convergence in expectation. Empirical results on multiple test and benchmark datasets demonstrate that NoiseFiT significantly reduces hallucination rates, often improving or matching baseline performance in key tasks. These findings highlight the promise of noise-driven strategies for achieving robust, trustworthy language modeling without incurring prohibitive computational overhead. Given the comprehensive and detailed nature of our experiments, we have publicly released the fine-tuning logs, benchmark evaluation artifacts, and source code online at W&B, Hugging Face, and GitHub, respectively, to foster further research, accessibility and reproducibility.