IMLGApr 25, 2023

Morphological Classification of Extragalactic Radio Sources Using Gradient Boosting Methods

arXiv:2304.12729v2h-index: 17
Originality Incremental advance
AI Analysis

This work addresses the need for efficient classification in radio astronomy as data volumes grow, but it is incremental as it applies existing gradient boosting methods to a known problem.

The paper tackled the problem of automatically classifying extragalactic radio sources by morphology, proposing gradient boosting methods as data-efficient alternatives to convolutional neural networks, and found that these methods outperformed a state-of-the-art CNN-based classifier using less than a quarter of the images, with CatBoost achieving the highest accuracy and 3-4% higher recall for Fanaroff-Riley class II sources.

The field of radio astronomy is witnessing a boom in the amount of data produced per day due to newly commissioned radio telescopes. One of the most crucial problems in this field is the automatic classification of extragalactic radio sources based on their morphologies. Most recent contributions in the field of morphological classification of extragalactic radio sources have proposed classifiers based on convolutional neural networks. Alternatively, this work proposes gradient boosting machine learning methods accompanied by principal component analysis as data-efficient alternatives to convolutional neural networks. Recent findings have shown the efficacy of gradient boosting methods in outperforming deep learning methods for classification problems with tabular data. The gradient boosting methods considered in this work are based on the XGBoost, LightGBM, and CatBoost implementations. This work also studies the effect of dataset size on classifier performance. A three-class classification problem is considered in this work based on the three main Fanaroff-Riley classes: class 0, class I, and class II, using radio sources from the Best-Heckman sample. All three proposed gradient boosting methods outperformed a state-of-the-art convolutional neural networks-based classifier using less than a quarter of the number of images, with CatBoost having the highest accuracy. This was mainly due to the superior accuracy of gradient boosting methods in classifying Fanaroff-Riley class II sources, with 3$\unicode{x2013}$4% higher recall.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes