Rahul Katarya

h-index5

7papers

103citations

Novelty21%

AI Score33

Ranked #132,027 of 201,326 authors (top 66%)#23,441 in CL (top 72%)

7 Papers

CVApr 14

A Hybrid Architecture for Benign-Malignant Classification of Mammography ROIs

Mohammed Asad, Mohit Bajpai, Sudhir Singh et al.

Accurate characterization of suspicious breast lesions in mammography is important for early diagnosis and treatment planning. While Convolutional Neural Networks (CNNs) are effective at extracting local visual patterns, they are less suited to modeling long-range dependencies. Vision Transformers (ViTs) address this limitation through self-attention, but their quadratic computational cost can be prohibitive. This paper presents a hybrid architecture that combines EfficientNetV2-M for local feature extraction with Vision Mamba, a State Space Model (SSM), for efficient global context modeling. The proposed model performs binary classification of abnormality-centered mammography regions of interest (ROIs) from the CBIS-DDSM dataset into benign and malignant classes. By combining a strong CNN backbone with a linear-complexity sequence model, the approach achieves strong lesion-level classification performance in an ROI-based setting.

LGJan 2, 2025

Empirical Analysis of Nature-Inspired Algorithms for Autism Spectrum Disorder Detection Using 3D Video Dataset

Aneesh Panchal, Kainat Khan, Rahul Katarya

Autism Spectrum Disorder (ASD) is a chronic neurodevelopmental disorder symptoms of which includes repetitive behaviour and lack of social and communication skills. Even though these symptoms can be seen very clearly in social but a large number of individuals with ASD remain undiagnosed. In this paper, we worked on a methodology for the detection of ASD from a 3-dimensional walking video dataset, utilizing supervised machine learning (ML) classification algorithms and nature-inspired optimization algorithms for feature extraction from the dataset. The proposed methodology involves the classification of ASD using a supervised ML classification algorithm and extracting important and relevant features from the dataset using nature-inspired optimization algorithms. We also included the ranking coefficients to find the initial leading particle. This selection of particle significantly reduces the computation time and hence, improves the total efficiency and accuracy for ASD detection. To evaluate the efficiency of the proposed methodology, we deployed various combinationsalgorithms of classification algorithm and nature-inspired algorithms resulting in an outstanding classification accuracy of $100\%$ using the random forest classification algorithm and gravitational search algorithm for feature selection. The application of the proposed methodology with different datasets would enhance the robustness and generalizability of the proposed methodology. Due to high accuracy and less total computation time, the proposed methodology will offer a significant contribution to the medical and academic fields, providing a foundation for future research and advancements in ASD diagnosis.

CLApr 23, 2021

Analysis of Online Toxicity Detection Using Machine Learning Approaches

Anjum, Rahul Katarya

Social media and the internet have become an integral part of how people spread and consume information. Over a period of time, social media evolved dramatically, and almost half of the population is using social media to express their views and opinions. Online hate speech is one of the drawbacks of social media nowadays, which needs to be controlled. In this paper, we will understand how hate speech originated and what are the consequences of it; Trends of machine-learning algorithms to solve an online hate speech problem. This study contributes by providing a systematic approach to help researchers to identify a new research direction and elucidating the shortcomings of the studies and model, as well as providing future directions to advance the field.

CLApr 23, 2021

Automated News Summarization Using Transformers

Anushka Gupta, Diksha Chugh, Anjum et al.

The amount of text data available online is increasing at a very fast pace hence text summarization has become essential. Most of the modern recommender and text classification systems require going through a huge amount of data. Manually generating precise and fluent summaries of lengthy articles is a very tiresome and time-consuming task. Hence generating automated summaries for the data and using it to train machine learning models will make these models space and time-efficient. Extractive summarization and abstractive summarization are two separate methods of generating summaries. The extractive technique identifies the relevant sentences from the original document and extracts only those from the text. Whereas in abstractive summarization techniques, the summary is generated after interpreting the original text, hence making it more complicated. In this paper, we will be presenting a comprehensive comparison of a few transformer architecture based pre-trained models for text summarization. For analysis and comparison, we have used the BBC news dataset that contains text data that can be used for summarization and human generated summaries for evaluating and comparing the summaries generated by machine learning models.

CLApr 23, 2021

Analysing Cyberbullying using Natural Language Processing by Understanding Jargon in Social Media

Bhumika Bhatia, Anuj Verma, Anjum et al.

Cyberbullying is of extreme prevalence today. Online-hate comments, toxicity, cyberbullying amongst children and other vulnerable groups are only growing over online classes, and increased access to social platforms, especially post COVID-19. It is paramount to detect and ensure minors' safety across social platforms so that any violence or hate-crime is automatically detected and strict action is taken against it. In our work, we explore binary classification by using a combination of datasets from various social media platforms that cover a wide range of cyberbullying such as sexism, racism, abusive, and hate-speech. We experiment through multiple models such as Bi-LSTM, GloVe, state-of-the-art models like BERT, and apply a unique preprocessing technique by introducing a slang-abusive corpus, achieving a higher precision in comparison to models without slang preprocessing.

CLApr 23, 2021

Comparative Analysis of Machine Learning and Deep Learning Algorithms for Detection of Online Hate Speech

Tashvik Dhamija, Anjum, Rahul Katarya

In the day and age of social media, users have become prone to online hate speech. Several attempts have been made to classify hate speech using machine learning but the state-of-the-art models are not robust enough for practical applications. This is attributed to the use of primitive NLP feature engineering techniques. In this paper, we explored various feature engineering techniques ranging from different embeddings to conventional NLP algorithms. We also experimented with combinations of different features. From our experimentation, we realized that roBERTa (robustly optimized BERT approach) based sentence embeddings classified using decision trees gives the best results of 0.9998 F1 score. In our paper, we concluded that BERT based embeddings give the most useful features for this problem and have the capacity to be made into a practical robust model.

LGNov 19, 2020

Utkarsh Nath, Shikha Asrani, Rahul Katarya

Clustering is spotting pattern in a group of objects and resultantly grouping the similar objects together. Objects have attributes which are not always numerical, sometimes attributes have domain or categories to which they could belong to. Such data is called categorical data. To group categorical data many clustering algorithms are used, among which k- modes algorithm has so far given the most significant results. Nevertheless, there is still a lot which could be improved. Algorithms like k-means, fuzzy-c-means or hierarchical have given far better accuracies with numerical data. In this paper, we have proposed a novel distance metric, similarity-based distance (SBD) to find the distance between objects of categorical data. Experiments have shown that our proposed distance (SBD), when used with the SBC (space structure based clustering) type algorithm significantly outperforms the existing algorithms like k-modes or other SBC type algorithms when used on categorical datasets.