CVJul 1, 2023Code
DeepMediX: A Deep Learning-Driven Resource-Efficient Medical Diagnosis Across the SpectrumKishore Babu Nampalle, Pradeep Singh, Uppala Vivek Narayan et al.
In the rapidly evolving landscape of medical imaging diagnostics, achieving high accuracy while preserving computational efficiency remains a formidable challenge. This work presents \texttt{DeepMediX}, a groundbreaking, resource-efficient model that significantly addresses this challenge. Built on top of the MobileNetV2 architecture, DeepMediX excels in classifying brain MRI scans and skin cancer images, with superior performance demonstrated on both binary and multiclass skin cancer datasets. It provides a solution to labor-intensive manual processes, the need for large datasets, and complexities related to image properties. DeepMediX's design also includes the concept of Federated Learning, enabling a collaborative learning approach without compromising data privacy. This approach allows diverse healthcare institutions to benefit from shared learning experiences without the necessity of direct data access, enhancing the model's predictive power while preserving the privacy and integrity of sensitive patient data. Its low computational footprint makes DeepMediX suitable for deployment on handheld devices, offering potential for real-time diagnostic support. Through rigorous testing on standard datasets, including the ISIC2018 for dermatological research, DeepMediX demonstrates exceptional diagnostic capabilities, matching the performance of existing models on almost all tasks and even outperforming them in some cases. The findings of this study underline significant implications for the development and deployment of AI-based tools in medical imaging and their integration into point-of-care settings. The source code and models generated would be released at https://github.com/kishorebabun/DeepMediX.
LGJun 30, 2023
Vision Through the Veil: Differential Privacy in Federated Learning for Medical Image ClassificationKishore Babu Nampalle, Pradeep Singh, Uppala Vivek Narayan et al.
The proliferation of deep learning applications in healthcare calls for data aggregation across various institutions, a practice often associated with significant privacy concerns. This concern intensifies in medical image analysis, where privacy-preserving mechanisms are paramount due to the data being sensitive in nature. Federated learning, which enables cooperative model training without direct data exchange, presents a promising solution. Nevertheless, the inherent vulnerabilities of federated learning necessitate further privacy safeguards. This study addresses this need by integrating differential privacy, a leading privacy-preserving technique, into a federated learning framework for medical image classification. We introduce a novel differentially private federated learning model and meticulously examine its impacts on privacy preservation and model performance. Our research confirms the existence of a trade-off between model accuracy and privacy settings. However, we demonstrate that strategic calibration of the privacy budget in differential privacy can uphold robust image classification performance while providing substantial privacy protection.
CVJun 27, 2023
See Through the Fog: Curriculum Learning with Progressive Occlusion in Medical ImagingPradeep Singh, Kishore Babu Nampalle, Uppala Vivek Narayan et al.
In recent years, deep learning models have revolutionized medical image interpretation, offering substantial improvements in diagnostic accuracy. However, these models often struggle with challenging images where critical features are partially or fully occluded, which is a common scenario in clinical practice. In this paper, we propose a novel curriculum learning-based approach to train deep learning models to handle occluded medical images effectively. Our method progressively introduces occlusion, starting from clear, unobstructed images and gradually moving to images with increasing occlusion levels. This ordered learning process, akin to human learning, allows the model to first grasp simple, discernable patterns and subsequently build upon this knowledge to understand more complicated, occluded scenarios. Furthermore, we present three novel occlusion synthesis methods, namely Wasserstein Curriculum Learning (WCL), Information Adaptive Learning (IAL), and Geodesic Curriculum Learning (GCL). Our extensive experiments on diverse medical image datasets demonstrate substantial improvements in model robustness and diagnostic accuracy over conventional training methodologies.
83.0AIMay 23
GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert CalibrationJunjie Zhao, Jingyi Liang, Zhenyang Cai et al.
While large language models (LLMs) hold transformative potential for medicine, their reasoning robustness and safety in real-world clinical scenarios remain critically underexplored, particularly in dentistry. Here we introduce GlobalDentBench, the first multinational dental benchmark, featuring a taxonomy that encompasses 14 dental specialties across 88 countries and regions spanning six continents. The benchmark comprises 8,978 expert-validated questions across three formats (multiple-choice, short-answer, and case-based questions) and assesses three progressive reasoning levels: knowledge recall (L1), routine reasoning (L2), and individualized reasoning (L3). To ensure data quality, the automated construction framework was calibrated by six senior dentists, achieving expert agreement rates of 99.98% for multiple-choice and short-answer questions and 96.78% for the more complex case-based questions. Evaluation of 12 frontier LLMs on GlobalDentBench revealed a sharp, stepwise performance degradation with increasing reasoning complexity. Specifically, accuracy plummeted from 81.34% on multiple-choice to 64.53% on short-answer and 22.34% on case-based questions, while declining markedly from 74.01% at L1 to 55.64% at L2 and 35.71% at L3. More critically, risk analysis of real-world dental cases demonstrated an alarming overall unsafe rate of 31.01% in LLM-generated clinical recommendations, with 4.51% posing risks of irreversible patient harm and risks particularly pronounced in specialties such as orthodontics. These findings expose fundamental limitations in the medical reasoning and safety of current LLMs. Consequently, GlobalDentBench provides a scalable foundation for trustworthy clinical AI evaluation, underscoring the urgent need for rigorous validation before the safe deployment of these models in healthcare.
NEJun 2, 2023
ErfReLU: Adaptive Activation Function for Deep Neural NetworkAshish Rajanand, Pradeep Singh
Recent research has found that the activation function (AF) selected for adding non-linearity into the output can have a big impact on how effectively deep learning networks perform. Developing activation functions that can adapt simultaneously with learning is a need of time. Researchers recently started developing activation functions that can be trained throughout the learning process, known as trainable, or adaptive activation functions (AAF). Research on AAF that enhance the outcomes is still in its early stages. In this paper, a novel activation function 'ErfReLU' has been developed based on the erf function and ReLU. This function exploits the ReLU and the error function (erf) to its advantage. State of art activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained. Adaptive activation functions like Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf have also been described. Lastly, performance analysis of 9 trainable activation functions along with the proposed one namely Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf has been shown by applying these activation functions in MobileNet, VGG16, and ResNet models on CIFAR-10, MNIST, and FMNIST benchmark datasets.
AIDec 23, 2025
Benchmarking LLMs for Predictive Applications in the Intensive Care UnitsChehak Malhotra, Mehak Gopal, Akshaya Devadiga et al.
With the advent of LLMs, various tasks across the natural language processing domain have been transformed. However, their application in predictive tasks remains less researched. This study compares large language models, including GatorTron-Base (trained on clinical data), Llama 8B, and Mistral 7B, against models like BioBERT, DocBERT, BioClinicalBERT, Word2Vec, and Doc2Vec, setting benchmarks for predicting Shock in critically ill patients. Timely prediction of shock can enable early interventions, thus improving patient outcomes. Text data from 17,294 ICU stays of patients in the MIMIC III database were scored for length of stay > 24 hours and shock index (SI) > 0.7 to yield 355 and 87 patients with normal and abnormal SI-index, respectively. Both focal and cross-entropy losses were used during finetuning to address class imbalances. Our findings indicate that while GatorTron Base achieved the highest weighted recall of 80.5%, the overall performance metrics were comparable between SLMs and LLMs. This suggests that LLMs are not inherently superior to SLMs in predicting future clinical events despite their strong performance on text-based tasks. To achieve meaningful clinical outcomes, future efforts in training LLMs should prioritize developing models capable of predicting clinical trajectories rather than focusing on simpler tasks such as named entity recognition or phenotyping.
AODec 18, 2025
The Universe Learning Itself: On the Evolution of Dynamics from the Big Bang to Machine IntelligencePradeep Singh, Mudasani Rushikesh, Bezawada Sri Sai Anurag et al.
We develop a unified, dynamical-systems narrative of the universe that traces a continuous chain of structure formation from the Big Bang to contemporary human societies and their artificial learning systems. Rather than treating cosmology, astrophysics, geophysics, biology, cognition, and machine intelligence as disjoint domains, we view each as successive regimes of dynamics on ever-richer state spaces, stitched together by phase transitions, symmetry-breaking events, and emergent attractors. Starting from inflationary field dynamics and the growth of primordial perturbations, we describe how gravitational instability sculpts the cosmic web, how dissipative collapse in baryonic matter yields stars and planets, and how planetary-scale geochemical cycles define long-lived nonequilibrium attractors. Within these attractors, we frame the origin of life as the emergence of self-maintaining reaction networks, evolutionary biology as flow on high-dimensional genotype-phenotype-environment manifolds, and brains as adaptive dynamical systems operating near critical surfaces. Human culture and technology-including modern machine learning and artificial intelligence-are then interpreted as symbolic and institutional dynamics that implement and refine engineered learning flows which recursively reshape their own phase space. Throughout, we emphasize recurring mathematical motifs-instability, bifurcation, multiscale coupling, and constrained flows on measure-zero subsets of the accessible state space. Our aim is not to present any new cosmological or biological model, but a cross-scale, theoretical perspective: a way of reading the universe's history as the evolution of dynamics itself, culminating (so far) in biological and artificial systems capable of modeling, predicting, and deliberately perturbing their own future trajectories.
CVDec 15, 2024
Facial Surgery Preview Based on the Orthognathic Treatment PredictionHuijun Han, Congyi Zhang, Lifeng Zhu et al.
Orthognathic surgery consultation is essential to help patients understand the changes to their facial appearance after surgery. However, current visualization methods are often inefficient and inaccurate due to limited pre- and post-treatment data and the complexity of the treatment. To overcome these challenges, this study aims to develop a fully automated pipeline that generates accurate and efficient 3D previews of postsurgical facial appearances for patients with orthognathic treatment without requiring additional medical images. The study introduces novel aesthetic losses, such as mouth-convexity and asymmetry losses, to improve the accuracy of facial surgery prediction. Additionally, it proposes a specialized parametric model for 3D reconstruction of the patient, medical-related losses to guide latent code prediction network optimization, and a data augmentation scheme to address insufficient data. The study additionally employs FLAME, a parametric model, to enhance the quality of facial appearance previews by extracting facial latent codes and establishing dense correspondences between pre- and post-surgery geometries. Quantitative comparisons showed the algorithm's effectiveness, and qualitative results highlighted accurate facial contour and detail predictions. A user study confirmed that doctors and the public could not distinguish between machine learning predictions and actual postoperative results. This study aims to offer a practical, effective solution for orthognathic surgery consultations, benefiting doctors and patients.
LGAug 25, 2025
HypER: Hyperbolic Echo State Networks for Capturing Stretch-and-Fold Dynamics in Chaotic FlowsPradeep Singh, Sutirtha Ghosh, Ashutosh Kumar et al.
Forecasting chaotic dynamics beyond a few Lyapunov times is difficult because infinitesimal errors grow exponentially. Existing Echo State Networks (ESNs) mitigate this growth but employ reservoirs whose Euclidean geometry is mismatched to the stretch-and-fold structure of chaos. We introduce the Hyperbolic Embedding Reservoir (HypER), an ESN whose neurons are sampled in the Poincare ball and whose connections decay exponentially with hyperbolic distance. This negative-curvature construction embeds an exponential metric directly into the latent space, aligning the reservoir's local expansion-contraction spectrum with the system's Lyapunov directions while preserving standard ESN features such as sparsity, leaky integration, and spectral-radius control. Training is limited to a Tikhonov-regularized readout. On the chaotic Lorenz-63 and Roessler systems, and the hyperchaotic Chen-Ueta attractor, HypER consistently lengthens the mean valid-prediction horizon beyond Euclidean and graph-structured ESN baselines, with statistically significant gains confirmed over 30 independent runs; parallel results on real-world benchmarks, including heart-rate variability from the Santa Fe and MIT-BIH datasets and international sunspot numbers, corroborate its advantage. We further establish a lower bound on the rate of state divergence for HypER, mirroring Lyapunov growth.
CVJan 31, 2025
Training-free Quantum-Inspired Image Edge Extraction MethodArti Jain, Pradeep Singh
Edge detection is a cornerstone of image processing, yet existing methods often face critical limitations. Traditional deep learning edge detection methods require extensive training datasets and fine-tuning, while classical techniques often fail in complex or noisy scenarios, limiting their real-world applicability. To address these limitations, we propose a training-free, quantum-inspired edge detection model. Our approach integrates classical Sobel edge detection, the Schrödinger wave equation refinement, and a hybrid framework combining Canny and Laplacian operators. By eliminating the need for training, the model is lightweight and adaptable to diverse applications. The Schrödinger wave equation refines gradient-based edge maps through iterative diffusion, significantly enhancing edge precision. The hybrid framework further strengthens the model by synergistically combining local and global features, ensuring robustness even under challenging conditions. Extensive evaluations on datasets like BIPED, Multicue, and NYUD demonstrate superior performance of the proposed model, achieving state-of-the-art metrics, including ODS, OIS, AP, and F-measure. Noise robustness experiments highlight its reliability, showcasing its practicality for real-world scenarios. Due to its versatile and adaptable nature, our model is well-suited for applications such as medical imaging, autonomous systems, and environmental monitoring, setting a new benchmark for edge detection.
LGSep 4, 2025
Echo State Networks as State-Space Models: A Systems PerspectivePradeep Singh, Balasubramanian Raman
Echo State Networks (ESNs) are typically presented as efficient, readout-trained recurrent models, yet their dynamics and design are often guided by heuristics rather than first principles. We recast ESNs explicitly as state-space models (SSMs), providing a unified systems-theoretic account that links reservoir computing with classical identification and modern kernelized SSMs. First, we show that the echo-state property is an instance of input-to-state stability for a contractive nonlinear SSM and derive verifiable conditions in terms of leak, spectral scaling, and activation Lipschitz constants. Second, we develop two complementary mappings: (i) small-signal linearizations that yield locally valid LTI SSMs with interpretable poles and memory horizons; and (ii) lifted/Koopman random-feature expansions that render the ESN a linear SSM in an augmented state, enabling transfer-function and convolutional-kernel analyses. This perspective yields frequency-domain characterizations of memory spectra and clarifies when ESNs emulate structured SSM kernels. Third, we cast teacher forcing as state estimation and propose Kalman/EKF-assisted readout learning, together with EM for hyperparameters (leak, spectral radius, process/measurement noise) and a hybrid subspace procedure for spectral shaping under contraction constraints.
LGAug 25, 2025
Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random DynamicsPradeep Singh, Mehak Sharma, Anupriya Dey et al.
Transformers are the de-facto choice for sequence modelling, yet their quadratic self-attention and weak temporal bias can make long-range forecasting both expensive and brittle. We introduce FreezeTST, a lightweight hybrid that interleaves frozen random-feature (reservoir) blocks with standard trainable Transformer layers. The frozen blocks endow the network with rich nonlinear memory at no optimisation cost; the trainable layers learn to query this memory through self-attention. The design cuts trainable parameters and also lowers wall-clock training time, while leaving inference complexity unchanged. On seven standard long-term forecasting benchmarks, FreezeTST consistently matches or surpasses specialised variants such as Informer, Autoformer, and PatchTST; with substantially lower compute. Our results show that embedding reservoir principles within Transformers offers a simple, principled route to efficient long-term time-series prediction.
LGApr 16, 2025
Dynamics and Computational Principles of Echo State Networks: A Mathematical PerspectivePradeep Singh, Ashutosh Kumar, Sutirtha Ghosh et al.
Reservoir computing (RC) represents a class of state-space models (SSMs) characterized by a fixed state transition mechanism (the reservoir) and a flexible readout layer that maps from the state space. It is a paradigm of computational dynamical systems that harnesses the transient dynamics of high-dimensional state spaces for efficient processing of temporal data. Rooted in concepts from recurrent neural networks, RC achieves exceptional computational power by decoupling the training of the dynamic reservoir from the linear readout layer, thereby circumventing the complexities of gradient-based optimization. This work presents a systematic exploration of RC, addressing its foundational properties such as the echo state property, fading memory, and reservoir capacity through the lens of dynamical systems theory. We formalize the interplay between input signals and reservoir states, demonstrating the conditions under which reservoirs exhibit stability and expressive power. Further, we delve into the computational trade-offs and robustness characteristics of RC architectures, extending the discussion to their applications in signal processing, time-series prediction, and control systems. The analysis is complemented by theoretical insights into optimization, training methodologies, and scalability, highlighting open challenges and potential directions for advancing the theoretical underpinnings of RC.
CVMay 17, 2023
Transcending Grids: Point Clouds and Surface Representations Powering Neurological ProcessingKishore Babu Nampalle, Pradeep Singh, Vivek Narayan Uppala et al.
In healthcare, accurately classifying medical images is vital, but conventional methods often hinge on medical data with a consistent grid structure, which may restrict their overall performance. Recent medical research has been focused on tweaking the architectures to attain better performance without giving due consideration to the representation of data. In this paper, we present a novel approach for transforming grid based data into its higher dimensional representations, leveraging unstructured point cloud data structures. We first generate a sparse point cloud from an image by integrating pixel color information as spatial coordinates. Next, we construct a hypersurface composed of points based on the image dimensions, with each smooth section within this hypersurface symbolizing a specific pixel location. Polygonal face construction is achieved using an adjacency tensor. Finally, a dense point cloud is generated by densely sampling the constructed hypersurface, with a focus on regions of higher detail. The effectiveness of our approach is demonstrated on a publicly accessible brain tumor dataset, achieving significant improvements over existing classification techniques. This methodology allows the extraction of intricate details from the original image, opening up new possibilities for advanced image analysis and processing tasks.
IVNov 8, 2021
Dense Representative Tooth Landmark/axis Detection Network on 3D ModelGuangshun Wei, Zhiming Cui, Jie Zhu et al.
Artificial intelligence (AI) technology is increasingly used for digital orthodontics, but one of the challenges is to automatically and accurately detect tooth landmarks and axes. This is partly because of sophisticated geometric definitions of them, and partly due to large variations among individual tooth and across different types of tooth. As such, we propose a deep learning approach with a labeled dataset by professional dentists to the tooth landmark/axis detection on tooth model that are crucial for orthodontic treatments. Our method can extract not only tooth landmarks in the form of point (e.g. cusps), but also axes that measure the tooth angulation and inclination. The proposed network takes as input a 3D tooth model and predicts various types of the tooth landmarks and axes. Specifically, we encode the landmarks and axes as dense fields defined on the surface of the tooth model. This design choice and a set of added components make the proposed network more suitable for extracting sparse landmarks from a given 3D tooth model. Extensive evaluation of the proposed method was conducted on a set of dental models prepared by experienced dentists. Results show that our method can produce tooth landmarks with high accuracy. Our method was examined and justified via comparison with the state-of-the-art methods as well as the ablation studies.
SEAug 25, 2020
Software Effort Estimation using parameter tuned ModelsAkanksha Baghel, Meemansa Rathod, Pradeep Singh
Software estimation is one of the most important activities in the software project. The software effort estimation is required in the early stages of software life cycle. Project Failure is the major problem undergoing nowadays as seen by software project managers. The imprecision of the estimation is the reason for this problem. Assize of software size grows, it also makes a system complex, thus difficult to accurately predict the cost of software development process. The greatest pitfall of the software industry was the fast-changing nature of software development which has made it difficult to develop parametric models that yield high accuracy for software development in all domains. We need the development of useful models that accurately predict the cost of developing a software product. This study presents the novel analysis of various regression models with hyperparameter tuning to get the effective model. Nine different regression techniques are considered for model development
AIApr 16, 2020
Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical dataDeepak Singh, Dilip Singh Sisodia, Pradeep Singh
Biomedical data is filled with continuous real values; these values in the feature set tend to create problems like underfitting, the curse of dimensionality and increase in misclassification rate because of higher variance. In response, pre-processing techniques on dataset minimizes the side effects and have shown success in maintaining the adequate accuracy. Feature selection and discretization are the two necessary preprocessing steps that were effectively employed to handle the data redundancies in the biomedical data. However, in the previous works, the absence of unified effort by integrating feature selection and discretization together in solving the data redundancy problem leads to the disjoint and fragmented field. This paper proposes a novel multi-objective based dimensionality reduction framework, which incorporates both discretization and feature reduction as an ensemble model for performing feature selection and discretization. Selection of optimal features and the categorization of discretized and non-discretized features from the feature subset is governed by the multi-objective genetic algorithm (NSGA-II). The two objective, minimizing the error rate during the feature selection and maximizing the information gain while discretization is considered as fitness criteria.