MLApr 1, 2022
Probabilistic AutoRegressive Neural Networks for Accurate Long-range ForecastingMadhurima Panja, Tanujit Chakraborty, Uttam Kumar et al.
Forecasting time series data is a critical area of research with applications spanning from stock prices to early epidemic prediction. While numerous statistical and machine learning methods have been proposed, real-life prediction problems often require hybrid solutions that bridge classical forecasting approaches and modern neural network models. In this study, we introduce the Probabilistic AutoRegressive Neural Networks (PARNN), capable of handling complex time series data exhibiting non-stationarity, nonlinearity, non-seasonality, long-range dependence, and chaotic patterns. PARNN is constructed by improving autoregressive neural networks (ARNN) using autoregressive integrated moving average (ARIMA) feedback error, combining the explainability, scalability, and "white-box-like" prediction behavior of both models. Notably, the PARNN model provides uncertainty quantification through prediction intervals, setting it apart from advanced deep learning tools. Through comprehensive computational experiments, we evaluate the performance of PARNN against standard statistical, machine learning, and deep learning models, including Transformers, NBeats, and DeepAR. Diverse real-world datasets from macroeconomics, tourism, epidemiology, and other domains are employed for short-term, medium-term, and long-term forecasting evaluations. Our results demonstrate the superiority of PARNN across various forecast horizons, surpassing the state-of-the-art forecasters. The proposed PARNN model offers a valuable hybrid solution for accurate long-range forecasting. By effectively capturing the complexities present in time series data, it outperforms existing methods in terms of accuracy and reliability. The ability to quantify uncertainty through prediction intervals further enhances the model's usefulness in decision-making processes.
PEDec 16, 2022
An ensemble neural network approach to forecast Dengue outbreak based on climatic conditionMadhurima Panja, Tanujit Chakraborty, Sk Shahid Nadim et al.
Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.
LGJun 21, 2022
Epicasting: An Ensemble Wavelet Neural Network (EWNet) for Forecasting EpidemicsMadhurima Panja, Tanujit Chakraborty, Uttam Kumar et al.
Infectious diseases remain among the top contributors to human illness and death worldwide, among which many diseases produce epidemic waves of infection. The unavailability of specific drugs and ready-to-use vaccines to prevent most of these epidemics makes the situation worse. These force public health officials and policymakers to rely on early warning systems generated by reliable and accurate forecasts of epidemics. Accurate forecasts of epidemics can assist stakeholders in tailoring countermeasures, such as vaccination campaigns, staff scheduling, and resource allocation, to the situation at hand, which could translate to reductions in the impact of a disease. Unfortunately, most of these past epidemics exhibit nonlinear and non-stationary characteristics due to their spreading fluctuations based on seasonal-dependent variability and the nature of these epidemics. We analyse a wide variety of epidemic time series datasets using a maximal overlap discrete wavelet transform (MODWT) based autoregressive neural network and call it EWNet model. MODWT techniques effectively characterize non-stationary behavior and seasonal dependencies in the epidemic time series and improve the nonlinear forecasting scheme of the autoregressive neural network in the proposed ensemble wavelet network framework. From a nonlinear time series viewpoint, we explore the asymptotic stationarity of the proposed EWNet model to show the asymptotic behavior of the associated Markov Chain. We also theoretically investigate the effect of learning stability and the choice of hidden neurons in the proposal. From a practical perspective, we compare our proposed EWNet framework with several statistical, machine learning, and deep learning models. Experimental results show that the proposed EWNet is highly competitive compared to the state-of-the-art epidemic forecasting methods.
LGJun 9, 2023
Prediction of Transportation Index for Urban Patterns in Small and Medium-sized Indian Cities using Hybrid RidgeGAN ModelRahisha Thottolil, Uttam Kumar, Tanujit Chakraborty
The rapid urbanization trend in most developing countries including India is creating a plethora of civic concerns such as loss of green space, degradation of environmental health, clean water availability, air pollution, traffic congestion leading to delays in vehicular transportation, etc. Transportation and network modeling through transportation indices have been widely used to understand transportation problems in the recent past. This necessitates predicting transportation indices to facilitate sustainable urban planning and traffic management. Recent advancements in deep learning research, in particular, Generative Adversarial Networks (GANs), and their modifications in spatial data analysis such as CityGAN, Conditional GAN, and MetroGAN have enabled urban planners to simulate hyper-realistic urban patterns. These synthetic urban universes mimic global urban patterns and evaluating their landscape structures through spatial pattern analysis can aid in comprehending landscape dynamics, thereby enhancing sustainable urban planning. This research addresses several challenges in predicting the urban transportation index for small and medium-sized Indian cities. A hybrid framework based on Kernel Ridge Regression (KRR) and CityGAN is introduced to predict transportation index using spatial indicators of human settlement patterns. This paper establishes a relationship between the transportation index and human settlement indicators and models it using KRR for the selected 503 Indian cities. The proposed hybrid pipeline, we call it RidgeGAN model, can evaluate the sustainability of urban sprawl associated with infrastructure development and transportation systems in sprawling cities. Experimental results show that the two-step pipeline approach outperforms existing benchmarks based on spatial and statistical measures.
LGJul 5, 2024
Spatiotemporal Forecasting of Traffic Flow using Wavelet-based Temporal AttentionYash Jakhmola, Madhurima Panja, Nitish Kumar Mishra et al.
Spatiotemporal forecasting of traffic flow data represents a typical problem in the field of machine learning, impacting urban traffic management systems. In general, spatiotemporal forecasting problems involve complex interactions, nonlinearities, and long-range dependencies due to the interwoven nature of the temporal and spatial dimensions. Due to this, traditional statistical and machine learning methods cannot adequately handle the temporal and spatial dependencies in these complex traffic flow datasets. A prevalent approach in the field combines graph convolutional networks and multi-head attention mechanisms for spatiotemporal processing. This paper proposes a wavelet-based temporal attention model, namely a wavelet-based dynamic spatiotemporal aware graph neural network (W-DSTAGNN), for tackling the traffic forecasting problem. Wavelet decomposition can help by decomposing the signal into components that can be analyzed independently, reducing the impact of non-stationarity and handling long-range dependencies of traffic flow datasets. Benchmark experiments using three popularly used statistical metrics confirm that our proposal efficiently captures spatiotemporal correlations and outperforms ten state-of-the-art models (including both temporal and spatiotemporal benchmarks) on three publicly available traffic datasets. Our proposed ensemble method can better handle dynamic temporal and spatial dependencies and make reliable long-term forecasts. In addition to point forecasts, our proposed model can generate interval forecasts that significantly enhance probabilistic forecasting for traffic datasets.
LGSep 26, 2024
Caption, Create, Continue: Continual Learning with Pre-trained Generative Vision-Language ModelsIndu Solomon, Aye Phyu Phyu Aung, Uttam Kumar et al.
Continual learning (CL) enables models to adapt to evolving data streams without catastrophic forgetting, a fundamental requirement for real-world AI systems. However, the current methods often depend on large replay buffers or heavily annotated datasets which are impractical due to storage, privacy, and cost constraints. We propose CLTS (Continual Learning via Text-Image Synergy), a novel class-incremental framework that mitigates forgetting without storing real task data. CLTS leverages pre-trained vision-language models, BLIP (Bootstrapping Language-Image Pre-training) for caption generation and stable diffusion for sample generation. Each task is handled by a dedicated Task Head, while a Task Router learns to assign inputs to the correct Task Head using the generated data. On three benchmark datasets, CLTS improves average task accuracy by up to 54% and achieves 63 times better memory efficiency compared to four recent continual learning baselines, demonstrating improved retention and adaptability. CLTS introduces a novel perspective by integrating generative text-image augmentation for scalable continual learning.
DCApr 20
Unlocking the Edge deployment and ondevice acceleration of multi-LoRA enabled one-for-all foundational LLMSravanth Kodavanti, Sowmya Vajrala, Srinivas Miriyala et al.
Deploying large language models (LLMs) on smartphones poses significant engineering challenges due to stringent constraints on memory, latency, and runtime flexibility. In this work, we present a hardware-aware framework for efficient on-device inference of a LLaMA-based multilingual foundation model supporting multiple use cases on Samsung Galaxy S24 and S25 devices with SM8650 and SM8750 Qualcomm chipsets respectively. Our approach integrates application-specific LoRAs as runtime inputs to a single frozen inference graph, enabling dynamic task switching without recompilation or memory overhead. We further introduce a multi-stream decoding mechanism that concurrently generates stylistic variations - such as formal, polite, or jovial responses - within a single forward pass, reducing latency by up to 6x. To accelerate token generation, we apply Dynamic Self-Speculative Decoding (DS2D), a tree-based strategy that predicts future tokens without requiring a draft model, yielding up to 2.3x speedup in decode time. Combined with quantization to INT4 and architecture-level optimizations, our system achieves 4-6x overall improvements in memory and latency while maintaining accuracy across 9 languages and 8 tasks. These results demonstrate practical feasibility of deploying multi-use-case LLMs on edge devices, advancing the commercial viability of Generative AI in mobile platforms.
CRJan 30, 2021Code
Efficient CNN Building Blocks for Encrypted DataNayna Jain, Karthik Nandakumar, Nalini Ratha et al.
Machine learning on encrypted data can address the concerns related to privacy and legality of sharing sensitive data with untrustworthy service providers. Fully Homomorphic Encryption (FHE) is a promising technique to enable machine learning and inferencing while providing strict guarantees against information leakage. Since deep convolutional neural networks (CNNs) have become the machine learning tool of choice in several applications, several attempts have been made to harness CNNs to extract insights from encrypted data. However, existing works focus only on ensuring data security and ignore security of model parameters. They also report high level implementations without providing rigorous analysis of the accuracy, security, and speed trade-offs involved in the FHE implementation of generic primitive operators of a CNN such as convolution, non-linear activation, and pooling. In this work, we consider a Machine Learning as a Service (MLaaS) scenario where both input data and model parameters are secured using FHE. Using the CKKS scheme available in the open-source HElib library, we show that operational parameters of the chosen FHE scheme such as the degree of the cyclotomic polynomial, depth limitations of the underlying leveled HE scheme, and the computational precision parameters have a major impact on the design of the machine learning model (especially, the choice of the activation function and pooling method). Our empirical study shows that choice of aforementioned design parameters result in significant trade-offs between accuracy, security level, and computational time. Encrypted inference experiments on the MNIST dataset indicate that other design choices such as ciphertext packing strategy and parallelization using multithreading are also critical in determining the throughput and latency of the inference process.
LGMay 23, 2024
U-TELL: Unsupervised Task Expert Lifelong LearningIndu Solomon, Aye Phyu Phyu Aung, Uttam Kumar et al.
Continual learning (CL) models are designed to learn new tasks arriving sequentially without re-training the network. However, real-world ML applications have very limited label information and these models suffer from catastrophic forgetting. To address these issues, we propose an unsupervised CL model with task experts called Unsupervised Task Expert Lifelong Learning (U-TELL) to continually learn the data arriving in a sequence addressing catastrophic forgetting. During training of U-TELL, we introduce a new expert on arrival of a new task. Our proposed architecture has task experts, a structured data generator and a task assigner. Each task expert is composed of 3 blocks; i) a variational autoencoder to capture the task distribution and perform data abstraction, ii) a k-means clustering module, and iii) a structure extractor to preserve latent task data signature. During testing, task assigner selects a suitable expert to perform clustering. U-TELL does not store or replay task samples, instead, we use generated structured samples to train the task assigner. We compared U-TELL with five SOTA unsupervised CL methods. U-TELL outperformed all baselines on seven benchmarks and one industry dataset for various CL scenarios with a training time over 6 times faster than the best performing baseline.
CLDec 7, 2021
Semantic Answer Type and Relation Prediction Task (SMART 2021)Nandana Mihindukulasooriya, Mohnish Dubey, Alfio Gliozzo et al.
Each year the International Semantic Web Conference organizes a set of Semantic Web Challenges to establish competitions that will advance state-of-the-art solutions in some problem domains. The Semantic Answer Type and Relation Prediction Task (SMART) task is one of the ISWC 2021 Semantic Web challenges. This is the second year of the challenge after a successful SMART 2020 at ISWC 2020. This year's version focuses on two sub-tasks that are very important to Knowledge Base Question Answering (KBQA): Answer Type Prediction and Relation Prediction. Question type and answer type prediction can play a key role in knowledge base question answering systems providing insights about the expected answer that are helpful to generate correct queries or rank the answer candidates. More concretely, given a question in natural language, the first task is, to predict the answer type using a target ontology (e.g., DBpedia or Wikidata. Similarly, the second task is to identify relations in the natural language query and link them to the relations in a target ontology. This paper discusses the task descriptions, benchmark datasets, and evaluation metrics. For more information, please visit https://smart-task.github.io/2021/.