LGApr 3, 2023
X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMsGiacomo Pedretti, John Moon, Pedro Bruel et al.
Structured, or tabular, data is the most common format in data science. While deep learning models have proven formidable in learning from unstructured data such as images or speech, they are less accurate than simpler approaches when learning from tabular data. In contrast, modern tree-based Machine Learning (ML) models shine in extracting relevant information from structured data. An essential requirement in data science is to reduce model inference latency in cases where, for example, models are used in a closed loop with simulation to accelerate scientific discovery. However, the hardware acceleration community has mostly focused on deep neural networks and largely ignored other forms of machine learning. Previous work has described the use of an analog content addressable memory (CAM) component for efficiently mapping random forests. In this work, we develop an analog-digital architecture that implements a novel increased precision analog CAM and a programmable chip for inference of state-of-the-art tree-based ML models, such as XGBoost, CatBoost, and others. Thanks to hardware-aware training, X-TIME reaches state-of-the-art accuracy and 119x higher throughput at 9740x lower latency with >150x improved energy efficiency compared with a state-of-the-art GPU for models with up to 4096 trees and depth of 8, with a 19W peak power consumption.
ARNov 29, 2023
RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer AccelerationLei Zhao, Aishwarya Natarajan, Luca Buonanno et al.
Transformer models represent the cutting edge of Deep Neural Networks (DNNs) and excel in a wide range of machine learning tasks. However, processing these models demands significant computational resources and results in a substantial memory footprint. While In-memory Computing (IMC)offers promise for accelerating Vector-Matrix Multiplications(VMMs) with high computational parallelism and minimal data movement, employing it for other crucial DNN operators remains a formidable task. This challenge is exacerbated by the extensive use of complex activation functions, Softmax, and data-dependent matrix multiplications (DMMuls) within Transformer models. To address this challenge, we introduce a Reconfigurable Analog Computing Engine (RACE) by enhancing Analog Content Addressable Memories (ACAMs) to support broader operations. Based on the RACE, we propose the RACE-IT accelerator (meaning RACE for In-memory Transformers) to enable efficient analog-domain execution of all core operations of Transformer models. Given the flexibility of our proposed RACE in supporting arbitrary computations, RACE-IT is well-suited for adapting to emerging and non-traditional DNN architectures without requiring hardware modifications. We compare RACE-IT with various accelerators. Results show that RACE-IT increases performance by 453x and 15x, and reduces energy by 354x and 122x over the state-of-the-art GPUs and existing Transformer-specific IMC accelerators, respectively.
LGNov 5, 2025
Forecast2Anomaly (F2A): Adapting Multivariate Time Series Foundation Models for Anomaly PredictionAtif Hassan, Tarun Kumar, Ashish Mishra et al.
Forecasting anomalies (anomaly prediction) in multivariate time series from different real-world, dynamic, and complex systems is vital for preempting critical failures, leading to a substantial minimization in operational costs and human labor. Yet, existing methods are limited to specific systems while failing to generalize to evolving anomaly patterns over time. In contrast, pretrained Time Series Foundation Models (TSFMs) have recently demonstrated strong generalization and zero-shot forecasting capabilities. However, their potential remains untapped for anomaly prediction, a task fundamentally different from forecasting normal behavior. Thus, we present Forecast2Anomaly (F2A), a novel framework that empowers TSFMs with anomaly prediction abilities through two key innovations. First, we propose a joint forecast-anomaly loss that fine-tunes TSFMs to accurately forecast future signals even at anomalous time points. Second, we introduce a Retrieval-Augmented Generation (RAG) module that retrieves historically relevant horizons and conditions predictions on them. This component dynamically adapts to distributional shifts at inference time, enabling F2A to track evolving anomalies without requiring model updates. By combining targeted fine-tuning with dynamic retrieval, F2A bridges the gap between robust TSFM zero-shot forecasting and zero-shot anomaly prediction. Extensive experiments across 16 diverse datasets and multiple TSFM backbones show that F2A consistently outperforms state-of-the-art methods, offering a scalable, zero-shot anomaly prediction solution for real-world applications.