Sina Khoshfetrat Pakazad

h-index10

7papers

375citations

Novelty54%

AI Score43

Ranked #54,070 of 194,257 authors (top 28%)#12,325 in LG (top 31%)

7 Papers

6.6LGMay 29

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings

Utsav Dutta, Gerardo Pastrana, Sina Khoshfetrat Pakazad et al.

Transformer-based architectures have advanced sequence modeling in language and vision, yet general-purpose representation learning for heterogeneous multivariate time series remains underexplored. We introduce CHARM (Channel-Aware Representation Model), which incorporates channel-level textual descriptions into a Transformer encoder equivariant to channel order. CHARM is trained with a Joint Embedding Predictive Architecture (JEPA) and a novel loss promoting informative, temporally stable embeddings; latent-space prediction encourages robustness to sensor noise while description-aware gating provides interpretability through learned inter-channel relationships. Across anomaly detection, classification, and short- and long-term forecasting, the learned embeddings achieve strong performance using only a linear probe. Performance is driven primarily by the JEPA objective and conditioning architecture, with text descriptions serving as channel identifiers for cross-dataset generalization.

2.4AIJan 29

NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents

Yang Song, Anoushka Vyas, Zirui Wei et al.

In this paper, we present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimization implementations, operating collaboratively with users or autonomously. Existing approaches typically rely on specialized large language models (LLMs) or bespoke, task-specific agents. Such methods are often brittle, complex and frequently generating syntactically invalid or non-executable code. NEMO instead centers on remote interaction with autonomous coding agents (ACAs), treated as a first-class abstraction analogous to API-based interaction with LLMs. This design enables the construction of higher-level systems around ACAs that structure, consolidate, and iteratively refine task specifications. Because ACAs execute within sandboxed environments, code produced by NEMO is executable by construction, allowing automated validation and repair. Building on this, we introduce novel coordination patterns with and across ACAs, including asymmetric validation loops between independently generated optimizer and simulator implementations (serving as a high-level validation mechanism), external memory for experience reuse, and robustness enhancements via minimum Bayes risk (MBR) decoding and self-consistency. We evaluate NEMO on nine established optimization benchmarks. As depicted in Figure 1, it achieves state-of-the-art performance on the majority of tasks, with substantial margins on several datasets, demonstrating the power of execution-aware agentic architectures for automated optimization modeling.

18.2MAJun 17

Data Intelligence Agents: Interpreting, Modeling, and Querying Enterprise Data via Autonomous Coding Agents

Anoushka Vyas, Aarushi Dhanuka, Sina Khoshfetrat Pakazad et al.

Production data integration is bottlenecked by repeated, lossy handoffs between data owners, engineers, and analysts who must collaboratively discover, structure, and query enterprise data. We present Data Intelligence Agents (DIA), a system of three agents (Data Interpreter, Schema Creator, and Query Generator) that compresses this workflow by treating autonomous coding agents (ACAs) as a first-class abstraction: rather than emitting text, the agents generate, execute, validate, and repair concrete artifacts, draw on a shared memory for experience reuse, and surface each for review by domain experts. DIA is deployed in production for enterprise customers. We study the Query Generator in depth and evaluate it in fully autonomous mode across seven SQL benchmarks spanning four task categories and four dialects. It matches or surpasses the best published results on all seven, demonstrating that an architecture grounded in execution, built on ACAs and a shared memory, generalizes across the data intelligence workload with adaptation confined to natural-language instructions.

7.1LGMay 20, 2025

Time to Embed: Unlocking Foundation Models for Time Series with Channel Descriptions

Utsav Dutta, Sina Khoshfetrat Pakazad, Henrik Ohlsson

Traditional time series models are task-specific and often depend on dataset-specific training and extensive feature engineering. While Transformer-based architectures have improved scalability, foundation models, commonplace in text, vision, and audio, remain under-explored for time series and are largely restricted to forecasting. We introduce $\textbf{CHARM}$, a foundation embedding model for multivariate time series that learns shared, transferable, and domain-aware representations. To address the unique difficulties of time series foundation learning, $\textbf{CHARM}$ incorporates architectural innovations that integrate channel-level textual descriptions while remaining invariant to channel order. The model is trained using a Joint Embedding Predictive Architecture (JEPA), with novel augmentation schemes and a loss function designed to improve interpretability and training stability. Our $7$M-parameter model achieves state-of-the-art performance across diverse downstream tasks, setting a new benchmark for time series representation learning.

1.0LGDec 9, 2019

Expert-guided Regularization via Distance Metric Learning

Shouvik Mani, Mehdi Maasoumy, Sina Pakazad et al.

High-dimensional prediction is a challenging problem setting for traditional statistical models. Although regularization improves model performance in high dimensions, it does not sufficiently leverage knowledge on feature importances held by domain experts. As an alternative to standard regularization techniques, we propose Distance Metric Learning Regularization (DMLreg), an approach for eliciting prior knowledge from domain experts and integrating that knowledge into a regularized linear model. First, we learn a Mahalanobis distance metric between observations from pairwise similarity comparisons provided by an expert. Then, we use the learned distance metric to place prior distributions on coefficients in a linear model. Through experimental results on a simulated high-dimensional prediction problem, we show that DMLreg leads to improvements in model performance when the domain expert is knowledgeable.

1.2SYMar 21, 2016

A Scalable and Distributed Solution to the Inertial Motion Capture Problem

Manon Kok, Sina Khoshfetrat Pakazad, Thomas B. Schön et al.

In inertial motion capture, a multitude of body segments are equipped with inertial sensors, consisting of 3D accelerometers and 3D gyroscopes. Using an optimization-based approach to solve the motion capture problem allows for natural inclusion of biomechanical constraints and for modeling the connection of the body segments at the joint locations. The computational complexity of solving this problem grows both with the length of the data set and with the number of sensors and body segments considered. In this work, we present a scalable and distributed solution to this problem using tailored message passing, capable of exploiting the structure that is inherent in the problem. As a proof-of-concept we apply our algorithm to data from a lower body configuration.

1.6LGSep 20, 2013

Scalable Anomaly Detection in Large Homogenous Populations

Henrik Ohlsson, Tianshi Chen, Sina Khoshfetrat Pakazad et al.

Anomaly detection in large populations is a challenging but highly relevant problem. The problem is essentially a multi-hypothesis problem, with a hypothesis for every division of the systems into normal and anomal systems. The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problems of practical interests. In the current paper we take an optimization approach to this multi-hypothesis problem. We first observe that the problem is equivalent to a non-convex combinatorial optimization problem. We then relax the problem to a convex problem that can be solved distributively on the systems and that stays computationally tractable as the number of systems increase. An interesting property of the proposed method is that it can under certain conditions be shown to give exactly the same result as the combinatorial multi-hypothesis problem and the relaxation is hence tight.