Amit Saha

AI
h-index43
5papers
45citations
Novelty33%
AI Score44

5 Papers

AIMay 17Code
CBT-Audio: Evaluating Audio Language Models for Patient-Side Distress Intensity Estimation in CBT Session Recordings

Qixuan Hu, Shuchang Ye, Xumou Zhang et al.

Cognitive behavioural therapy is widely used to help patients understand and manage psychological distress. It is often delivered through spoken conversation, where therapists attend not only to what patients say, but also to how they say it, because these cues can help therapists decide how to respond and adapt treatment. Progress in building AI systems for CBT remains largely limited to text, partly because most available datasets are text based and shareable spoken CBT data are scarce under ethical and privacy constraints. This creates a blind spot because text based models and evaluations cannot capture the mismatch between the transcript and the patient's voice, even though therapists often rely on this mismatch to understand patient distress. We introduce CBT-Audio, a dataset for evaluating patient distress estimation from spoken CBT sessions with audio language models. CBT-Audio contains 1,802 patient turns from 96 publicly available CBT recordings, with turn-level distress labels validated on an experts-annotated subset. We evaluate 10 open source audio language models under three input conditions, where models receive only patient audio, only the transcript, or both audio and transcript. Our results show that audio can provide useful information beyond text, especially when combined with transcripts. Adding audio to transcript input improves distress estimation over using the transcript alone in 8 of 10 model families, with significant gains in 4, and case studies show the clearest benefit when verbal content and vocal delivery diverge. CBT-Audio makes spoken patient behaviour measurable for AI evaluation in CBT-related tasks and supports future work on audio language models for mental health interaction.

QUANT-PHMar 31
Noise Inference by Recycling Test Rounds in Verification Protocols

Amit Saha, Harold Ollivier

Interactive verification protocols for quantum computations allow to build trust between a client and a service provider, ensuring the former that the instructed computation was carried out faithfully. They come in two variants, one without quantum communication that requires large overhead on the server side to coherently implement quantum-resistant cryptographic primitives, and one with quantum communication but with repetition as the only overhead on the service provider's side. Given the limited number of available qubits on current machines, only quantum communication-based protocols have yielded proof of concepts. In this work, we show that the repetition overhead of protocols with quantum communication can be further mitigated if one examines the task of operating a quantum machine from the service provider's point of view. Indeed, we show that the test rounds data, whose collection is necessary to provide security, can indeed be recycled to perform continuous monitoring of noise model parameters for the service provider. This exemplifies the versatility of these protocols, whose template can serve multiple purposes and increases the interest in considering their early integration into development roadmaps of quantum machines.

HEP-EXOct 8, 2025
Locality-Sensitive Hashing-Based Efficient Point Transformer for Charged Particle Reconstruction

Shitij Govil, Jack P. Rodgers, Yuan-Tang Chou et al.

Charged particle track reconstruction is a foundational task in collider experiments and the main computational bottleneck in particle reconstruction. Graph neural networks (GNNs) have shown strong performance for this problem, but costly graph construction, irregular computations, and random memory access patterns substantially limit their throughput. The recently proposed Hashing-based Efficient Point Transformer (HEPT) offers a theoretically guaranteed near-linear complexity for large point cloud processing via locality-sensitive hashing (LSH) in attention computations; however, its evaluations have largely focused on embedding quality, and the object condensation pipeline on which HEPT relies requires a post-hoc clustering step (e.g., DBScan) that can dominate runtime. In this work, we make two contributions. First, we present a unified, fair evaluation of physics tracking performance for HEPT and a representative GNN-based pipeline under the same dataset and metrics. Second, we introduce HEPTv2 by extending HEPT with a lightweight decoder that eliminates the clustering stage and directly predicts track assignments. This modification preserves HEPT's regular, hardware-friendly computations while enabling ultra-fast end-to-end inference. On the TrackML dataset, optimized HEPTv2 achieves approximately 28 ms per event on an A100 while maintaining competitive tracking efficiency. These results position HEPTv2 as a practical, scalable alternative to GNN-based pipelines for fast tracking.

IRApr 12, 2018
On Using Non-Volatile Memory in Apache Lucene

Ramdoot Pydipaty, Amit Saha

Apache Lucene is a widely popular information retrieval library used to provide search functionality in an extremely wide variety of applications. Naturally, it has to efficiently index and search large number of documents. With non-volatile memory in DIMM form factor (NVDIMM), software now has access to durable, byte-addressable memory with write latency within an order of magnitude of DRAM write latency. In this preliminary article, we present the first reported work on the impact of using NVDIMM on the performance of committing, searching, and near-real time searching in Apache Lucene. We show modest improvements by using NVM but, our empirical study suggests that bigger impact requires redesigning Lucene to access NVM as byte-addressable memory using loads and stores, instead of accessing NVM via the file system.

CLJan 10, 2012
Recognizing Bangla Grammar using Predictive Parser

K. M. Azharul Hasan, Al-Mahmud, Amit Mondal et al.

We describe a Context Free Grammar (CFG) for Bangla language and hence we propose a Bangla parser based on the grammar. Our approach is very much general to apply in Bangla Sentences and the method is well accepted for parsing a language of a grammar. The proposed parser is a predictive parser and we construct the parse table for recognizing Bangla grammar. Using the parse table we recognize syntactical mistakes of Bangla sentences when there is no entry for a terminal in the parse table. If a natural language can be successfully parsed then grammar checking from this language becomes possible. The proposed scheme is based on Top down parsing method and we have avoided the left recursion of the CFG using the idea of left factoring.