CLJun 24, 2022
Using BERT Embeddings to Model Word Importance in Conversational Transcripts for Deaf and Hard of Hearing UsersAkhter Al Amin, Saad Hassan, Cecilia O. Alm et al.
Deaf and hard of hearing individuals regularly rely on captioning while watching live TV. Live TV captioning is evaluated by regulatory agencies using various caption evaluation metrics. However, caption evaluation metrics are often not informed by preferences of DHH users or how meaningful the captions are. There is a need to construct caption evaluation metrics that take the relative importance of words in a transcript into account. We conducted correlation analysis between two types of word embeddings and human-annotated labeled word-importance scores in existing corpus. We found that normalized contextualized word embeddings generated using BERT correlated better with manually annotated importance scores than word2vec-based word embeddings. We make available a pairing of word embeddings and their human-annotated importance scores. We also provide proof-of-concept utility by training word importance models, achieving an F1-score of 0.57 in the 6-class word importance classification task.
LGNov 3, 2023
LLMs-augmented Contextual BanditAli Baheri, Cecilia O. Alm
Contextual bandits have emerged as a cornerstone in reinforcement learning, enabling systems to make decisions with partial feedback. However, as contexts grow in complexity, traditional bandit algorithms can face challenges in adequately capturing and utilizing such contexts. In this paper, we propose a novel integration of large language models (LLMs) with the contextual bandit framework. By leveraging LLMs as an encoder, we enrich the representation of the context, providing the bandit with a denser and more informative view. Preliminary results on synthetic datasets demonstrate the potential of this approach, showing notable improvements in cumulative rewards and reductions in regret compared to traditional bandit algorithms. This integration not only showcases the capabilities of LLMs in reinforcement learning but also opens the door to a new era of contextually-aware decision systems.
AIMar 10, 2025
Hierarchical Neuro-Symbolic Decision TransformerAli Baheri, Cecilia O. Alm
We present a hierarchical neuro-symbolic control framework that tightly couples a classical symbolic planner with a transformer-based policy to address long-horizon decision-making under uncertainty. At the high level, the planner assembles an interpretable sequence of operators that guarantees logical coherence with task constraints, while at the low level each operator is rendered as a sub-goal token that conditions a decision transformer to generate fine-grained actions directly from raw observations. This bidirectional interface preserves the combinatorial efficiency and explainability of symbolic reasoning without sacrificing the adaptability of deep sequence models, and it permits a principled analysis that tracks how approximation errors from both planning and execution accumulate across the hierarchy. Empirical studies in stochastic grid-world domains demonstrate that the proposed method consistently surpasses purely symbolic, purely neural and existing hierarchical baselines in both success and efficiency, highlighting its robustness for sequential tasks.
CLMar 28, 2019
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken DialoguesSushant Kafle, Cecilia O. Alm, Matt Huenerfauth
Prosodic cues in conversational speech aid listeners in discerning a message. We investigate whether acoustic cues in spoken dialogue can be used to identify the importance of individual words to the meaning of a conversation turn. Individuals who are Deaf and Hard of Hearing often rely on real-time captions in live meetings. Word error rate, a traditional metric for evaluating automatic speech recognition, fails to capture that some words are more important for a system to transcribe correctly than others. We present and evaluate neural architectures that use acoustic features for 3-class word importance prediction. Our model performs competitively against state-of-the-art text-based word-importance prediction models, and it demonstrates particular benefits when operating on imperfect ASR output.