Yongming Qin

7.6CLJun 2

See, Infer, Intervene: Proactive World Modeling for Goal-Oriented Social Intelligence

Honghui Zhang, Chenmeinian Guo, Yichen Yu et al.

Multimodal retail agents should not only recognize what a customer is doing, but also decide whether and how to assist before an explicit request is made. We study this setting through the See--Infer--Intervene (SII) framework, where a device must see pre-interaction behavior, infer latent customer intent, and act by selecting an appropriate service intervention or choosing to wait. We instantiate SII with the Proactive Intent World Model (PIWM), which represents customer state with AIDA (Attention, Interest, Desire, Action) purchasing phases and BDI (belief, desire, intention) psychological fields, predicts action-conditioned intent transitions, and selects from five response classes: Greet, Elicit, Inform, Recommend, and Hold. We further construct GuidanceSalesBench, a smart-retail benchmark containing state manifests, pre-interaction videos, candidate responses, action-conditioned outcomes, and best-action labels. When conditioned on ground-truth customer state to isolate action selection, PIWM achieves 0.641 macro F1 on 30 held-out target videos, outperforming a zero-shot Qwen2.5-VL-7B baseline and training variants without balanced action supervision; end-to-end video-only selection drops to 0.295, below the 5-class balanced random baseline of 0.414, identifying video-to-state grounding as the dominant deployment-time bottleneck. A preliminary staged real-store pilot (recorded with paid participants performing scripted customer behaviors) reaches 0.579 action macro F1 on 20 fully annotated videos, with 10 additional accessible videos released with index-level labels.

SYJul 17, 2018Code

Experimental Resilience Assessment of An Open-Source Driving Agent

Abu Hasnat Mohammad Rubaiyat, Yongming Qin, Homa Alemzadeh

Autonomous vehicles (AV) depend on the sensors like RADAR and camera for the perception of the environment, path planning, and control. With the increasing autonomy and interactions with the complex environment, there have been growing concerns regarding the safety and reliability of AVs. This paper presents a Systems-Theoretic Process Analysis (STPA) based fault injection framework to assess the resilience of an open-source driving agent, called openpilot, under different environmental conditions and faults affecting sensor data. To increase the coverage of unsafe scenarios during testing, we use a strategic software fault-injection approach where the triggers for injecting the faults are derived from the unsafe scenarios identified during the high-level hazard analysis of the system. The experimental results show that the proposed strategic fault injection approach increases the hazard coverage compared to random fault injection and, thus, can help with more effective simulation of safety-critical faults and testing of AVs. In addition, the paper provides insights on the performance of openpilot safety mechanisms and its ability in timely detection and recovery from faulty inputs.

Yongming Qin

2 Papers