CLJun 2
SaliMory: Orchestrating Cognitive Memory for Conversational AgentsKai Zhang, Xinyuan Zhang, Hongda Jiang et al.
Conversational agents that serve as lifelong companions must maintain persistent memory across all interactions. However, simply expanding context windows with raw retrieval degrades reasoning quality, while training memory agents via standard reinforcement learning creates a severe credit assignment bottleneck in a multi-stage pipeline. To solve this, we introduce SALIMORY, a framework that trains a single language model to manage a cognitively-structured memory-spanning user facts, preferences, and working memory. By introducing a hierarchical stage-wise process reward and reward-decomposed contrastive refinement, SALIMORY provides isolated supervision for distinct memory operations (selective filtering, consolidation, and cue-driven recall) end-to-end. SALIMORY cuts memory-attributed failures by one-third, outperforms the state-of-the-art by over 10% in end-to-end accuracy, and more than doubles the Good Personalization rate.
CLOct 12, 2025
AssoMem: Scalable Memory QA with Multi-Signal Associative RetrievalKai Zhang, Xinyuan Zhang, Ejaz Ahmed et al. · amazon-science
Accurate recall from large scale memories remains a core challenge for memory augmented AI assistants performing question answering (QA), especially in similarity dense scenarios where existing methods mainly rely on semantic distance to the query for retrieval. Inspired by how humans link information associatively, we propose AssoMem, a novel framework constructing an associative memory graph that anchors dialogue utterances to automatically extracted clues. This structure provides a rich organizational view of the conversational context and facilitates importance aware ranking. Further, AssoMem integrates multi-dimensional retrieval signals-relevance, importance, and temporal alignment using an adaptive mutual information (MI) driven fusion strategy. Extensive experiments across three benchmarks and a newly introduced dataset, MeetingQA, demonstrate that AssoMem consistently outperforms SOTA baselines, verifying its superiority in context-aware memory recall.
AISep 22, 2025
Memory-QA: Answering Recall Questions Based on Multimodal MemoriesHongda Jiang, Xinyuan Zhang, Siddhant Garg et al. · amazon-science
We introduce Memory-QA, a novel real-world task that involves answering recall questions about visual content from previously stored multimodal memories. This task poses unique challenges, including the creation of task-oriented memories, the effective utilization of temporal and location information within memories, and the ability to draw upon multiple memories to answer a recall question. To address these challenges, we propose a comprehensive pipeline, Pensieve, integrating memory-specific augmentation, time- and location-aware multi-signal retrieval, and multi-memory QA fine-tuning. We created a multimodal benchmark to illustrate various real challenges in this task, and show the superior performance of Pensieve over state-of-the-art solutions (up to 14% on QA accuracy).
CVNov 2, 2018
What evidence does deep learning model use to classify Skin Lesions?Xiaoxiao Li, Junyan Wu, Eric Z. Chen et al.
Melanoma is a type of skin cancer with the most rapidly increasing incidence. Early detection of melanoma using dermoscopy images significantly increases patients' survival rate. However, accurately classifying skin lesions by eye, especially in the early stage of melanoma, is extremely challenging for the dermatologists. Hence, the discovery of reliable biomarkers will be meaningful for melanoma diagnosis. Recent years, the value of deep learning empowered computer-assisted diagnose has been shown in biomedical imaging based decision making. However, much research focuses on improving disease detection accuracy but not exploring the evidence of pathology. In this paper, we propose a method to interpret the deep learning classification findings. Firstly, we propose an accurate neural network architecture to classify skin lesions. Secondly, we utilize a prediction difference analysis method that examines each patch on the image through patch-wised corrupting to detect the biomarkers. Lastly, we validate that our biomarker findings are corresponding to the patterns in the literature. The findings can be significant and useful to guide clinical diagnosis.