Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions Paper • 2506.00421 • Published May 31, 2025 • 5
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation Paper • 2504.17207 • Published Apr 24, 2025 • 30
Response Tuning: Aligning Large Language Models without Instruction Paper • 2410.02465 • Published Oct 3, 2024 • 13