Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published Mar 8 • 18
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published 29 days ago • 29
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories Paper • 2602.14941 • Published Feb 16 • 6
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning Paper • 2602.08236 • Published Feb 9 • 9
Reliable and Responsible Foundation Models: A Comprehensive Survey Paper • 2602.08145 • Published Feb 4 • 8
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Paper • 2507.06485 • Published Jul 9, 2025 • 5
Movie Facts and Fibs (MF$^2$): A Benchmark for Long Movie Understanding Paper • 2506.06275 • Published Jun 6, 2025
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation Paper • 2506.17113 • Published Jun 20, 2025 • 5
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning? Paper • 2510.06036 • Published Oct 7, 2025 • 7
Planning with Sketch-Guided Verification for Physics-Aware Video Generation Paper • 2511.17450 • Published Nov 21, 2025 • 4
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI Paper • 2512.09867 • Published Dec 10, 2025
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25