VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory Paper • 2512.04519 • Published Dec 4, 2025 • 4
Self-Evaluation Unlocks Any-Step Text-to-Image Generation Paper • 2512.22374 • Published 10 days ago • 15
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 13 days ago • 49
MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory Paper • 2511.22609 • Published Nov 27, 2025 • 48
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Paper • 2507.10548 • Published Jul 14, 2025 • 36
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11, 2025 • 61