AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents Paper • 2605.17933 • Published 2 days ago • 5
StableVLA: Towards Robust Vision-Language-Action Models without Extra Data Paper • 2605.18287 • Published 2 days ago • 13
Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis Paper • 2605.18451 • Published 2 days ago • 35
MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 6 days ago • 110
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published 6 days ago • 75
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 9 days ago • 45
TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking Paper • 2605.12587 • Published 8 days ago • 36
PanoWorld: Towards Spatial Supersensing in 360^circ Panorama World Paper • 2605.13169 • Published 7 days ago • 20
VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction Paper • 2605.15186 • Published 6 days ago • 25
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 6 days ago • 60
Nexus : An Agentic Framework for Time Series Forecasting Paper • 2605.14389 • Published 6 days ago • 3
IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation Paper • 2605.14712 • Published 6 days ago • 16
LiVeAction: a Lightweight, Versatile, and Asymmetric Neural Codec Design for Real-time Operation Paper • 2605.06628 • Published 13 days ago • 6
RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark Paper • 2605.10921 • Published 9 days ago • 4
FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration Paper • 2605.08520 • Published 12 days ago • 6
SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training Paper • 2605.08738 • Published 11 days ago • 13