-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
Collections
Discover the best community collections!
Collections including paper arxiv:2511.17889
-
Unified Vision-Language-Action Model
Paper • 2506.19850 • Published • 28 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 157 -
3D-VLA: A 3D Vision-Language-Action Generative World Model
Paper • 2403.09631 • Published • 12 -
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Paper • 2312.14457 • Published • 1
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 60 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 31 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49
-
OpenVLA: An Open-Source Vision-Language-Action Model
Paper • 2406.09246 • Published • 46 -
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Paper • 2411.19650 • Published -
Octo: An Open-Source Generalist Robot Policy
Paper • 2405.12213 • Published • 29 -
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Paper • 2412.03293 • Published
-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
-
Unified Vision-Language-Action Model
Paper • 2506.19850 • Published • 28 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 157 -
3D-VLA: A 3D Vision-Language-Action Generative World Model
Paper • 2403.09631 • Published • 12 -
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Paper • 2312.14457 • Published • 1
-
Gemini Robotics: Bringing AI into the Physical World
Paper • 2503.20020 • Published • 31 -
Magma: A Foundation Model for Multimodal AI Agents
Paper • 2502.13130 • Published • 58 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 51 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 60 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 45 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 64
-
OpenVLA: An Open-Source Vision-Language-Action Model
Paper • 2406.09246 • Published • 46 -
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Paper • 2411.19650 • Published -
Octo: An Open-Source Generalist Robot Policy
Paper • 2405.12213 • Published • 29 -
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Paper • 2412.03293 • Published