Stoney Kang's picture

Stoney Kang

sikang99

·

AI & ML interests

Remote Control based on Vision

Recent Activity

upvoted a paper about 17 hours ago

AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents

upvoted a paper about 17 hours ago

StableVLA: Towards Robust Vision-Language-Action Models without Extra Data

upvoted a paper about 17 hours ago

Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis

View all activity

Organizations

upvoted 3 papers about 17 hours ago

AtlasVA: Self-Evolving Visual Skill Memory for Teacher-Free VLM Agents

Paper • 2605.17933 • Published 2 days ago • 5

StableVLA: Towards Robust Vision-Language-Action Models without Extra Data

Paper • 2605.18287 • Published 2 days ago • 13

Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis

Paper • 2605.18451 • Published 2 days ago • 35

upvoted 2 papers 1 day ago

Unlocking Dense Metric Depth Estimation in VLMs

Paper • 2605.15876 • Published 5 days ago • 9

MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 6 days ago • 110

upvoted a paper 2 days ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published 6 days ago • 75

upvoted 2 papers 3 days ago

Orchard: An Open-Source Agentic Modeling Framework

Paper • 2605.15040 • Published 6 days ago • 18

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published 9 days ago • 45

upvoted a paper 4 days ago

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

Paper • 2605.12587 • Published 8 days ago • 36

upvoted 6 papers 5 days ago

PanoWorld: Towards Spatial Supersensing in 360^circ Panorama World

Paper • 2605.13169 • Published 7 days ago • 20

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 6 days ago • 100

VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction

Paper • 2605.15186 • Published 6 days ago • 25

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Paper • 2605.15128 • Published 6 days ago • 60

Nexus : An Agentic Framework for Time Series Forecasting

Paper • 2605.14389 • Published 6 days ago • 3

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

Paper • 2605.14712 • Published 6 days ago • 16

upvoted a paper 6 days ago

The DAWN of World-Action Interactive Models

Paper • 2605.11550 • Published 8 days ago • 22

upvoted 4 papers 7 days ago

LiVeAction: a Lightweight, Versatile, and Asymmetric Neural Codec Design for Real-time Operation

Paper • 2605.06628 • Published 13 days ago • 6

RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark

Paper • 2605.10921 • Published 9 days ago • 4

FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration

Paper • 2605.08520 • Published 12 days ago • 6

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

Paper • 2605.08738 • Published 11 days ago • 13