Harold Chen's picture

Harold Chen

Harold328

·

https://haroldchen19.github.io/

HaroldChen19

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 1 day ago

MosaicMem: Hybrid Spatial Memory for Controllable Video World Models

upvoted a paper 1 day ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

upvoted a paper 1 day ago

FASTER: Rethinking Real-Time Flow VLAs

View all activity

Organizations

None yet

upvoted 4 papers 1 day ago

MosaicMem: Hybrid Spatial Memory for Controllable Video World Models

Paper • 2603.17117 • Published 4 days ago • 82

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 4 days ago • 115

FASTER: Rethinking Real-Time Flow VLAs

Paper • 2603.19199 • Published 2 days ago • 43

ESPIRE: A Diagnostic Benchmark for Embodied Spatial Reasoning of Vision-Language Models

Paper • 2603.13033 • Published 8 days ago • 13

upvoted 2 papers 3 days ago

GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published 7 days ago • 29

Demystifing Video Reasoning

Paper • 2603.16870 • Published 4 days ago • 346

upvoted a paper 4 days ago

Learning Latent Proxies for Controllable Single-Image Relighting

Paper • 2603.15555 • Published 5 days ago • 8

upvoted 3 papers 5 days ago

Panoramic Affordance Prediction

Paper • 2603.15558 • Published 5 days ago • 9

Attention Residuals

Paper • 2603.15031 • Published 5 days ago • 131

From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space

Paper • 2603.12648 • Published 9 days ago • 12

upvoted a paper 9 days ago

DVD: Deterministic Video Depth Estimation with Generative Priors

Paper • 2603.12250 • Published 9 days ago • 26

upvoted 9 papers about 1 month ago

Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models

Paper • 2602.10224 • Published Feb 10 • 19

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 30

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published Feb 11 • 53

VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Paper • 2602.10098 • Published Feb 10 • 19

BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation

Paper • 2602.09849 • Published Feb 10 • 16

Olaf-World: Orienting Latent Actions for Video World Modeling

Paper • 2602.10104 • Published Feb 10 • 27

Reinforcement World Model Learning for LLM-based Agents

Paper • 2602.05842 • Published Feb 5 • 27

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Paper • 2602.03036 • Published Feb 3 • 14

MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Paper • 2602.02474 • Published Feb 2 • 60