8 20 12

Zhongang Cai

caizhongang

http://caizhongang.com/

AI & ML interests

Multimodal, Spatial Intelligence, Embodied AI, Virtual Humans.

Recent Activity

updated a dataset 10 days ago

sensenova/MessyTable-SI

upvoted a paper 16 days ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

liked a dataset 17 days ago

sensenova/SenseNova-SI-800K

View all activity

Organizations

upvoted a paper 16 days ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published 17 days ago • 62

upvoted a paper 23 days ago

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Paper • 2512.13604 • Published 24 days ago • 73

upvoted a collection 30 days ago

SenseNova-SI

Collection

Scaling Spatial Intelligence with Multimodal Foundation Models • 10 items • Updated about 2 hours ago • 14

upvoted 4 papers about 2 months ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92

upvoted 3 papers 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals

Paper • 2510.27684 • Published Oct 31, 2025 • 22

The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

Paper • 2510.26794 • Published Oct 30, 2025 • 26

upvoted 3 papers 3 months ago

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior

Paper • 2508.00599 • Published Aug 1, 2025 • 7

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

Paper • 2510.05094 • Published Oct 6, 2025 • 37

Visual Jigsaw Post-Training Improves MLLMs

Paper • 2509.25190 • Published Sep 29, 2025 • 36

upvoted 3 papers 5 months ago

EgoTwin: Dreaming Body and View in First Person

Paper • 2508.13013 • Published Aug 18, 2025 • 21

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Paper • 2508.13142 • Published Aug 18, 2025 • 34

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Paper • 2508.13154 • Published Aug 18, 2025 • 62

upvoted a collection 8 months ago

EgoLife

Collection

CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7, 2025 • 20

upvoted a paper 8 months ago

EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5, 2025 • 46

upvoted a paper about 1 year ago

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Paper • 2412.00174 • Published Nov 29, 2024 • 23

upvoted a paper over 1 year ago

Disco4D: Disentangled 4D Human Generation and Animation from a Single Image

Paper • 2409.17280 • Published Sep 25, 2024 • 10

Zhongang Cai

AI & ML interests

Recent Activity

Organizations

caizhongang's activity