sherry's picture

sherry

rain305

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

Kwai Keye-VL-2.0 Technical Report

upvoted a paper 9 days ago

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

upvoted a paper 9 days ago

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

View all activity

Organizations

None yet

upvoted a paper 6 days ago

Kwai Keye-VL-2.0 Technical Report

Paper • 2606.10651 • Published 11 days ago • 187

upvoted 4 papers 9 days ago

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Paper • 2605.18740 • Published May 18 • 5

Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

Paper • 2606.05922 • Published 15 days ago • 52

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 11 days ago • 41

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

Paper • 2606.10804 • Published 11 days ago • 43

upvoted 2 papers 2 months ago

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 102

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 111

upvoted 2 papers 3 months ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published Mar 26 • 155

Evaluating and Steering Modality Preferences in Multimodal Large Language Model

Paper • 2505.20977 • Published May 27, 2025 • 10

upvoted an article 3 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 165

upvoted 6 papers 3 months ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Paper • 2603.08652 • Published Mar 9 • 41

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published May 6, 2025 • 94

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7, 2025 • 124

Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Paper • 2601.10332 • Published Jan 15 • 32

Enhancing Spatial Understanding in Image Generation via Reward Modeling

Paper • 2602.24233 • Published Feb 27 • 60

upvoted 4 papers 4 months ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 106

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Paper • 2601.10061 • Published Jan 15 • 32