Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.20256

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

lightx2v/Hy1.5-Distill-Models

Text-to-Video • Updated Nov 24, 2025 • 795 • 28
Plan-X: Instruct Video Generation via Semantic Planning

Paper • 2511.17986 • Published Nov 22, 2025 • 17
In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published Nov 24, 2025 • 31
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published Nov 25, 2025 • 27

Generation Quality Enhancement

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

Paper • 2412.20800 • Published Dec 30, 2024 • 11
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models

Paper • 2501.06751 • Published Jan 12, 2025 • 32
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16, 2025 • 71
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16, 2025 • 35

The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

benzweijia/Adv-GRPO

Updated 23 days ago • 14 • 3
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published Nov 25, 2025 • 27
Runtime error

Adv GRPO

📊

an RL method using adversarial reward models
Sleeping

Adv-GRPO DINO

👁

Generate images from text prompts

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 77
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 36

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

benzweijia/Adv-GRPO

Updated 23 days ago • 14 • 3
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published Nov 25, 2025 • 27
Runtime error

Adv GRPO

📊

an RL method using adversarial reward models
Sleeping

Adv-GRPO DINO

👁

Generate images from text prompts

lightx2v/Hy1.5-Distill-Models

Text-to-Video • Updated Nov 24, 2025 • 795 • 28
Plan-X: Instruct Video Generation via Semantic Planning

Paper • 2511.17986 • Published Nov 22, 2025 • 17
In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published Nov 24, 2025 • 31
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published Nov 25, 2025 • 27

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 77
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 36

Generation Quality Enhancement

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

Paper • 2412.20800 • Published Dec 30, 2024 • 11
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models

Paper • 2501.06751 • Published Jan 12, 2025 • 32
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16, 2025 • 71
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16, 2025 • 35

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs