Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria Paper • 2605.08354 • Published 5 days ago • 20
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published Mar 5 • 56
GEditBench v2: A Human-Aligned Benchmark for General Image Editing Paper • 2603.28547 • Published Mar 30 • 32
GEditBench v2: A Human-Aligned Benchmark for General Image Editing Paper • 2603.28547 • Published Mar 30 • 32
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 94