view article Article Did GPT 5.2 make a breakthrough discovery in theoretical physics? 7 days ago • 56
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 7 days ago • 465
Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Paper • 2602.03773 • Published 23 days ago • 10
view article Article Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments Jan 20 • 11
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face Feb 11, 2025 • 105
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 22 days ago • 35
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 22 days ago • 49
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 Jan 18, 2024 • 79
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 302
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models Dec 15, 2025 • 109
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 228