Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation Paper • 2602.07670 • Published 4 days ago • 1
view article Article Where should test-time compute go? Surprisal-guided selection in verifiable environments 4 days ago • 1
Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents Paper • 2601.18217 • Published 16 days ago • 11
OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence Paper • 2601.21083 • Published 14 days ago • 1
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. • 8 items • Updated 7 days ago • 62
view article Article Frontier Security Agents Don't Lack Detection. They Lack Restraint 19 days ago • 2
PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models Paper • 2601.11087 • Published 26 days ago • 11
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking Paper • 2601.06487 • Published Jan 10 • 52
CausalARC: Abstract Reasoning with Causal World Models Paper • 2509.03636 • Published Sep 3, 2025 • 1
Parakeet Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 7 days ago • 54
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16, 2025 • 61
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published Dec 18, 2025 • 35
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 119