Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Paper • 2601.07226 • Published 27 days ago • 32
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning Paper • 2601.03872 • Published Jan 7 • 42
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment Paper • 2601.01576 • Published Jan 4 • 18
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs Paper • 2601.01836 • Published Jan 5 • 10
OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs Paper • 2601.01592 • Published Jan 4 • 12
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models Paper • 2601.01321 • Published Jan 4 • 18
The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving Paper • 2601.00747 • Published Jan 2 • 20
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving Paper • 2601.01426 • Published Jan 4 • 22
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning Paper • 2512.23412 • Published Dec 29, 2025 • 40
SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning Paper • 2512.24330 • Published Dec 30, 2025 • 35
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published Jan 4 • 57
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published Dec 31, 2025 • 119