Rethinking State Tracking in Recurrent Models Through Error Control Dynamics Paper • 2605.07755 • Published 15 days ago • 23
view article Article Adding Benchmaxxer Repellant to the Open ASR Leaderboard +9 bezzam, Steveeeeeeen, eustlb, SBruccoleriAppen, jmss-appen, c-e-ford-appen, wgb14, YukaiHuang, like2026, logicbean, ally-lxl • 17 days ago • 16
Investigating Efficiently Extending Transformers for Long Input Summarization Paper • 2208.04347 • Published Aug 8, 2022 • 1
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 14 days ago • 37
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers tomaarsen • Apr 9 • 59
OlmPool Collection Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension". • 26 items • Updated 22 days ago • 5
Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published 24 days ago • 40
Why Fine-Tuning Encourages Hallucinations and How to Fix It Paper • 2604.15574 • Published Apr 16 • 23
Olmo 3.1 Collection The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 52
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 26 days ago • 88
Laguna XS.2 Collection Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 15 days ago • 21
Parakeet ASR Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 3 days ago • 74
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation Paper • 2604.09497 • Published Apr 10 • 29