Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models Paper • 2605.07721 • Published 12 days ago • 29
Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck Paper • 2603.08462 • Published Mar 9 • 22
Efficient Training-Free Multi-Token Prediction via Embedding-Space Probing Paper • 2603.17942 • Published Mar 18 • 8