Data and models for the paper "How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models"
Kristian Schwethelm
KristianS7
AI & ML interests
Large Language Models
Recent Activity
liked a model about 15 hours ago
KristianS7/Ouro-1.4B updated a model about 15 hours ago
KristianS7/Ouro-1.4B upvoted an article about 16 hours ago
Mixture of Experts (MoEs) in Transformers