view article Article How I contributed a new model to the Transformers library using Codex nielsr • Mar 30 • 52
🤗 SmolLM2 Automatic Essay Grading Collection Automatic Essay Grading - SmolLM2 • 15 items • Updated Jun 9, 2025 • 1
🪅 Qwen2.5 Automatic Essay Grading Collection Automatic Essay Grading - Qwen2.5 • 15 items • Updated Jun 9, 2025 • 1
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22, 2025 • 130
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention sirluk • Oct 7, 2024 • 71
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge NormalUhr • Feb 7, 2025 • 293
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated Jul 31, 2025 • 34
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 +1 eliebak, lvwerra, lewtun • Jan 28, 2025 • 889
view article Article Towards a Fully Arabic Retrieval-Augmented Generation (RAG) Pipeline: Omartificial-Intelligence-Space • Nov 30, 2024 • 28
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Paper • 2404.05726 • Published Apr 8, 2024 • 23
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 254
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models Paper • 2312.17661 • Published Dec 29, 2023 • 15
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 264
Distributed Representations of Words and Phrases and their Compositionality Paper • 1310.4546 • Published Oct 16, 2013 • 3
Efficient Estimation of Word Representations in Vector Space Paper • 1301.3781 • Published Jan 16, 2013 • 8
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning Paper • 2012.13255 • Published Dec 22, 2020 • 5