Knowledge Distillation - a DistAya Collection

DistAya 's Collections

Knowledge Distillation

KV Cache Compression

Model Compression & Inference Survey Papers

Knowledge Distillation

updated Aug 27, 2024

shayekh/aya8b-distillkit-hidden

Updated Aug 11, 2024 • 1
shayekh/aya8b-distillkit-logits

Updated Aug 11, 2024
AhmadMustafa/distAyaQwen

0.6B • Updated Aug 11, 2024 • 3 • 1
Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Paper • 2210.01351 • Published Oct 4, 2022 • 3
A Survey on Knowledge Distillation of Large Language Models

Paper • 2402.13116 • Published Feb 20, 2024 • 4
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Paper • 2306.13649 • Published Jun 23, 2023 • 31
Compact Language Models via Pruning and Knowledge Distillation

Paper • 2407.14679 • Published Jul 19, 2024 • 39
LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
DistiLLM: Towards Streamlined Distillation for Large Language Models

Paper • 2402.03898 • Published Feb 6, 2024 • 3
Relational Knowledge Distillation

Paper • 1904.05068 • Published Apr 10, 2019 • 1
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Paper • 2305.02301 • Published May 3, 2023 • 5