Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2307.09288

Language Models - Essential Research Papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 20
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

Milestone Papers in LLM

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26, 2025 • 4.16M • • 838
Qwen/Qwen3-4B

Text Generation • 4B • Updated Jul 26, 2025 • 3.85M • • 512
Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26, 2025 • 8.32M • • 939
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21, 2025 • 772k • 1.07k

Source Papers of LLM Giants

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 37
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Paper • 2311.07919 • Published Nov 14, 2023 • 10
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 63

Running

3.04k

AnyCoder

🏆

3.04k

Generate code with AI
Running

Featured

274

Qwen2.5 Coder Artifacts

🐢

274

Generate code from natural language prompts
Running

Featured

923

QwQ-32B-Preview

🔍

923

QwQ-32B-Preview
Running on CPU Upgrade

13.8k

Open LLM Leaderboard

🏆

13.8k

Track, rank and evaluate open LLMs and chatbots

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 43
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 151
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 56
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 433

New Tools For Oct 2024

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 656k • • 12.1k
openai/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 2.99M • • 2.75k
meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 104k • • 1.55k
deepseek-ai/DeepSeek-V2.5

Text Generation • 236B • Updated Dec 11, 2024 • 3.99k • • 732

Language Models - Essential Research Papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 108
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 18
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 20
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

Milestone Papers in LLM

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 43
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 151
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26, 2025 • 4.16M • • 838
Qwen/Qwen3-4B

Text Generation • 4B • Updated Jul 26, 2025 • 3.85M • • 512
Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26, 2025 • 8.32M • • 939
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21, 2025 • 772k • 1.07k

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Source Papers of LLM Giants

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 37
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Paper • 2311.07919 • Published Nov 14, 2023 • 10
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 63

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 56
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 248
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 433

Running

3.04k

AnyCoder

🏆

3.04k

Generate code with AI
Running

Featured

274

Qwen2.5 Coder Artifacts

🐢

274

Generate code from natural language prompts
Running

Featured

923

QwQ-32B-Preview

🔍

923

QwQ-32B-Preview
Running on CPU Upgrade

13.8k

Open LLM Leaderboard

🏆

13.8k

Track, rank and evaluate open LLMs and chatbots

New Tools For Oct 2024

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 656k • • 12.1k
openai/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 2.99M • • 2.75k
meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 104k • • 1.55k
deepseek-ai/DeepSeek-V2.5

Text Generation • 236B • Updated Dec 11, 2024 • 3.99k • • 732

Previous
1
2
3
...
9
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs