-
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 108 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 18 -
LLaMA: Open and Efficient Foundation Language Models
Paper β’ 2302.13971 β’ Published β’ 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248
Collections
Discover the best community collections!
Collections including paper arxiv:2307.09288
-
Qwen Technical Report
Paper β’ 2309.16609 β’ Published β’ 37 -
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Paper β’ 2311.07919 β’ Published β’ 10 -
Qwen2 Technical Report
Paper β’ 2407.10671 β’ Published β’ 167 -
Qwen2-Audio Technical Report
Paper β’ 2407.10759 β’ Published β’ 63
-
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper β’ 2503.14476 β’ Published β’ 144 -
Training language models to follow instructions with human feedback
Paper β’ 2203.02155 β’ Published β’ 24 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
The Llama 3 Herd of Models
Paper β’ 2407.21783 β’ Published β’ 117
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper β’ 2211.04325 β’ Published β’ 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 25 -
On the Opportunities and Risks of Foundation Models
Paper β’ 2108.07258 β’ Published β’ 2 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper β’ 2204.07705 β’ Published β’ 2
-
Mistral 7B
Paper β’ 2310.06825 β’ Published β’ 56 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper β’ 2309.11235 β’ Published β’ 15 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 433
-
black-forest-labs/FLUX.1-dev
Text-to-Image β’ Updated β’ 656k β’ β’ 12.1k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ 0.8B β’ Updated β’ 2.99M β’ β’ 2.75k -
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text β’ 11B β’ Updated β’ 104k β’ β’ 1.55k -
deepseek-ai/DeepSeek-V2.5
Text Generation β’ 236B β’ Updated β’ 3.99k β’ β’ 732
-
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 108 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 18 -
LLaMA: Open and Efficient Foundation Language Models
Paper β’ 2302.13971 β’ Published β’ 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248
-
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper β’ 2503.14476 β’ Published β’ 144 -
Training language models to follow instructions with human feedback
Paper β’ 2203.02155 β’ Published β’ 24 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
The Llama 3 Herd of Models
Paper β’ 2407.21783 β’ Published β’ 117
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper β’ 2211.04325 β’ Published β’ 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 25 -
On the Opportunities and Risks of Foundation Models
Paper β’ 2108.07258 β’ Published β’ 2 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper β’ 2204.07705 β’ Published β’ 2
-
Qwen Technical Report
Paper β’ 2309.16609 β’ Published β’ 37 -
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Paper β’ 2311.07919 β’ Published β’ 10 -
Qwen2 Technical Report
Paper β’ 2407.10671 β’ Published β’ 167 -
Qwen2-Audio Technical Report
Paper β’ 2407.10759 β’ Published β’ 63
-
Mistral 7B
Paper β’ 2310.06825 β’ Published β’ 56 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper β’ 2309.11235 β’ Published β’ 15 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 433
-
black-forest-labs/FLUX.1-dev
Text-to-Image β’ Updated β’ 656k β’ β’ 12.1k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition β’ 0.8B β’ Updated β’ 2.99M β’ β’ 2.75k -
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text β’ 11B β’ Updated β’ 104k β’ β’ 1.55k -
deepseek-ai/DeepSeek-V2.5
Text Generation β’ 236B β’ Updated β’ 3.99k β’ β’ 732