-
Qwen2.5 Technical Report
Paper β’ 2412.15115 β’ Published β’ 376 -
Qwen2.5-Coder Technical Report
Paper β’ 2409.12186 β’ Published β’ 152 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper β’ 2409.12122 β’ Published β’ 4 -
Qwen2.5-VL Technical Report
Paper β’ 2502.13923 β’ Published β’ 212
Collections
Discover the best community collections!
Collections including paper arxiv:2307.09288
-
Self-Play Preference Optimization for Language Model Alignment
Paper β’ 2405.00675 β’ Published β’ 28 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper β’ 2205.14135 β’ Published β’ 15 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 108 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper β’ 2307.08691 β’ Published β’ 9
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper β’ 2402.17764 β’ Published β’ 627 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 259 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper β’ 2312.11514 β’ Published β’ 260
-
stabilityai/stable-diffusion-3-medium
Text-to-Image β’ Updated β’ 6.98k β’ β’ 4.89k -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 259 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper β’ 2312.11514 β’ Published β’ 260
-
Qwen2.5 Technical Report
Paper β’ 2412.15115 β’ Published β’ 376 -
Qwen2.5-Coder Technical Report
Paper β’ 2409.12186 β’ Published β’ 152 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper β’ 2409.12122 β’ Published β’ 4 -
Qwen2.5-VL Technical Report
Paper β’ 2502.13923 β’ Published β’ 212
-
Self-Play Preference Optimization for Language Model Alignment
Paper β’ 2405.00675 β’ Published β’ 28 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper β’ 2205.14135 β’ Published β’ 15 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 108 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper β’ 2307.08691 β’ Published β’ 9
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper β’ 2402.17764 β’ Published β’ 627 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 259 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper β’ 2312.11514 β’ Published β’ 260
-
stabilityai/stable-diffusion-3-medium
Text-to-Image β’ Updated β’ 6.98k β’ β’ 4.89k -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper β’ 2307.09288 β’ Published β’ 248 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 259 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper β’ 2312.11514 β’ Published β’ 260