Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Qwen
/
Qwen1.5-MoE-A2.7B

Text Generation
Transformers
Safetensors
English
qwen2_moe
pretrained
Mixture of Experts
conversational
Model card Files Files and versions
xet
Community
9
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Update README.md

#9 opened 3 months ago by
cherry0328

Adaptive-K Routing: 32% compute savings for Qwen-MoE

1
#8 opened 3 months ago by
Gabrobals

What is the evaluation setting to get the benchmark result like GSM8K?

2
#7 opened about 1 year ago by
ljb121002
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs