Qwen
/

Qwen1.5-MoE-A2.7B

Text Generation

Mixture of Experts

Model card Files Files and versions

Resources

View closed (6)

Update README.md

#9 opened 3 months ago by

Adaptive-K Routing: 32% compute savings for Qwen-MoE

#8 opened 3 months ago by

What is the evaluation setting to get the benchmark result like GSM8K?

#7 opened about 1 year ago by