Update README.md
#9 opened 3 months ago
by
cherry0328
Adaptive-K Routing: 32% compute savings for Qwen-MoE
1
#8 opened 3 months ago
by
Gabrobals
What is the evaluation setting to get the benchmark result like GSM8K?
2
#7 opened about 1 year ago
by
ljb121002