AMD Ryzen AI Max+ 395 Strix Halo
Quantized models with Windows ROCm llama-bench results
Text Generation • 1B • Updated • 48.6k • 50Note pp512: 4168.69 ± 937.92 | tg128: 143.90 ± 7.15 [UD-Q6_K_XL]
unsloth/OLMo-2-0425-1B-Instruct-GGUF
Text Generation • 1B • Updated • 796 • 7Note pp512: 4563.71 ± 55.46 | tg128: 135.19 ± 0.38 [UD-Q6_K_XL]
unsloth/LFM2-8B-A1B-GGUF
Text Generation • 8B • Updated • 3.32k • 41Note pp512: 2352.90 ± 40.45 | tg128: 111.49 ± 2.38 [UD-Q6_K_XL]
ggml-org/SmolVLM2-2.2B-Instruct-GGUF
2B • Updated • 4.79k • 26Note pp512: 2062.47 ± 316.51 | tg128: 97.45 ± 4.51 [Q8_0]
ggml-org/gpt-oss-20b-GGUF
21B • Updated • 48k • 116Note pp512: 752.08 ± 13.02 | tg128: 66.05 ± 3.15 [MXFP4]
bartowski/Nanbeige_Nanbeige4-3B-Thinking-2511-GGUF
Text Generation • 4B • Updated • 9.02k • 12Note pp512: 1442.34 ± 110.00 | tg128: 84.12 ± 8.04 [IQ4_NL]
bartowski/XiaomiMiMo_MiMo-VL-7B-RL-2508-GGUF
Image-Text-to-Text • 8B • Updated • 427 • 1Note pp512: 704.39 ± 25.41 | tg128: 42.12 ± 5.50 [IQ4_NL]
ggml-org/Kimi-VL-A3B-Thinking-2506-GGUF
16B • Updated • 1.66k • 25Note pp512: 852.78 ± 40.09 | tg128: 45.57 ± 0.81 [Q8_0]
unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 94.7k • 203Note pp512: 493.86 ± 2.23 | tg128: 40.13 ± 0.47 [UD-Q6_K_XL]
unsloth/Seed-Coder-8B-Reasoning-GGUF
Text Generation • 8B • Updated • 1.47k • 11Note pp512: 834.67 ± 4.36 | tg128: 28.63 ± 0.30 [UD-Q6_K_XL]
unsloth/rnj-1-instruct-GGUF
8B • Updated • 5.3k • 6Note pp512: 677.64 ± 10.98 | tg128: 26.79 ± 0.71 [UD-Q6_K_XL]
Intel/MiniMax-M2-REAP-172B-A10B-gguf-q2ks-mixed-AutoRound
173B • Updated • 598 • 5Note pp512: 182.88 ± 3.84 | tg128: 26.21 ± 0.44 [Q2_K_S]
bartowski/ServiceNow-AI_Apriel-1.6-15b-Thinker-GGUF
Image-Text-to-Text • 14B • Updated • 18.4k • 23Note pp512: 338.01 ± 5.23 | tg128: 25.25 ± 0.39 [IQ4_NL]
unsloth/GLM-4.6V-Flash-GGUF
Image-Text-to-Text • 9B • Updated • 78.7k • 66Note pp512: 639.56 ± 2.57 | tg128: 23.76 ± 0.04 [UD-Q6_K_XL]
unsloth/Apertus-8B-Instruct-2509-GGUF
Text Generation • 8B • Updated • 2.45k • 15Note pp512: 546.45 ± 33.03 | tg128: 19.44 ± 0.24 [UD-Q6_K_XL]
unsloth/Phi-4-reasoning-plus-GGUF
Text Generation • 15B • Updated • 3.5k • 77Note pp512: 417.69 ± 13.85 | tg128: 15.35 ± 0.32 [UD-Q6_K_XL]
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
Any-to-Any • 108B • Updated • 21.1k • 125Note pp512: 138.94 ± 2.66 | tg128: 15.34 ± 1.97 [UD-IQ3_XXS]
bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF
Text Generation • 24B • Updated • 43.2k • 45Note pp512: 179.34 ± 0.99 | tg128: 14.57 ± 0.12 [IQ4_NL]
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
24B • Updated • 154k • 73Note pp512: 245.79 ± 5.41 | tg128: 9.80 ± 0.03 [UD-Q6_K_XL]
unsloth/Olmo-3.1-32B-Think-GGUF
32B • Updated • 1.99k • 6Note pp512: 179.13 ± 5.12 | tg128: 7.04 ± 0.17 [UD-Q6_K_XL]
unsloth/Seed-OSS-36B-Instruct-GGUF
Text Generation • 36B • Updated • 5.42k • 41Note pp512: 166.54 ± 4.16 | tg128: 6.33 ± 0.02 [UD-Q6_K_XL]
unsloth/Kimi-Dev-72B-GGUF
73B • Updated • 1.54k • 46Note pp512: 68.09 ± 0.25 | tg128: 5.11 ± 0.32 [UD-IQ3_XXS]