Llama 3.3 8B Instruct GGUF

Quantized GGUF versions of shb777/Llama-3.3-8B-Instruct-128K

Includes fixes for full context length and chat template (Unsloth)

Quantization	File	Use Case
Q8_0	`llama-3.3-8b-instruct-q8_0.gguf`	Highest quality, largest size
Q6_K	`llama-3.3-8b-instruct-q6_k.gguf`	Balanced quality/size
Q4_K_M	`llama-3.3-8b-instruct-q4_k_m.gguf`	Smaller size, good quality

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit

6-bit

8-bit

Model tree for shb777/Llama-3.3-8B-Instruct-128K-GGUF

Base model

Finetuned

Quantized

(5)

this model