Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,883

Full-text search

Active filters: 4-bit

Intel/GLM-4.7-int4-mixed-AutoRound

Text Generation • 2B • Updated 1 day ago • 89 • 14

tencent/HY-MT1.5-1.8B-GPTQ-Int4

Translation • 2B • Updated about 13 hours ago • 291 • 9

QuantTrio/MiniMax-M2.1-AWQ

Text Generation • 229B • Updated 2 days ago • 1.17k • 5

tencent/HY-MT1.5-7B-GPTQ-Int4

Translation • 8B • Updated about 13 hours ago • 226 • 5

QuantTrio/GLM-4.7-AWQ

Text Generation • 358B • Updated 3 days ago • 10.6k • 12

QuantTrio/GLM-4.7-GPTQ-Int4-Int8Mix

Text Generation • 390B • Updated 3 days ago • 126 • 4

mlx-community/MiniMax-M2.1-4bit

Text Generation • 229B • Updated 6 days ago • 866 • 4

Disty0/Qwen-Image-Edit-2511-SDNQ-uint4-svd-r32

Image-to-Image • Updated 9 days ago • 281 • 6

TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ

7B • Updated Jan 31, 2024 • 192 • 61

unsloth/Phi-3-mini-4k-instruct-bnb-4bit

Text Generation • 4B • Updated Sep 3, 2024 • 35k • 42

lmstudio-community/Qwen2.5-Coder-7B-Instruct-MLX-4bit

Text Generation • 1B • Updated Nov 13, 2024 • 1.94k • 3

ICEPVP8977/Uncensored_Qwen2.5_Coder_7B_4_bit_quantized_Seaftensors

8B • Updated Mar 19, 2025 • 54 • 3

lmstudio-community/Devstral-Small-2507-MLX-4bit

Text Generation • 24B • Updated Jul 10, 2025 • 26.4k • 5

Intel/Qwen3-Next-80B-A3B-Instruct-int4-mixed-AutoRound

Text Generation • Updated Sep 18, 2025 • 18.9k • 23

nota-ai/Qwen3-30B-A3B-NotaMoEQuant-Int4

Text Generation • 0.6B • Updated 8 days ago • 136 • 4

nota-ai/GLM-4.5-Air-NotaMoeQuant-Int4

Text Generation • 1B • Updated 3 days ago • 63 • 2

nightmedia/Qwen3-4B-Agent-F32-dwq4-mlx

Text Generation • 0.8B • Updated 5 days ago • 212 • 2

mlx-community/maya1-4bit

Text-to-Speech • 0.5B • Updated 8 days ago • 34 • 2

TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ

Text Generation • 33B • Updated Aug 21, 2023 • 39 • 93

MaziyarPanahi/TheTop-5x7B-Instruct-S5-v0.1-GGUF

Text Generation • 7B • Updated Feb 19, 2024 • 32 • 3

MaziyarPanahi/gemma-7b-GGUF

Text Generation • 9B • Updated Feb 29, 2024 • 1.35k • 12

CohereLabs/c4ai-command-r-v01-4bit

Text Generation • 35B • Updated Apr 16, 2025 • 33 • 176

Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4

Text Generation • 14B • Updated Jun 9, 2024 • 1.19k • 49

SweatyCrayfish/llama-3-8b-quantized

Text Generation • 8B • Updated Apr 20, 2024 • 32 • • 12

solidrust/Llama-3-8B-Lexi-Uncensored-AWQ

Text Generation • 8B • Updated Sep 3, 2024 • 99.5k • 4

MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF

Text Generation • 7B • Updated May 22, 2024 • 169k • 129

Intel/Qwen2-0.5B-Instuct-int4-inc

Text Generation • 0.6B • Updated Jun 6, 2024 • 4 • 1

Intel/Qwen2-1.5B-Instuct-int4-inc

Text Generation • 2B • Updated Jun 6, 2024 • 4 • 3

MaziyarPanahi/Mistral-Nemo-Instruct-2407-GGUF

Text Generation • 12B • Updated Jul 22, 2024 • 165k • 50

hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4

Text Generation • 8B • Updated Aug 7, 2024 • 177k • 82