Post
14763
š®š³ Qwen3-4B Hindi Instruct v2 ā a Hindi LLM that runs on your own machine
Most strong Hindi-capable models are either huge or cloud-only. I wanted one that's small enough to run locally but actually follows instructions in Hindi ā so I fine-tuned Qwen3-4B on 10K Hindi instruction pairs and shipped it with a full GGUF quant ladder.
ā Fine-tune (16-bit): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2
ā GGUF (Q4/Q5/Q8): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2-GGUF
Runs in Ollama, llama.cpp, and LM Studio. The Q4_K_M is just 2.5 GB ā fits comfortably on a laptop, CPU or GPU.
Part of my Hindi LLM Series ā building openly-licensed Indic models for local and edge use. More coming (Gemma next). Feedback welcome š
#Hindi #IndicNLP #GGUF #LocalLLM #Qwen
Most strong Hindi-capable models are either huge or cloud-only. I wanted one that's small enough to run locally but actually follows instructions in Hindi ā so I fine-tuned Qwen3-4B on 10K Hindi instruction pairs and shipped it with a full GGUF quant ladder.
ā Fine-tune (16-bit): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2
ā GGUF (Q4/Q5/Q8): huggingface.co/pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2-GGUF
Runs in Ollama, llama.cpp, and LM Studio. The Q4_K_M is just 2.5 GB ā fits comfortably on a laptop, CPU or GPU.
Part of my Hindi LLM Series ā building openly-licensed Indic models for local and edge use. More coming (Gemma next). Feedback welcome š
#Hindi #IndicNLP #GGUF #LocalLLM #Qwen