Qwen2.5-7B-Instruct Abliterated (GGUF)
An abliterated (uncensored) version of Qwen/Qwen2.5-7B-Instruct in GGUF format, ready for local inference with llama.cpp, Ollama, or LM Studio.
Abliteration removes the refusal training from the model while preserving its core capabilities — useful for research, creative writing, and scenarios where you need unrestricted model output.
Quick Start
With Ollama
ollama run hf.co/richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF
With llama.cpp
# Download the Q4_K_M quantization (recommended balance of quality/speed)
huggingface-cli download richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF \
--include "*Q4_K_M*" --local-dir ./models
# Run inference
./llama-cli -m ./models/*Q4_K_M*.gguf \
-p "You are a helpful assistant." \
--chat-template chatml -ngl 99
With LM Studio
Search for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF in the model browser, or download manually and import.
Available Quantizations
| Quantization | Use Case |
|---|---|
| Q2_K | Minimum RAM, lower quality |
| Q4_K_M | Recommended — good balance of quality and speed |
| Q5_K_M | Higher quality, more RAM |
| Q6_K | Near-original quality |
| Q8_0 | Maximum quality, most RAM |
What is Abliteration?
Abliteration is a technique that identifies and removes the "refusal direction" in a model's residual stream. Unlike fine-tuning, it surgically modifies the model's behavior without retraining, preserving the original model's knowledge and capabilities.
For more details, see the original research: Refusal in Language Models Is Mediated by a Single Direction
Intended Use
This model is intended for:
- Research on model alignment and safety
- Creative writing without artificial restrictions
- Education on how language model censorship works
- Local inference where you control the deployment context
Limitations
- Abliterated models will comply with requests the base model would refuse
- Use responsibly — the model has no safety guardrails
- Output quality matches the base Qwen2.5-7B-Instruct
Other Models by richardyoung
- Abliterated/Uncensored models: Qwen2.5-7B | Qwen3-14B | DeepSeek-R1-32B | Qwen3-8B
- MLX quantizations (Apple Silicon): Kimi-K2 series | olmOCR MLX
- OCR & Vision: olmOCR GGUF
- Healthcare/Medical: Synthea 575K patients dataset | CardioEmbed
- Research: LLM Instruction-Following Evaluation (arxiv:2510.18892)
- Downloads last month
- 538
4-bit
5-bit
8-bit