Qwen2.5-7B-Instruct Abliterated (GGUF)

An abliterated (uncensored) version of Qwen/Qwen2.5-7B-Instruct in GGUF format, ready for local inference with llama.cpp, Ollama, or LM Studio.

Abliteration removes the refusal training from the model while preserving its core capabilities — useful for research, creative writing, and scenarios where you need unrestricted model output.

Quick Start

With Ollama

ollama run hf.co/richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF

With llama.cpp

# Download the Q4_K_M quantization (recommended balance of quality/speed)
huggingface-cli download richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF \
    --include "*Q4_K_M*" --local-dir ./models

# Run inference
./llama-cli -m ./models/*Q4_K_M*.gguf \
    -p "You are a helpful assistant." \
    --chat-template chatml -ngl 99

With LM Studio

Search for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF in the model browser, or download manually and import.

Available Quantizations

Quantization	Use Case
Q2_K	Minimum RAM, lower quality
Q4_K_M	Recommended — good balance of quality and speed
Q5_K_M	Higher quality, more RAM
Q6_K	Near-original quality
Q8_0	Maximum quality, most RAM

What is Abliteration?

Abliteration is a technique that identifies and removes the "refusal direction" in a model's residual stream. Unlike fine-tuning, it surgically modifies the model's behavior without retraining, preserving the original model's knowledge and capabilities.

For more details, see the original research: Refusal in Language Models Is Mediated by a Single Direction

Intended Use

This model is intended for:

Research on model alignment and safety
Creative writing without artificial restrictions
Education on how language model censorship works
Local inference where you control the deployment context

Limitations

Abliterated models will comply with requests the base model would refuse
Use responsibly — the model has no safety guardrails
Output quality matches the base Qwen2.5-7B-Instruct

Other Models by richardyoung

Abliterated/Uncensored models: Qwen2.5-7B | Qwen3-14B | DeepSeek-R1-32B | Qwen3-8B
MLX quantizations (Apple Silicon): Kimi-K2 series | olmOCR MLX
OCR & Vision: olmOCR GGUF
Healthcare/Medical: Synthea 575K patients dataset | CardioEmbed
Research: LLM Instruction-Following Evaluation (arxiv:2510.18892)

Downloads last month: 538

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

4-bit

5-bit

8-bit

Model tree for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

(288)

this model

Papers for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF

When Models Can't Follow: Testing Instruction Adherence Across 256 LLMs

Paper • 2510.18892 • Published Oct 18, 2025 • 1

Refusal in Language Models Is Mediated by a Single Direction

Paper • 2406.11717 • Published Jun 17, 2024 • 9