Qwen2.5-7B-Instruct Abliterated (GGUF)

An abliterated (uncensored) version of Qwen/Qwen2.5-7B-Instruct in GGUF format, ready for local inference with llama.cpp, Ollama, or LM Studio.

Abliteration removes the refusal training from the model while preserving its core capabilities — useful for research, creative writing, and scenarios where you need unrestricted model output.

Quick Start

With Ollama

ollama run hf.co/richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF

With llama.cpp

# Download the Q4_K_M quantization (recommended balance of quality/speed)
huggingface-cli download richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF \
    --include "*Q4_K_M*" --local-dir ./models

# Run inference
./llama-cli -m ./models/*Q4_K_M*.gguf \
    -p "You are a helpful assistant." \
    --chat-template chatml -ngl 99

With LM Studio

Search for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF in the model browser, or download manually and import.

Available Quantizations

Quantization Use Case
Q2_K Minimum RAM, lower quality
Q4_K_M Recommended — good balance of quality and speed
Q5_K_M Higher quality, more RAM
Q6_K Near-original quality
Q8_0 Maximum quality, most RAM

What is Abliteration?

Abliteration is a technique that identifies and removes the "refusal direction" in a model's residual stream. Unlike fine-tuning, it surgically modifies the model's behavior without retraining, preserving the original model's knowledge and capabilities.

For more details, see the original research: Refusal in Language Models Is Mediated by a Single Direction

Intended Use

This model is intended for:

  • Research on model alignment and safety
  • Creative writing without artificial restrictions
  • Education on how language model censorship works
  • Local inference where you control the deployment context

Limitations

  • Abliterated models will comply with requests the base model would refuse
  • Use responsibly — the model has no safety guardrails
  • Output quality matches the base Qwen2.5-7B-Instruct

Other Models by richardyoung

Downloads last month
538
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(288)
this model

Papers for richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF