Instructions to use mahmoudalrefaey/llama3-codeweaver-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mahmoudalrefaey/llama3-codeweaver-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mahmoudalrefaey/llama3-codeweaver-lora")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mahmoudalrefaey/llama3-codeweaver-lora", dtype="auto")

PEFT
How to use mahmoudalrefaey/llama3-codeweaver-lora with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use mahmoudalrefaey/llama3-codeweaver-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mahmoudalrefaey/llama3-codeweaver-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmoudalrefaey/llama3-codeweaver-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/mahmoudalrefaey/llama3-codeweaver-lora

SGLang

How to use mahmoudalrefaey/llama3-codeweaver-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mahmoudalrefaey/llama3-codeweaver-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmoudalrefaey/llama3-codeweaver-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mahmoudalrefaey/llama3-codeweaver-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mahmoudalrefaey/llama3-codeweaver-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use mahmoudalrefaey/llama3-codeweaver-lora with Docker Model Runner:
```
docker model run hf.co/mahmoudalrefaey/llama3-codeweaver-lora
```

Model Card for llama3-codeweaver-lora

Model Details

Model name: llama3-codeweaver-lora
Developed by: mahmoudalrefaey
Funded by: None (personal project)
Finetuned from: meta-llama/Meta-Llama-3-8B
License: LLaMA 3 license

This is a LLaMA-3 8B model fine-tuned with QLoRA on the CoNaLa mined-curated dataset for code generation tasks.
The adapter was trained on Google Colab T4 (16GB) using fp16 mixed precision with QLoRA for efficiency.

Uses

Direct Use

Intended for code generation assistant tasks such as transforming natural language instructions into Python snippets.
Educational use for learning about LLM fine-tuning with LoRA adapters.

Downstream Use

Can be further fine-tuned on specialized coding datasets (e.g. SQL, JS).
Integration into coding assistants and research projects.

Out-of-Scope Use

Not intended for production-critical code security auditing.
Not guaranteed to generate safe or fully optimized code.
Should not be used in environments where code execution safety is critical without sandboxing.

Training Details

Training Data

Dataset: CoNaLa mined-curated
Dataset size used: ~7,000 samples

Training Procedure

Method: QLoRA fine-tuning with 4-bit quantization
Precision: fp16 mixed precision
Hardware: Google Colab T4 (16GB GPU)
Batch size: 2 → effective batch 4 with accumulation
Epochs: 3
Training time: ~1h 30m

Evaluation

Testing Data

Held-out validation split (10% of dataset)

Metrics

Validation Loss decreased steadily across epochs
Qualitative Evaluation: Generated Python snippets from validation prompts
Example outputs matched reference solutions for common coding tasks

Example Prompt & Output

Prompt:
### Instruction:
Write code to convert integer num to list

### Code:

Generated:
[int(x) for x in str(num)]

Environmental Impact

Hardware: NVIDIA T4 (16 GB VRAM)
Cloud Provider: Google Colab
Compute Region: Unknown
Training Duration: ~1.5 hours

Citation

@misc{llama3-codeweaver-lora, author = {Mahmoud Alrefaey}, title = {llama3-codeweaver-lora: A QLoRA fine-tuned LLaMA-3 model for code generation}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/mahmoudalrefaey/llama3-codeweaver-lora}}, }

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for mahmoudalrefaey/llama3-codeweaver-lora

Base model

meta-llama/Meta-Llama-3-8B

Finetuned

(591)

this model

mahmoudalrefaey
/

llama3-codeweaver-lora