Instructions to use mahmoudalrefaey/llama3-codeweaver-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mahmoudalrefaey/llama3-codeweaver-lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mahmoudalrefaey/llama3-codeweaver-lora")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("mahmoudalrefaey/llama3-codeweaver-lora", dtype="auto") - PEFT
How to use mahmoudalrefaey/llama3-codeweaver-lora with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mahmoudalrefaey/llama3-codeweaver-lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mahmoudalrefaey/llama3-codeweaver-lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mahmoudalrefaey/llama3-codeweaver-lora", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/mahmoudalrefaey/llama3-codeweaver-lora
- SGLang
How to use mahmoudalrefaey/llama3-codeweaver-lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mahmoudalrefaey/llama3-codeweaver-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mahmoudalrefaey/llama3-codeweaver-lora", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mahmoudalrefaey/llama3-codeweaver-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mahmoudalrefaey/llama3-codeweaver-lora", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use mahmoudalrefaey/llama3-codeweaver-lora with Docker Model Runner:
docker model run hf.co/mahmoudalrefaey/llama3-codeweaver-lora
Model Card for llama3-codeweaver-lora
Model Details
- Model name: llama3-codeweaver-lora
- Developed by: mahmoudalrefaey
- Funded by: None (personal project)
- Finetuned from: meta-llama/Meta-Llama-3-8B
- License: LLaMA 3 license
This is a LLaMA-3 8B model fine-tuned with QLoRA on the CoNaLa mined-curated dataset for code generation tasks.
The adapter was trained on Google Colab T4 (16GB) using fp16 mixed precision with QLoRA for efficiency.
Uses
Direct Use
- Intended for code generation assistant tasks such as transforming natural language instructions into Python snippets.
- Educational use for learning about LLM fine-tuning with LoRA adapters.
Downstream Use
- Can be further fine-tuned on specialized coding datasets (e.g. SQL, JS).
- Integration into coding assistants and research projects.
Out-of-Scope Use
- Not intended for production-critical code security auditing.
- Not guaranteed to generate safe or fully optimized code.
- Should not be used in environments where code execution safety is critical without sandboxing.
Training Details
Training Data
- Dataset: CoNaLa mined-curated
- Dataset size used: ~7,000 samples
Training Procedure
- Method: QLoRA fine-tuning with 4-bit quantization
- Precision: fp16 mixed precision
- Hardware: Google Colab T4 (16GB GPU)
- Batch size: 2 → effective batch 4 with accumulation
- Epochs: 3
- Training time: ~1h 30m
Evaluation
Testing Data
- Held-out validation split (10% of dataset)
Metrics
- Validation Loss decreased steadily across epochs
- Qualitative Evaluation: Generated Python snippets from validation prompts
- Example outputs matched reference solutions for common coding tasks
Example Prompt & Output
Prompt:
### Instruction:
Write code to convert integer num to list
### Code:
Generated:
[int(x) for x in str(num)]
Environmental Impact
- Hardware: NVIDIA T4 (16 GB VRAM)
- Cloud Provider: Google Colab
- Compute Region: Unknown
- Training Duration: ~1.5 hours
Citation
@misc{llama3-codeweaver-lora, author = {Mahmoud Alrefaey}, title = {llama3-codeweaver-lora: A QLoRA fine-tuned LLaMA-3 model for code generation}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/mahmoudalrefaey/llama3-codeweaver-lora}}, }
Model tree for mahmoudalrefaey/llama3-codeweaver-lora
Base model
meta-llama/Meta-Llama-3-8B