HuggingFaceFW/fineweb-2
Viewer โข Updated โข 4.48B โข 60.9k โข 817
How to use ctu-aic/Llama-3.1-8B_cp-cs with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="ctu-aic/Llama-3.1-8B_cp-cs") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ctu-aic/Llama-3.1-8B_cp-cs")
model = AutoModelForCausalLM.from_pretrained("ctu-aic/Llama-3.1-8B_cp-cs")How to use ctu-aic/Llama-3.1-8B_cp-cs with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ctu-aic/Llama-3.1-8B_cp-cs"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ctu-aic/Llama-3.1-8B_cp-cs",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/ctu-aic/Llama-3.1-8B_cp-cs
How to use ctu-aic/Llama-3.1-8B_cp-cs with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "ctu-aic/Llama-3.1-8B_cp-cs" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ctu-aic/Llama-3.1-8B_cp-cs",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ctu-aic/Llama-3.1-8B_cp-cs" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ctu-aic/Llama-3.1-8B_cp-cs",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use ctu-aic/Llama-3.1-8B_cp-cs with Docker Model Runner:
docker model run hf.co/ctu-aic/Llama-3.1-8B_cp-cs
Llama 3.1 8B continuously pretrained on the Czech subset of FineWeb2. More information in the thesis: TBA. (The notation is thesis is: B->CP_(cs))
This model is a Czech-adapted version of Meta's LLaMA 3.1 8B, developed as part of master's thesis. It is intended solely for academic and research purposes.
Researchers and practitioners using this model must ensure appropriate ethical oversight and conduct rigorous evaluations before any further deployment or fine-tuning.
@mastersthesis{mlynar2025llmadapt,
author = {Tomรกลก Mlynรกล},
title = {Compute-constrained LLM adaptation to Czech language},
school = {Czech Technical University in Prague},
year = {2025},
type = {Master's thesis},
month = {6},
note = {Supervisor: Ing. Herbert Ullrich},
url = {http://hdl.handle.net/10467/123587}
}
Base model
meta-llama/Llama-3.1-8B