Instructions to use Salesforce/xgen-7b-8k-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Salesforce/xgen-7b-8k-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Salesforce/xgen-7b-8k-base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base")
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Salesforce/xgen-7b-8k-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Salesforce/xgen-7b-8k-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/xgen-7b-8k-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Salesforce/xgen-7b-8k-base

SGLang

How to use Salesforce/xgen-7b-8k-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Salesforce/xgen-7b-8k-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/xgen-7b-8k-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Salesforce/xgen-7b-8k-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/xgen-7b-8k-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Salesforce/xgen-7b-8k-base with Docker Model Runner:
```
docker model run hf.co/Salesforce/xgen-7b-8k-base
```

fine tune the model for custom dataset

#24

by NajiAboo - opened Jul 12, 2023

Discussion

NajiAboo

Jul 12, 2023

I am trying to fine-tune the model with the custom dataset. Any help on this is highly appreciated.
Thanks in advance

alvunr

Jul 25, 2023

help on this would be very useful. Thanks!

thebadsektor

Aug 5, 2023

following this.

obscureagent

Sep 13, 2023

Hopefully we get a resource for a straight-forward fine-tuning process for XGen-7b like in llama.

ybelkada

Oct 12, 2023

Hi everyone

As x-gen has the same architecture as Llama you can easily extend this gist: https://gist.github.com/younesbelkada/9f7f75c94bdc1981c8ca5cc937d4a4da to make use of x-gen and run it on your custom dataset.
Since the model is a 7b model you can also fine-tune it easily on a free-tier google colab instance by following for example this tutorial: https://colab.research.google.com/drive/1DNenc5BpdqaS10prtklYyIe9qW_7gUnb but run it on xgen instead.

thebadsektor

Oct 13, 2023

Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment