Instructions to use google/gemma-2-9b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-2-9b-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-2-9b-it", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-9b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2-9b-it", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use google/gemma-2-9b-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-2-9b-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-2-9b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-2-9b-it

SGLang

How to use google/gemma-2-9b-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-2-9b-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-2-9b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-2-9b-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-2-9b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use google/gemma-2-9b-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-2-9b-it
```

Request: Access for '/gemma-2-9b-it' model

#54

by rajeevhuggingface87 - opened Dec 26, 2024

Discussion

rajeevhuggingface87

Dec 26, 2024

Sir, I have requested access for this model 2-3 days before. But still it is pending. Please provide me access to this model.

lkv

Google org Dec 26, 2024

Hi @rajeevhuggingface87 ,

For now to get instant approval could you please visit the Kaggle page for Gemma and request access to the model weights through Hugging Face. Kindly find this kaggle page reference link.

Thank you.

rajeevhuggingface87

Dec 26, 2024

Sir, I want to do practical through google-colab not from kaggle. So please give access so that I can access these models through google-colab.

Kerassy

Dec 26, 2024

@rajeevhuggingface87

I had a similar problem with access to Google Gemma models in Colab, despite having been granted access. I solved it by creating a new HF token with 'write' permissions, and using it to replace the existing Colab secret I had stored.

After granting the notebook permission to access the HF token secret, all variants of the model started downloading to Colab as normal. Strangely, it didn't work with a newly created HF token with 'read only ' permissions 🤔 or is that intentional?

rajeevhuggingface87

Dec 26, 2024

@Kerassy

I have generated another token in hugging face with 'Write' access and provided it in both Google Colab and Kaggle Notebook. But a error message in coming in both:
"Your request to access model google/gemma-2-9b-it is awaiting a review from the repo authors".

"Lavanya KV" madam have provided me a link https://www.kaggle.com/models/google/gemma?postConsentAction=download. In this link it is clearly mentioned that "You've consented to the license for Gemma" in Kaggle Environment but still I am unable to access this model in Kaggle Environment too. I think in hugging face Google approval is must. Please note that I have access to GPU accelerator in Google and Kaggle both. There are no resource issues.

rajeevhuggingface87

Dec 30, 2024

@lkv , Good morning madam! What is the status?

lkv

Google org Dec 31, 2024

Hi @rajeevhuggingface87 ,
As an alternative you can also Request Access to Gemma-2 models on Kaggle. Kindly try and let us know if you have any concerns.

Thank you.

rajeevhuggingface87

Dec 31, 2024

@lkv , Madam I tried to access Gemma-2 at Kaggle too, but it is also asking for google approval in hugging face. I had mentioned same concerns in reply to @Kerassy .

lkv

Google org Jan 2, 2025

•

edited Jan 2, 2025

Hi @rajeevhuggingface87 ,

Could you please refer this similar issue for more details.

Thank you

73d

May 9, 2025

•

edited May 12, 2025

after first logging into docker at the command line...

docker

I then try to run:
export HUGS_CACHE=~/.cache/hugs
mkdir -p "$HUGS_CACHE"
docker run -it --rm
--gpus all
--shm-size=16GB
-v "$HUGS_CACHE:/tmp"
-p 8080:80
'hfhugs/nvidia-google-gemma-2-9b-it:0.2.0'

I did the docker login with no problem...
I already consented to the agreement with Kagle...
but I still get:
"repository does not exist or may require 'docker login"

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment