Instructions to use nvidia/Eagle2.5-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nvidia/Eagle2.5-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="nvidia/Eagle2.5-8B", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("nvidia/Eagle2.5-8B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use nvidia/Eagle2.5-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nvidia/Eagle2.5-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Eagle2.5-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/nvidia/Eagle2.5-8B

SGLang

How to use nvidia/Eagle2.5-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nvidia/Eagle2.5-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Eagle2.5-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nvidia/Eagle2.5-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nvidia/Eagle2.5-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use nvidia/Eagle2.5-8B with Docker Model Runner:
```
docker model run hf.co/nvidia/Eagle2.5-8B
```

Fix KeyError Bug to support in SGLang

#13

by jonahbernard - opened Nov 13, 2025

base: refs/heads/main

←

from: refs/pr/13

Discussion Files changed

+10

-8

Fix KeyError Bug to support in SGLang0e5dc22d

jonahbernard

Nov 13, 2025

•

edited Nov 13, 2025

Trying to support this in SGLang. Need this approved to merge the another PR into SGLang (https://github.com/sgl-project/sglang/pull/13166)

Arguments that are not passed in kwargs are being popped, resulting in KeyError.

The _further_process_kwargs function includes the 'interpolation instead of resample' now.

cg1177

Nov 15, 2025

We encountered a similar issue when enabling eagle support for VLLM? Is this primarily a compatibility problem stemming from Hugging Face?

make backwards compatible after transformers 4.56.0 upgrade933df48a

jonahbernard

Nov 15, 2025

Yeah, it is.

The Eagle2.5 Github repo requires transformers==4.51.0, but SGLang requires transformers==4.57.1. In transformers >=4.56.0, a commit was included that broke image_processing_eagle2_5_vl_fast.py due to these lines that I am fixing in this PR. This is the commit: https://github.com/huggingface/transformers/commit/f690a2a1e09e8a8c7b04cc050ef24838c609060b.

I updated my PR to be backwards-compatible, so you don't have to update the Eagle2.5 Github repo, but SGLang and vLLM can use a newer version of HF Transformers.

cg1177

Nov 16, 2025

IC. Thanks for your support! @jonahbernard

Zhiding changed pull request status to merged Nov 29, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment