Qwen3-8B-speculator.eagle3

Model Overview

Verifier: Qwen/Qwen3-8B
Speculative Decoding Algorithm: EAGLE-3
Model Architecture: Eagle3Speculator
Release Date: 07/27/2025
Version: 1.0
Model Developers: RedHat

This is a speculator model designed for use with Qwen/Qwen3-8B, based on the EAGLE-3 speculative decoding algorithm. It was trained using the speculators library on a combination of the Aeala/ShareGPT_Vicuna_unfiltered and the HuggingFaceH4/ultrachat_200k datasets. The model was trained with thinking turned disabled. This model should be used with the Qwen/Qwen3-8B chat template, specifically through the /chat/completions endpoint.

Use with vLLM

vllm serve Qwen/Qwen3-8B \
 -tp 1 \
 --speculative-config '{
 "model": "RedHatAI/Qwen3-8B-speculator.eagle3",
 "num_speculative_tokens": 3,
 "method": "eagle3"
 }'