投机采样Benchmark#

Eagle3#

1. Qwen3 Series Models#

Model

Method

GSM8K

Alpaca

HumanEval

MT-bench

Mean

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

Qwen3-1.7B

Vanilla

376.42

1

378.86

1

378.38

1

390.53

1

381.05

1

Eagle3

616.9

2.13

653.29

2.19

680.1

2.2

621.44

2.17

642.93

2.17

Qwen3-4B

Vanilla

229.05

1

235.29

1

234.66

1

234.04

1

233.26

1

Eagle3

389.35

2.07

395.97

2.1

377.84

2.08

384.6

2.07

386.94

2.08

Qwen3-8B

Vanilla

149.63

1

149.93

1

153.85

1

153.81

1

151.81

1

Eagle3

257.32

2

266.69

2.02

244.89

1.97

258.2

1.97

257.52

1.99

Qwen3-14B

Vanilla

92.97

1

92.66

1

92.94

1

94.46

1

93.26

1

Eagle3

153.72

1.87

140.46

1.78

144.68

1.76

142.45

1.74

145.33

1.79

Qwen3-32B

Vanilla

43.39

1

43.38

1

43.19

1

43.3

1

43.32

1

Eagle3

80.43

2.01

72.49

1.9

71.57

1.86

74.1

1.86

74.1

1.91

Qwen3-30B-A3B

Vanilla

311.84

1

320.43

1

325.77

1

325.42

1

320.87

1

Eagle3

453.97

2.1

432.45

2.04

428.81

2.02

437.06

2.01

438.07

2.04

2. VLM Models#

2.1 Qwen3-VL Series Models#

Model

Method

GSM8K

Alpaca

HumanEval

MT-bench

MATH-500

MMMU

MMStar

Mean

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

throughput (tokens/s)

accept length

Qwen3-VL-2B-Instruct

Vanilla

348.55

1

350.9

1

346.07

1

346.31

1

82.96

1

83.27

1

81.63

1

234.24

1

Eagle3

511.52

2.11

560.55

2.26

826.01

3.39

555.22

2.29

163.09

2.57

154.18

2.55

139.73

2.31

415.76

2.5

Qwen3-VL-4B-Instruct

Vanilla

212.87

1

213.24

1

211.69

1

212.1

1

67.96

1

65.88

1

67.75

1

150.21

1

Eagle3

415.29

2.57

372.89

2.26

459.37

2.82

382.33

2.34

141.87

2.72

104.44

2.05

107.07

2.1

283.32

2.41

Qwen3-VL-30B-A3B-Instruct

Vanilla

179.94

1

184.6

1

168.68

1

180.57

1

31.08

1

31.51

1

30.93

1

115.33

1

Eagle3

281.93

2.82

241.42

2.13

223.05

2.57

240.47

2.19

75.31

2.79

48.47

1.78

52.57

1.94

166.17

2.32

2.2 HunyuanOCR Model#

Model

Method

OmniDocBench

throughput (tokens/s)

accept length

Hunyuan-OCR

Vanilla

70.12

1

Eagle3

108.1

2.08

3. Audio Models#

3.1 Qwen2-Audio Model#

Model

Method

LibriSpeech

throughput (tokens/s)

accept length

Qwen2-Audio

Vanilla

78.76

1

Eagle3

146.66

3.51

3.2 Fun-CosyVoice3 Model#

Model

Method

LibriTTS

throughput (tokens/s)

accept length

Fun-CosyVoice3

Vanilla

-

1

Eagle3

-

1.96