# 量化benchmark

## Hunyuan-Instruct

Hunyuan-Instruct的`BF16`、`FP8`、`INT4-GPTQ`、`INT4-AWQ`在`OlympiadBench`、`AIME 2024`、`DROP`、`GPQA-Diamond`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-Hunyuan-Instruct-performance

   +-----------------------+--------------+---------+---------+---------+-----------+
   | Model                 | Quantization | CEVAL   | MMLU    | GSM8K   | HUMANEVAL |
   +=======================+==============+=========+=========+=========+===========+
   | Hunyuan-A13B-Instruct | BF16         | 82.70   | 87.30   | 91.10   | 71.20     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | FP8-Static   | 83.00   | 86.70   | 91.10   | -         |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-GPTQ    | 82.70   | 86.70   | 91.10   | -         |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-AWQ     | 82.60   | 85.60   | 91.00   | -         |
   +-----------------------+--------------+---------+---------+---------+-----------+
   | Hunyuan-7B-Instruct   | BF16         | 76.50   | 81.10   | 85.90   | 60.10     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | FP8-Static   | 76.60   | 80.90   | 86.00   | 60.10     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-GPTQ    | 76.20   | 81.00   | 85.70   | 60.00     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-AWQ     | 76.40   | 80.90   | 85.90   | 60.10     |
   +-----------------------+--------------+---------+---------+---------+-----------+
   | Hunyuan-4B-Instruct   | BF16         | 73.10   | 78.30   | 78.20   | 61.10     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | FP8-Static   | 73.10   | 76.60   | 78.30   | 60.20     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-GPTQ    | 72.90   | -       | 78.10   | 58.10     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-AWQ     | 72.80   | -       | 78.20   | -         |
   +-----------------------+--------------+---------+---------+---------+-----------+
   | Hunyuan-1.8B-Instruct | BF16         | 63.40   | 56.70   | 76.70   | 47.20     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | FP8-Static   | 62.50   | 55.20   | 75.10   | 47.70     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-GPTQ    | 60.90   | -       | 73.00   | 44.40     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-AWQ     | 61.70   | -       | 71.70   | 43.60     |
   +-----------------------+--------------+---------+---------+---------+-----------+
   | Hunyuan-0.5B-Instruct | BF16         | 29.60   | 17.20   | 52.80   | 23.30     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | FP8-Static   | 29.60   | 17.20   | 51.60   | 22.50     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-GPTQ    | 26.80   | -       | 50.90   | 23.30     |
   +                       +--------------+---------+---------+---------+-----------+
   |                       | INT4-AWQ     | 26.30   | -       | 48.90   | 23.30     |
   +-----------------------+--------------+---------+---------+---------+-----------+

```

## Qwen3

Qwen3系列模型的`BF16`、`FP8-Static`、`FP8-Dynamic`、`INT4-GPTQ`、`INT4-AWQ`在`CEVAL`、`MMLU`、`GSM8K`、`HUMANEVAL`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-qwen3-performance

   +-------------------+--------------+---------+---------+---------+-----------+
   | Model             | Quantization | CEVAL   | MMLU    | GSM8K   | HUMANEVAL |
   +===================+==============+=========+=========+=========+===========+
   | Qwen3-0.6B        | BF16         | 45.84   | 47.21   | 42.99   | 19.51     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 45.99   | 46.87   | 38.06   | 18.90     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 45.99   | 46.93   | 38.29   | 20.73     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 45.17   | 46.95   | 41.17   | 21.34     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-1.7B        | BF16         | 60.33   | 59.77   | 68.69   | 40.85     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 61.07   | 59.39   | 68.01   | 38.41     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 60.77   | 59.88   | 67.10   | 34.76     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 60.25   | 59.80   | 68.54   | 41.46     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 57.50   | 56.93   | -       | -         |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 59.06   | 56.86   | -       | -         |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-4B          | BF16         | 72.66   | 69.99   | 85.37   | 72.56     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 72.14   | 69.93   | 83.70   | 73.17     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 70.80   | 70.08   | 83.40   | 69.51     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 72.21   | 69.47   | 85.75   | 66.46     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 70.06   | 68.59   | 81.65   | -         |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 70.36   | 67.62   | 80.59   | -         |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-8B          | BF16         | 79.27   | 74.78   | 87.79   | 63.41     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 78.23   | 74.79   | 86.96   | 62.20     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 78.45   | 74.75   | 87.64   | 62.80     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 78.01   | 74.84   | 86.96   | 67.07     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 77.19   | 73.26   | 86.43   | 62.20     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 76.15   | 73.59   | 86.96   | 63.41     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-14B         | BF16         | 83.06   | 78.90   | 88.40   | 55.49     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 82.62   | 78.57   | 89.46   | 57.32     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 82.24   | 78.92   | 88.32   | 52.44     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 81.87   | 78.13   | 86.28   | 56.10     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 81.05   | 78.02   | 87.34   | 57.93     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 82.02   | 77.68   | 84.23   | 61.59     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-30B-A3B     | BF16         | 83.66   | 79.36   | 89.99   | 31.71     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 83.95   | 79.47   | 89.01   | 31.10     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 84.10   | 79.40   | 89.16   | 32.93     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 83.36   | 79.48   | 89.16   | 34.15     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-32B         | BF16         | 86.55   | 82.00   | 74.53   | 37.80     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 86.92   | 81.78   | 70.20   | 39.63     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 86.55   | 81.89   | 70.43   | 38.41     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 86.18   | 81.01   | -       | 43.29     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 86.18   | 81.54   | -       | 36.59     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | Qwen3-235B-A22B   | BF16         | 89.60   | 86.28   | 85.29   | 27.44     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 89.67   | 86.19   | 86.96   | 27.44     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 89.67   | 86.18   | 85.22   | 28.05     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 88.93   | 86.20   | 86.20   | 23.78     |
   +-------------------+--------------+---------+---------+---------+-----------+
   | QwQ-32B           | BF16         | 85.74   | 82.03   | 73.31   | 42.68     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Static   | 85.44   | 81.91   | 75.36   | 42.68     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 85.07   | 81.93   | 75.66   | 42.07     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT8-Dynamic | 86.40   | 81.97   | 74.37   | 45.73     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-GPTQ    | 84.03   | 81.26   | 68.23   | 45.73     |
   +                   +--------------+---------+---------+---------+-----------+
   |                   | INT4-AWQ     | 83.58   | 81.01   | 68.69   | 43.29     |
   +-------------------+--------------+---------+---------+---------+-----------+

```

## Qwen2.5VL

Qwen2.5VL系列模型的`BF16`、`FP8-Static`、`FP8-Dynamic`、`FP8-Static-ViT`、`FP8-Dynamic-ViT`、`INT4-GPTQ`、`INT4-AWQ`在`MMMU_VAL`、`DocVQA_VAL`、`ChartQA_TEST`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-qwen2.5vl-performance

   +-------------------+------------------+----------+------------+--------------+
   | Model             | Quantization     | MMMU_VAL | DocVQA_VAL | ChartQA_TEST |
   +===================+==================+==========+============+==============+
   | Qwen2.5VL-3B      | BF16             | 47.11    | 78.57      | 80.32        |
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Static       | 47.33    | 79.34      | 79.68        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Dynamic      | 47.00    | 78.92      | 79.60        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Static-ViT   | 45.56    | 79.36      | 80.16        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT8-Dynamic-ViT | 46.67    | 79.26      | 79.84        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-GPTQ        | 46.56    | 77.20      | 78.96        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-AWQ         | 45.78    | -          | 79.60        | 
   +-------------------+------------------+----------+------------+--------------+
   | Qwen2.5VL-7B      | BF16             | 45.44    | 89.71      | 84.64        |
   |                   +------------------+----------+------------+--------------+
   |                   | FP8-Static       | 47.00    | 89.83      | 85.92        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Dynamic      | 47.22    | 89.80      | 88.64        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Static-ViT   | 47.00    | 89.85      | 86.88        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT8-Dynamic-ViT | 46.44    | 89.68      | 88.72        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-GPTQ        | 46.67    | 90.45      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-AWQ         | 45.67    | 89.28      | -            | 
   +-------------------+------------------+----------+------------+--------------+
   | Qwen2.5VL-32B     | BF16             | 57.00    | 90.03      | -            |
   |                   +------------------+----------+------------+--------------+
   |                   | FP8-Static       | 57.00    | 89.88      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Dynamic      | 56.44    | 89.88      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Static-ViT   | 56.33    | 89.92      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT8-Dynamic-ViT | 57.22    | 89.88      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-GPTQ        | 55.22    | 89.80      | -            | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-AWQ         | 55.22    | 90.30      | -            | 
   +-------------------+------------------+----------+------------+--------------+
   | Qwen2.5VL-72B     | BF16             | 58.78    | 94.39      | 85.60        |
   |                   +------------------+----------+------------+--------------+
   |                   | FP8-Static       | 57.89    | 94.41      | 85.84        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Dynamic      | 58.67    | 94.38      | 85.60        | 
   +                   +------------------+----------+------------+--------------+
   |                   | FP8-Static-ViT   | 57.44    | 94.48      | 85.84        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT8-Dynamic-ViT | 58.22    | 94.47      | 86.00        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-GPTQ        | 57.56    | 94.46      | 86.48        | 
   +                   +------------------+----------+------------+--------------+
   |                   | INT4-AWQ         | 58.78    | 94.19      | 87.28        | 
   +-------------------+------------------+----------+------------+--------------+
```


## DeepSeek-R1-0528

DeepSeek-R1-0528模型的`FP8-Block-Wise`、`W4A8-FP8`在`GPQA Diamond`、`AIME 2024`、`SimpleQA`、`LiveCodeBench`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-DeepSeek-R1-0528-performance

   +-----------------------+----------------+--------------+-----------+----------+---------------+
   | Model                 | Quantization   | GPQA Diamond | AIME 2024 | SimpleQA | LiveCodeBench |
   +=======================+================+==============+===========+==========+===============+
   | DeepSeek-R1-0528      | FP8-Block-Wise |    78.28     |   88.67   |   27.80  |     77.1      |
   +                       +----------------+--------------+-----------+----------+---------------+
   |                       | W4A8-FP8       |    77.37     |   88.67   |   26.83  |     78.86     |
   +-----------------------+----------------+--------------+-----------+----------+---------------+
```

## Seed-OSS-36B-Instruct

Seed-OSS-36B-Instruct模型的`FP8-Static`、`FP8-Dynamic`在`CEVAL`、`MMLU`、`GSM8K`、`HUMANEVAL`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-seed-oss-36b-performance

   +-------------------------+----------------+---------+--------+----------------+------------------+-------------+
   | Model                   | Quantization   | CEVAL   | MMLU   | GSM8K-strict   | GSM8K-flexible   | HUMANEVAL   |
   +=========================+================+=========+========+================+==================+=============+
   | Seed-OSS-36B-Instruct   | BF16           | 88.19   | 82.97  | 70.36          | 97.12            | 87.20       |
   +                         +----------------+---------+--------+----------------+------------------+-------------+
   |                         | FP8-Static     | 87.82   | 82.79  | 74.75          | 96.51            | 86.59       |
   +                         +----------------+---------+--------+----------------+------------------+-------------+
   |                         | FP8-Dynamic    | 87.82   | 82.64  | 74.15          | 96.89            | 87.20       |
   +-------------------------+----------------+---------+--------+----------------+------------------+-------------+

```

该数据使用[lm-eval](https://github.com/EleutherAI/lm-evaluation-harness)工具评测，注意需要设置`--gen_kwargs max_gen_toks`防止思考内容过长被截断。


## GLM-4.6

GLM-4.6模型的`FP8-Static`、`FP8-Dynamic`在`CEVAL`、`GSM8K`、`HUMANEVAL`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: glm-4.6-performance

   +-------------------+--------------+---------+---------+-----------+
   | Model             | Quantization | CEVAL   | GSM8K   | HUMANEVAL |
   +===================+==============+=========+=========+===========+
   | GLM-4.6           | BF16         | 82.6    | 93.71   | 73.78     |
   +                   +--------------+---------+---------+-----------+
   |                   | FP8-Static   | 83.14   | 93.86   | 66.46     |
   +                   +--------------+---------+---------+-----------+
   |                   | FP8-Dynamic  | 82.91   | 93.71   | 63.41     |
   +-------------------+--------------+---------+---------+-----------+

```


## 其他模型

其他模型的`BF16`、`FP8-Static`、`FP8-Dynamic`、`INT4-GPTQ`、`INT4-AWQ`在`CEVAL`、`MMLU`、`GSM8K`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-other-performance

   +-------------------------------+--------------+---------+---------+---------+
   | Model                         | Quantization | CEVAL   | MMLU    | GSM8K   |
   +===============================+==============+=========+=========+=========+
   | Qwen2.5-1.5B-Instruct         | BF16         | 67.01   | 60.05   | 54.28   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 66.27   | 60.23   | -       |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 66.79   | 60.08   | 51.71   |
   +-------------------------------+--------------+---------+---------+---------+
   | Qwen2.5-7B-Instruct           | BF16         | 81.20   | 74.55   | 79.98   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 81.13   | 74.03   | 79.30   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 80.31   | 74.07   | 79.00   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-GPTQ    | 79.05   | 73.05   | 74.75   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-AWQ     | 79.35   | 73.22   | 79.38   |
   +-------------------------------+--------------+---------+---------+---------+
   | Qwen2.5-32B-Instruct          | BF16         | 87.30   | 83.21   | 81.73   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 87.59   | 83.08   | 81.58   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 87.30   | 83.04   | 81.58   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-GPTQ    | 86.70   | 82.45   | 82.03   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-AWQ     | 87.00   | 82.64   | -       |
   +-------------------------------+--------------+---------+---------+---------+
   | DeepSeek-R1-Distill-Qwen-1.5B | BF16         | 37.22   | 36.63   | 67.02   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 35.44   | 37.41   | -       |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 35.96   | 36.12   | 64.75   |
   +-------------------------------+--------------+---------+---------+---------+
   | DeepSeek-R1-Distill-Qwen-7B   | BF16         | 53.49   | 53.80   | 75.74   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 53.57   | 54.17   | 76.19   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 52.97   | 54.13   | 74.15   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-GPTQ    | 51.86   | 52.44   | 75.89   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-AWQ     | 53.49   | 53.70   | -       |
   +-------------------------------+--------------+---------+---------+---------+
   | DeepSeek-R1-Distill-Qwen-14B  | BF16         | 77.71   | 74.28   | 85.67   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 77.56   | 74.66   | 86.73   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 76.82   | 74.63   | 87.11   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-GPTQ    | 74.29   | 72.37   | 84.61   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-AWQ     | 74.81   | 73.00   | 86.05   |
   +-------------------------------+--------------+---------+---------+---------+
   | DeepSeek-R1-Distill-Qwen-32B  | BF16         | 84.18   | 80.89   | 87.41   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Static   | 83.43   | 80.90   | 87.57   |
   +                               +--------------+---------+---------+---------+
   |                               | FP8-Dynamic  | 83.73   | 81.10   | 86.43   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-GPTQ    | 84.10   | 79.80   | 86.73   |
   +                               +--------------+---------+---------+---------+
   |                               | INT4-AWQ     | 82.84   | 80.15   | 87.19   |
   +-------------------------------+--------------+---------+---------+---------+

```


## INT4-GPTAQ

INT4-GPTAQ在`GSM8K`、`HUMANEVAL`、`GPQA Diamond`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-INT4-GPTAQ-performance

   +-----------+--------------+-------+-----------+--------------+
   | Model     | Quantization | GSM8K | HUMANEVAL | GPQA Diamond |
   +===========+==============+=======+===========+==============+
   | Qwen3-4B  | BF16         | 85.37 | 72.56     | 37.88        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTQ    | 81.65 | 61.59     | 35.35        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTAQ   | 82.56 | 64.02     | 39.39        |
   +-----------+--------------+-------+-----------+--------------+
   | Qwen3-8B  | BF16         | 87.79 | 63.41     | 32.32        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTQ    | 86.43 | 62.20     | 34.85        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTAQ   | 86.66 | 64.02     | 33.33        |
   +-----------+--------------+-------+-----------+--------------+
   | Qwen3-32B | BF16         | 74.53 | 37.80     | 40.40        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTQ    | 65.58 | 43.29     | 40.40        |
   +           +--------------+-------+-----------+--------------+
   |           | INT4-GPTAQ   | 69.52 | 37.20     | -            |
   +-----------+--------------+-------+-----------+--------------+
```


## NVFP4

NVFP4在`GSM8K`、`MMLU`、`GPQA Diamond`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-NVFP4-performance

   +-----------------+--------------+-------+-------+--------------+
   | Model           | Quantization | GSM8K | MMLU  | GPQA Diamond |
   +=================+==============+=======+=======+==============+
   | Qwen3-32B       | BF16         | 67.06 | 81.72 | 54.04        |
   +                 +--------------+-------+-------+--------------+
   |                 | NVFP4        | 69.87 | 80.74 | 56.06        |
   +-----------------+--------------+-------+-------+--------------+
   | Qwen3-235B-A22B | BF16         | 96.63 | 62.73 | 60.60        |
   +                 +--------------+-------+-------+--------------+
   |                 | NVFP4        | 96.17 | 62.09 | 60.10        |
   +-----------------+--------------+-------+-------+--------------+
```


## Qwen3VL

Qwen3VL系列模型的`BF16`、`FP8-Static`、`FP8-Dynamic`在`MMMU_VAL`、`DocVQA_VAL`、`ChartQA_TEST`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-qwen3vl-performance

   +---------------------------+------------------+----------+------------+--------------+
   | Model                     | Quantization     | MMMU_VAL | DocVQA_VAL | ChartQA_TEST |
   +===========================+==================+==========+============+==============+
   | Qwen3-VL-32B-Instruct     | BF16             | 60.11    | 96.08      | 94.64        |
   +                           +------------------+----------+------------+--------------+
   |                           | FP8-Static       | 61.22    | 96.00      | 94.64        | 
   +                           +------------------+----------+------------+--------------+
   |                           | FP8-Dynamic      | 60.78    | 96.19      | 94.72        | 
   +---------------------------+------------------+----------+------------+--------------+
   | Qwen3-VL-30B-A3B-Instruct | BF16             | 50.44    | 95.28      | 95.36        |
   +                           +------------------+----------+------------+--------------+
   |                           | FP8-Dynamic      | 50.67    | 95.25      | 95.20        | 
   +---------------------------+------------------+----------+------------+--------------+
```

FP8-Dynamic采用Block-wise的量化，启动命令：python3 tools/fp8_quant_blockwise.py --block_size --input_path --output_path


## Qwen3-Omni

**Qwen3-Omni Text -> Text Benchmark**

Qwen3-Omni模型的`BF16`、`FP8-Static`、`FP8-Dynamic`在`aime25`、`gpqa_diamond`、`mmlu_redux`上的评测结果如下：

```{eval-rst}
.. table::
   :align: center
   :name: table-qwen3-omni-performance

   +-----------------------------+----------------+----------+--------------+------------+
   | Model                       | Quantization   | aime25   | gpqa_diamond | mmlu_redux |
   +=============================+================+==========+==============+============+
   | Qwen3-Omni-30B-A3B-Instruct | BF16           | 73.32    | 56.77        | 88.09      |
   +                             +----------------+----------+--------------+------------+
   |                             | FP8-Static     | 71.33    | 56.57        | 87.91      | 
   +                             +----------------+----------+--------------+------------+
   |                             | FP8-Dynamic    | 73.33    | 55.15        | 88.07      | 
   +-----------------------------+----------------+----------+--------------+------------+
```