新增模型与算法

新增模型与算法#

新增模型#

Qwen为例,所有新定义的模型继承自BaseModel,然后使用@SlimModelFactory.register来进行注册即可,示例如下:

@SlimModelFactory.register
class Qwen(BaseModel):
    def __init__(
        self,
        model=None,
        model_path=None,
        deploy_backend="vllm",
    ):
        super().__init__(
            model=model,
            model_path=model_path,
            deploy_backend=deploy_backend,
        )

SlimModelFactory中会自动创建模型全局列表,调用时直接指定模型的Class名字即可,比如:Qwen,创建model如下所示:

self.slim_model = SlimModelFactory.create(
   "Qwen", model=self.model, deploy_backend=deploy_backend
)

新增压缩算法#

量化目前支持FP8INT8INT4等策略,压缩通过CompressorFactory来统一注册压缩算法,比如PTQ的注册如下:

@CompressorFactory.register
class PTQ:
    def __init__(self, model, slim_config=None):
        self.quant_model = model
        ...

CompressorFactory中会自动创建压缩算法全局列表,调用时直接指定压缩算法的Class名字即可,比如创建PTQ,如下所示:

slim_config = {
   "global_config": global_config,
   "compress_config": compress_config,
}
self.compressor = CompressorFactory.create(
   "PTQ", self.slim_model, slim_config=slim_config
)