System Info transformers ve

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Cannot save HQQ quantized model. about transformers HOT 6 CLOSED

mxjmtxrm commented on June 3, 2024

Cannot save HQQ quantized model.

from transformers.

Comments (6)

mobicham commented on June 3, 2024 3

It needs some major refactoring on the hqq lib side to support serialization with transformers.This is because it's not possible to store the meta-data directly as a safetensor. Since hqq supports different backends and various settings, that meta-data becomes cumbersome. Currently, I am trying to simplify things to make the refactoring easier. For the moment, you can use the approach shared above:

@mxjmtxrm save/load was initially included in the pull request, but we decided to remove it because the logic was too different from the rest of the transformers models. For the moment, you can do something like this:
from hqq.models.hf.base import AutoHQQHFModel

#Save
AutoHQQHFModel.save_quantized(model, save_path)

#Load
AutoHQQHFModel.from_quantized(save_path)

from transformers.

mxjmtxrm commented on June 3, 2024 1

Got it. Thanks.

from transformers.

SunMarc commented on June 3, 2024

Hi @mxjmtxrm, this is indeed the case. You cannot serialize the model when using transformers integration because the serialization logic was not compatible with transformers code. However, you can use directly the hqq library if you want to save and relaod a quantized model https://github.com/mobiusml/hqq cc @mobicham

from transformers.

mobicham commented on June 3, 2024

@mxjmtxrm save/load was initially included in the pull request, but we decided to remove it because the logic was too different from the rest of the transformers models. For the moment, you can do something like this:

from hqq.models.hf.base import AutoHQQHFModel

#Save
AutoHQQHFModel.save_quantized(model, save_path)

#Load
AutoHQQHFModel.from_quantized(save_path)

from transformers.

mxjmtxrm commented on June 3, 2024

I'm wondering that is_serializable is True in AWQ/EETQ and other quantizers except HQQ. If it can be set True directly in HQQ quantizer?

AWQ quantizer:

@property
    def is_serializable(self):
        # AWQ through auto-awq has been always serializable, except if the model is fused.
        if self.quantization_config.do_fuse:
            logger.warning("You cannot save an AWQ model that uses fused modules!")
            return False

        if self.quantization_config.version == AWQLinearVersion.EXLLAMA:
            logger.warning("You cannot save an AWQ model that uses Exllama backend!")
            return False

        return True

from transformers.

SunMarc commented on June 3, 2024

We set it to True when it is indeed possible to save the model with transformers logic. with EETQ and AWQ, they can indeed be serialized but this is not the case for HQQ. Hence we set it to False.

from transformers.

Recommend Projects

Cannot save HQQ quantized model. about transformers HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent