Hi! I'm not sure if it's already been asked, but how difficult would it be to impl

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

[Question] Support for custom estimators and custom transformers about hummingbird HOT 7 CLOSED

mbignotti commented on May 24, 2024

[Question] Support for custom estimators and custom transformers

from hummingbird.

Comments (7)

interesaaat commented on May 24, 2024 1

I don't think you will need to write twice the model. Only the inference part (which looks quite easy). For your next model you could just write the fit method in numpy and the predict in pytorch so that you don't to replicate any work. Keep us posted!

from hummingbird.

interesaaat commented on May 24, 2024

Ciao Marco, adding a custom op shouldn't be too hard. Unfortunately at the moment we don't provide a specific API for this but I can tell you how you can do it. (we love contributions 😄).

So first thing you need to add the class of your custom op among the supported ops.

Then you need to write a converter taking as input your operator and returning a pytorch model version. To do this, first you need to register a converter. You can use this as an example where instead of having "SklearnMLPClassifier" you should put "Sklearn_your_custom_op_class_name".

Then you need to provide the actual converter. Given your implementation that is pretty much uses a bunch on np funtions, should be straightforward to implement it. You can look into other converts implementations to get an idea on how you can do it. For example here.

Let me know if this works for you.

from hummingbird.

mbignotti commented on May 24, 2024

Hi @interesaaat!
Thank you for your reply!
So, if I understand well, the idea is that you take parameters and other relevant attributes (e.g. classes_) from fitted sklearn estimators and pass them to a corresponding nn.Module, that implements the same logic.

However, I'm wondering if, in this case, it's easier to simply create a new nn.Module class (instead of inheriting from sklearn.base.BaseEstimator) that internally uses an hummingbird-converted class.
I'm not 100% sure how I would write it, but what I mean is something like this (ignoring the fact that inverse_transform is not supported):

class PCADetector(torch.nn.Module):

    def __init__(self, n_components):
        super().__init__()
        self.n_components = n_components

    def fit(self, X: np.ndarray):
        model = PCA(n_components=self.n_components)
        model.fit(X)
        self.estimator_ = convert(model, backend="pytorch", test_input=X)

    def forward(self, x):
        x_hat = self.estimator_.inverse_transform(self.estimator_.transform(x))
        residuals = x - x_hat 
        spe = np.sqrt(np.sum(residuals**2, axis=1))
        return spe

To give a little bit of context, I'll try to explain why I would like to do.

The final goal is being able to deploy these models without having to deal with python package. The big problem of sklearn, and python in general for machine learning, is that it's very difficult to deploy custom models when you are not allowed to use docker in production (our case). Custom models might be defined in a project-related repo, and the only way to ship it is to bundle them together with the source code. But this is something we want to avoid, as it might raise other dependencies issue.

Another approach is to compile or convert the model to somthing like onnx or tvm. However, onnx and tvm support is very limited for custom models that are not using deep learning frameworks. That's why I'm trying to understand if Hummingbird could help me.

However, I'm not sure if a composition approach, like the one above, can be adapted to work with subsequent conversions to onnx or similar.
On the other side, maybe following the official approach you described to register custom operators in Hummingbird, might be more robust.

What do you think?

Thanks again,
Marco.

from hummingbird.

interesaaat commented on May 24, 2024

Yea the approach above won't work because even if you wrap the model as a pytorch module, the internal code still uses numpy so you will need that dependency + python. Hummingbird should be able to help in your use case because as long as you provide your model implementation as tensor operations, using TorchScript or ONNX you can export it without any python or other dependencies.

from hummingbird.

mbignotti commented on May 24, 2024

The only thing I don't like is having to write twice the same model. But, at this point, I guess that this is the only way to go (I've really investigated all possible solutions I could find). Because the only alternative solution I see, is to directly write the code in a compiled language.
I'll try to implement it and let you know if that works.

from hummingbird.

interesaaat commented on May 24, 2024

Closing at the moment. We can reopen in case.

from hummingbird.

mbignotti commented on May 24, 2024

Unfortunately I haven't had the time to work on it. I'll update you as soon as I can.
Thanks!

from hummingbird.

[Question] Support for custom estimators and custom transformers about hummingbird HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent