Giter Club home page Giter Club logo

Comments (7)

interesaaat avatar interesaaat commented on May 24, 2024 1

I don't think you will need to write twice the model. Only the inference part (which looks quite easy). For your next model you could just write the fit method in numpy and the predict in pytorch so that you don't to replicate any work. Keep us posted!

from hummingbird.

interesaaat avatar interesaaat commented on May 24, 2024

Ciao Marco, adding a custom op shouldn't be too hard. Unfortunately at the moment we don't provide a specific API for this but I can tell you how you can do it. (we love contributions 😄).

So first thing you need to add the class of your custom op among the supported ops.

Then you need to write a converter taking as input your operator and returning a pytorch model version. To do this, first you need to register a converter. You can use this as an example where instead of having "SklearnMLPClassifier" you should put "Sklearn_your_custom_op_class_name".

Then you need to provide the actual converter. Given your implementation that is pretty much uses a bunch on np funtions, should be straightforward to implement it. You can look into other converts implementations to get an idea on how you can do it. For example here.

Let me know if this works for you.

from hummingbird.

mbignotti avatar mbignotti commented on May 24, 2024

Hi @interesaaat!
Thank you for your reply!
So, if I understand well, the idea is that you take parameters and other relevant attributes (e.g. classes_) from fitted sklearn estimators and pass them to a corresponding nn.Module, that implements the same logic.

However, I'm wondering if, in this case, it's easier to simply create a new nn.Module class (instead of inheriting from sklearn.base.BaseEstimator) that internally uses an hummingbird-converted class.
I'm not 100% sure how I would write it, but what I mean is something like this (ignoring the fact that inverse_transform is not supported):

class PCADetector(torch.nn.Module):

    def __init__(self, n_components):
        super().__init__()
        self.n_components = n_components

    def fit(self, X: np.ndarray):
        model = PCA(n_components=self.n_components)
        model.fit(X)
        self.estimator_ = convert(model, backend="pytorch", test_input=X)

    def forward(self, x):
        x_hat = self.estimator_.inverse_transform(self.estimator_.transform(x))
        residuals = x - x_hat 
        spe = np.sqrt(np.sum(residuals**2, axis=1))
        return spe

To give a little bit of context, I'll try to explain why I would like to do.

The final goal is being able to deploy these models without having to deal with python package. The big problem of sklearn, and python in general for machine learning, is that it's very difficult to deploy custom models when you are not allowed to use docker in production (our case). Custom models might be defined in a project-related repo, and the only way to ship it is to bundle them together with the source code. But this is something we want to avoid, as it might raise other dependencies issue.

Another approach is to compile or convert the model to somthing like onnx or tvm. However, onnx and tvm support is very limited for custom models that are not using deep learning frameworks. That's why I'm trying to understand if Hummingbird could help me.

However, I'm not sure if a composition approach, like the one above, can be adapted to work with subsequent conversions to onnx or similar.
On the other side, maybe following the official approach you described to register custom operators in Hummingbird, might be more robust.

What do you think?

Thanks again,
Marco.

from hummingbird.

interesaaat avatar interesaaat commented on May 24, 2024

Yea the approach above won't work because even if you wrap the model as a pytorch module, the internal code still uses numpy so you will need that dependency + python. Hummingbird should be able to help in your use case because as long as you provide your model implementation as tensor operations, using TorchScript or ONNX you can export it without any python or other dependencies.

from hummingbird.

mbignotti avatar mbignotti commented on May 24, 2024

The only thing I don't like is having to write twice the same model. But, at this point, I guess that this is the only way to go (I've really investigated all possible solutions I could find). Because the only alternative solution I see, is to directly write the code in a compiled language.
I'll try to implement it and let you know if that works.

from hummingbird.

interesaaat avatar interesaaat commented on May 24, 2024

Closing at the moment. We can reopen in case.

from hummingbird.

mbignotti avatar mbignotti commented on May 24, 2024

Unfortunately I haven't had the time to work on it. I'll update you as soon as I can.
Thanks!

from hummingbird.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.