Giter Club home page Giter Club logo

hungabunga's Introduction

Hunga-Bunga

Brute Force all scikit-learn models and all scikit-learn parameters with fit predict.


Lets brute force all sklearn models with all of sklearn parameters! Ahhh Hunga Bunga!!
from hunga_bunga import HungaBungaClassifier, HungaBungaRegressor
And then simply:


What?

Yes.

No! Really! What?

Many believe that

most of the work of supervised (non-deep) Machine Learning lies in feature engineering, whereas the model-selection process is just running through all the models or just take xgboost.

So here is an automation for that.

HOW IT WORKS

Runs through all sklearn models (both classification and regression), with all possible hyperparameters, and rank using cross-validation.

MODELS

Runs all the model available on sklearn for supervised learning here. The categories are:

  • Generalized Linear Models
  • Kernel Ridge
  • Support Vector Machines
  • Nearest Neighbors
  • Gaussian Processes
  • Naive Bayes
  • Trees
  • Neural Networks
  • Ensemble methods

Note: Some models were dropped out (nearly none of them..) and some crash or cause exceptions from time to time. It takes REALLY long to test this out so clearing exceptions took me a while.

Installation

pip install hunga-bunga

Dependencies


- Python (>= 2.7)
- NumPy (>= 1.11.0)
- SciPy (>= 0.17.0)
- joblib (>= 0.11)
- scikit-learn (>=0.20.0)
- tabulate (>=0.8.2)
- tqdm (>=4.28.1)

Option I (Recommended): brain = False

As any other sklearn model

clf = HungaBungaClassifier()
clf.fit(x, y)
clf.predict(x)

And import from here

from hunga_bunga import HungaBungaClassifier, HungaBungaRegressor

Option II: brain = True

As any other sklearn model

clf = HungaBungaClassifier(brain=True)
clf.fit(x, y)

The output looks this:

Model accuracy Time/clf (s)
SGDClassifier 0.967 0.001
LogisticRegression 0.940 0.001
Perceptron 0.900 0.001
PassiveAggressiveClassifier 0.967 0.001
MLPClassifier 0.827 0.018
KMeans 0.580 0.010
KNeighborsClassifier 0.960 0.000
NearestCentroid 0.933 0.000
RadiusNeighborsClassifier 0.927 0.000
SVC 0.960 0.000
NuSVC 0.980 0.001
LinearSVC 0.940 0.005
RandomForestClassifier 0.980 0.015
DecisionTreeClassifier 0.960 0.000
ExtraTreesClassifier 0.993 0.002

The winner is: ExtraTreesClassifier with score 0.993.

hungabunga's People

Contributors

ypeleg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hungabunga's Issues

parameter ranges?

Hey! How did you determine the parameter ranges for all the models?
And have you thought about the implications of having them specified by name and not by model (i.e. alpha might be very different things for different models).

Documentation issue

HOW IT WORKS

BUT DOES IT WORK THO

JK this was funny and cool though. Thanks for the laugh :)

Hanging up on NuSVC

I'm not sure if this is a real issue or just the computer i'm running on being too slow, but, every time I try to run example.py, it hangs up with the loading bar at 67%.
I added brain=True for a more verbose output and saw that it hangs up after displaying "NuSVC".
For comparison, it takes me 30s-1m to get to that point, and I tried letting it run overnight and it still didn't get further than that... I killed it after maybe 24 hours because I was tired of having 7 cores out of 8 permanently maxxed.

Module not found error

getting this error:
"
4 from multiprocessing import cpu_count
5 from sklearn.base import BaseEstimator
----> 6 from regression import HungaBungaRegressor
7 from classification import HungaBungaClassifier
8
ImportError: No module named 'regression'
"
Can anyone please help in this issue

No module named 'regression'

Hi,
I am getting below error while initializing. Please advise the solution

FYI - All libraries are updated with latest version


ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 from hunga_bunga import HungaBungaClassifier, HungaBungaRegressor, HungaBungaZeroKnowledge

~\Anaconda3\lib\site-packages\hunga_bunga_init_.py in ()
4 from multiprocessing import cpu_count
5 from sklearn.base import BaseEstimator
----> 6 from regression import HungaBungaRegressor
7 from classification import HungaBungaClassifier
8

ModuleNotFoundError: No module named 'regression'

How to setup?

I want to run your sample code, but it shows some errors. I guess it is something wrong with the setup. But you don't provide like setup.py. So how do we setup this package?

Hanging up on ExtraTreesRegressor

Everything works smoothly until the output gets stuck on 95% with the ExtraTreesRegressor. It takes 15 - 20 mins to complete 95%, but then it just hangs up after that.

sklearn.exceptions.NotFittedError: This SGDClassifier instance is not fitted yet

Hello

This is the code i'm trying to run:

    X_train, y_train = prepare_data_for_ml(X_train, y_train)
    y_train, y_test = prepare_data_for_ml(X_test, y_test)

    clf = HungaBungaClassifier()
    clf.fit(X_train, y_train)

    clf.predict(X_test)

That's the error:

100%|██████████| 15/15 [00:00<00:00, 43.02it/s]
Traceback (most recent call last):
  File "/Users/yonatab/PycharmProjects/VisualFitnessUtils/activate_ml/activate_ml_on_joints.py", line 62, in <module>
    main()
  File "/Users/yonatab/PycharmProjects/VisualFitnessUtils/activate_ml/activate_ml_on_joints.py", line 43, in main
    clf.predict(X_test)
  File "/Users/yonatab/opt/anaconda3/envs/CondaEnv/lib/python3.7/site-packages/hunga_bunga/classification.py", line 202, in predict
    return self.model.predict(x)
  File "/Users/yonatab/opt/anaconda3/envs/CondaEnv/lib/python3.7/site-packages/sklearn/linear_model/base.py", line 289, in predict
    scores = self.decision_function(X)
  File "/Users/yonatab/opt/anaconda3/envs/CondaEnv/lib/python3.7/site-packages/sklearn/linear_model/base.py", line 263, in decision_function
    "yet" % {'name': type(self).__name__})
sklearn.exceptions.NotFittedError: This SGDClassifier instance is not fitted yet

Am I missing something?

Thanks

get all inf?

ExtraTreesClassifier

Model accuracy Time/clf (s)


SGDClassifier -inf inf
LogisticRegression -inf inf
Perceptron -inf inf
PassiveAggressiveClassifier -inf inf
MLPClassifier -inf inf
KMeans -inf inf
KNeighborsClassifier -inf inf
NearestCentroid -inf inf
RadiusNeighborsClassifier -inf inf
SVC -inf inf
NuSVC -inf inf
LinearSVC -inf inf
RandomForestClassifier -inf inf
DecisionTreeClassifier -inf inf
ExtraTreesClassifier -inf inf

The winner is: SGDClassifier with score -inf.

I used the same data with just RandomForestClassifier and got a [bad but] working classifier...

Error in initial import

Hi,
I downloaded and put the package in site-packages.
When importing, I get the error:


    from regression import HungaBungaRegressor
ModuleNotFoundError: No module named 'regression'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.