Giter Club home page Giter Club logo

Comments (4)

pplonski avatar pplonski commented on May 24, 2024

Hi @yiannis-gkoufas,

I understand that you were able to train ML models with AutoML but there is problem with predictions only. Could you please provide code that you are using for computing predictions?

from mljar-supervised.

yiannis-gkoufas avatar yiannis-gkoufas commented on May 24, 2024

Hi @pplonski!

I use the same constructor for AutoML and pass a dataframe.

automl = AutoML(results_path=str(model_directory),
                            mode="Compete",
                            total_time_limit=600 * 600,
                            golden_features=True,
                            features_selection=True,
                            ml_task="binary_classification")

Could it be an issue with the ensemble model?

from mljar-supervised.

pplonski avatar pplonski commented on May 24, 2024

Thank you @yiannis-gkoufas for response. It looks like some bug with computing predictions for Stacked Ensemble. Is it possible to share full code and data to reproduce the issue?

from mljar-supervised.

yiannis-gkoufas avatar yiannis-gkoufas commented on May 24, 2024

This code:

from sklearn.model_selection import train_test_split
from supervised import AutoML
import pandas as pd

if __name__ == '__main__':
    df = pd.read_csv(
        "https://raw.githubusercontent.com/pplonski/datasets-for-start/master/adult/data.csv",
        skipinitialspace=True,
    )
    X_train, X_test, y_train, y_test = train_test_split(
        df[df.columns[:-1]], df["income"], test_size=0.25
    )

    automl = AutoML(results_path="./model",
                    mode="Compete",
                    total_time_limit=600 * 600,
                    golden_features=True,
                    features_selection=True,
                    ml_task="binary_classification")
    automl.fit(X_train, y_train)

    predictions = automl.predict(X_test)
    print(predictions)

reproduced the issue for me, because the ensemble stacked is identified as the best model.
It takes a while to run ofcourse. The message I got:

Traceback (most recent call last):
  File "/Users/prezi/Code/mljar_issue/mljar_issue/main.py", line 23, in <module>
    predictions = automl.predict(X_test)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/automl.py", line 451, in predict
    return self._predict(X)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/base_automl.py", line 1503, in _predict
    predictions = self._base_predict(X)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/base_automl.py", line 1465, in _base_predict
    predictions = model.predict(X, X_stacked)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/ensemble.py", line 434, in predict
    y_predicted_from_model = model.predict(X_stacked)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/model_framework.py", line 448, in predict
    y_p = learner.predict(X_data)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/supervised/algorithms/sklearn.py", line 66, in predict
    return self.model.predict_proba(X)[:, 1]
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/sklearn/ensemble/_forest.py", line 947, in predict_proba
    X = self._validate_X_predict(X)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/sklearn/ensemble/_forest.py", line 641, in _validate_X_predict
    X = self._validate_data(
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/sklearn/base.py", line 608, in _validate_data
    self._check_feature_names(X, reset=reset)
  File "/Users/prezi/Library/Caches/pypoetry/virtualenvs/mljar-issue-kQcsGfQC-py3.10/lib/python3.10/site-packages/sklearn/base.py", line 535, in _check_feature_names
    raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names unseen at fit time:
- 100_NearestNeighbors_prediction
- 101_NearestNeighbors_prediction
- 102_Xgboost_BoostOnErrors_prediction
- 102_Xgboost_prediction
- 103_Xgboost_prediction
- ...
Feature names seen at fit time, yet now missing:
- 100_NearestNeighbors_prediction_0_for_<=50K_1_for_>50K
- 101_NearestNeighbors_prediction_0_for_<=50K_1_for_>50K
- 102_Xgboost_BoostOnErrors_prediction_0_for_<=50K_1_for_>50K
- 102_Xgboost_prediction_0_for_<=50K_1_for_>50K
- 103_Xgboost_prediction_0_for_<=50K_1_for_>50K
- ...

from mljar-supervised.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.