Giter Club home page Giter Club logo

auto_viml's Introduction

Repos Badge Updated Badge Join our elite team of contributors! Contributors Display Contributors Display Contributors Display Contributors Display image3000

๐Ÿ‘‹ Welcome to the AutoViML Fan Club Page!
We just hit 3300 stars collectively for all AutoViML libraries on Github!!

AutoViML creates innovative Open Source libraries to make data scientists' and machine learning engineers' lives easier and more productive!

kanchitank

Our innovative libraries so far:

  • ๐Ÿค AutoViz Automatically Visualizes any dataset, any size with a single line of code. Now with Bokeh and Holoviews it can make your charts and dashboards interactive!
  • ๐Ÿค Auto_ViML Automatically builds multiple ML models with a single line of code. Uses scikit-learn, XGBoost and CatBoost.
  • ๐Ÿค Auto_TS Automatically builds ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with DASK to handle millions of rows.
  • ๐Ÿค Featurewiz Uses advanced feature engineering strategies and select the best features from your data set fast with a single line of code. Now updated with DASK to handle millions of rows.
  • ๐Ÿค Deep_AutoViML Builds tensorflow keras models and pipelines for any data set, any size with text, image and tabular data, with a single line of code.
  • ๐Ÿค lazytransform Automatically transform all categorical, date-time, NLP variables to numeric in a single line of code, for any data, set any size.
  • ๐Ÿค pandas_dq Automatically find and fix data quality issues in your dataset with a single line of code, for pandas.

Feb-2024: Added "Auto Encoders" for automatic feature extraction to featurewiz library for #feature-extraction

On Feb 8, 2024, we released a major update to our popular "featurewiz" library that will transform your input into a latent space with a dimension of latent_dim. This lower dimension (similar to PCA) will enable you to extract the best patterns in your data for the toughest imbalanced class and multi-class problems. Try it and let us know! autoencoders-screenshot
how to use autoencoders in featurewiz

April-2023: Released a major new python library "pandas_dq" #data_quality #dataengineering

On April 2, 2023, we released a major new Python library called "pandas_dq" that will automatically find and fix data quality issuesin your train and test dataframes in a single line of code, for any data, set any size. fix-dq-screenshot
how many pixels wide is my screen

April-2022: Released a major new python library "lazytransform" #featureengineering #featureselection

On April 3, 2022, we released a major new Python library called "lazytransform" that will automatically transform all categorical, date-time, NLP variables to numeric in a single line of code, for any data, set any size. lazy-code2

Jan-2022: Major upgrade to featurewiz: you can now perform feature selection thru fit and transform #MLOps #featureselection

As of version 0.0.90, featurewiz has a scikit-learn compatible feature selection transformer called FeatureWiz. You can use it to perform fit and predict as follows. You will get a Scikit-Learn Transformer object that you can add it to other data pipelines in MLops to select the top variables from your dataset.
featurewiz-class2

Dec-23-2021 Update: AutoViz now does Wordclouds! #autoviz #wordcloud

AutoViz can now create Wordclouds automatically for your NLP variables in data. It detects NLP variables automatically and creates wordclouds for them.

Dec 21, 2021: AutoViz now runs on Docker containers as part of MLOps pipelines. Check out Orchest.io

We are excited to announce that AutoViz and Deep_AutoViML are now available as containerized applications on Docker. This means that you can build data pipelines using a fantastic tool like orchest.io to build MLOps pipelines visually. Here are two sample pipelines we have created:

AutoViz pipeline: https://lnkd.in/g5uC-z66 Deep_AutoViML pipeline: https://lnkd.in/gdnWTqCG

You can find more examples and a wonderful video on orchest's web site banner

Dec-17-2021 AutoViz now uses HoloViews to display dashboards with Bokeh and save them as Dynamic HTML for web serving #HTML #Bokeh #Holoviews

Now you can use AutoViz to create Interactive Bokeh charts and dashboards (see below) either in Jupyter Notebooks or in the browser. Use chart_format as follows:

  • chart_format='bokeh': interactive Bokeh dashboards are plotted in Jupyter Notebooks.
  • chart_format='server', dashboards will pop up for each kind of chart on your web browser.
  • chart_format='html', interactive Bokeh charts will be silently saved as Dynamic HTML files under AutoViz_Plots directory

Languages and Tools:

docker git python scikit_learn

ย AutoViML

AutoViML

Our Kaggle Badges:

notebook discussion

Connect with us on Linkedin:

ram seshadri

auto_viml's People

Contributors

autoviml avatar emekaborisama avatar morenoh149 avatar rsesha avatar zhiningliu1998 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

auto_viml's Issues

UnboundLocalError: local variable 'sp' referenced before assignment

Hi,

First of all, thank you for your work ! It's really amazing.
But i'm having an issue with my attempts of TrEnDs assignement on Kaggle, I have the issue below:
It seems to me that Scipy is not recognized or something like that, even thought i put import scipy as sp on my notebook or even in AutoViml.py

Can you help ? thanks again !

Best Regards .

############## D A T A S E T A N A L Y S I S #######################
Training Set Shape = (5434, 1411)
Training Set Memory Usage = 58.54 MB
Test Set Shape = (5877, 1405)
Test Set Memory Usage = 63.04 MB
Multi_Label Target: ['age', 'domain1_var1', 'domain1_var2', 'domain2_var1', 'domain2_var2']
############## C L A S S I F Y I N G V A R I A B L E S ####################
Classifying variables in data set...
Number of Numeric Columns = 1404
Number of Integer-Categorical Columns = 0
Number of String-Categorical Columns = 0
Number of Factor-Categorical Columns = 0
Number of String-Boolean Columns = 0
Number of Numeric-Boolean Columns = 0
Number of Discrete String Columns = 0
Number of NLP String Columns = 0
Number of Date Time Columns = 0
Number of ID Columns = 1
Number of Columns to Delete = 1
1406 Predictors classified...
This does not include the Target column(s)
2 variables removed since they were ID or low-information variables
Number of GPUs = 2
GPU available
############# D A T A P R E P A R A T I O N #############
No Missing Values in train data set
Test data has no missing values. Continuing...
Completed Scaling of Train and Test Data using MinMaxScaler(copy=True, feature_range=(0, 1)) ...
Regression problem: hyperparameters are being optimized for mae
############## F E A T U R E S E L E C T I O N ####################
Removing highly correlated features among 1404 variables using pearson correlation...
Number of variables removed due to high correlation = 407
List of variables removed: ['SCN(99)_vs_SCN(69)', 'SCN(45)_vs_SCN(69)', 'SMN(11)_vs_SCN(69)', 'SMN(27)_vs_SCN(69)', 'SMN(66)_vs_SCN(69)', 'VSN(20)_vs_SCN(69)', 'VSN(8)_vs_SCN(69)', 'CON(88)_vs_SCN(69)', 'CON(67)_vs_SCN(69)', 'CON(38)_vs_SCN(69)', 'CON(83)_vs_SCN(69)', 'CBN(4)_vs_SCN(69)', 'CBN(7)_vs_SCN(69)', 'SCN(45)_vs_SCN(53)', 'ADN(21)_vs_SCN(53)', 'ADN(56)_vs_SCN(53)', 'SMN(27)_vs_SCN(53)', 'SMN(54)_vs_SCN(53)', 'SMN(66)_vs_SCN(53)', 'VSN(20)_vs_SCN(53)', 'VSN(8)_vs_SCN(53)', 'CON(88)_vs_SCN(53)', 'CON(67)_vs_SCN(53)', 'CON(38)_vs_SCN(53)', 'CBN(4)_vs_SCN(53)', 'CBN(7)_vs_SCN(53)', 'SCN(45)_vs_SCN(98)', 'SMN(27)_vs_SCN(98)', 'VSN(8)_vs_SCN(98)', 'ADN(21)_vs_SCN(99)', 'ADN(56)_vs_SCN(99)', 'SMN(3)_vs_SCN(99)', 'SMN(2)_vs_SCN(99)', 'SMN(11)_vs_SCN(99)', 'SMN(27)_vs_SCN(99)', 'SMN(72)_vs_SCN(99)', 'VSN(16)_vs_SCN(99)', 'VSN(5)_vs_SCN(99)', 'VSN(62)_vs_SCN(99)', 'VSN(15)_vs_SCN(99)', 'VSN(12)_vs_SCN(99)', 'VSN(93)_vs_SCN(99)', 'VSN(20)_vs_SCN(99)', 'VSN(8)_vs_SCN(99)', 'VSN(77)_vs_SCN(99)', 'CON(68)_vs_SCN(99)', 'CON(33)_vs_SCN(99)', 'CON(43)_vs_SCN(99)', 'CON(70)_vs_SCN(99)', 'CON(61)_vs_SCN(99)', 'CON(55)_vs_SCN(99)', 'CON(63)_vs_SCN(99)', 'CON(79)_vs_SCN(99)', 'CON(84)_vs_SCN(99)', 'CON(96)_vs_SCN(99)', 'CON(88)_vs_SCN(99)', 'CON(48)_vs_SCN(99)', 'CON(81)_vs_SCN(99)', 'CON(37)_vs_SCN(99)', 'CON(67)_vs_SCN(99)', 'CON(38)_vs_SCN(99)', 'CON(83)_vs_SCN(99)', 'DMN(32)_vs_SCN(99)', 'DMN(23)_vs_SCN(99)', 'DMN(71)_vs_SCN(99)', 'CBN(13)_vs_SCN(99)', 'CBN(18)_vs_SCN(99)', 'CBN(4)_vs_SCN(99)', 'CBN(7)_vs_SCN(99)', 'ADN(21)_vs_SCN(45)', 'ADN(56)_vs_SCN(45)', 'SMN(3)_vs_SCN(45)', 'SMN(9)_vs_SCN(45)', 'SMN(2)_vs_SCN(45)', 'SMN(11)_vs_SCN(45)', 'SMN(27)_vs_SCN(45)', 'SMN(54)_vs_SCN(45)', 'SMN(66)_vs_SCN(45)', 'VSN(16)_vs_SCN(45)', 'VSN(93)_vs_SCN(45)', 'VSN(20)_vs_SCN(45)', 'VSN(8)_vs_SCN(45)', 'CON(63)_vs_SCN(45)', 'CON(84)_vs_SCN(45)', 'CON(88)_vs_SCN(45)', 'CON(67)_vs_SCN(45)', 'CON(38)_vs_SCN(45)', 'CON(83)_vs_SCN(45)', 'DMN(40)_vs_SCN(45)', 'DMN(71)_vs_SCN(45)', 'CBN(13)_vs_SCN(45)', 'CBN(18)_vs_SCN(45)', 'CBN(4)_vs_SCN(45)', 'CBN(7)_vs_SCN(45)', 'SMN(11)_vs_ADN(21)', 'SMN(80)_vs_ADN(21)', 'VSN(62)_vs_ADN(21)', 'VSN(93)_vs_ADN(21)', 'VSN(20)_vs_ADN(21)', 'VSN(8)_vs_ADN(21)', 'VSN(77)_vs_ADN(21)', 'CON(33)_vs_ADN(21)', 'CON(84)_vs_ADN(21)', 'CON(37)_vs_ADN(21)', 'DMN(51)_vs_ADN(21)', 'DMN(94)_vs_ADN(21)', 'CBN(13)_vs_ADN(21)', 'CBN(18)_vs_ADN(21)', 'CBN(4)_vs_ADN(21)', 'CBN(7)_vs_ADN(21)', 'SMN(11)_vs_ADN(56)', 'SMN(54)_vs_ADN(56)', 'SMN(80)_vs_ADN(56)', 'VSN(16)_vs_ADN(56)', 'VSN(5)_vs_ADN(56)', 'VSN(62)_vs_ADN(56)', 'VSN(15)_vs_ADN(56)', 'VSN(12)_vs_ADN(56)', 'VSN(93)_vs_ADN(56)', 'VSN(20)_vs_ADN(56)', 'VSN(8)_vs_ADN(56)', 'VSN(77)_vs_ADN(56)', 'CON(33)_vs_ADN(56)', 'CON(70)_vs_ADN(56)', 'CON(61)_vs_ADN(56)', 'CON(55)_vs_ADN(56)', 'CON(63)_vs_ADN(56)', 'CON(84)_vs_ADN(56)', 'CON(48)_vs_ADN(56)', 'CON(37)_vs_ADN(56)', 'CON(67)_vs_ADN(56)', 'CON(38)_vs_ADN(56)', 'CON(83)_vs_ADN(56)', 'DMN(32)_vs_ADN(56)', 'DMN(40)_vs_ADN(56)', 'DMN(23)_vs_ADN(56)', 'DMN(71)_vs_ADN(56)', 'DMN(51)_vs_ADN(56)', 'CBN(13)_vs_ADN(56)', 'CBN(4)_vs_ADN(56)', 'CBN(7)_vs_ADN(56)', 'SMN(2)_vs_SMN(3)', 'SMN(11)_vs_SMN(3)', 'SMN(80)_vs_SMN(3)', 'VSN(16)_vs_SMN(3)', 'VSN(5)_vs_SMN(3)', 'VSN(15)_vs_SMN(3)', 'VSN(12)_vs_SMN(3)', 'VSN(93)_vs_SMN(3)', 'VSN(20)_vs_SMN(3)', 'VSN(8)_vs_SMN(3)', 'VSN(77)_vs_SMN(3)', 'CON(33)_vs_SMN(3)', 'CON(63)_vs_SMN(3)', 'CON(84)_vs_SMN(3)', 'CON(88)_vs_SMN(3)', 'CON(37)_vs_SMN(3)', 'DMN(51)_vs_SMN(3)', 'DMN(94)_vs_SMN(3)', 'CBN(13)_vs_SMN(3)', 'CBN(18)_vs_SMN(3)', 'CBN(4)_vs_SMN(3)', 'CBN(7)_vs_SMN(3)', 'SMN(2)_vs_SMN(9)', 'SMN(80)_vs_SMN(9)', 'CON(33)_vs_SMN(9)', 'CON(61)_vs_SMN(9)', 'CON(63)_vs_SMN(9)', 'CON(84)_vs_SMN(9)', 'CON(67)_vs_SMN(9)', 'DMN(23)_vs_SMN(9)', 'DMN(51)_vs_SMN(9)', 'DMN(94)_vs_SMN(9)', 'SMN(11)_vs_SMN(2)', 'SMN(80)_vs_SMN(2)', 'VSN(62)_vs_SMN(2)', 'CON(33)_vs_SMN(2)', 'CON(61)_vs_SMN(2)', 'CON(63)_vs_SMN(2)', 'CON(84)_vs_SMN(2)', 'CON(88)_vs_SMN(2)', 'CON(37)_vs_SMN(2)', 'CON(67)_vs_SMN(2)', 'CON(38)_vs_SMN(2)', 'DMN(23)_vs_SMN(2)', 'DMN(71)_vs_SMN(2)', 'CBN(4)_vs_SMN(2)', 'CBN(7)_vs_SMN(2)', 'SMN(27)_vs_SMN(11)', 'SMN(54)_vs_SMN(11)', 'SMN(66)_vs_SMN(11)', 'SMN(80)_vs_SMN(11)', 'VSN(16)_vs_SMN(11)', 'VSN(5)_vs_SMN(11)', 'VSN(62)_vs_SMN(11)', 'VSN(12)_vs_SMN(11)', 'VSN(93)_vs_SMN(11)', 'VSN(20)_vs_SMN(11)', 'VSN(8)_vs_SMN(11)', 'VSN(77)_vs_SMN(11)', 'CON(68)_vs_SMN(11)', 'CON(43)_vs_SMN(11)', 'CON(70)_vs_SMN(11)', 'CON(61)_vs_SMN(11)', 'CON(55)_vs_SMN(11)', 'CON(63)_vs_SMN(11)', 'CON(79)_vs_SMN(11)', 'CON(84)_vs_SMN(11)', 'CON(96)_vs_SMN(11)', 'CON(88)_vs_SMN(11)', 'CON(48)_vs_SMN(11)', 'CON(37)_vs_SMN(11)', 'CON(38)_vs_SMN(11)', 'CON(83)_vs_SMN(11)', 'DMN(32)_vs_SMN(11)', 'DMN(40)_vs_SMN(11)', 'DMN(23)_vs_SMN(11)', 'DMN(17)_vs_SMN(11)', 'DMN(51)_vs_SMN(11)', 'DMN(94)_vs_SMN(11)', 'CBN(13)_vs_SMN(11)', 'CBN(18)_vs_SMN(11)', 'CBN(4)_vs_SMN(11)', 'CBN(7)_vs_SMN(11)', 'VSN(16)_vs_SMN(27)', 'VSN(5)_vs_SMN(27)', 'VSN(62)_vs_SMN(27)', 'VSN(15)_vs_SMN(27)', 'VSN(12)_vs_SMN(27)', 'VSN(93)_vs_SMN(27)', 'VSN(20)_vs_SMN(27)', 'VSN(8)_vs_SMN(27)', 'VSN(77)_vs_SMN(27)', 'CON(33)_vs_SMN(27)', 'CON(88)_vs_SMN(27)', 'CON(67)_vs_SMN(27)', 'CON(38)_vs_SMN(27)', 'DMN(71)_vs_SMN(27)', 'CBN(13)_vs_SMN(27)', 'CBN(18)_vs_SMN(27)', 'CBN(4)_vs_SMN(27)', 'CBN(7)_vs_SMN(27)', 'VSN(12)_vs_SMN(54)', 'VSN(20)_vs_SMN(54)', 'VSN(8)_vs_SMN(54)', 'CON(81)_vs_SMN(54)', 'DMN(94)_vs_SMN(54)', 'VSN(62)_vs_SMN(66)', 'VSN(20)_vs_SMN(66)', 'VSN(8)_vs_SMN(66)', 'CON(63)_vs_SMN(66)', 'DMN(94)_vs_SMN(66)', 'CBN(13)_vs_SMN(66)', 'CBN(7)_vs_SMN(66)', 'VSN(20)_vs_SMN(80)', 'VSN(8)_vs_SMN(80)', 'DMN(94)_vs_SMN(80)', 'CBN(7)_vs_SMN(80)', 'VSN(20)_vs_SMN(72)', 'VSN(8)_vs_SMN(72)', 'DMN(94)_vs_SMN(72)', 'CBN(7)_vs_SMN(72)', 'VSN(8)_vs_VSN(16)', 'CON(63)_vs_VSN(16)', 'CON(48)_vs_VSN(16)', 'CON(83)_vs_VSN(16)', 'DMN(94)_vs_VSN(16)', 'VSN(15)_vs_VSN(5)', 'VSN(8)_vs_VSN(5)', 'VSN(20)_vs_VSN(62)', 'VSN(8)_vs_VSN(62)', 'CON(68)_vs_VSN(62)', 'CON(79)_vs_VSN(62)', 'CON(96)_vs_VSN(62)', 'CON(81)_vs_VSN(62)', 'CON(37)_vs_VSN(62)', 'DMN(40)_vs_VSN(62)', 'DMN(23)_vs_VSN(62)', 'DMN(17)_vs_VSN(62)', 'DMN(94)_vs_VSN(62)', 'CBN(18)_vs_VSN(62)', 'CBN(7)_vs_VSN(62)', 'VSN(12)_vs_VSN(15)', 'VSN(20)_vs_VSN(15)', 'DMN(94)_vs_VSN(15)', 'VSN(8)_vs_VSN(12)', 'CON(43)_vs_VSN(12)', 'DMN(94)_vs_VSN(12)', 'CBN(13)_vs_VSN(12)', 'VSN(20)_vs_VSN(93)', 'VSN(8)_vs_VSN(93)', 'CON(83)_vs_VSN(93)', 'DMN(94)_vs_VSN(93)', 'CBN(7)_vs_VSN(93)', 'VSN(8)_vs_VSN(20)', 'VSN(77)_vs_VSN(20)', 'CON(68)_vs_VSN(20)', 'CON(33)_vs_VSN(20)', 'CON(43)_vs_VSN(20)', 'CON(70)_vs_VSN(20)', 'CON(61)_vs_VSN(20)', 'CON(55)_vs_VSN(20)', 'CON(63)_vs_VSN(20)', 'CON(79)_vs_VSN(20)', 'CON(84)_vs_VSN(20)', 'CON(96)_vs_VSN(20)', 'CON(88)_vs_VSN(20)', 'CON(48)_vs_VSN(20)', 'CON(81)_vs_VSN(20)', 'CON(37)_vs_VSN(20)', 'CON(67)_vs_VSN(20)', 'CON(38)_vs_VSN(20)', 'CON(83)_vs_VSN(20)', 'DMN(32)_vs_VSN(20)', 'DMN(40)_vs_VSN(20)', 'DMN(23)_vs_VSN(20)', 'DMN(71)_vs_VSN(20)', 'DMN(17)_vs_VSN(20)', 'DMN(51)_vs_VSN(20)', 'DMN(94)_vs_VSN(20)', 'CBN(13)_vs_VSN(20)', 'CBN(4)_vs_VSN(20)', 'CBN(7)_vs_VSN(20)', 'VSN(77)_vs_VSN(8)', 'CON(68)_vs_VSN(8)', 'CON(33)_vs_VSN(8)', 'CON(43)_vs_VSN(8)', 'CON(70)_vs_VSN(8)', 'CON(61)_vs_VSN(8)', 'CON(55)_vs_VSN(8)', 'CON(63)_vs_VSN(8)', 'CON(79)_vs_VSN(8)', 'CON(84)_vs_VSN(8)', 'CON(96)_vs_VSN(8)', 'CON(88)_vs_VSN(8)', 'CON(48)_vs_VSN(8)', 'CON(81)_vs_VSN(8)', 'CON(37)_vs_VSN(8)', 'CON(67)_vs_VSN(8)', 'CON(38)_vs_VSN(8)', 'CON(83)_vs_VSN(8)', 'DMN(32)_vs_VSN(8)', 'DMN(40)_vs_VSN(8)', 'DMN(23)_vs_VSN(8)', 'DMN(71)_vs_VSN(8)', 'DMN(17)_vs_VSN(8)', 'DMN(51)_vs_VSN(8)', 'DMN(94)_vs_VSN(8)', 'CBN(13)_vs_VSN(8)', 'CBN(18)_vs_VSN(8)', 'CBN(4)_vs_VSN(8)', 'CBN(7)_vs_VSN(8)', 'DMN(94)_vs_VSN(77)', 'CBN(7)_vs_VSN(77)', 'DMN(94)_vs_CON(33)', 'CBN(7)_vs_CON(33)', 'CON(61)_vs_CON(43)', 'CON(63)_vs_CON(43)', 'CON(38)_vs_CON(43)', 'CBN(13)_vs_CON(43)', 'CBN(7)_vs_CON(70)', 'CON(96)_vs_CON(61)', 'DMN(94)_vs_CON(61)', 'DMN(94)_vs_CON(55)', 'CBN(7)_vs_CON(55)', 'CON(96)_vs_CON(63)', 'DMN(40)_vs_CON(63)', 'DMN(17)_vs_CON(63)', 'DMN(94)_vs_CON(63)', 'CBN(7)_vs_CON(63)', 'CON(38)_vs_CON(79)', 'DMN(94)_vs_CON(84)', 'CBN(7)_vs_CON(84)', 'CON(38)_vs_CON(96)', 'CBN(13)_vs_CON(96)', 'DMN(17)_vs_CON(88)', 'DMN(94)_vs_CON(88)', 'CBN(7)_vs_CON(88)', 'CON(38)_vs_CON(81)', 'DMN(94)_vs_CON(37)', 'CBN(7)_vs_CON(37)', 'DMN(94)_vs_CON(67)', 'DMN(40)_vs_CON(38)', 'DMN(17)_vs_CON(38)', 'DMN(94)_vs_CON(38)', 'CBN(7)_vs_CON(38)', 'DMN(71)_vs_CON(83)', 'CBN(13)_vs_CON(83)', 'CBN(18)_vs_CON(83)', 'CBN(7)_vs_CON(83)', 'DMN(17)_vs_DMN(32)', 'DMN(94)_vs_DMN(32)', 'DMN(17)_vs_DMN(40)', 'CBN(7)_vs_DMN(23)', 'CBN(7)_vs_DMN(71)', 'DMN(51)_vs_DMN(17)', 'DMN(94)_vs_DMN(17)', 'CBN(13)_vs_DMN(17)', 'DMN(94)_vs_DMN(51)', 'CBN(7)_vs_DMN(51)', 'CBN(13)_vs_DMN(94)', 'CBN(18)_vs_DMN(94)', 'CBN(7)_vs_DMN(94)', 'CBN(4)_vs_CBN(13)', 'CBN(7)_vs_CBN(13)', 'CBN(4)_vs_CBN(18)', 'CBN(7)_vs_CBN(4)']

############# PROCESSING T A R G E T = age ##########################
No categorical feature reduction done. All 0 Categorical vars selected
############## F E A T U R E S E L E C T I O N ####################
Removing highly correlated features among 997 variables using pearson correlation...
Number of variables removed due to high correlation = 176
List of variables removed: ['ADN(21)_vs_SCN(69)', 'ADN(56)_vs_SCN(69)', 'SMN(3)_vs_SCN(69)', 'SMN(9)_vs_SCN(69)', 'SMN(2)_vs_SCN(69)', 'VSN(16)_vs_SCN(69)', 'VSN(5)_vs_SCN(69)', 'VSN(15)_vs_SCN(69)', 'VSN(12)_vs_SCN(69)', 'CBN(18)_vs_SCN(69)', 'SMN(2)_vs_SCN(53)', 'SMN(11)_vs_SCN(53)', 'VSN(16)_vs_SCN(53)', 'VSN(5)_vs_SCN(53)', 'VSN(15)_vs_SCN(53)', 'VSN(12)_vs_SCN(53)', 'CBN(13)_vs_SCN(53)', 'CBN(18)_vs_SCN(53)', 'ADN(21)_vs_SCN(98)', 'ADN(56)_vs_SCN(98)', 'SMN(3)_vs_SCN(98)', 'SMN(2)_vs_SCN(98)', 'SMN(11)_vs_SCN(98)', 'SMN(54)_vs_SCN(98)', 'SMN(54)_vs_SCN(99)', 'SMN(66)_vs_SCN(98)', 'SMN(66)_vs_SCN(99)', 'VSN(16)_vs_SCN(98)', 'VSN(15)_vs_SCN(98)', 'VSN(12)_vs_SCN(98)', 'VSN(20)_vs_SCN(98)', 'CBN(13)_vs_SCN(98)', 'CBN(18)_vs_SCN(98)', 'CBN(7)_vs_SCN(98)', 'SMN(9)_vs_SCN(99)', 'VSN(15)_vs_SCN(45)', 'VSN(12)_vs_SCN(45)', 'VSN(16)_vs_ADN(21)', 'VSN(5)_vs_ADN(21)', 'VSN(15)_vs_ADN(21)', 'VSN(12)_vs_ADN(21)', 'SMN(27)_vs_SMN(3)', 'CON(67)_vs_SMN(11)', 'DMN(71)_vs_SMN(3)', 'DMN(71)_vs_SMN(11)', 'VSN(62)_vs_SMN(9)', 'CON(38)_vs_SMN(9)', 'DMN(71)_vs_SMN(9)', 'CBN(4)_vs_SMN(9)', 'CBN(7)_vs_SMN(9)', 'VSN(93)_vs_SMN(2)', 'VSN(20)_vs_SMN(2)', 'VSN(8)_vs_SMN(2)', 'VSN(77)_vs_SMN(2)', 'CBN(13)_vs_SMN(2)', 'VSN(15)_vs_SMN(11)', 'VSN(16)_vs_SMN(54)', 'VSN(5)_vs_SMN(54)', 'VSN(15)_vs_SMN(54)', 'VSN(5)_vs_SMN(66)', 'VSN(15)_vs_SMN(66)', 'VSN(12)_vs_SMN(66)', 'VSN(16)_vs_SMN(80)', 'VSN(5)_vs_SMN(80)', 'VSN(15)_vs_SMN(80)', 'VSN(12)_vs_SMN(80)', 'VSN(16)_vs_SMN(72)', 'VSN(5)_vs_SMN(72)', 'VSN(15)_vs_SMN(72)', 'VSN(12)_vs_SMN(72)', 'VSN(5)_vs_VSN(16)', 'VSN(62)_vs_VSN(16)', 'VSN(15)_vs_VSN(16)', 'VSN(12)_vs_VSN(16)', 'VSN(93)_vs_VSN(16)', 'VSN(20)_vs_VSN(16)', 'VSN(77)_vs_VSN(16)', 'CON(33)_vs_VSN(16)', 'CON(43)_vs_VSN(16)', 'CON(61)_vs_VSN(16)', 'CON(55)_vs_VSN(16)', 'CON(79)_vs_VSN(16)', 'CON(84)_vs_VSN(16)', 'CON(88)_vs_VSN(16)', 'CON(81)_vs_VSN(16)', 'CON(37)_vs_VSN(16)', 'CON(67)_vs_VSN(16)', 'CON(38)_vs_VSN(16)', 'DMN(71)_vs_VSN(16)', 'DMN(17)_vs_VSN(16)', 'DMN(51)_vs_VSN(16)', 'CBN(13)_vs_VSN(16)', 'CBN(4)_vs_VSN(16)', 'CBN(7)_vs_VSN(16)', 'VSN(62)_vs_VSN(5)', 'VSN(93)_vs_VSN(5)', 'VSN(77)_vs_VSN(5)', 'CON(68)_vs_VSN(5)', 'CON(33)_vs_VSN(5)', 'CON(43)_vs_VSN(5)', 'CON(61)_vs_VSN(5)', 'CON(55)_vs_VSN(5)', 'CON(79)_vs_VSN(5)', 'CON(84)_vs_VSN(5)', 'CON(88)_vs_VSN(5)', 'CON(81)_vs_VSN(5)', 'CON(37)_vs_VSN(5)', 'CON(67)_vs_VSN(5)', 'CON(38)_vs_VSN(5)', 'CON(83)_vs_VSN(5)', 'DMN(71)_vs_VSN(5)', 'DMN(17)_vs_VSN(5)', 'DMN(51)_vs_VSN(5)', 'DMN(94)_vs_VSN(5)', 'CBN(13)_vs_VSN(5)', 'CBN(4)_vs_VSN(5)', 'CBN(7)_vs_VSN(5)', 'VSN(15)_vs_VSN(62)', 'VSN(12)_vs_VSN(62)', 'VSN(93)_vs_VSN(15)', 'VSN(77)_vs_VSN(15)', 'CON(68)_vs_VSN(15)', 'CON(33)_vs_VSN(15)', 'CON(43)_vs_VSN(15)', 'CON(70)_vs_VSN(15)', 'CON(61)_vs_VSN(15)', 'CON(55)_vs_VSN(15)', 'CON(63)_vs_VSN(15)', 'CON(79)_vs_VSN(15)', 'CON(84)_vs_VSN(15)', 'CON(96)_vs_VSN(15)', 'CON(88)_vs_VSN(15)', 'CON(48)_vs_VSN(15)', 'CON(81)_vs_VSN(15)', 'CON(37)_vs_VSN(15)', 'CON(67)_vs_VSN(15)', 'CON(38)_vs_VSN(15)', 'CON(83)_vs_VSN(15)', 'DMN(32)_vs_VSN(15)', 'DMN(40)_vs_VSN(15)', 'DMN(23)_vs_VSN(15)', 'DMN(71)_vs_VSN(15)', 'DMN(17)_vs_VSN(15)', 'DMN(51)_vs_VSN(15)', 'CBN(13)_vs_VSN(15)', 'CBN(18)_vs_VSN(15)', 'CBN(4)_vs_VSN(15)', 'CBN(7)_vs_VSN(15)', 'VSN(93)_vs_VSN(12)', 'VSN(77)_vs_VSN(12)', 'CON(68)_vs_VSN(12)', 'CON(33)_vs_VSN(12)', 'CON(70)_vs_VSN(12)', 'CON(61)_vs_VSN(12)', 'CON(55)_vs_VSN(12)', 'CON(63)_vs_VSN(12)', 'CON(79)_vs_VSN(12)', 'CON(84)_vs_VSN(12)', 'CON(96)_vs_VSN(12)', 'CON(88)_vs_VSN(12)', 'CON(48)_vs_VSN(12)', 'CON(81)_vs_VSN(12)', 'CON(37)_vs_VSN(12)', 'CON(67)_vs_VSN(12)', 'CON(38)_vs_VSN(12)', 'CON(83)_vs_VSN(12)', 'DMN(32)_vs_VSN(12)', 'DMN(40)_vs_VSN(12)', 'DMN(23)_vs_VSN(12)', 'DMN(71)_vs_VSN(12)', 'DMN(17)_vs_VSN(12)', 'DMN(51)_vs_VSN(12)', 'CBN(18)_vs_VSN(12)', 'CBN(4)_vs_VSN(12)', 'CBN(7)_vs_VSN(12)', 'CBN(18)_vs_VSN(20)']
Adding 0 categorical variables to reduced numeric variables of 821
############## F E A T U R E S E L E C T I O N ####################
Current number of predictors = 821
Finding Important Features using Boosted Trees algorithm...
using 821 variables...
[11:18:07] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
using 657 variables...
[11:18:09] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
using 493 variables...
[11:18:10] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
using 329 variables...
[11:18:12] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
using 165 variables...
[11:18:12] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
using 1 variables...
[11:18:13] WARNING: C:/Jenkins/workspace/xgboost-win64_release_0.90/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
Found 73 important features
Starting Feature Engineering now...
No Entropy Binning specified or there are no numeric vars in data set to Bin
############### M O D E L B U I L D I N G ####################
Rows in Train data set = 4347
Features in Train data set = 73
Rows in held-out data set = 1087

UnboundLocalError Traceback (most recent call last)
in
----> 1 model_age_trends, features_trends, trainm_trends, testm_trends = Auto_ViML(df, ['age','domain1_var1','domain1_var2','domain2_var1','domain2_var2'], test_df, verbose=2, scoring_parameter='r2')

C:\ProgramData\Anaconda3\envs\virtEnv\lib\site-packages\autoviml\Auto_ViML.py in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, verbose)
1109 'alpha': np.logspace(-5,3),
1110 },
-> 1111 "XGBoost": {
1112 'learning_rate': sp.stats.uniform(scale=1),
1113 'gamma': sp.stats.randint(0, 32),

UnboundLocalError: local variable 'sp' referenced before assignment

How are you handling preprocessing steps during prediction?

Hi @rsesha

Currently Auto_ViML function is returning best model (XGB), features (Array), train metrics and test metrics. But how you suggesting to handling the preprocessing in the prediction dataset?

For Example, if you are applying LabelEncoding on a column inside the Auto_ViML function during training

KeyError: "['index'] not in index" while saving results

Hello Team, I ran a simple regression model with the following

model, features, trainm, testm = Auto_ViML(
    train2.reset_index(),
    "Total_Effort",
    x_test,
    "",
    hyper_param="GS",
    feature_reduction=True,
    scoring_parameter="weighted-f1",
    KMeans_Featurizer=True,
    Boosting_Flag=False,
    Binning_Flag=False,
    Add_Poly=False,
    Stacking_Flag=False,
    Imbalanced_Flag=False,
    verbose=2,
)

The error stack -

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-41-58032da7fcff> in <module>
     13     Stacking_Flag=False,
     14     Imbalanced_Flag=False,
---> 15     verbose=2,
     16 )

~\Anaconda3\lib\site-packages\autoviml\Auto_ViML.py in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, verbose)
   2587         #############################################################################################
   2588         if isinstance(sample_submission, str):
-> 2589             sample_submission = testm[id_cols+[each_target+'_predictions']]
   2590         try:
   2591             write_file_to_folder(sample_submission, each_target, each_target+'_'+modeltype+'_'+'submission.csv')

~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2999             if is_iterator(key):
   3000                 key = list(key)
-> 3001             indexer = self.loc._convert_to_indexer(key, axis=1, raise_missing=True)
   3002 
   3003         # take() does not accept boolean indexers

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _convert_to_indexer(self, obj, axis, is_setter, raise_missing)
   1283                 # When setting, missing keys are not allowed, even with .loc:
   1284                 kwargs = {"raise_missing": True if is_setter else raise_missing}
-> 1285                 return self._get_listlike_indexer(obj, axis, **kwargs)[1]
   1286         else:
   1287             try:

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1090 
   1091         self._validate_read_indexer(
-> 1092             keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
   1093         )
   1094         return keyarr, indexer

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1183             if not (self.name == "loc" and not raise_missing):
   1184                 not_found = list(set(key) - set(ax))
-> 1185                 raise KeyError("{} not in index".format(not_found))
   1186 
   1187             # we skip the warning on Categorical/Interval

KeyError: "['index'] not in index"

Environment -

  • Windows 10
  • Python Version: 3.7.6
  • AutoViz_Class version: 0.0.68**

I couldn't get time to look inside the module but it looks like this has something to do with File Lock as it saves the file Total_Effort_Regression_test_modified.csv which is lock under the process.

Problem with Auto_NLP: TypeError np.matrix is not supported

Hello,

I am trying to use Auto_NLP module of AutoViML, and I am facing an issue I don't know how to solve. Here are the files I am using to reproduce the problem.

  • requirements.txt:
pandas==1.5.2
autoviml==0.1.710
lightgbm==3.3.4
lxml==4.9.2 
  • train.csv: dataset with tweets with sentiment analysis I got from an online paper about Auto_NLP. To facilitate the test, you can keep only the 500 first lines of the file

  • test_autonlp.ipynb: Jupyter notebook content, reproduced next:

import pandas as pd
from sklearn.model_selection import train_test_split
from autoviml.Auto_NLP import Auto_NLP

data=pd.read_csv('train.csv')

train,test = train_test_split(data, test_size=0.2)

input_feature, target = "SentimentText", "Sentiment"

train["Sentiment"] = pd.to_numeric(train["Sentiment"])
test["Sentiment"] = pd.to_numeric(test["Sentiment"])

train_x, test_x, final, predicted=Auto_NLP(input_feature,train,test,target,score_type="balanced_accuracy",modeltype="classification",verbose=0,build_model=True)

I am getting the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 13
      9 input_feature, target = "SentimentText", "Sentiment"
     11 train["Sentiment"] = pd.to_numeric(train["Sentiment"])
---> 13 train_x, test_x, final, predicted=Auto_NLP(input_feature,train,test,target,score_type="balanced_accuracy",modeltype="classification",verbose=0,build_model=True)

File ~/.pyenv/versions/3.10.9/envs/testautonlp/lib/python3.10/site-packages/autoviml/Auto_NLP.py:1588, in Auto_NLP(nlp_column, train, test, target, score_type, modeltype, top_num_features, verbose, build_model)
   1582     calib_pipe = Pipeline([
   1583         ('tfidfvectorizer', best_vect),
   1584         ('convert_dense', FunctionTransformer(lambda x: x.todense(), accept_sparse=True)),
   1585         ('best_features', best_sel),
   1586          ])
   1587     best_model = CalibratedClassifierCV(best_estimator,cv=3, method='isotonic')
-> 1588     best_model.fit(calib_pipe.transform(X_train), y_train)
   1589     y_pred = best_model.predict(calib_pipe.transform(X_test))
   1590 else:

File ~/.pyenv/versions/3.10.9/envs/testautonlp/lib/python3.10/site-packages/sklearn/pipeline.py:659, in Pipeline.transform(self, X)
    657 Xt = X
    658 for _, _, transform in self._iter():
--> 659     Xt = transform.transform(Xt)
    660 return Xt

File ~/.pyenv/versions/3.10.9/envs/testautonlp/lib/python3.10/site-packages/sklearn/utils/_set_output.py:142, in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
...
    743 xp, is_array_api = get_namespace(array)
    745 # store reference to original array to check if copy is needed when
[train.csv](https://github.com/AutoViML/Auto_ViML/files/10404796/train.csv)

    746 # function returns

TypeError: np.matrix is not supported. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html

I am using Ubuntu with pyenv/virtualenv. I have tried Python 3.10 and 3.9, and also anaconda3-2022.10, but the error persists. Could it be something wrong in my environment setup? Or is it a bug?

How to let AutoViML know which column is ID, and also identify sub-class?

Maybe I should split the issue into two?
Thanks to the contributors and previous issues, I finally made the program run to the end, but it seems that AutoViML didn't identify the column to mark different samples. I used 'Identifier' for the column, should I change it to another column name, or there is another method to pass this info to AutoViML?
image

Second question: I have two main classes: negative and positive, for positive, I want to split them into more classes based on how strong the positive feature is. For now, I assign 0 to negative, 1 to basic level of positive, and +0.2 for each level higher until reach 2.
And AutoViML took them as 7 totally independent groups:
image
What should I do to let AutoViML know that >1 is a large classification equal to '0' group, and 1.2 1.4 1.6 ... are sub-groups? Another classification model?
BTW, if I solve the 1st problem, that is, AutoViML identifies different samples/IDs properly, the the 'shuffling the data set' will take one sample as a whole for shuffling, right?

How should I stack the data of different samples into one single dataframe?

Hello! My background is biology, so I'm a beginner in this field, kind of confused about putting the whole train dataset into one single dataframe var to load. Here I will first give a general introduction of what we want to do, then the two methods to stack the data based on my guess, please tell me what's the right way to do it. And please correct anything that I misunderstand.

In our projects, we have accumulated a long term records of many samples, and we hypothesize that several parameters can reflect the chance of 'event' of interest. And most likely, the parameters in 2 months before the 'event' are useful to indicate it, and ~20 days for a higher weight. But we don't know what's the best model to use fit these parameters in, so I think AutoML is the most suitable approach to find out it, right?
OK, now to the data, table 1 is an example to show how our data look like. I shorten the total time length and set the 'prediction period' to 5 days, so that the tables won't be toooo long, this applies to all the model tables in this Issue.
Table 1 Example of data from one sample

<style> </style>
Date parameter1 parameter2 โ€ฆ parameter8 EventDetection
20221001 xx xx xx xx no
20221002 xx xx xx xx no
20221003 xx xx xx xx no
20221004 xx xx xx xx no
20221005 xx xx xx xx no
20221006 xx xx xx xx no
20221007 xx xx xx xx yes
20221008 xx xx xx xx no
20221009 xx xx xx xx no
20221010 xx xx xx xx no
20221011 xx xx xx xx yes
20221012 xx xx xx xx yes
20221013 xx xx xx xx yes
20221014 xx xx xx xx no

I have data from many samples, so how should I put them together to let AutoViML know these 10 entries are from sample1, the next 10 are from sample 2? Should I just link them in tandem, add a column to label which sample they are from, like in table 2? Or should I split the records from each sample by the prediction period, like in table 3, (1_1: sample1 from day1 to day5; 1_2: sample1 from day2 to day6; sample1 from day3 to day7;โ€ฆโ€ฆ; sample100 from day2 to day6; sample100 from day3 to day7;โ€ฆโ€ฆ)?

Table 2 Data from samples in tandem

<style> </style>
Sample# Date parameters EventDetection
1 20221001 xx no
1 20221002 xx no
1 20221003 xx no
1 20221004 xx no
1 20221005 xx no
1 20221006 xx no
1 20221007 xx yes
1 20221008 xx no
1 20221009 xx no
1 20221010 xx no
1 20221011 xx yes
1 20221012 xx yes
1 20221013 xx yes
1 20221014 xx no
2 20221001 xx no
2 20221002 xx no
2 20221003 xx yes
2 20221004 xx no
2 20221005 xx no
2 20221006 xx no
2 20221007 xx no
2 20221008 xx no
2 20221009 xx no
2 20221010 xx no
2 20221011 xx yes
2 20221012 xx no
2 20221013 xx yes
2 20221014 xx no
3 20221001 xx no
3 20221002 xx no
3 20221003 xx yes
3 20221004 xx yes
3 20221005 xx yes
3 20221006 xx no
3 20221007 xx no
3 20221008 xx no
3 20221009 xx yes
3 20221010 xx no
3 20221011 xx no
3 20221012 xx no
3 20221013 xx yes
3 20221014 xx no
โ€ฆ โ€ฆ โ€ฆ โ€ฆ

Table 3 Split one experiment sample's records into periods as independent samples for training

<style> </style>
Sample# Date parameters EventDetection
1_1 20221001 xx no
1_1 20221002 xx no
1_1 20221003 xx no
1_1 20221004 xx no
1_1 20221005 xx no
1_2 20221002 xx no
1_2 20221003 xx no
1_2 20221004 xx no
1_2 20221005 xx no
1_2 20221006 xx no
1_3 20221003 xx no
1_3 20221004 xx no
1_3 20221005 xx no
1_3 20221006 xx no
1_3 20221007 xx yes
1_4 20221004 xx no
1_4 20221005 xx no
1_4 20221006 xx no
1_4 20221007 xx yes
1_4 20221008 xx no
โ€ฆ โ€ฆ โ€ฆ โ€ฆ
1_9 20221009 xx no
1_9 20221010 xx no
1_9 20221011 xx yes
1_9 20221012 xx yes
1_9 20221013 xx yes
โ€ฆ โ€ฆ โ€ฆ โ€ฆ
3_10 20221010 xx no
3_10 20221011 xx no
3_10 20221012 xx no
3_10 20221013 xx yes
3_10 20221014 xx no

Any help would be appreciated, thanks in advance!

running Auto_ViML with hyper_param='HO' throws an exception.

Im testing Auto_ViML with hyperopt on the titanic dataset, running it like this:

        train, target, test,
        verbose=0,
        hyper_param='HO',
)```

This throws the following exception:

...
############### M O D E L B U I L D I N G B E G I N S ####################
Rows in Train data set = 640
Features in Train data set = 10
Rows in held-out data set = 161
Finding Best Model and Hyper Parameters for Target: Survived...
Baseline Accuracy Needed for Model = 62.17%
CPU Count = 8 in this device
Using Linear Model, Estimated Training time = 0.02 mins
Error: Not able to print validation metrics. Continuing...
Actual training time (in seconds): 0
########### S I N G L E M O D E L R E S U L T S #################

UnboundLocalError Traceback (most recent call last)
/tmp/core/run_auto-viml.py in
42 train, target, test,
43 verbose=0,
---> 44 hyper_param='HO',
45 )
46

/usr/local/lib/python3.7/site-packages/autoviml/Auto_ViML.py in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, verbose)
1725 ############## This is for Classification Only !! ########################
1726 if scoring_parameter in ['logloss','neg_log_loss','log_loss','log-loss','']:
-> 1727 print('{}-fold Cross Validation {} = {}'.format(n_splits, 'logloss', best_score))
1728 elif scoring_parameter in ['accuracy','balanced-accuracy','balanced_accuracy','roc_auc','roc-auc',
1729 'f1','precision','recall','average-precision','average_precision',

UnboundLocalError: local variable 'best_score' referenced before assignment


specifying the evaluation metric doesnt fix the issue.

Im using `autoviml==0.1.651`.

[bug] Auto_ViML doesn't work on google colab

Hi,
AutoViML looks very interesting buth when I've tried to install and run it on google collab I've encountered issues.
A fails at import. B fails at installation.

A)

!pip install autoviml --upgrade --ignore-installed
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting autoviml
  Downloading autoviml-0.1.710-py3-none-any.whl (133 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 133.3/133.3 KB 3.1 MB/s eta 0:00:00
Collecting xgboost>=1.1.1
  Downloading xgboost-1.7.4-py3-none-manylinux2014_x86_64.whl (193.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 193.6/193.6 MB 5.6 MB/s eta 0:00:00
Collecting jupyter
  Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 126.0/126.0 KB 15.1 MB/s eta 0:00:00
Collecting imbalanced-learn>=0.7
  Downloading imbalanced_learn-0.10.1-py3-none-any.whl (226 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 226.0/226.0 KB 23.5 MB/s eta 0:00:00
Collecting regex
  Downloading regex-2022.10.31-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (772 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 772.3/772.3 KB 50.3 MB/s eta 0:00:00
Collecting matplotlib
  Downloading matplotlib-3.7.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (9.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 9.2/9.2 MB 90.8 MB/s eta 0:00:00
Collecting xlrd
  Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 96.5/96.5 KB 10.6 MB/s eta 0:00:00
Collecting nltk
  Downloading nltk-3.8.1-py3-none-any.whl (1.5 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.5/1.5 MB 52.9 MB/s eta 0:00:00
Collecting scikit-learn>=0.23.1
  Downloading scikit_learn-1.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 9.8/9.8 MB 68.3 MB/s eta 0:00:00
Collecting beautifulsoup4
  Downloading beautifulsoup4-4.11.2-py3-none-any.whl (129 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 129.4/129.4 KB 13.4 MB/s eta 0:00:00
Collecting seaborn
  Downloading seaborn-0.12.2-py3-none-any.whl (293 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 293.3/293.3 KB 31.2 MB/s eta 0:00:00
Collecting catboost
  Downloading catboost-1.1.1-cp38-none-manylinux1_x86_64.whl (76.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 76.6/76.6 MB 12.2 MB/s eta 0:00:00
Collecting shap>=0.36.0
  Downloading shap-0.41.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (575 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 575.9/575.9 KB 47.1 MB/s eta 0:00:00
Collecting emoji
  Downloading emoji-2.2.0.tar.gz (240 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 240.9/240.9 KB 25.4 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting imbalanced-ensemble>=0.1.7
  Downloading imbalanced_ensemble-0.2.0-py2.py3-none-any.whl (746 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 746.5/746.5 KB 45.7 MB/s eta 0:00:00
Collecting textblob
  Downloading textblob-0.17.1-py2.py3-none-any.whl (636 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 636.8/636.8 KB 49.4 MB/s eta 0:00:00
Collecting pandas
  Downloading pandas-1.5.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 12.2/12.2 MB 82.4 MB/s eta 0:00:00
Collecting ipython
  Downloading ipython-8.11.0-py3-none-any.whl (793 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 793.3/793.3 KB 54.2 MB/s eta 0:00:00
Collecting joblib>=0.11
  Downloading joblib-1.2.0-py3-none-any.whl (297 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 298.0/298.0 KB 29.5 MB/s eta 0:00:00
Collecting numpy>=1.16.0
  Downloading numpy-1.24.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 17.3/17.3 MB 71.0 MB/s eta 0:00:00
Collecting scipy>=1.9.1
  Downloading scipy-1.10.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.5 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 34.5/34.5 MB 11.8 MB/s eta 0:00:00
Collecting tqdm>=4.50.2
  Downloading tqdm-4.65.0-py3-none-any.whl (77 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 77.1/77.1 KB 7.4 MB/s eta 0:00:00
Collecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.38.0-py3-none-any.whl (965 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 965.4/965.4 KB 43.6 MB/s eta 0:00:00
Collecting pillow>=6.2.0
  Downloading Pillow-9.4.0-cp38-cp38-manylinux_2_28_x86_64.whl (3.4 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.4/3.4 MB 53.5 MB/s eta 0:00:00
Collecting pyparsing>=2.3.1
  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 98.3/98.3 KB 11.4 MB/s eta 0:00:00
Collecting contourpy>=1.0.1
  Downloading contourpy-1.0.7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (300 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 300.0/300.0 KB 30.3 MB/s eta 0:00:00
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.4.4-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.2/1.2 MB 67.9 MB/s eta 0:00:00
Collecting importlib-resources>=3.2.0
  Downloading importlib_resources-5.12.0-py3-none-any.whl (36 kB)
Collecting cycler>=0.10
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting python-dateutil>=2.7
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 247.7/247.7 KB 22.6 MB/s eta 0:00:00
Collecting packaging>=20.0
  Downloading packaging-23.0-py3-none-any.whl (42 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 42.7/42.7 KB 2.5 MB/s eta 0:00:00
Collecting pytz>=2020.1
  Downloading pytz-2022.7.1-py2.py3-none-any.whl (499 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 499.4/499.4 KB 37.3 MB/s eta 0:00:00
Collecting slicer==0.0.7
  Downloading slicer-0.0.7-py3-none-any.whl (14 kB)
Collecting cloudpickle
  Downloading cloudpickle-2.2.1-py3-none-any.whl (25 kB)
Collecting numba
  Downloading numba-0.56.4-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.5 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.5/3.5 MB 60.6 MB/s eta 0:00:00
Collecting soupsieve>1.2
  Downloading soupsieve-2.4-py3-none-any.whl (37 kB)
Collecting six
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting plotly
  Downloading plotly-5.13.1-py2.py3-none-any.whl (15.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 15.2/15.2 MB 63.1 MB/s eta 0:00:00
Collecting graphviz
  Downloading graphviz-0.20.1-py3-none-any.whl (47 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 47.0/47.0 KB 4.6 MB/s eta 0:00:00
Collecting jedi>=0.16
  Downloading jedi-0.18.2-py2.py3-none-any.whl (1.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.6/1.6 MB 74.9 MB/s eta 0:00:00
Collecting decorator
  Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting pexpect>4.3
  Downloading pexpect-4.8.0-py2.py3-none-any.whl (59 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 59.0/59.0 KB 7.3 MB/s eta 0:00:00
Collecting matplotlib-inline
  Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)
Collecting pickleshare
  Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)
Collecting pygments>=2.4.0
  Downloading Pygments-2.14.0-py3-none-any.whl (1.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.1/1.1 MB 61.4 MB/s eta 0:00:00
Collecting prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30
  Downloading prompt_toolkit-3.0.38-py3-none-any.whl (385 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 385.8/385.8 KB 32.0 MB/s eta 0:00:00
Collecting stack-data
  Downloading stack_data-0.6.2-py3-none-any.whl (24 kB)
Collecting backcall
  Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)
Collecting traitlets>=5
  Downloading traitlets-5.9.0-py3-none-any.whl (117 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 117.4/117.4 KB 8.6 MB/s eta 0:00:00
Collecting qtconsole
  Downloading qtconsole-5.4.0-py3-none-any.whl (121 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 121.0/121.0 KB 15.0 MB/s eta 0:00:00
Collecting ipykernel
  Downloading ipykernel-6.21.2-py3-none-any.whl (149 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 149.7/149.7 KB 17.0 MB/s eta 0:00:00
Collecting jupyter-console
  Downloading jupyter_console-6.6.2-py3-none-any.whl (24 kB)
Collecting ipywidgets
  Downloading ipywidgets-8.0.4-py3-none-any.whl (137 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 137.8/137.8 KB 15.1 MB/s eta 0:00:00
Collecting nbconvert
  Downloading nbconvert-7.2.9-py3-none-any.whl (274 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 274.9/274.9 KB 19.7 MB/s eta 0:00:00
Collecting notebook
  Downloading notebook-6.5.2-py3-none-any.whl (439 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 439.1/439.1 KB 26.8 MB/s eta 0:00:00
Collecting click
  Downloading click-8.1.3-py3-none-any.whl (96 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 96.6/96.6 KB 10.3 MB/s eta 0:00:00
Collecting requests
  Downloading requests-2.28.2-py3-none-any.whl (62 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 62.8/62.8 KB 5.7 MB/s eta 0:00:00
Collecting zipp>=3.1.0
  Downloading zipp-3.15.0-py3-none-any.whl (6.8 kB)
Collecting parso<0.9.0,>=0.8.0
  Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 100.8/100.8 KB 12.7 MB/s eta 0:00:00
Collecting ptyprocess>=0.5
  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Collecting wcwidth
  Downloading wcwidth-0.2.6-py2.py3-none-any.whl (29 kB)
Collecting nest-asyncio
  Downloading nest_asyncio-1.5.6-py3-none-any.whl (5.2 kB)
Collecting pyzmq>=20
  Downloading pyzmq-25.0.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.1/1.1 MB 61.8 MB/s eta 0:00:00
Collecting jupyter-client>=6.1.12
  Downloading jupyter_client-8.0.3-py3-none-any.whl (102 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 102.7/102.7 KB 13.1 MB/s eta 0:00:00
Collecting tornado>=6.1
  Downloading tornado-6.2-cp37-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (423 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 424.0/424.0 KB 34.6 MB/s eta 0:00:00
Collecting psutil
  Downloading psutil-5.9.4-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (280 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 280.2/280.2 KB 21.9 MB/s eta 0:00:00
Collecting comm>=0.1.1
  Downloading comm-0.1.2-py3-none-any.whl (6.5 kB)
Collecting debugpy>=1.6.5
  Downloading debugpy-1.6.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.1/3.1 MB 72.5 MB/s eta 0:00:00
Collecting jupyter-core!=5.0.*,>=4.12
  Downloading jupyter_core-5.2.0-py3-none-any.whl (94 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 94.3/94.3 KB 11.4 MB/s eta 0:00:00
Collecting widgetsnbextension~=4.0
  Downloading widgetsnbextension-4.0.5-py3-none-any.whl (2.0 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 2.0/2.0 MB 69.7 MB/s eta 0:00:00
Collecting jupyterlab-widgets~=3.0
  Downloading jupyterlab_widgets-3.0.5-py3-none-any.whl (384 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 384.3/384.3 KB 38.7 MB/s eta 0:00:00
Collecting jinja2>=3.0
  Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 133.1/133.1 KB 15.4 MB/s eta 0:00:00
Collecting markupsafe>=2.0
  Downloading MarkupSafe-2.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting nbformat>=5.1
  Downloading nbformat-5.7.3-py3-none-any.whl (78 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 78.1/78.1 KB 8.6 MB/s eta 0:00:00
Collecting pandocfilters>=1.4.1
  Downloading pandocfilters-1.5.0-py2.py3-none-any.whl (8.7 kB)
Collecting importlib-metadata>=3.6
  Downloading importlib_metadata-6.0.0-py3-none-any.whl (21 kB)
Collecting mistune<3,>=2.0.3
  Downloading mistune-2.0.5-py2.py3-none-any.whl (24 kB)
Collecting nbclient>=0.5.0
  Downloading nbclient-0.7.2-py3-none-any.whl (71 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 72.0/72.0 KB 8.5 MB/s eta 0:00:00
Collecting bleach
  Downloading bleach-6.0.0-py3-none-any.whl (162 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 162.5/162.5 KB 18.4 MB/s eta 0:00:00
Collecting jupyterlab-pygments
  Downloading jupyterlab_pygments-0.2.2-py2.py3-none-any.whl (21 kB)
Collecting defusedxml
  Downloading defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting tinycss2
  Downloading tinycss2-1.2.1-py3-none-any.whl (21 kB)
Collecting prometheus-client
  Downloading prometheus_client-0.16.0-py3-none-any.whl (122 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 122.5/122.5 KB 13.8 MB/s eta 0:00:00
Collecting ipython-genutils
  Downloading ipython_genutils-0.2.0-py2.py3-none-any.whl (26 kB)
Collecting terminado>=0.8.3
  Downloading terminado-0.17.1-py3-none-any.whl (17 kB)
Collecting nbclassic>=0.4.7
  Downloading nbclassic-0.5.2-py3-none-any.whl (10.0 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 10.0/10.0 MB 86.0 MB/s eta 0:00:00
Collecting Send2Trash>=1.8.0
  Downloading Send2Trash-1.8.0-py3-none-any.whl (18 kB)
Collecting argon2-cffi
  Downloading argon2_cffi-21.3.0-py3-none-any.whl (14 kB)
Collecting setuptools
  Downloading setuptools-67.4.0-py3-none-any.whl (1.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.1/1.1 MB 64.7 MB/s eta 0:00:00
Collecting numpy>=1.16.0
  Downloading numpy-1.23.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 17.1/17.1 MB 48.3 MB/s eta 0:00:00
Collecting llvmlite<0.40,>=0.39.0dev0
  Downloading llvmlite-0.39.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 34.6/34.6 MB 14.7 MB/s eta 0:00:00
Collecting tenacity>=6.2.0
  Downloading tenacity-8.2.2-py3-none-any.whl (24 kB)
Collecting qtpy>=2.0.1
  Downloading QtPy-2.3.0-py3-none-any.whl (83 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 83.6/83.6 KB 9.5 MB/s eta 0:00:00
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 140.6/140.6 KB 16.5 MB/s eta 0:00:00
Collecting certifi>=2017.4.17
  Downloading certifi-2022.12.7-py3-none-any.whl (155 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 155.3/155.3 KB 16.2 MB/s eta 0:00:00
Collecting idna<4,>=2.5
  Downloading idna-3.4-py3-none-any.whl (61 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 61.5/61.5 KB 7.9 MB/s eta 0:00:00
Collecting charset-normalizer<4,>=2
  Downloading charset_normalizer-3.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (195 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 195.4/195.4 KB 21.6 MB/s eta 0:00:00
Collecting executing>=1.2.0
  Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB)
Collecting asttokens>=2.1.0
  Downloading asttokens-2.2.1-py2.py3-none-any.whl (26 kB)
Collecting pure-eval
  Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Collecting platformdirs>=2.5
  Downloading platformdirs-3.1.0-py3-none-any.whl (14 kB)
Collecting notebook-shim>=0.1.0
  Downloading notebook_shim-0.2.2-py3-none-any.whl (13 kB)
Collecting jupyter-server>=1.8
  Downloading jupyter_server-2.3.0-py3-none-any.whl (365 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 365.8/365.8 KB 32.6 MB/s eta 0:00:00
Collecting fastjsonschema
  Downloading fastjsonschema-2.16.3-py3-none-any.whl (23 kB)
Collecting jsonschema>=2.6
  Downloading jsonschema-4.17.3-py3-none-any.whl (90 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 90.4/90.4 KB 9.8 MB/s eta 0:00:00
Collecting argon2-cffi-bindings
  Downloading argon2_cffi_bindings-21.2.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (86 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 86.2/86.2 KB 8.7 MB/s eta 0:00:00
Collecting webencodings
  Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting attrs>=17.4.0
  Downloading attrs-22.2.0-py3-none-any.whl (60 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 60.0/60.0 KB 6.8 MB/s eta 0:00:00
Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0
  Downloading pyrsistent-0.19.3-py3-none-any.whl (57 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 57.5/57.5 KB 6.4 MB/s eta 0:00:00
Collecting pkgutil-resolve-name>=1.3.10
  Downloading pkgutil_resolve_name-1.3.10-py3-none-any.whl (4.7 kB)
Collecting anyio>=3.1.0
  Downloading anyio-3.6.2-py3-none-any.whl (80 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 80.6/80.6 KB 7.6 MB/s eta 0:00:00
Collecting websocket-client
  Downloading websocket_client-1.5.1-py3-none-any.whl (55 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 55.9/55.9 KB 5.5 MB/s eta 0:00:00
Collecting jupyter-server-terminals
  Downloading jupyter_server_terminals-0.4.4-py3-none-any.whl (13 kB)
Collecting jupyter-events>=0.4.0
  Downloading jupyter_events-0.6.3-py3-none-any.whl (18 kB)
Collecting cffi>=1.0.1
  Downloading cffi-1.15.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (442 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 442.7/442.7 KB 38.7 MB/s eta 0:00:00
Collecting sniffio>=1.1
  Downloading sniffio-1.3.0-py3-none-any.whl (10 kB)
Collecting pycparser
  Downloading pycparser-2.21-py2.py3-none-any.whl (118 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 118.7/118.7 KB 14.0 MB/s eta 0:00:00
Collecting rfc3339-validator
  Downloading rfc3339_validator-0.1.4-py2.py3-none-any.whl (3.5 kB)
Collecting pyyaml>=5.3
  Downloading PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 701.2/701.2 KB 53.7 MB/s eta 0:00:00
Collecting rfc3986-validator>=0.1.1
  Downloading rfc3986_validator-0.1.1-py2.py3-none-any.whl (4.2 kB)
Collecting python-json-logger>=2.0.4
  Downloading python_json_logger-2.0.7-py3-none-any.whl (8.1 kB)
Collecting jsonpointer>1.13
  Downloading jsonpointer-2.3-py2.py3-none-any.whl (7.8 kB)
Collecting uri-template
  Downloading uri_template-1.2.0-py3-none-any.whl (10 kB)
Collecting isoduration
  Downloading isoduration-20.11.0-py3-none-any.whl (11 kB)
Collecting webcolors>=1.11
  Downloading webcolors-1.12-py3-none-any.whl (9.9 kB)
Collecting fqdn
  Downloading fqdn-1.5.1-py3-none-any.whl (9.1 kB)
Collecting arrow>=0.15.0
  Downloading arrow-1.2.3-py3-none-any.whl (66 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 66.4/66.4 KB 8.2 MB/s eta 0:00:00
Building wheels for collected packages: emoji
  Building wheel for emoji (setup.py) ... done
  Created wheel for emoji: filename=emoji-2.2.0-py3-none-any.whl size=234926 sha256=fb3ac821b11c4d1450f979a7fbf650cdaf49230baf22dd3605d7e5b1680ba51b
  Stored in directory: /root/.cache/pip/wheels/86/62/9e/a6b27a681abcde69970dbc0326ff51955f3beac72f15696984
Successfully built emoji
Installing collected packages: webencodings, wcwidth, Send2Trash, pytz, pure-eval, ptyprocess, pickleshare, mistune, ipython-genutils, fastjsonschema, executing, charset-normalizer, backcall, zipp, xlrd, widgetsnbextension, websocket-client, webcolors, urllib3, uri-template, traitlets, tqdm, tornado, tinycss2, threadpoolctl, tenacity, soupsieve, sniffio, slicer, six, setuptools, rfc3986-validator, regex, pyzmq, pyyaml, python-json-logger, pyrsistent, pyparsing, pygments, pycparser, psutil, prompt-toolkit, prometheus-client, platformdirs, pkgutil-resolve-name, pillow, pexpect, parso, pandocfilters, packaging, numpy, nest-asyncio, markupsafe, llvmlite, kiwisolver, jupyterlab-widgets, jupyterlab-pygments, jsonpointer, joblib, idna, graphviz, fqdn, fonttools, emoji, defusedxml, decorator, debugpy, cycler, cloudpickle, click, certifi, attrs, terminado, scipy, rfc3339-validator, requests, qtpy, python-dateutil, plotly, nltk, matplotlib-inline, jupyter-core, jinja2, jedi, importlib-resources, importlib-metadata, contourpy, comm, cffi, bleach, beautifulsoup4, asttokens, anyio, xgboost, vaderSentiment, textblob, stack-data, scikit-learn, pandas, numba, matplotlib, jupyter-server-terminals, jupyter-client, jsonschema, arrow, argon2-cffi-bindings, shap, seaborn, nbformat, isoduration, ipython, imbalanced-learn, catboost, argon2-cffi, nbclient, ipykernel, imbalanced-ensemble, qtconsole, nbconvert, jupyter-events, jupyter-console, ipywidgets, jupyter-server, notebook-shim, nbclassic, notebook, jupyter, autoviml
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
moviepy 0.2.3.5 requires decorator<5.0,>=4.0.2, but you have decorator 5.1.1 which is incompatible.
google-colab 1.0.0 requires ipykernel~=5.3.4, but you have ipykernel 6.21.2 which is incompatible.
google-colab 1.0.0 requires ipython~=7.9.0, but you have ipython 8.11.0 which is incompatible.
google-colab 1.0.0 requires notebook~=6.3.0, but you have notebook 6.5.2 which is incompatible.
cvxpy 1.2.3 requires setuptools<=64.0.2, but you have setuptools 67.4.0 which is incompatible.
Successfully installed Send2Trash-1.8.0 anyio-3.6.2 argon2-cffi-21.3.0 argon2-cffi-bindings-21.2.0 arrow-1.2.3 asttokens-2.2.1 attrs-22.2.0 autoviml-0.1.710 backcall-0.2.0 beautifulsoup4-4.11.2 bleach-6.0.0 catboost-1.1.1 certifi-2022.12.7 cffi-1.15.1 charset-normalizer-3.0.1 click-8.1.3 cloudpickle-2.2.1 comm-0.1.2 contourpy-1.0.7 cycler-0.11.0 debugpy-1.6.6 decorator-5.1.1 defusedxml-0.7.1 emoji-2.2.0 executing-1.2.0 fastjsonschema-2.16.3 fonttools-4.38.0 fqdn-1.5.1 graphviz-0.20.1 idna-3.4 imbalanced-ensemble-0.2.0 imbalanced-learn-0.10.1 importlib-metadata-6.0.0 importlib-resources-5.12.0 ipykernel-6.21.2 ipython-8.11.0 ipython-genutils-0.2.0 ipywidgets-8.0.4 isoduration-20.11.0 jedi-0.18.2 jinja2-3.1.2 joblib-1.2.0 jsonpointer-2.3 jsonschema-4.17.3 jupyter-1.0.0 jupyter-client-8.0.3 jupyter-console-6.6.2 jupyter-core-5.2.0 jupyter-events-0.6.3 jupyter-server-2.3.0 jupyter-server-terminals-0.4.4 jupyterlab-pygments-0.2.2 jupyterlab-widgets-3.0.5 kiwisolver-1.4.4 llvmlite-0.39.1 markupsafe-2.1.2 matplotlib-3.7.1 matplotlib-inline-0.1.6 mistune-2.0.5 nbclassic-0.5.2 nbclient-0.7.2 nbconvert-7.2.9 nbformat-5.7.3 nest-asyncio-1.5.6 nltk-3.8.1 notebook-6.5.2 notebook-shim-0.2.2 numba-0.56.4 numpy-1.23.5 packaging-23.0 pandas-1.5.3 pandocfilters-1.5.0 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 pillow-9.4.0 pkgutil-resolve-name-1.3.10 platformdirs-3.1.0 plotly-5.13.1 prometheus-client-0.16.0 prompt-toolkit-3.0.38 psutil-5.9.4 ptyprocess-0.7.0 pure-eval-0.2.2 pycparser-2.21 pygments-2.14.0 pyparsing-3.0.9 pyrsistent-0.19.3 python-dateutil-2.8.2 python-json-logger-2.0.7 pytz-2022.7.1 pyyaml-6.0 pyzmq-25.0.0 qtconsole-5.4.0 qtpy-2.3.0 regex-2022.10.31 requests-2.28.2 rfc3339-validator-0.1.4 rfc3986-validator-0.1.1 scikit-learn-1.2.1 scipy-1.10.1 seaborn-0.12.2 setuptools-67.4.0 shap-0.41.0 six-1.16.0 slicer-0.0.7 sniffio-1.3.0 soupsieve-2.4 stack-data-0.6.2 tenacity-8.2.2 terminado-0.17.1 textblob-0.17.1 threadpoolctl-3.1.0 tinycss2-1.2.1 tornado-6.2 tqdm-4.65.0 traitlets-5.9.0 uri-template-1.2.0 urllib3-1.26.14 vaderSentiment-3.3.2 wcwidth-0.2.6 webcolors-1.12 webencodings-0.5.1 websocket-client-1.5.1 widgetsnbextension-4.0.5 xgboost-1.7.4 xlrd-2.0.1 zipp-3.15.0

Result:

from autoviml.Auto_ViML import Auto_ViML

Imported Auto_ViML version: 0.1.710. Call using:
             m, feats, trainm, testm = Auto_ViML(train, target, test,
                            sample_submission='',
                            scoring_parameter='', KMeans_Featurizer=False,
                            hyper_param='RS',feature_reduction=True,
                             Boosting_Flag='CatBoost', Binning_Flag=False,
                            Add_Poly=0, Stacking_Flag=False,Imbalanced_Flag=False,
                            verbose=1)
            

Imported Auto_NLP version: 0.1.01.. Call using:
     train_nlp, test_nlp, nlp_pipeline, predictions = Auto_NLP(
                nlp_column, train, test, target, score_type='balanced_accuracy',
                modeltype='Classification',top_num_features=200, verbose=0,
                build_model=True)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module compiled against API version 0x10 but this version of numpy is 0xf
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[<ipython-input-3-addf35cd01ca>](https://localhost:8080/#) in <module>
----> 1 from autoviml.Auto_ViML import Auto_ViML

8 frames
[/usr/local/lib/python3.8/dist-packages/autoviml/Auto_ViML.py](https://localhost:8080/#) in <module>
     16 import warnings
     17 warnings.filterwarnings("ignore")
---> 18 from sklearn.exceptions import DataConversionWarning
     19 warnings.filterwarnings(action='ignore', category=DataConversionWarning)
     20 with warnings.catch_warnings():

[/usr/local/lib/python3.8/dist-packages/sklearn/__init__.py](https://localhost:8080/#) in <module>
     80     from . import _distributor_init  # noqa: F401
     81     from . import __check_build  # noqa: F401
---> 82     from .base import clone
     83     from .utils._show_versions import show_versions
     84 

[/usr/local/lib/python3.8/dist-packages/sklearn/base.py](https://localhost:8080/#) in <module>
     15 from . import __version__
     16 from ._config import get_config
---> 17 from .utils import _IS_32BIT
     18 from .utils._set_output import _SetOutputMixin
     19 from .utils._tags import (

[/usr/local/lib/python3.8/dist-packages/sklearn/utils/__init__.py](https://localhost:8080/#) in <module>
     23 from .deprecation import deprecated
     24 from .discovery import all_estimators
---> 25 from .fixes import parse_version, threadpool_info
     26 from ._estimator_html_repr import estimator_html_repr
     27 from .validation import (

[/usr/local/lib/python3.8/dist-packages/sklearn/utils/fixes.py](https://localhost:8080/#) in <module>
     17 import numpy as np
     18 import scipy
---> 19 import scipy.stats
     20 import threadpoolctl
     21 

[/usr/local/lib/python3.8/dist-packages/scipy/stats/__init__.py](https://localhost:8080/#) in <module>
    483 from ._warnings_errors import (ConstantInputWarning, NearConstantInputWarning,
    484                                DegenerateDataWarning, FitError)
--> 485 from ._stats_py import *
    486 from ._variation import variation
    487 from .distributions import *

[/usr/local/lib/python3.8/dist-packages/scipy/stats/_stats_py.py](https://localhost:8080/#) in <module>
     35 from numpy import array, asarray, ma
     36 from numpy.lib import NumpyVersion
---> 37 from numpy.testing import suppress_warnings
     38 
     39 from scipy.spatial.distance import cdist

[/usr/local/lib/python3.8/dist-packages/numpy/testing/__init__.py](https://localhost:8080/#) in <module>
      8 from unittest import TestCase
      9 
---> 10 from ._private.utils import *
     11 from ._private.utils import (_assert_valid_refcount, _gen_alignment_data)
     12 from ._private import extbuild, decorators as dec

[/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in <module>
     21 from numpy.core import(
     22      intp, float32, empty, arange, array_repr, ndarray, isnat, array)
---> 23 import numpy.linalg.lapack_lite
     24 
     25 from io import StringIO

ImportError: numpy.core.multiarray failed to import

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------


B)

!pip install git+https://github.com/AutoViML/Auto_ViML.git
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/AutoViML/Auto_ViML.git
  Cloning https://github.com/AutoViML/Auto_ViML.git to /tmp/pip-req-build-5pj491sm
  Running command git clone --filter=blob:none --quiet https://github.com/AutoViML/Auto_ViML.git /tmp/pip-req-build-5pj491sm
  Resolved https://github.com/AutoViML/Auto_ViML.git to commit 64eca10c83f668bdc00e15c2bb4ab5496f6f224f
  error: subprocess-exited-with-error
  
  ร— python setup.py egg_info did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

ร— Encountered error while generating package metadata.
โ•ฐโ”€> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Error saving output files to disk

I found this error with saving output files to disk. Maybe a sanitization function for the target should help.
Thanks,
Daniel

Saving predictions to .\Avg_Quadrat_Yield(t/ha)\Avg_Quadrat_Yield(t/ha)_Regression_test_modified.csv
    Error: Not able to save test modified file. Skipping...
    Saving predictions to .\Avg_Quadrat_Yield(t/ha)\Avg_Quadrat_Yield(t/ha)_Regression_submission.csv
    Error: Not able to save submission file. Skipping...
    Saving predictions to .\Avg_Quadrat_Yield(t/ha)\Avg_Quadrat_Yield(t/ha)_Regression_train_modified.csv
    Error: Not able to save train modified file. Skipping...

Saving Model Weights

Does anyone know which all model are being trained by Auto Viml and how to save the weights of the final trained model

For sklearn above 1.0, the model. estimator didn't work, while 'base_estimator' worked.

Well, maybe this is an issue that I should post in scikit-learn project?
I installed scikit-learn of 1.1:
my modification for the codes:
image
results and errors:
Screenshot 2023-11-24 175528

For 1.3 of scikit-learn, the result is exactly same.
btw, my numpy is 1.26, I failed dozens of times to build the wheel for the ~1.19 version suggested in requirments.txt, but I doubt that this may affect sklearn CalibratedClassifierCV object's estimator attribute.

ModuleNotFoundError: No module named 'lightgbm'

I've installed from pip. I'm using py 3.9.

When attempting:

train_x, test_x, final, predicted= Auto_NLP(input_feature, train, test,target,score_type="balanced_accuracy",top_num_features=100,modeltype="Classification", verbose=2, build_model=True)

I got:

[nltk_data]    | 
[nltk_data]  Done downloading collection popular

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 train_x, test_x, final, predicted= Auto_NLP(input_feature, train, test,target,score_type="balanced_accuracy",top_num_features=100,modeltype="Classification", verbose=2, build_model=True)

File /usr/local/lib/python3.9/site-packages/autoviml/Auto_NLP.py:1334, in Auto_NLP(nlp_column, train, test, target, score_type, modeltype, top_num_features, verbose, build_model)
   1332 nltk.download("popular")
   1333 calibrator_flag = False
-> 1334 from lightgbm import LGBMClassifier, LGBMRegressor
   1335 seed = 99
   1336 train = copy.deepcopy(train)

ModuleNotFoundError: No module named 'lightgbm'

A missing dependency?

error printing set of missing columns

print(' Missing columns = %s' %set(list(train))-set(flat_list))

could be fixed by
print(' Missing columns = %s' %list(set(list(train))-set(flat_list)))

log:

TypeError                                 Traceback (most recent call last)

<ipython-input-25-4b7137719194> in <module>
      1 model, features, trainm, testm = Auto_ViML(
      2     fp.data,
----> 3     target="featureName",test="")
      4 

c:\foo\venv2\lib\site-packages\autoviml\Auto_ViML.py in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, verbose)
    622     count = 0
    623     #################    CLASSIFY  COLUMNS   HERE    ######################
--> 624     var_df = classify_columns(orig_train[orig_preds], verbose)
    625     #####       Classify Columns   ################
    626     id_cols = var_df['id_vars']

c:\foo\venv2\lib\site-packages\autoviml\Auto_ViML.py in classify_columns(df_preds, verbose)
   3201         ls = sum_all_cols.values()
   3202         flat_list = [item for sublist in ls for item in sublist]
-> 3203         print('    Missing columns = %s' %(set(list(train))-set(flat_list)))
   3204     return sum_all_cols
   3205 #################################################################################

TypeError: unsupported operand type(s) for -: 'str' and 'set'`


TypeError when performing Auto_NLP for a regression problem

Hi AutoViML community,

Thank you for providing this amazing package.

I am trying my hands on a regression problem and ended up with typeerror. Please note that the same can be replicated in a kaggle kernel. The code and error is shared below for your reference.

Thanks -

nlp_column = 'Product_Information'
target = 'Product_Price'
train_nlp, test_nlp, nlp_transformer, preds = Auto_NLP(
                nlp_column, train, test, target, score_type='neg_mean_squared_error',
                modeltype='Regression',top_num_features=50, verbose=2,
                build_model=True)

error:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-1ec6fbec18bb> in <module>
      4                 nlp_column, train, test, target, score_type='neg_mean_squared_error',
      5                 modeltype='Regression',top_num_features=50, verbose=2,
----> 6                 build_model=True)

/opt/conda/lib/python3.7/site-packages/autoviml/Auto_NLP.py in Auto_NLP(nlp_column, train, test, target, score_type, modeltype, top_num_features, verbose, build_model)
   1197         gs = RandomizedSearchCV(nlp_model,params, n_iter=10, cv=scv,
   1198                                 scoring=score_type, random_state=seed)
-> 1199         gs.fit(X_train_dtm,y_train)
   1200         y_pred = gs.predict(X_test_dtm)
   1201         ##### Print the model results on Cross Validation data set (held out)

/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    737             refit_start_time = time.time()
    738             if y is not None:
--> 739                 self.best_estimator_.fit(X, y, **fit_params)
    740             else:
    741                 self.best_estimator_.fit(X, **fit_params)

/opt/conda/lib/python3.7/site-packages/sklearn/linear_model/_least_angle.py in fit(self, X, y, Xy)
    955             returns an instance of self.
    956         """
--> 957         X, y = check_X_y(X, y, y_numeric=True, multi_output=True)
    958 
    959         alpha = getattr(self, 'alpha', 0.)

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    753                     ensure_min_features=ensure_min_features,
    754                     warn_on_dtype=warn_on_dtype,
--> 755                     estimator=estimator)
    756     if multi_output:
    757         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    509                                       dtype=dtype, copy=copy,
    510                                       force_all_finite=force_all_finite,
--> 511                                       accept_large_sparse=accept_large_sparse)
    512     else:
    513         # If np.array(..) gives ComplexWarning, then we convert the warning

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in _ensure_sparse_format(spmatrix, accept_sparse, dtype, copy, force_all_finite, accept_large_sparse)
    304 
    305     if accept_sparse is False:
--> 306         raise TypeError('A sparse matrix was passed, but dense '
    307                         'data is required. Use X.toarray() to '
    308                         'convert to a dense numpy array.')

TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

getting ValueError when running notebook with XGBoost on Titanic dataset.

Hi,

Thanks for sharing your work!
I just tested the titanic dataset downloaded from https://www.kaggle.com/c/titanic/data with XGBoost as below-
m, feats, trainm, testm = Auto_ViML(train, target, test, sample_submission, scoring_parameter=scoring_parameter, hyper_param='GS',feature_reduction=True, Boosting_Flag=True,Binning_Flag=False, Add_Poly=0, Stacking_Flag=False, Imbalanced_Flag=False, verbose=1)

Once I ran the above code then found below error-
ValueError: DataFrame.dtypes for data must be int, float or bool. Did not expect the data types in fields Name

It seems same error occurs in case of Boosting_Flag=None. Logs of the console just prior to error is as below-

Train (Size: 891,12) has Single_Label with target: ['Survived']
"
################### Binary-Class ##################### "
Shuffling the data set before training
Class -> Counts -> Percent
1: 342 -> 38.4%
0: 549 -> 61.6%
Selecting 2-Class Classifier...
Using GridSearchCV for Hyper Parameter tuning...
Target Survived is already numeric. No transformation done.
Top columns in Train with missing values: ['Cabin', 'Age', 'Embarked']
and their missing value totals: [687, 177, 2]
Classifying variables in data set...
Number of Numeric Columns = 2
Number of Integer-Categorical Columns = 3
Number of String-Categorical Columns = 1
Number of Factor-Categorical Columns = 0
Number of String-Boolean Columns = 1
Number of Numeric-Boolean Columns = 0
Number of Discrete String Columns = 2
Number of NLP String Columns = 0
Number of Date Time Columns = 0
Number of ID Columns = 2
Number of Columns to Delete = 0
11 Predictors classified...
This does not include the Target column(s)
2 variables removed since they were some ID or low-information variables
Completed Label Encoding, Missing Value Imputing and Scaling of data without errors.
No Missing values in Train
Test data has no missing values
Number of numeric variables = 5
No variables were removed since no highly correlated variables found in data

Data Ready for Modeling with Target variable = Survived
Starting Selection among 11 predictors...
Number of numeric variables = 5
No variables were removed since no highly correlated variables found in data
Adding 6 categorical variables to reduced numeric variables of 5
Selected No. of variables = 11
Finding Important Features...
in 11 variables

memory issue while running multi label classification

Hi Ram,

I am trying multi label classification with the following parameters & encountering "Kernal tried to allocate more memory than available" error, while running in kaggle with memory of 16GB. The predictor in train is title & abstract of the book. Is there is any way to overcome this memory issue?

model, features, trainm, testm = Auto_ViML(
train=train,
target=['Computer_Science', 'Physics', 'Mathematics', 'Statistics', 'Quantitative_Biology', 'Quantitative_Finance'],
test=test,
sample_submission=sample_submission,
hyper_param="GS",
feature_reduction=False,
scoring_parameter="f1",
KMeans_Featurizer=False,
Boosting_Flag=True,
Binning_Flag=True,
Add_Poly=False,
Stacking_Flag=False,
Imbalanced_Flag=False,
verbose=2,
)

image

Thanks,
Venkatesh

TypeError unexpected keyword argument 'base_estimator'

Hello
I have a problem with the use of imbalanced-ensemble. In version 0.2.0 the parameter base_estimator was renamed to estimator.
This leads to the following error:

TypeError Traceback (most recent call last)
Cell In[4], line 1
----> 1 m, feats, trainm, testm = Auto_ViML(train, "purchase", "", "",
2 scoring_parameter="balanced_accuracy",
3 hyper_param='RS',feature_reduction=True,
4 Boosting_Flag=True,Binning_Flag=False,
5 Add_Poly=0, Stacking_Flag=False,
6 Imbalanced_Flag=True,
7 verbose=1)

File /opt/python/envs/minimal/lib/python3.8/site-packages/autoviml/Auto_ViML.py:1982, in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, GPU_flag, verbose)
1980 if Imbalanced_Flag:
1981 rf = RandomForestClassifier(n_estimators=100, random_state=99)
-> 1982 xgbm = SelfPacedEnsembleClassifier(base_estimator=rf, n_jobs=-1, soft_resample_flag=True)
1983 hyper_param = 'Imb'
1984 model_name = 'SPE'

File /opt/python/envs/minimal/lib/python3.8/site-packages/imbens/utils/_validation.py:604, in _deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
597 warnings.warn(
598 f"Pass {', '.join(args_msg)} as keyword args. From version 0.9 "
599 f"passing these as positional arguments will "
600 f"result in an error",
601 FutureWarning,
602 )
603 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
--> 604 return f(**kwargs)

TypeError: __init__() got an unexpected keyword argument 'base_estimator'

best regards and many thanks s

here the used version:

from autoviml.Auto_ViML import Auto_ViML

Imported Auto_ViML version: 0.1.713. Call using:
             m, feats, trainm, testm = Auto_ViML(train, target, test,
                            sample_submission='',
                            scoring_parameter='', KMeans_Featurizer=False,
                            hyper_param='RS',feature_reduction=True,
                             Boosting_Flag='CatBoost', Binning_Flag=False,
                            Add_Poly=0, Stacking_Flag=False,Imbalanced_Flag=False,
                            GPU_flag=False, verbose=1)
            

Imported Auto_NLP version: 0.1.01.. Call using:
     train_nlp, test_nlp, nlp_pipeline, predictions = Auto_NLP(
                nlp_column, train, test, target, score_type='balanced_accuracy',
                modeltype='Classification',top_num_features=200, verbose=0,
                build_model=True)

UnboundLocalError: local variable 'missing_cols' referenced before assignment`

I'm getting the following error:

`Filling missing values with "missing" placeholder and adding a column for missing_flags

UnboundLocalError Traceback (most recent call last)
in
12 Stacking_Flag=False,
13 Imbalanced_Flag=True,
---> 14 verbose=2,
15 )

/sas_cambrian/Projects/hamilton_attrib_276597/rxb427/python3_env/lib/python3.6/site-packages/autoviml/Auto_ViML.py in Auto_ViML(train, target, test, sample_submission, hyper_param, feature_reduction, scoring_parameter, Boosting_Flag, KMeans_Featurizer, Add_Poly, Stacking_Flag, Binning_Flag, Imbalanced_Flag, verbose)
754 preds.append(new_missing_col)
755 missing_flag_cols.append(new_missing_col)
--> 756 elif f in missing_cols:
757 #### YOu have to do nothing for missing column yet. Leave them as is for Iterative Imputer later ##############
758 continue

UnboundLocalError: local variable 'missing_cols' referenced before assignment`

I didn't have time to dig too deep in the code. But it does appear the missing_cols gets defined in an if statement that appears to be checking if the test dataset exist before it runs. So it appears there is a way that you can get to the 756 check without providing a test dataset.

So either the defining of the missing_cols needs to change or you need to fix the logical error that allows the f in missing_cols check. You could also just get rid of that check since it doesn't do anything right now.

Training model errors out without context/stacktrace.

Training regular model first time is Erroring: Check if your Input is correct...
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-32-d59141079a76> in <module>
     12     Stacking_Flag=False,
     13     Imbalanced_Flag=False,
---> 14     verbose=0,
     15 )

TypeError: 'NoneType' object is not iterable

I DO think there is probably something wrong with the data I'm feeding in or how I'm feeding it in. However, The fact that only the error is printed and there is no context around where, what, etc. is erroring out is why I'm raising this to an issue. Its near impossible to figure out what is wrong with sparse information provided.

In short, you should probably surface the error from the training of the model here with the stacktrace.

Getting unboundlocalerror when setting Boosting_flag=True

Getting error when trying to use xgboost.

File "C:\ProgramData\Anaconda3\lib\site-packages\autoviml\Auto_ViML.py", line 1242, in Auto_ViML
print('%d-fold Cross Validation %s = %0.1f%%' %(n_splits,scoring_parameter, best_score*100))

UnboundLocalError: local variable 'best_score' referenced before assignment

how to load the Catboost model for predicting new data

I used the Catboost to train the model and saved the trained model by using the code (m.save_model('Catboost.dump')).

However, when I failed to load such a saved model for predicting the new data unless I trained the model again. I used the code (m.load_model('Catboost.dump')); however, the bug is name "m" is not defined.

The question is how to load such a model for predicting new data without training the model each time.

Thanks!

Sample Weight Support for Regression Problems

First want to say thank you for the very interesting looking library. I've tried it briefly, and gotten very strong performance.

I wanted to ask whether it would be possible to add sample weight support for regression problems. This is typically done in scikit-learn estimators by simply passing a sample_weight parameter after X, and y. For example, LinearRegression, XGBoost, or Catboost all support the same API, so I'm hopeful this is a fairly straightforward addition.

Under the hood it's typically just multiplying the loss for each row by the sample weight, in order to give certain observations more weight than others. This can be very helpful for problems where you have sensor data with different quality sensors, or for simply downweighting older observations.

running Auto_ViML with python interpreter throws an ipython exception.

Im testing Auto_Viml as one of automl containers in my pipeline. However, running Auto_Viml function from a script (or importing it from standard python REPL) throws the following exception:

    from autoviml.Auto_ViML import Auto_ViML
  File "/usr/local/lib/python3.7/site-packages/autoviml/Auto_ViML.py", line 29, in <module>
    get_ipython().magic(u'matplotlib inline')
NameError: name 'get_ipython' is not defined

This can be fixed by either using a jupyter notebook (which i imagine is the only thing tested so far) or using ipython instead of python. This is ok for toy examples but in production systems python is the default executable.

Would make sense to make ipython magic not fail.

I am using autoviml==0.1.651

Integration with sklearn-evaluation

I think adding an integration/tutorial and the right documentation of the framework can go a long way.

Usually, when using Auto_ViML you'd get the final model, and then you'll check performance and results.
This will allow an easier mechanism for the users to solve the second part of it.

How do use save model?

How do you use the saved model? How do I get the transformations during the training for predictions later on?

SHAP Plots

I am getting "Could not plot SHAP values since SHAP is not installed or could not import SHAP in this machine" message for creating XGBoost model. Which version of SHAP is compatible with AutoVIML?

UnboundLocalError: local variable 'cols' referenced before assignment

I found the error from version 0.1.701 in Auto_ViML

UnboundLocalError: local variable 'cols' referenced before assignment

After this, y_pred is a Series from now on. You need y_pred.values

if len(cols) == 5:
print(' Calculating weighted average ensemble of %d classifiers' %len(new_cols))

It should be new_cols ?

groups in cross validation?

I would like to use the group params in the cross validation like in sklearn ? Is it possible to do so ?

Otherwise, thank you for your package, this is a very cool work here, especially the visualization part.

Is it possible to generate sklearn code

Is it possible that after the call to Auto_ViML function, we generate sklearn python code which generate the same model that the Auto_ViML function produced? Or at least the architecture of the model so we can reproduce it in sklearn?

Thank you.

How to reuse the trained model?

I find that the Auto_Viml main function is great when you have both the train and test datasets at the same time. This is good for kaggle, but not for real world operations where the inference is done after the model has been trained.

I see that the output of the main function is a trained model, and the train and test datasets with the required features (this is not even true btw, the testm and trainm dont have the same output).

However, the trained model is not a pipeline, but a simple model (logisticregression in a vanilla run on the titanic dataset.).

Would it be possible to actually export a pipeline that can perform inference in a dataset with the same features as the original training one?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.