Giter Club home page Giter Club logo

pycaret's Introduction

drawing

An open-source, low-code machine learning library in Python

๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ PyCaret 3.3 is now available. ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰

pip install --upgrade pycaret

Docs โ€ข Tutorials โ€ข Blog โ€ข LinkedIn โ€ข YouTube โ€ข Slack

Overview
CI/CD pytest on push Documentation Status
Code !pypi !python-versions !black
Downloads Downloads Downloads Downloads
License License
Community Slack

alt text

Welcome to PyCaret

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that speeds up the experiment cycle exponentially and makes you more productive.

In comparison with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few lines only. This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, Optuna, Hyperopt, Ray, and few more.

The design and simplicity of PyCaret are inspired by the emerging role of citizen data scientists, a term first used by Gartner. Citizen Data Scientists are power users who can perform both simple and moderately sophisticated analytical tasks that would previously have required more technical expertise. PyCaret was inspired by the caret library in R programming language.

๐Ÿš€ Installation

๐ŸŒ Option 1: Install via PyPi

PyCaret is tested and supported on 64-bit systems with:

  • Python 3.9, 3.10 and 3.11
  • Ubuntu 16.04 or later
  • Windows 7 or later

You can install PyCaret with Python's pip package manager:

# install pycaret
pip install pycaret

PyCaret's default installation will not install all the optional dependencies automatically. Depending on the use case, you may be interested in one or more extras:

# install analysis extras
pip install pycaret[analysis]

# models extras
pip install pycaret[models]

# install tuner extras
pip install pycaret[tuner]

# install mlops extras
pip install pycaret[mlops]

# install parallel extras
pip install pycaret[parallel]

# install test extras
pip install pycaret[test]

##

# install multiple extras together
pip install pycaret[analysis,models]

Check out all optional dependencies. If you want to install everything including all the optional dependencies:

# install full version
pip install pycaret[full]

๐Ÿ“„ Option 2: Build from Source

Install the development version of the library directly from the source. The API may be unstable. It is not recommended for production use.

pip install git+https://github.com/pycaret/pycaret.git@master --upgrade

๐Ÿ“ฆ Option 3: Docker

Docker creates virtual environments with containers that keep a PyCaret installation separate from the rest of the system. PyCaret docker comes pre-installed with a Jupyter notebook. It can share resources with its host machine (access directories, use the GPU, connect to the Internet, etc.). The PyCaret Docker images are always tested for the latest major releases.

# default version
docker run -p 8888:8888 pycaret/slim

# full version
docker run -p 8888:8888 pycaret/full

๐Ÿƒโ€โ™‚๏ธ Quickstart

1. Functional API

# Classification Functional API Example

# loading sample dataset
from pycaret.datasets import get_data
data = get_data('juice')

# init setup
from pycaret.classification import *
s = setup(data, target = 'Purchase', session_id = 123)

# model training and selection
best = compare_models()

# evaluate trained model
evaluate_model(best)

# predict on hold-out/test set
pred_holdout = predict_model(best)

# predict on new data
new_data = data.copy().drop('Purchase', axis = 1)
predictions = predict_model(best, data = new_data)

# save model
save_model(best, 'best_pipeline')

2. OOP API

# Classification OOP API Example

# loading sample dataset
from pycaret.datasets import get_data
data = get_data('juice')

# init setup
from pycaret.classification import ClassificationExperiment
s = ClassificationExperiment()
s.setup(data, target = 'Purchase', session_id = 123)

# model training and selection
best = s.compare_models()

# evaluate trained model
s.evaluate_model(best)

# predict on hold-out/test set
pred_holdout = s.predict_model(best)

# predict on new data
new_data = data.copy().drop('Purchase', axis = 1)
predictions = s.predict_model(best, data = new_data)

# save model
s.save_model(best, 'best_pipeline')

๐Ÿ“ Modules

Classification

Functional API OOP API

Regression

Functional API OOP API

Time Series

Functional API OOP API

Clustering

Functional API OOP API

Anomaly Detection

Functional API OOP API

๐Ÿ‘ฅ Who should use PyCaret?

PyCaret is an open source library that anybody can use. In our view the ideal target audience of PyCaret is:

  • Experienced Data Scientists who want to increase productivity.
  • Citizen Data Scientists who prefer a low code machine learning solution.
  • Data Science Professionals who want to build rapid prototypes.
  • Data Science and Machine Learning students and enthusiasts.

๐ŸŽฎ Training on GPUs

To train models on the GPU, simply pass use_gpu = True in the setup function. There is no change in the use of the API; however, in some cases, additional libraries have to be installed. The following models can be trained on GPUs:

  • Extreme Gradient Boosting
  • CatBoost
  • Light Gradient Boosting Machine requires GPU installation
  • Logistic Regression, Ridge Classifier, Random Forest, K Neighbors Classifier, K Neighbors Regressor, Support Vector Machine, Linear Regression, Ridge Regression, Lasso Regression requires cuML >= 0.15

๐Ÿ–ฅ๏ธ PyCaret Intel sklearnex support

You can apply Intel optimizations for machine learning algorithms and speed up your workflow. To train models with Intel optimizations use sklearnex engine. There is no change in the use of the API, however, installation of Intel sklearnex is required:

pip install scikit-learn-intelex

๐Ÿค Contributors

๐Ÿ“ License

PyCaret is completely free and open-source and licensed under the MIT license.

โ„น๏ธ More Information

Important Links Description
โญ Tutorials Tutorials developed and maintained by core developers
๐Ÿ“‹ Example Notebooks Example notebooks created by community
๐Ÿ“™ Blog Official blog by creator of PyCaret
๐Ÿ“š Documentation API docs
๐Ÿ“บ Videos Video resources
โœˆ๏ธ Cheat sheet Community Cheat sheet
๐Ÿ“ข Discussions Community Discussion board on GitHub
๐Ÿ› ๏ธ Release Notes Release Notes

pycaret's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pycaret's Issues

int32 issue

Hi,
I tried to use the package for classification problem, and when i entered the dataframe with target data in setup environment and ran the code, it showed no dtype for the dataframe, but it has dtype of 'int32'. When i changed the dtype to int64, it worked. Can you fix this issue.

AttributeError

Hi - I really enjoy the package! However, when executing the script on a server, I get the following error:

Traceback (most recent call last):
File "RF_pycaret.py", line 15, in
model1 = create_model('rf')
File "/home/../.conda/envs/pycaret/lib/python3.8/site-packages/pycaret/regression.py", line 1691, in create_model
display_id = display_.display_id
AttributeError: 'NoneType' object has no attribute 'display_id'

Any clue what's going on?
Thanks for your help!

Problem with numba

Hi everyone!
When I try to run script where I use pycaret classification module this thing happens:

Traceback (most recent call last):
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\core\typeconv\typeconv.py", line 4, in
from numba.core.typeconv import _typeconv
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/tamara.ciric/PycharmProjects/intelliq/model/risk/services/ml/pycaret_train.py", line 76, in
train(date=datetime(2019, 1, 1), sources=config.DATA_SOURCES)
File "C:/Users/tamara.ciric/PycharmProjects/intelliq/model/risk/services/ml/pycaret_train.py", line 43, in train
exp_clf = setup(X, target=config.Y)
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\pycaret\classification.py", line 880, in setup
from pycaret import preprocess
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\pycaret\preprocess.py", line 26, in
from pyod.models.knn import KNN
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\pyod_init_.py", line 4, in
from . import utils
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\pyod\utils_init_.py", line 11, in
from .stat_models import pairwise_distances_no_broadcast
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\pyod\utils\stat_models.py", line 11, in
from numba import njit
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba_init_.py", line 20, in
from numba.misc.special import (
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\misc\special.py", line 3, in
from numba.core.typing.typeof import typeof
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\core\typing_init_.py", line 1, in
from .context import BaseContext, Context
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\core\typing\context.py", line 11, in
from numba.core.typeconv import Conversion, rules
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\core\typeconv\rules.py", line 2, in
from .typeconv import TypeManager, TypeCastingRules
File "C:\ProgramData\Anaconda3_all\envs\intelliq_risk\lib\site-packages\numba\core\typeconv\typeconv.py", line 17, in
raise ImportError(msg % (url, reportme, str(e), sys.executable))
ImportError: Numba could not be imported.
If you are seeing this message and are undertaking Numba development work, you may need to re-run:

python setup.py build_ext --inplace

(Also, please check the development set up guide http://numba.pydata.org/numba-doc/latest/developer/contributing.html.)

If you are not working on Numba development:

Please report the error message and traceback, along with a minimal reproducer
at: https://github.com/numba/numba/issues/new

If more help is needed please feel free to speak to the Numba core developers
directly at: https://gitter.im/numba/numba

Thanks in advance for your help in improving Numba!

The original error was: 'DLL load failed: The specified module could not be found.'

If possible please include the following in your error report:

sys.executable: C:\ProgramData\Anaconda3_all\envs\intelliq_risk\python.exe

Tree visulization

Hi,

Thank You for a nice package.

Is there a way to visualize the tree output in Pycaret?
Like plot_tree or export_graphviz?

Creating separate Models

Please make a alternative model name's dictionary for Creating separate Models.
PyCaret doesn't recognize 'Extra Trees Classifier'
image

It would be helpful to get the names from this table
image

Vocab Size showing zero while using NLP (pycaret.nlp)

Can you please provides the solution about this, I try many data but vocab size is showing zero and not working create_model('lda'), showing error "cannot compute LDA over an empty collection (no terms)"

I attached screen shot also you can see below..?
12

interpret_model() problem with XGBoost regression

I was trying interpret_model with XGBoost for a regression model and had this error

shap_xgboost

I searched for it and seems to be a known issue of shap-0.32.1.
shap/shap#887
I updated to shap-0.35.0 and the problem was fixed. Maybe you should consider not using the shap-0.32.1 version.

Problem installing pycaret

Hi everyone,

I had some issues pip-installing pycaret. Some of my other libraries use the same underlying packages as pycaret, but require more recent versions.

Here are the error messages I get:

ERROR: prettierplot 0.1.1 has requirement scikit-learn>=0.22.1, but you'll have scikit-learn 0.22 which is incompatible.
ERROR: mlmachine 0.1.3 has requirement catboost>=0.22, but you'll have catboost 0.20.2 which is incompatible.
ERROR: mlmachine 0.1.3 has requirement scikit-learn>=0.22.1, but you'll have scikit-learn 0.22 which is incompatible.
ERROR: mlmachine 0.1.3 has requirement shap>=0.35.0, but you'll have shap 0.32.1 which is incompatible.
ERROR: mlmachine 0.1.3 has requirement xgboost>=1.0.2, but you'll have xgboost 0.90 which is incompatible.

Would it be a possibility to update your requirements file ?

Thank you

Bayesian Hyperparameter Optimization

Hi,
I was wondering if we can have Bayesian Hyperparameter Optimization technique used instead of Random Grid. This will help with speed of tuning and allow us to scrape through much larger grid scientifically. We can have this enhancement along with ability to add custom grid in tuning.

Thanks

Ipython exceptions for python

When using python directly Ipython is not supported hence display function doesn't work. Create an exception rule for future.

xgboost not working

First of all, I love this package. It is great and makes training, testing, and tuning so much simpler.

I am unable to compare or create an xgboost classification or an xgboost regression model. With the same dataset, all of the other models work except xgboost. For the classification, i get the error:
"attempt to get argmax of an empty sequence"

For the regression, i get the error:
"Found array with 0 feature(s) (shape=(4035, 0)) while a minimum of 1 is required."

Please advise.

Top Models from Compare Models

Hi,

I was wondering if there is a way to select top 'n' models from compare models. If we can have this feature, the whole training process can be automated. This would also help users to use it in platforms like KNIME/Power BI that do not support HTML output.

Thanks,
Riaz

Change Fold Strategy

In pycaret, the number of k-folds can be changed by using fold as an argument.
However, the fold strategy is fixed as random split (i.e. KFold), cannot be changed by the user. For general problems, there are many cases where you want to use Group KFold or TimeSplitFold.

So I propose that allow the user to change the fold strategy by allowing them to pass an instance that inherits _BaseFold https://github.com/scikit-learn/scikit-learn/blob/95d4f0841d57e8b5f6b2a570312e9d832e69debc/sklearn/model_selection/_split.py#L269.

It will be a fairly widespread change, but I think the change will make the project be more wonderful

Best Regards.

Dataframe constructor not properly called

While running a simple linear regression model on kaggle using pycaret, I get the error
DataFrame constructor not properly called on running setup. The dataset contains only two columns.

df=pd.read_csv("/kaggle/input/salary-data-simple-linear-regression/Salary_Data.csv")`
setup_data1 = setup(data = df, target = 'Salary', session_id=123) 

Logs:

ValueError Traceback (most recent call last)
in
----> 1 setup_data1 = setup(data = df, target = 'Salary', session_id=123)

/opt/conda/lib/python3.6/site-packages/pycaret/regression.py in setup(data, target, train_size, sampling, sample_estimator, categorical_features, categorical_imputation, ordinal_features, high_cardinality_features, high_cardinality_method, numeric_features, numeric_imputation, date_features, ignore_features, normalize, normalize_method, transformation, transformation_method, handle_unknown_categorical, unknown_categorical_method, pca, pca_method, pca_components, ignore_low_variance, combine_rare_levels, rare_level_threshold, bin_numeric_features, remove_outliers, outliers_threshold, remove_multicollinearity, multicollinearity_threshold, create_clusters, cluster_iter, polynomial_features, polynomial_degree, trigonometry_features, polynomial_threshold, group_features, group_names, feature_selection, feature_selection_threshold, feature_interaction, feature_ratio, interaction_threshold, transform_target, transform_target_method, session_id, silent, profile)
955 target_transformation = transform_target, #new
956 target_transformation_method = transform_target_method_pass, #new
--> 957 random_state = seed)
958
959 progress.value += 1

/opt/conda/lib/python3.6/site-packages/pycaret/preprocess.py in Preprocess_Path_One(train_data, target_variable, ml_usecase, test_data, categorical_features, numerical_features, time_features, features_todrop, display_types, imputation_type, numeric_imputation_strategy, categorical_imputation_strategy, apply_zero_nearZero_variance, club_rare_levels, rara_level_threshold_percentage, apply_untrained_levels_treatment, untrained_levels_treatment_method, apply_ordinal_encoding, ordinal_columns_and_categories, apply_cardinality_reduction, cardinal_method, cardinal_features, apply_binning, features_to_binn, apply_grouping, group_name, features_to_group_ListofList, apply_polynomial_trigonometry_features, max_polynomial, trigonometry_calculations, top_poly_trig_features_to_select_percentage, scale_data, scaling_method, Power_transform_data, Power_transform_method, target_transformation, target_transformation_method, remove_outliers, outlier_contamination_percentage, outlier_methods, apply_feature_selection, feature_selection_top_features_percentage, remove_multicollinearity, maximum_correlation_between_features, remove_perfect_collinearity, apply_feature_interactions, feature_interactions_to_apply, feature_interactions_top_features_to_select_percentage, cluster_entire_data, range_of_clusters_to_try, apply_pca, pca_method, pca_variance_retained_or_number_of_components, random_state)
2538 return(pipe.fit_transform(train_data),pipe.transform(test_data))
2539 else:
-> 2540 return(pipe.fit_transform(train_data))
2541
2542

/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
381 """
382 last_step = self._final_estimator
--> 383 Xt, fit_params = self._fit(X, y, **fit_params)
384 with _print_elapsed_time('Pipeline',
385 self._log_message(len(self.steps) - 1)):

/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params)
311 message_clsname='Pipeline',
312 message=self._log_message(step_idx),
--> 313 **fit_params_steps[name])
314 # Replace the transformer of the step with the fitted
315 # transformer. This is necessary when loading the transformer

/opt/conda/lib/python3.6/site-packages/joblib/memory.py in call(self, *args, **kwargs)
353
354 def call(self, *args, **kwargs):
--> 355 return self.func(*args, **kwargs)
356
357 def call_and_shelve(self, *args, **kwargs):

/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
724 with _print_elapsed_time(message_clsname, message):
725 if hasattr(transformer, 'fit_transform'):
--> 726 res = transformer.fit_transform(X, y, **fit_params)
727 else:
728 res = transformer.fit(X, y, **fit_params).transform(X)

/opt/conda/lib/python3.6/site-packages/pycaret/preprocess.py in fit_transform(self, dataset, y)
1957 def fit_transform(self,dataset,y=None):
1958 data = dataset.copy()
-> 1959 corr = pd.DataFrame(np.corrcoef(data.drop(self.target,axis=1).T))
1960 corr.columns = data.drop(self.target,axis=1).columns
1961 corr.index = data.drop(self.target,axis=1).columns

/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in init(self, data, index, columns, dtype, copy)
483 )
484 else:
--> 485 raise ValueError("DataFrame constructor not properly called!")
486
487 NDFrame.init(self, mgr, fastpath=True)

ValueError: DataFrame constructor not properly called!

Python 64 bit only

Website and git documentation to mention that it supports 64 bit only.

Lack of info in the create_model function documentation

In the create_model() function, after the model has been evaluated on a kfold cross validation and its ready to be used to make predictions on unseen data, a question remained (I've read the function documnetation and didn't find an answer): Is the model trained in the entire data set?

If yes, would it be great to have this piece of information stated in the function docs?

How to get the value of compare_models table

When we use the make_model/ compare_models/tune_model function, a table containing much information will be returned. Is it any function used for get a specific row/column/value of the table?

Normalization is not working

Hi, just noted that the example at https://pycaret.org/normalization doesn't normalize all numerical columns in dataset as in the example. Instead, only the target class if data type is numeric.

# Importing dataset
from pycaret.datasets import get_data
pokemon = get_data('pokemon')

# Importing module and initializing setup
from pycaret.classification import *
clf1 = setup(data = pokemon, target = 'Legendary', normalize = True)

Cross validation with/without shuffling rows

Hi. Great package. Thank you.

Is there a way to select whether we want rows to be shuffled or not during CV? What is the base behavior? Not shuffling is critical for time series analysis.

Thanks

Not able to fill missing values

Hello,

According to Missing Value Imputation section, I am trying out to fill the missing values but the syntax is not mentioned. I am in search of the syntax to impute the missing values as mentioned on the page.

I tried to use the following code but that does not work.

import pycaret
from pycaret.datasets import get_data
hepatitis = get_data('hepatitis')
# Importing module and initializing setup
from pycaret.classification import *
clf1 = setup(data = hepatitis, target = 'Class',numeric_imputation='mean',categorical_imputation = 'mode')

The above code did not work and the missing values still exist in the dataset. Kindly, provide the syntax for the same.

Thanks

Specify "n_jobs" Parameters

Currently n_jobs=-1 is set in compare_models, tune_model and so on. But I think it would be more useful if user can specify it when the calculation is done on a shared server. I would appreciate if you could consider it.

Custom Tuning Grid

Hi,

The current tune_model() I understand uses randomized grid search for tuning a model. However, how can I provide my own tuning grid for selected hyper-parameters. I don't think PyCaret currently offers this flexibility. If we can have this feature, it will be very instrumental.

Thanks

Clustering create_model not working

The create_model for Kmeans or other clustering seems not to be working. I get the following error

SystemExit: (Value Error): Estimator Not Available. Please see docstring for list of available estimators.

even n_clusters is also not available.

Setup worked fine.

I am using colab for processing

Merging Stack_Model and Stacknet

Hi,

stack_model() and create_stacknet() are both wonderful features, something I havent seen being provided by any other package. However, intrinsically I feel the only difference in them is of layers. If we can have just one function with options of choosing number and estimators at each layer, we would not need two functions here.

Thanks.

Confidence interval for performance metrics

Is there any way to compute 95%CI for all performance metrics (especially the hold-out test set) since a single value doesn't really help that much to evaluate models robustness. Many thanks for the great work you have done and keep on doing.

ValueError: need at most 63 handles, got a sequence of length 65

When i try to run tune_model() on Windows server I receive the following error

Exception in thread QueueManagerThread: Traceback (most recent call last): File "d:\anaconda3\envs\pycaret\lib\threading.py", line 916, in _bootstrap_inner self.run() File "d:\anaconda3\envs\pycaret\lib\threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "d:\anaconda3\envs\pycaret\lib\site-packages\joblib\externals\loky\process_executor.py", line 615, in _queue_management_worker ready = wait(readers + worker_sentinels) File "d:\anaconda3\envs\pycaret\lib\multiprocessing\connection.py", line 859, in wait ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout) File "d:\anaconda3\envs\pycaret\lib\multiprocessing\connection.py", line 791, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length

I tried to run it on two different Windows servers using different Anaconda enviroments with Python 3.6.10 and 3.7.7 and had the same error.

Estimator Error

Tried regression data and catboost regressor found to be the best performing model, but after tuning the model tried to evaluate the model to view the hyperparameters got the below error
SystemExit: (Estimator Error): CatBoost estimator is not compatible with plot_model function, try using Catboost with interpret_model instead.

Couldnt find any tests

Hello,

First of all, thanks a lot for this comprehensive library.

I wanted to add cross platform continious integration workflow pr but couldnt locate the tests for the implemented functions. Where can i find them?

If there arent any, you should definitely consider adding unit/integration tests, otherwise it would be extremely hard to identify/localize bugs and errors.

Bests

Meta model default in stacking

The default for meta-model is linear/logistic for classification/regression. Though it might work in certain cases, places where I have used, especially with re-stack = True, the model performance only improves when I have chosen one of the better performing models as meta model. Since most of the users would begin with Compared_models(), if there was a way to have the stacking default to best model from compare_model, it would be more convenient.

Some of the required packages are missing in Conda and Conda-Forge.

conda install -f -y -q -c conda-forge --file requirements.txt

PackagesNotFoundError: The following packages are not available from current channels:

  • cufflinks==0.17.0
  • kmodes==0.10.1
  • datefinder==0.7.0
  • yellowbrick==1.0.1
  • datetime==4.3

Alternatives would be appreciated. Thanks in advance

Parellelizing compare_models

Hi team,

Wonderful work with pycaret! I'm wondering if you might be open to a PR that parallelizes compare_models(), such that one core takes on one model class? Yes, this might change some of the UI elements, but it might also speed up training. I have some experience using Dask, and am happy to lend some time to make this happen, though I might also need some guidance through the codebase at some point.

Memmory related Crash

Dear All,

Thanks for the great library, I'm facing problems when trying to use pandas to import my csv which has 130 rows and 110 columns. It consumes all my RAM (which is 64GB, MacBook Pro 16") and crashes. Any ideas? Up till now, I was able to use it only with csv files with a limited number of columns .

Nikos

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.