Giter Club home page Giter Club logo

dstools's Introduction

Handy tools for data scientists

image

Available functions

metrics.calculate_relationship Determine if y is positive|negative|unrelated with x

metrics.cosine_similarity Cosine similartiy between two vector

metrics.jarccard_index Calculate the jarccard index (aka jaccard similarity).

metrics.ks_score Calculating the Kolmogorov-Smirnov score

metrics.lift_table Create lift table given cutoff point or number of bins

metrics.psi Calculate PSI given two array.

sklearn_extension.BaseEstimator Base class for all estimators in scikit-learn

sklearn_extension.Binning Base class for all Binning functionalities,

sklearn_extension.BorderlineSMOTE Over-sampling using Borderline SMOTE.

sklearn_extension.ChiSquareBinning No documentation found.

sklearn_extension.ClassifierMixin Mixin class for all classifiers in scikit-learn.

sklearn_extension.ConditionalWrapper A conditional wrapper that makes a Scikit-Learn transformer only works on part of the data

sklearn_extension.CorrelationRemover No documentation found.

sklearn_extension.EntropyBinning No documentation found.

sklearn_extension.EqualFrequencyBinning No documentation found.

sklearn_extension.EqualWidthBinning No documentation found.

sklearn_extension.IQROutlierRemover Removing outlier based on IQR,

sklearn_extension.IVBinning No documentation found.

sklearn_extension.IncrementalLogisticRegression Incremental Logistic Regression

sklearn_extension.Inspect A step that can be plugged into the pipeline to inspect the

sklearn_extension.KMeansSMOTE Apply a KMeans clustering before to over-sample using SMOTE.

sklearn_extension.KSBinning No documentation found.

sklearn_extension.NormDistOutlierRemover Removing outliers assuming data is independent and followes normal distribution

sklearn_extension.NotFittedError Exception class to raise if estimator is used before fitting.

sklearn_extension.OrdinalEncoder Similar Scikit-Learn OrdinalEncoder but allows for arbitrary ordering in the columns,

sklearn_extension.Pipeline A dropin replacement for Scikit-learn Pipeline object that supports

sklearn_extension.QuantileOutlierRemover Removing outlier based on skewness threshold

sklearn_extension.RandomOverSampler Class to perform random over-sampling.

sklearn_extension.SMOTE Class to perform over-sampling using SMOTE.

sklearn_extension.SVMSMOTE Over-sampling using SVM-SMOTE.

sklearn_extension.SparsityRemover No documentation found.

sklearn_extension.StepwiseLogisticRegression Stepwise Logistic Regression

sklearn_extension.TreeBinner No documentation found.

sklearn_extension.WoeEncoder No documentation found.

sklearn_extension.equal_frequency_binning Shortcut for equal frequency binning on a Pandas.Series, returns

sklearn_extension.equal_width_binning Shortcut for equal width binning on a Pandas.Series, returns

sklearn_extension.iv Compute the iv stats for each feature, return a list of woe value.

sklearn_extension.return_frame A class decorator for Scikit-Learn transformers

sklearn_extension.sort_columns_logistic Sort columns according to wald_chi2

sklearn_extension.sort_columns_tree Sort columns according to feature importance in tree method

sklearn_extension.woe Return a series mapping feature value to its woe stats

utils.capture_output Capture stdout and stderr as string.

utils.check_same_length A decorator that checks all the arguments to be the same length

utils.create_multilevel_index Create two-level multilevel index from given index names.

utils.find_duplicates Find duplicate elements in an iterable

utils.flatten_list Flatten a nested list regardless of the depth.

utils.get_stats Return a pstats.Stats object from a statement.

utils.groupby groupby(iterable, key=None) -> make an iterator that returns consecutive

utils.is_scalar_nan Tests if x is NaN

utils.iter_date Iterate over days

utils.limit_precision Limit the precision of a float number

utils.maybe_mkdir Create directory when it didn't exist.

utils.ngram Generating n-gram from iterable.

utils.plot_distribution Show the plot for the specified distribution

utils.print_source_code Print the source code of an object.

utils.print_stats Print out the profiling detail from the statement sorted by *keys

utils.read_csv Read multiple csv file and concatenate them row-wise

utils.read_excel Read multiple excel file and concatenate them row-wise

utils.read_multiple_files No documentation found.

utils.read_sheets Read all the sheets in an excel file and concatenate them row-wise

utils.return_default A decorator that checks the first argument, if meets the criteria then simply return the default_value

utils.set_default A decorator that checks the first argument, if meets the criteria then replace it with default_value

utils.timeit A decorator that times the function and logs the information.

utils.today Return the date of today as a string.

utils.weighted_sum No documentation found.

utils.write_dict_to_excel Save a dictionary to an Excel file with each key being the sheet name

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.