Giter Club home page Giter Club logo

data-analysis's Introduction

Read me

Repo directory:

  • Projects are split by folders

Model frameworks

Others

Techniques

Bayesian Regression

Causal Inference

Applied datasets

Installation

The various analysis was built in Python 3.

Virtual environment setup

Some projects have their own requirements/environment. The general setup is installed by:

python3 -m venv dataAnalysisEnv
source dataAnalysisEnv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

Markdown from Notebooks

jupyter nbconvert notebook.ipynb --to markdown

This is automated via github actions.

Standard library

Custom library installed as a dev library for continued development

VSCode

Use the settings.json file in the repo

Future areas

Aim is for future work to be incorporated by working on separate branches and merge to master when finished.

Tools/areas to explore

Datasets to explore

Tasks

  • Build project template repo
  • Publish interpret-ml piece
  • NBA
    • Player position classification model
    • Bayesian sequential team rating
    • Player VAE - how are players related
      • College stats to NBA VAE
  • M5/M4 forecasting
    • Walmart demand forecasting
    • with LightGBM
    • Greykite
  • PCA via embedding layer
  • NN to predict tempo from song, generate dummy dataset
  • Word embeddings plot with hiplot
    • Plot with PCA first and compare with hiplot
  • Compare linear regression MC dropout to theoretical results
  • Optimal car charging schedule
  • Media pipe - 3d audio
    • Face distance javascript web app with react
  • Covid UK plot against time on a map
  • Autoencoder using transfer learning?
    • what do we use for the decoder?
  • Fit a sinusoid to noisy data
    • Fourier
    • Gradient descent
    • MCMC
    • Variational inference
  • Double dip loss trajectories
  • Fitting NNs to common functions (exp etc.), deep vs wide, number of parameters for given error
  • Fit a NN to seasonal data with fourier series components
  • DoubleML on heart data to find CATE
  • Github action to publish ipynbs to markdown
  • Hierarchical models
    • Mixed effects model - is it the same as a fixed effects model (lin/log regression) with one hot encoding for the categorical variables + a fixed effect?
    • Hierarchical bayesian models - for when we have categorical features with share effects over other features
    • Fit with MCMC
    • Similarities to ridge regression - only some coefficients are regularised
    • Generate data and fit each model
    • Ref
  • Bimomial regression = logistic regression
  • Linear regression = logistic regression, relationship to Linear Thompson Sampling

data-analysis's People

Contributors

stanton119 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.