Giter Club home page Giter Club logo

pymlviz's Introduction

Interactive Exploration and Visualization of Algorithms for Machine Learning and Data Science

Mission

The rapid gain in popularity of data science and machine learning imposes new challenges on the educational system. Traditional learning paradigms such as ex-cathedra teaching and textbooks not only lack in terms of individual student learning behavior, but are furthermore incapable of addressing free exploration of algorithms. While many algorithms in machine learning can be grouped conceptually, theres a plethora of variants and specific implementations. By following fixed learning curricula it often turns out to be difficult to grasp the subtleties, weaknesses and failure modes of taught algorithms. Links to corresponding methodological alternatives can be missing or are introduced at a much later point.

We believe that free exploration and interactivity experienced by working through concrete examples are at the heart of a satisfactory learning process. In a growing amount of online resources we specifically envision to contribute in the developement of interactive hands-on material catered towards a wholistic understanding of both algorithmic, implementation as well as application aspects.

Goals

We found that a variety of existing online material tends to focus on specific aspects of understanding machine learning algorithms. Typically content is divided by method. As an example: we believe there is a great deal of online tutorials on gradient descent or sampling algorithms, but rarely contrasted directly in an easy to explore fashion. Depending on the source of the material we are generally encountered with a focus on either how to implement, how to understand the equations or how to apply what we have just learned to a specific context. In this project we make an attempt at presenting a layered view of content in which the reader is free to explore content at any of these levels. In detail we have formulated the following goals:

  • Three different layers spanning visual free to explore examples, explanation of underlying equations with theory and an implementation layer.
  • Easy to use interactive widgets with full control of parameters and the ability to execute algorithms step by step to fully visualize and understand various intermediate algorithmic states.
  • The ability to select and contrast algorithms and their variants on the same example, highlighting the nuanced failure modes of particular algorithmic choices.
  • Commented, accessible, open-source Python only code in contrast to e.g. commonly used Javascript implementation. We believe this eases the learning process as Python is the current go-to programming language for machine learning.
  • Online accessible and executable material with the option to download material in a self-contained, executable form.

Technical Details

While the technical details are open to evolution over time, we are currently pursuing an approach using the following methods:

  • Python as the only programming language.
  • This website/repository to aggregate the content.
  • Executable Jupyter notebooks (http://jupyter.org) to explore the underlying algorithm code and code for visualizations with the help of Binder (https://mybinder.org).
  • Interactive Widgets with parameter sliders, algorithm choices and choices of data. We currently employ Bloomberg's bqplot (https://github.com/bloomberg/bqplot).
  • Downloadable Docker containers (https://www.docker.com) for local execution.

MyBinder:

Binder

Click launch binder button above or follow this URL to view this repository in a pre-built environment:

https://mybinder.org/v2/gh/PyMLVizard/PyMLViz/master?filepath=Index.ipynb

Note: Chrome or Firefox are recommended for using the notebooks!!

Direct Links to Contents/Notebooks

Linear regression

  1. Linear Regression

Sampling methods

  1. Introduction. Inversion Sampling
  2. Rejection sampling
  3. Importance sampling
  4. Markov chain Monte-Carlo (MCMC) sampling: Metropolis-Hastings algorithm
  5. Gibbs Sampling
  6. Slice sampling
  7. Hamiltonian Monte Carlo (HMC) sampling
  8. PyStan

Gradient descent methods

  1. Introduction. Gradient and stochastic gradient descent
  2. Variants. Momentum, Nesterov, Adagrad, RMSProp and Adam

Contributing

We are open and grateful to contributions of any kind.

pymlviz's People

Contributors

wolfstam avatar mozzhorin avatar mrtnmndt avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.