Giter Club home page Giter Club logo

noggin's Introduction

noggin

Python version support PyPi version Build Status codecov Tested with Hypothesis Documentation Status

Noggin is a simple Python tool for ‘live’ logging and plotting measurements during experiments. Although Noggin can be used in a general context, it is designed around the train/test and batch/epoch paradigm for training a machine learning model.

Noggin’s primary features are its abilities to:

  • Log batch-level and epoch-level measurements by name
  • Seamlessly update a ‘live’ plot of your measurements, embedded within a Jupyter notebook
  • Organize your measurements into a data set of arrays with labeled axes, via xarray
  • Save and load your measurements & live-plot session: resume your experiment later without a hitch

You can read more about Noggin here

noggin

noggin's People

Contributors

davidmascharka avatar dependabot[bot] avatar rsokl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

lgtm-migrator

noggin's Issues

Is noggin coming to conda?

I like to use conda rather than pip to keep all of my packages in one place. Will noggin be coming to conda via conda install at any point?

Make metrics saveable/loadable as x-arrays

Live metrics are already handled as ordered dictionaries of numpy arrays; this is nearly exactly the data format needed to form an xarray of the metrics.

This would permit users to seamlessly access their data as N-dimensional arrays with labeled axes.

x-axis values

Iteration number can be pretty unwieldy. It would be nice to have an option to label the x-axis by iteration number, epoch, etc.

Create gif of liveplot in action

The README needs a brief gif that shows liveplot in action. It should show at least two metrics (e.g. loss and accuracy) being plotted with both batch and epoch-level statistics.

Limit data rate for plotting

Currently liveplot will plot all available data regardless of how much data that is. This can lead to large computational costs, making plotting a bottleneck.

We should establish a heuristic for limiting the amount of data being plotted. Ideally this would involve estimating the computational cost of each "draw" during live plotting, and how this scales with the amount of data available.

We would also want to estimate the maximum visually-resolvable density of data. That is, if I am drawing 10,000 points on a typically-sized plot, does drawing every 10th point look just the same as drawing every point?

With these to pieces of analysis, we should be able to arrive at a sensible default for limiting the number of points that we draw in a given call. We could potentially plot sliding-window averages to coarsen the plot.

Add support for alternate plotting backends

Abstract away the specific plotting backend (i.e. matplotlib) from LivePlot. Thus the current version of LivePlot would become MatplotlibLivePlot, and would retain the matplotlib-specific functionality. Otherwise LivePlot will serve as an abstract base class that handles all of the metric logging, saving, refresh logic, etc.

Ultimately, it would be nice to support bokeh and toyplot as backends.

Plotting in server mode

Add ability to serve logged data to a plotter. This would permit people to manage a live plot in a separate and multiple notebooks.

This is an ambitious enhancement that has the potential for a large payoff. I would like to carefully consider the best means for serving/listening to data in a simple but robust way. I'd like to get input from other about how to move forward with this (@davidmascharka , @ptran516 , @arjunmajum)

fix indentation

    # record training epoch
    if i%10 == 0 and i > 0:
        plotter.plot_train_epoch()

       # cue test-evaluation of model
       for x in np.linspace(0, 10, 5):
           x += (np.random.rand(1) - 0.5)*5
           test_metrics = {"accuracy": x**2}
           plotter.set_test_batch(test_metrics, batch_size=1)
       plotter.plot_test_epoch()
plotter.plot()  # ensures final data gets plotted

recreate_plot should take a figsize argument

It would be lovely to be able to take a figsize in recreate_plot so as not to end up with a miniscule plot. When I work out of interactive mode (e.g. when I work in emacs), I'd like to be able to simply construct the plot at the size I want via an interface like:

plotter, fig, ax  = recreate_plot(train_metrics=train, test_metrics=test, figsize=(8, 12))

rather than:

plotter, fig, ax = recreate_plot(train_metrics=train, test_metrics=test)
fig.set_size_inches(8, 12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.