Giter Club home page Giter Club logo

pythia's Introduction

Pythia Build Status

Machine Learning experiment management platform.

Manage your ml experiments to have all your results in one place. Pythia tries to give you an overview even if you have experiments running in different places.

For this, you consume an API of a Pythia instance in your experiment(s). A Python library is currently in the making here.

Please note that Pythia is not yet ready to use. The API lacks some routes, the frontend is nowhere near finished, we have no charts or anything beyond sole listing of experiments and models.

However, I'm always keen to hear your feedback, so create an issue or drop me a comment on twitter: @andinfinity_GER.

This work was inspired by FBLearner FLow and Pastalog. Here's a relevant post on reddit with lots of alternatives.

Status

screen shot

I'm currently working towards the first release. The charts for metrics over time (ANN convergence for example) aren't done yet, the socket communication isn't quite stable (some things lacking) and not all changes to the data are propagated and/or reflected to/in the frontend.

Installation

Fetch the latest release as a zip file or fetch the code via git clone. cd into the directory.

Dependencies

Fetch all the dependencies via npm install.

Running

With a running MongoDB instance just issue node . in the root directory of the project. Pythia is now available via browser or via API.

License

The MIT License (MIT)

Copyright (c) 2016 Christian Schulze

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

pythia's People

Contributors

christiansch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pythia's Issues

Metric viz

  • Singular data points: cards with a plain number, like kibana
    screen shot 2016-05-29 at 12 21 25
  • Temporal data points: graph

UI improvements / tables for better overview

We should use tables in the experiment and model listings to provide a better overview.

Possible columns:

  • experiment
    • model (name, description, hyperparameter)
      • metric (name, value if singular data point *or * "max value, min value, avg value" if series)

hyperparameter optimization model

Maybe I can incorporate this somehow … not sure yet. Would be useful to have an overview of the chosen hyperparams after the search has finished or something.

[feature] best performing model should be highlighted

If we have a number of models which share some metrics, we should highlight the best and worst performing models. The maximum value of a metric should be marked with some color, and the minimum value as well. The colors however should not be red and green, as errors ought to be minimized, and scores maximized, and we have no way to tell (other than trying to be smart and parsing some scores and errors, which isn't really a good approach). Maybe blue and orange.

  • add legend above the models (which are only displayed if we have more than one model) which a re hyperlinked to the respective value (max and min value via #min-val or #max-val)
  • detect max and min of a value
    • mark row with color and underline the value
    • create an anchor to allow jumping to the row from the legend
  • introduce help button that displays a short explanation of the difference between error and score.

API v1 spec

  • CRUD /experiment: the experiment
    • POST /experiment: create new experiment
    • GET /experiment: get all experiments
    • GET /experiment/:id: get experiment data (all of it)
    • PUT /experiment/:id: change experiment data (only the meta data about the experiment)
    • DELETE /experiment/:id: remove experiment from database
  • CRUD /experiment/:id/model: the different models that ought to be compared for the experiment
    • POST /experiment/:id/model: create a new model
    • GET /experiment/:id/model: get all models
    • GET /experiment/:id/model/:id: get model data
    • PUT /experiment/:id/model/:id: change model data (only meta data about the model)
    • DELETE /experiment/:id/model/:id: remove model
  • CRUD /experiment/:id/model/:id/measurements
    • POST /experiment/:id/model/:id/measurements: create a new measurement point
    • GET /experiment/:id/model/:id/measurements: get all measurement points (sorted by name)
    • GET /experiment/:id/model/:id/measurements/forname/:name: get all metrics with a name (for example f1_micro or totalLoss)

tags? tags!

I guess tags would be cool if one runs a lot of projects/experiments.

Chartjs and vue.js do not like each other

As we have to generate canvas elements per metric with more than one data point the elements are created dynamically. Chartjs can't really find these elements, hence the chart will only be displayed after another point of measurement is added later on.

[feature] tensorboard/jupyter notebook integration

  • it was proposed to me to include tensorboard somehow
    • either try to read the files created by SummaryWriter (not sure if and how this works)
    • or as plain uri
  • jupyter notebook integration would be splendid
    • either as html directly via nbconvert
    • or as plain uri

thus, hereby I propose introducing a hypertext resource attribute of models.

models are not updated

new models are added to the table, though measurements and other updates are not propagated properly. after reloading the information is displayed correctly.

Pythia crashes on add_measurement with invalid input

y/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/types/embedded.js:178
    throw new Error(msg);
    ^

Error: Unable to invalidate a subdocument that has not been added to an array.
    at EmbeddedDocument.invalidate (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/types/embedded.js:178:11)
    at EmbeddedDocument.Document.set (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/document.js:687:10)
    at EmbeddedDocument.Document.set (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/document.js:540:18)
    at EmbeddedDocument.Document (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/document.js:66:10)
    at new EmbeddedDocument (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/types/embedded.js:30:12)
    at EmbeddedDocument (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/schema/documentarray.js:26:17)
    at Array.create (/Users/x/Documents/Development/Pythia/node_modules/mongoose/lib/types/documentarray.js:204:12)
    at Query.<anonymous> (/Users/x/Documents/Development/Pythia/api/v1/experiment/index.js:475:54)
    at /Users/x/Documents/Development/Pythia/node_modules/mongoose/node_modules/kareem/index.js:177:19
    at /Users/x/Documents/Development/Pythia/node_modules/mongoose/node_modules/kareem/index.js:109:16
    at _combinedTickCallback (internal/process/next_tick.js:73:7)
    at process._tickCallback (internal/process/next_tick.js:104:9)
2017-03-24T08:52:03.220+0100 I -        [conn1] end connection 127.0.0.1:56966 (1 connection now open)

caused by:

m1.add_measurement('avg_smape', m)

with m:

{ name: 'avg_smape',
  value:
   [ 'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)',
     'Figure(720x432)' ],
  epoch: '0',
  step: '0' }

change database

mongodb disappoints throughout when it comes to security. we should look for alternatives. I'm open for suggestions.

Missing tests

  • PUT /experiment/:id
  • PUT /experiment/:id/model/:id

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.