Giter Club home page Giter Club logo

evalpy's Introduction

evalpy

A lightweight framework for experiment logging and automatic visualization

Evalpy aims at researchers that want a fast and efficient framework to log their experiment configurations alongside the results. This includes an interface to both log the parameters and metrics of any single run as well as the progression during a run in a time series manner.

The second part of evalpy includes the GUI which is provided within the package. To start the GUI simply activate the environment in which you installed evalpy in a console and execute the following:

evalpy run

Quickstart

The intended usage involves the following steps:

  • Declaring the project root, a file path
  • Declaring the project name, the name of the project directory
  • Starting a run with an experiment name
  • In the run one can log one time the parameters and metrics and do a step logging for the run progression

A minimal usage outline is as follows

import evalpy


evalpy.set_project('my_first_project_path', 'my_project_folder_name')
with evalpy.start_run('experiment_name'):
    for log_step_stuff in model_training():
        evalpy.log_run_step(log_step_stuff, step_forward=True)  
    evalpy.log_run_entries(model_parameters_and_metrics)  # both methods expect a dict as input

evalpy's People

Contributors

davidrother avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

jakobwe

evalpy's Issues

Allow native storing of numpy scalars

Right now, datatypes such as np.float64 are handled as BLOBs when stored into the database since the typecheck for regular python floats obviously fails.

Example code:

import evalpy
import numpy as np

entries = {
    "normal_float": 1.0,  # this will be stored in the database as a float
    "numpy_float": np.float64(1.0) # this will be stored in the database as a BLOB
}

evalpy.set_project("random_test_path", 'my_project_folder_name')
with evalpy.start_run('experiment_name'):
    evalpy.log_run_entries(entries)

This is a problem since in the vast majority of cases numpy floats are used interchangeably with normal python floats, to the point where an end user might not realize that what they are working with is not a native type. These values are then stored as BLOBs, which makes the UI crash when trying to plot them.

It would be nice if these types would be automatically converted so they can be used for plotting in the ui aswell.

Allow intermediate committing of logs either manually or automatically

As it is now, evalpy will keep all data that is to be logged in RAM until the current run is finished.
This might lead to out of memory errors on long runs that log a lot of data.
As a user I would prefer if data is committed regularly to the database on disk, as such it would not take up RAM. This would also save logged data in case of system crashes.

Since committing to permanent storage may cause a delay, committing could be implemented in a separate python thread (since slow disk IO should not hold the gil), which could enable direct committing with little to no performance impact.

UI crashes when selecting blobs/strings for plotting

This happens on commit be03404 on the dev branch. Return of #6?
Typechecking for plots seems to be incomplete.

Running the following code and using the freshly generated database:

import evalpy
import numpy as np
from pathlib import Path

entries = {
    "normal_value": 1.0,  # this is plottable
    "nonplottable": Path("not", "plottable"),
    "me_neither": [3, 2, 1]
}

evalpy.set_project("random_test_path", 'my_project_folder_name')
with evalpy.start_run('experiment_name'):
    evalpy.log_run_entries(entries)

Will result in a crash when either nonplottable or me_neither are selected on a plotting axis.

evalpy run --path /home/jakob/git/jakob-weimar-thesis-algorithms/TRPO/random_test_path/my_project_folder_name/
QFont::setPointSize: Point size <= 0 (-1), must be greater than 0
Traceback (most recent call last):
  File "/home/jakob/git/evalpy/evalpy/visualization/window.py", line 197, in compute_plot
    self.compute_experiment_plot()
  File "/home/jakob/git/evalpy/evalpy/visualization/window.py", line 216, in compute_experiment_plot
    self._prepare_plot_experiment(x_values, y_values, x_axis, y_axis)
  File "/home/jakob/git/evalpy/evalpy/visualization/window.py", line 325, in _prepare_plot_experiment
    scatter.setData(transformed_x, transformed_y)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/pyqtgraph/graphicsItems/ScatterPlotItem.py", line 308, in setData
    self.addPoints(*args, **kargs)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/pyqtgraph/graphicsItems/ScatterPlotItem.py", line 388, in addPoints
    newData['x'] = kargs['x']
ValueError: could not convert string to float: b'\x80\x03]q\x00(K\x03K\x02K\x01e.'
Aborted (core dumped)

A similar error occurs when using log_run_step instead of log_run_entries, which is likely related.

allow project path as commandline argument

As a user I would like to have a command line argument to specify from which directory evalpy run should load it's database.

example:
evalpy run --directory ~/git/project/evalpy_results/

Saving metrics in unsupported formats crashes evalpy after run completion

Similar to #2.
Again, one possible solution is to fail early so the line of code that is at fault is contained in the stacktrace and runs don't need to be fully done for errors to show up.

Code to reproduce:

import evalpy
import torch
import time

important_metric = torch.tensor([0.0])

evalpy.set_project("random_test_path", 'my_project_folder_name')
with evalpy.start_run('experiment_name1'):
    print("doing some stuff")
    evalpy.log_run_step({
        "average_reward_trainig": important_metric
    }, step_forward=True)

    time.sleep(1)
    print("doing some other stuff")
    time.sleep(1)

Console output:

/home/jakob/miniconda3/envs/jw-masterthesis/bin/python /home/jakob/git/jakob-weimar-thesis-algorithms/TRPO/source/examples/test.py
doing some stuff
doing some other stuff
Traceback (most recent call last):
  File "/home/jakob/git/jakob-weimar-thesis-algorithms/TRPO/source/examples/test.py", line 16, in <module>
    time.sleep(1)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/client.py", line 29, in __exit__
    self.end_run()
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/client.py", line 78, in end_run
    sql_utilities.add_row_to_run_table(self.db_connection, self.active_run_id, entry)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/sql_utilities.py", line 63, in add_row_to_run_table
    _add_data_row_to_table(cursor, params, run_id)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/sql_utilities.py", line 123, in _add_data_row_to_table
    cursor.execute(sql, param_values)
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.

Process finished with exit code 1

Having lists as entries crashes the program after finishing a run

Saving an entry that is a list will crash evalpy when it tries to finish a run and save everything into the sql database. This can lead to frustrations as runs can complete succesfully only to be crashed last second. In my opinion evalpy should crash early under circumstances where it can't store entries.
On top of that, it would be nice for my use case if it could store lists in some form (filtering over those lists isn't needed in my case).

Code to reproduce:

import evalpy

entries = {
    "other_entry": "some_entry",
    "list_entry": ["this","is","a","test"]
}

evalpy.set_project("random_test_path", 'my_project_folder_name')
with evalpy.start_run('experiment_name'):
    evalpy.log_run_entries(entries)

Console output:

param_values ['tfd60aff6_f025_4843_bf43_2058cc3fc6c2', 'experiment_name', 'dsafdsf', ['this', 'is', 'a', 'test']]
Traceback (most recent call last):
  File "/home/jakob/git/jakob-weimar-thesis-algorithms/TRPO/source/examples/test.py", line 10, in <module>
    evalpy.log_run_entries(entries)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/client.py", line 29, in __exit__
    self.end_run()
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/client.py", line 76, in end_run
    sql_utilities.add_row_to_main_table(self.db_connection, final_entry_dict)
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/sql_utilities.py", line 57, in add_row_to_main_table
    _add_data_row_to_table(cursor, params, 'params')
  File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/sql_utilities.py", line 125, in _add_data_row_to_table
    cursor.execute(sql, param_values)
sqlite3.InterfaceError: Error binding parameter 3 - probably unsupported type.

Crash when deselecting experiment and then plotting

The program crashes when I try to plot without having selected an experiment.
Steps to reproduce:

  1. Plot anything
  2. Deselect all experiments
  3. Click "Compute Plot"

Video: https://www.youtube.com/watch?v=YEzSGagLHMM
Database file: https://mega.nz/#!pGYR3KqD!xxrlohnpg6BPJ1Wr37ww_gr1wLphC6Y6cOLWUFp8pM0

bash output:
evalpy run QFont::setPointSize: Point size <= 0 (-1), must be greater than 0 Empty filename passed to function Traceback (most recent call last): File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/visualization/window.py", line 192, in compute_plot self.compute_experiment_plot() File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/visualization/window.py", line 205, in compute_experiment_plot values = self.backend.get_filtered_column_values_experiments(experiment_names, filters, [x_axis, y_axis]) File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/visualization/backend.py", line 29, in get_filtered_column_values_experiments return self.client.filtered_column_values_experiment(experiment_names, sql_filters, columns) File "/home/jakob/miniconda3/envs/jw-masterthesis/lib/python3.6/site-packages/evalpy/project/client.py", line 164, in filtered_column_values_experiment sql_utilities.SQLOperator.EQUALS, experiment_names[0], False, IndexError: list index out of range Aborted (core dumped)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.