michaelgao8 / classification-metrics Goto Github PK
View Code? Open in Web Editor NEW:bar_chart: Utility for automated model metric figures, interactive Jupyter Notebook slides, and pdf reports
:bar_chart: Utility for automated model metric figures, interactive Jupyter Notebook slides, and pdf reports
Ideally, there would be bootstrapping to generate confidence bands around the ROC curve. I initially took inspiration from here.
This requires looking into how to bootstrap at certain thresholds, which may be nontrivial. Because the original is written in R
's data.table
package, it can be difficult to infer the process with which they used.
This requires changing the plot_dict object as well as modifying the plotting code in the template.
Right now, if a user wants to write the output of the papermill
call to a directory, I'm not sure what happens if the directory does not exist.
This also occurs with the -o
flag in the generate_plot_data.py
module.
The PPV by decile plot can be difficult to interpret due to the x-axis label being unintuitive. The deciles of risk should start near 100 and go down to 0, whereas right now they are percentiles of risk (descending).
This should be reversed, or at least made clear.
Currently, the resulting .ipynb
file has no title slide, which is annoying. However, it can be difficult to actually inject a markdown cell without opening the notebook.
Under the hood, all jupyter notebooks are represented as json
structure, meaning that it is feasible to read in the json
and insert a cell in the form of
{
cells: {
type: 'markdown'
}
}
or something like that. This should be done after the papermill
call, but not after any pdf
conversions
Similar to this tool:
michaelgao.shinyapps.io/threshold_tuning
it would be nice to incorporate ipywidgets in the RISE-driven jupyter notebooks. A proof of concept would be to use this to make a slider that would change the output of a confusion matrix as well as other metrics (sensitivity, specificity, PPV, NPV), etc.
Currently, the cutpoints
variable in generate_plot_data.py
returns the cutpoints to calculate the precision. However, the x-axis of the associated plot does not use this. For the sake of consistency, it would be useful either to:
np.linspace
call and replace the plot call in Metrics Template.ipynb
Currently, the process to use this tool involves
generate_plot_data.py
)papermill
to inject and execute the new Jupyter notebookHowever, some of these steps can be consolidated and called directly from python.
For example, currently generate_plot_data.py
takes in command line parameters that can be inspected by running generate_plot_data.py -h
.
Since this writes intermediate data to a file, and this file is then specified again in the papermill
call, papermill
can potentially be called directly from python in the following way:
Designing a command-line interface that specifies flags for things like saving figures to a directory, converting to pdf, etc. can all be done within a python script by adding the appropriate argparse
arguments.
Currently, the conversion to a pdf file, which would normally work as follows:
nbconvert --to pdf <original_notebook.ipynb>
However, this functionality does not currently work because of a missing pandoc
and tex
installation.
To work on this, it may be useful to shell into the container and try to install the packages as referenced here (note that the docker container uses apt
as its package manager)
and then test the functionality.
Ultimately, this should then be incorporated in the docker container itself. To try this on a fresh image, rebuild the container and then try to convert a file to pdf.
Before each plot, as a sub-slide, it will be useful to describe what the metric is and how it used. The primary audience will be people who are interested in the model metrics, but may not know exactly how to interpret the results.
After discussing with Suresh, it may be useful to think about what an "ideal" (perfect) classifier would look like, and what the worst classifier is. This helps orient people spatially.
For example, an ROC curve might start with an explanation of what the ROC curve is and then show that the perfect classifier is a triangle formed by the left and top edges of the plot along with the diagonal and the worst classifier just lies along that diagonal.
Right now, the Dockerfile pulls from python:3.7, which is quite large. There is probably a way to pull from a smaller repo while still ensuring that we can install pandoc, etc.
Not critical, but would be a nice improvement
using sklearn
's built-in datasets and models, it may useful to built a Jupyter notebook trained on some data and then to use those model results to test the tool. This can also act as an example for newcomers to the tool.
Will involve changing the README to include the example and how to use it
Similar to #4 , but for the calibration curve. Can be merged in a single PR if desired.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.