Giter Club home page Giter Club logo

pycm's Issues

Make pycm available to download with conda

I want to be able to add pycm as a dependency with conda.

Currently it seems to be available with pip, but I would like to install pycm with the following command:

conda install pycm

Thanks !

Parameter recommender

recommend most related parameters considering if the dataset is unbalanced or not, or the classification is binary or not

Initial Update

The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.

Add method to rename classes

It would be useful if there was a method to rename the classes.

In many workflows it is common to encode class names as integers. However, when working with final results it is most useful to convert these integers back into the string class names.

It would be useful to do this after the construction of the ConfusionMatrix, because it is more efficient to remap the labels on the matrix rows and columns than it is to rename each item in the actual and predicted vectors (which can be quite large).

Something like ConfusionMatrix.rename(mapping) or ConfusionMatrix.relabel(mapping) might be a good method signature where mapping is a dictionary of old names to new names. It would also be useful if this method "did the right thing" when multiple old names mapped to a single new name (i.e. sum the counts in the corresponding matrix cells).

Add AUC/AUNU/AUNP

Ballabio, D., Grisoni, F. and Todeschini, R. (2018). Multivariate comparison of classification performance measures. Chemometrics and Intelligent Laboratory Systems, 174, pp.33-44.

Add Overall MCC

Gorodkin J (2004) Comparing two K-category assignments by a K-category
correlation coefficient. Computational Biology and Chemistry 28: 367โ€“374.

statement of need in README *and* paper.md

One last issue @sepandhaghighi

You define confusion matrices in your README.md and paper.md but you don't state who the main users of the library will be, i.e., your 'target audience' as the JOSS guideliness state.

Please add some language specifying which users you think will find pycm most useful, e.g. pycm is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models

Make sure to add this to both the README.md and the paper.md.
Thanks! That's the last "major-minor" revision :)

Add dlnd/slnd

Ballabio, D., Grisoni, F. and Todeschini, R. (2018). Multivariate comparison of classification performance measures. Chemometrics and Intelligent Laboratory Systems, 174, pp.33-44.

Names of matrix and normalized_matrix methods are confusing.

I would have expected matrix and normalized matrix to return the confusion matrix as a numpy array or a pandas DataFrame. However instead it prints a string representation.

It might be a more clear design if these functions were renamed to print_matrix / print_normalized_matrix and matrix / normalized_matrix were added to return a value that the a user can work with.

Support numpy arrays

They are basically used everywhere, no need to convert. They are faster and memory efficient. It's an issue with large lists. The library supports only lists at the moment

if not isinstance(actual_vector,list) or not isinstance(predict_vector,list):

example usages

Hi again @sepandhaghighi, one more minor issue:

the examples in the README are great for demonstrating functionality, but I think examples demonstrating actual use cases would be helpful for people.

Again, there are many metrics included in the package; I would say pick three-five metrics that you think will be the ones most frequently used by all users, and then provide some more real-world examples of using pycm with those metrics.
E.g., using it in conjunction with scikit-learn to compare two or three different classifiers.

This would be a place that a simple visualization would really be helpful and make the concepts more intuitive.
I have often adapted this code from the scikit-learn docs to my own needs when I have to visualize a confusion matrix:
http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py
If you do not want to add scikit-learn as a dependency, you might just add a data sub-package with outputs from some models you have trained, and the same goes for matplotlib or whatever you use to visualize, you could for now just have the plots as static .png files in your library.

Maybe there was a particular problem you were facing that inspired you to develop the package and you have true data + predictions from those models that you could include as example datasets?

I really think adding these sorts of examples will help you recruit users to whom it's not immediately obvious why they shouldn't just use the basic confusion_matrix functionality in scikit-learn.

document API

Hi @sepandhaghighi I'm reviewing pycm for JOSS.
First, you all have done a great job and the library is well put-together. I'm sorry I haven't started the review sooner. I think the library as-is is very close to "accept". There are just a couple of minor revisions I think might be helpful.

Of those minor revisions, the one I would most like to see is a thorough documentation of the API.

At the bare minimum, the user should be able to type help(ConfusionMatrix) (or ConfusionMatrix? in Ipython) and get some sort of useful help. As you probably know, the way to fix this would be to add a docstring to the ConfusionMatrix class. This docstring should state what the acceptable types for y_true and y_pred are--are only lists of ints and np.ndarrays of ints valid? Including more brief versions of the examples from the README.md (without the entire output of the print(cm) statement) would probably also be helpful.

I think it would also be helpful to state what data types are acceptable in plain English on your README.md. A potential user will want to know if pymc can accept their data in its present state or if they're going to have to convert it somehow to the right format.

I will need to finish this review tomorrow but I wanted to get it started for you as soon as possible. I hope my review will be helpful and we can quickly get your library into JOSS so more people can appreciate it.

DOI in paper.md

Hey @sepandhaghighi thanks for working on the issues.

One very minor one: I think the JOSS guidelines ask for you to put any relevant DOIs in the paper.md file.
See the example paper.md:
https://raw.githubusercontent.com/arfon/fidgit/master/paper/paper.md

In a somewhat meta fashion, Fidgit is publishing itself to figshare with DOI 'https://doi.org/10.6084/m9.figshare.828487' [@figshare_archive].

As far as I can tell, your library DOI is not in your paper.md currently.
Please add that.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.