Giter Club home page Giter Club logo

point-annotator's Introduction

point-annotator

This package provides functionalities to annotate data with labels based on the Mann-Whitney U test and Hypergeometric test. Currently, we provide examples of the fast annotation of gene expression data with the cell types based on marker genes.

Installation

The package is available at PIP and is installed with

pip install point-annotator

Build the documentation

For building the documentation one needs to have sphinx python package installed.

cd docs
make hmtl

After build the documentation is available in the _build directory.

point-annotator's People

Contributors

primozgodec avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

primozgodec

point-annotator's Issues

Questions & Comments

I've tried using this library and I have a few general comments that would, IMO, make this package much simple to use. To be clear, I was following the notebooks.

  • You should add checks to make sure objects are of the correct type at the beginning of the script so it fails right away e.g. I passed in a sparse matrix instead of a dataframe and it was running for some time before it failed, and even then the message was quite cryptic.
  • It's pretty strange you require the user to pass in a pandas dataframe, but then I thought about it a little and it makes sense since you need a way to match up the marker genes and the genes in the count matrix. Still, that caught me a little off guard.
  • It seems strange to me that the annotations in the markers table need to have the name "Cell Type", after all, the library is called "point-annotator" and not "cell annotator". I think the most elegant solution would be to just allow a parameter annotation_col, which would allow me to quickly set and change which column I want to use for annotations.
  • At first, I was completely confused by the num_genes parameter in assign_annotations. What does that have to do with anything? Reading the docstring did not help me out at all. Upon closer inspection, I realized that this is important for the DE test (I guess?). What would be the correct thing to do here if you don't know how many genes the organism has?
  • You might want to add a function that does plotting with the annotations into the library itself. It's tedious to write code for plotting, and this is something most people would want to do (I guess).

Also, the notebooks are a little confusing, a lot of nontrivial code for plotting confusion matrices and all that. Other than that, it's really cool ๐Ÿ˜„

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.