Giter Club home page Giter Club logo

cellbench's People

Contributors

hpages avatar jwokaty avatar nturaga avatar petehaitch avatar qgouil avatar shians avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cellbench's Issues

First pass over exported functions

Check exported functions and

  • Check for adequate documentation
  • Check examples are runnable
  • Check arguments are consistent with other API within package
  • Consider categorisation of function
  • Consider if it should be made private

Check private functions and

  • Consider if it is useful to expose
  • Add "." prefix to indicate private
  • Add any missing imports

Compiling full pipelines and running each pipeline more efficiently

Currently CellBench stores every result for each step before moving on to the next step. It's possible to reduce memory burden by running single pipelines to the final result before moving onto the next, reducing the amount of data that needs to be kept in memory.

The difficulty is in computing all required data only once and releasing them when they are no longer needed, this would follow a depth-first tree structure and it's not obvious how to implement it elegantly.

Add propagated task_stop object

At the moment task_error objects are propagated for the purposes of recording errors in method calls. A class task_stop could be implemented with similar functionality but based on custom stop conditions.

For example, task_stop could be emitted based on a filtering step removing too many cells. Another use case would be if only the top 5/10 stepwise results based on some metric should be passed into the next method, a filtering function could replace all non-passing results with a task_stop that contains some information for the reason for stop.

The key functionality is that task_stop has the same functionality as task_error in terms of propagation, in that it is done automatically and individual methods do not need to know how to handle them.

Add metric unpack helper

When applying a metric as a method, the results are a list and the metric type is in a column. The desire behaviour would be a vector column with the metric name and the values unlisted.

Progress reporting

Progress reporting would be useful for applying pipelines. Can do this at the method step checkpoints and/or BiocParallel level.

Document conceptual framework

Once categories have been defined in #4, each category should have a shared conceptual framework to be explained in a vignette.

vignette ‘Introduction’ not found

vignette("Introduction", package = "CellBench")
Warning message:
vignette ‘Introduction’ not found

Where can I find more detailed documentation, it does not seem to find the associated instance in the .rmd of the related ‘CellBench_data’ project.

Automatic code checks

I think it's worth running goodpractice::goodpractice(".") and BiocCheck::BiocCheck(".") (I won't post the output here since it's rather long).
Some of the warnings are spurious/premature since the package is still in development (e.g. lack of unit tests) but others are worth fixing now (e.g., use seq_len() or seq_along() instead of 1:N).

Overally, I take these more as indicators than rules. But you'll ultimately need to satisfy BiocCheck for submission to Bioconductor.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.