Giter Club home page Giter Club logo

Comments (7)

mckinsel avatar mckinsel commented on September 13, 2024

Just to get the ball rolling, here are a couple quick comments. @paoloczi, any thoughts?

  1. Exposing the explicit container types is a little weird in python. There's no real notion of, say, a UintFloatPairListList in the language. So instead of sticking c++ vectors into python, it's better from a usability perspective to work with plain python lists, do the PyObject casts, and fail if it doesn't work. Similarly, python doesn't have c++ pairs, but it does of tuples of length 2.

  2. The methods with incremented names like findSimilarPairs0, findSimilarPairs1, findSimilarPairs2 are a little tough to figure out.

  3. Most of the methods that do any heavy lifting don't return anything. You would expect something like createCellGraph to return a CellGraph or findSimilarPairs to return some similar pairs. Keeping everything within an opaque object means you have to externally manage a catalog of cell set names, gene set names, similar pairs names, cell graph names, and cluster graph names.

  4. There are some common tasks that are hard to do. For example, when I've used this I wanted to iterate over all the cells and get each cell's cluster assignment from a particular clustering. Ideally this could be something like

[(cell, exp_mat.get_cluster_assignment(cell, "my_clustering")) for cell in exp_mat.cells]

But it's not clear how to do that with this API.

from expressionmatrix2.

paoloczi avatar paoloczi commented on September 13, 2024

from expressionmatrix2.

freeman-lab avatar freeman-lab commented on September 13, 2024

Great discussion! Chatted a bit with @mckinsel offline, here are some thoughts...

It seems like there's potential value in a friendly and functional Pythonic interface, with clean inputs and outputs for core operations, designed for Python end users. To that end I like all of @mckinsel 's API suggestions, and might have a few others. But also totally get @paoloczi that we don't want to completely refactor the logic of the C++ library! And we want to keep it easily usable from C++.

So, I think somewhere we need some extra, probably messy glue code, to translate from the C++ semantics to something more Pythonic. We could do that in a separate codebase, so it doesn't muck up what's here. In other words, this repo just exposes a clean C++ API. The Python translation code becomes its own little project, as would an R translation. The only downside is the time to write and maintain that glue code, but it could be worthwhile it if we really want to enable Python users.

The only caveat is whether the memory management underlying the ExpressionMatrix object makes certain semantics in Python strictly impossible — this is the the "hard to make them accessible" part of @paoloczi 's response :) But @mckinsel suggested he can play with a small test case to see what's possible there, which seems super useful to me!

from expressionmatrix2.

paoloczi avatar paoloczi commented on September 13, 2024

from expressionmatrix2.

paoloczi avatar paoloczi commented on September 13, 2024

Some thoughts regarding a more natural Python API, triggered by in part by work in this repo https://github.com/chanzuckerberg/pyEM2 and in part by some discussions today:

  • We could replace Boost.python with pybind11. If we do this, std::vector gets automatically converted to Python list and std::pair gets automatically converted to Python tuple with two elements. This would eliminate the vast majority of the new types exposed to Python. In addition, we could also use std::map, which gets automatically converted to Python dictionary, in places where that would be more natural.
  • Once converted to pybind11, we could essentially merge PythonModule.cpp with the code in the pyEM2 repo.
  • Regarding naming conventions, Python prefers underscores, rather than capitalization, for function names (but always capitalization for class names). We could expose each function in two ways: using both the C++ capitalized name and the name transformed using underscores. This would allow both styles of Python code, and ensure that using the same name between C++ and Python would continue to work.
  • Use of keyword arguments (instead of types such as ExpressionMatrixCreationParameters) requires more thinking.

from expressionmatrix2.

paoloczi avatar paoloczi commented on September 13, 2024

Regarding keyword arguments, there seems to be consensus that they are a good thing to do.

from expressionmatrix2.

paoloczi avatar paoloczi commented on September 13, 2024

The pyEM2 repository fills some of these needs. There are good reasons to keep both a "lean and mean" lower level API like the one provided by ExpressionMatrix2 and a higher level, more convenient and "Pythonic" API like the one provided by pyEM2. To avoid confusing the users we concluded that it is best to keep the two repositories and their documentations separate.

from expressionmatrix2.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.