Giter Club home page Giter Club logo

hiwenet's Introduction

Histogram-weighted Networks (hiwenet)

status travis Code Health codecov PyPI version [Python versions]

Histogram-weighted Networks for Feature Extraction and Advanced Analysis in Neuroscience

Network-level analysis of various features, esp. if it can be individualized for a single-subject, is proving to be a valuable tool in many applications. Ability to extract the networks for a given subject individually on its own, would allow for feature extraction conducive to predictive modeling, unlike group-wise networks which can only be used for descriptive and explanatory purposes. This package extracts single-subject (individualized, or intrinsic) networks from node-wise data by computing the edge weights based on histogram distance between the distributions of values within each node. Individual nodes could be an ROI or a patch or a cube, or any other unit of relevance in your application. This is a great way to take advantage of the full distribution of values available within each node, relative to the simpler use of averages (or another summary statistic) to compare two nodes/ROIs within a given subject.

Rough scheme of computation is shown below: illustration

Installation

pip install -U hiwenet

Documentation

Docs: http://hiwenet.readthedocs.io

hiwenet's People

Contributors

raamana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

imageflowtech

hiwenet's Issues

New feature: Ability to plugin a user-defined metric

To enable users a compute a metric or similarity or another function of their choice

  • optional input of a callable (taking two distributions, returning a number)
  • callable must be able to handle two numpy arrays (distributions) of different length

[JOSS Review] Please indicate clearly that the software is not Python 3 compatible

The software only works on Python 2. Given the current size of the project (small) and the fact that Python 2 has been discontinued, I would strongly suggest moving forward and start supporting Python 3 now.

In my opinion, no backwards compatibility is necessary. So it is up to the authors to keep it or dismiss it.

If making this software Python 3 compatible is not in the roadmap, both points should be made very clear in the top README file of the project.

It is true that the authors are using the appropriate classifiers for Pypi, but those classifiers are not taken into account by pip during installation. Long story short: installation works in a Python 3 environment without even a warning.

(ref openjournals/joss-reviews#380)

[JOSS Review] Paper.md

I am now checking the last two checkpoints of the review list:

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

The statement of need basically replicates the summary in the README file. I think that, for JOSS, a more explicit statement should be made. The target audience is not mentioned in the Summary (of course it can be inferred from the summary and the paper, but I think JOSS wants it to be explicit, not implicit).

Some references from the preprint should be extracted and placed in the references section of Paper.md. At least a reference about the earliest structural connectivity analyses that looked at the cortical thickness (He 2007) and references to the historical methods to extract these networks (there is a table about this in the preprint). It is important to place this contribution in the full scientific scope.

Adding those references will also make it easier to describe the unique contributions of this software.

In my opinion, it is also important to mention that this is a reimplementation of an original code in Matlab. First, I want to applaud the authors' transparency noting this point and their effort of providing a Python version that opens up the audience field. Second, I would advice to include the original Matlab code in the repo, as part of the documentation. That way the authors ensure the transparency of the reimplementation. Third, this point is a bit disturbing to me, and I'm not sure of my opinion on this. Should the Matlab code be the one to be submitted to JOSS?

Additionally, (and this goes beyond this review, it is more for the pre-print), I'd advise to repeat all the analyses done in the paper with this software (if possible), to check that this re-implementation works exactly as the Matlab implementation.

(ref. openjournals/joss-reviews#380)

[JOSS review] Discussion and Clarifications

From http://joss.theoj.org/about#author_guidelines:

What are your requirements for submission?

  • Your software should be open source as per the OSI definition
  • Your software should have an obvious research application
  • You should be a major contributor to the software you are submitting
  • Should be a significant contribution to the available open source software that either enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)
  • Should be feature complete (no half-baked solutions)
  • Minor, 'utility' packages are not acceptable
    JOSS publishes articles about research software. This definition includes software that: solves complex modeling problems in a scientific context (physics, mathematics, biology, medicine, social science, neuroscience, engineering); supports the functioning of research instruments or the execution of research experiments; extracts knowledge from large data sets; offers a mathematical library, or similar.
  1. The point Should be feature complete (no half-baked solutions) could be ticked on, but some details (as the fact that no automated generation of documentation is provided #7 ) points to the opposite. Tests are probably a bit light as well (will post a different issue with more concrete ideas). These issues can be addressed during the review process.

  2. But I'm mostly concerned about the last point: Minor, 'utility' packages are not acceptable. The core of the proposed package is the extract function which computes the histograms for each ROI in the parcellation and then uses the histogram distance function from medpy to calculate the histogram-weighted network. The code is completed by some eight supporting functions that basically wrap methods from numpy and medpy to address particular operations. I would like to hear from Pradeep the aspects along which this package goes beyond the concept of 'utility'.

    This is a very valuable piece of code and a great contribution to network analysis, please do not get me wrong. But I would find it extremely more useful (and accessible to a wider audience) if it was integrated in other neuroimaging packages, like MNE, nilearn, nistats, medpy (which is a dependency) or even nibabel. Was a PR requested to any of these projects and rejected in first place? These packages always credit contributors and reference the original methods, so I don't see the downside.

    Particularly, nilearn currently misses the calculation of histogram-weighted networks and formal utilities for signal extraction from surface (it can be done, but with some manual massaging). Adding these two features (which is, basically what HiWiNet does currently) to nilearn would be extremely useful. And, hiwinets could be calculated on volume-based data as well, at no extra cost.

(ref. openjournals/joss-reviews#380)

Provide a sklearn-comptabile object

Use docs and templates from

and offer the following sorts of object to other packages like nilearn, MNE and the related:

from hiwenet import HistogramDist, FancyFamilyOfDistances, AnotherFancyFamilyOfSimilarities

This should ideally work like a drop-in replacement for LedoitWolf from sklearn.covariance

from sklearn.covariance import LedoitWolf
from hiwenet import HistogramDist

[JOSS Review] Software testing

The main missing point here is a "smoke test" (just checking that the software worked, allowing any size of error at the output). How it would work:

  1. Generate an example features file. Store it in your repo so travis fetches it.
  2. Generate (locally) the outputs from that specific features file. Add it to the repo.
  3. Run the command line on that file (use coverage if you want to make sure that run is taken into account for coverage measure, allowing you to get rid of some tests -I go deeper on this below-).
  4. Check that the output of 3) is the same you got in 2).

This does not check that the method worked correctly, it only checks that it did not break down (no smoke came out the box).

But, you happen to have an alternative implementation. This is a great testing oracle. You will be able to 1) ensure your implementation matches the original; 2) be more certain about the correctness of your software. To do so, modify 2) in the previous list with the outputs generated with your Matlab implementation. In 4) you probably want to modify your test to accept certain threshold of numerical tolerance. After all, you wouldn't need to repeat all the experiments of your preprint to provide some evidence that this implementation is equivalent.

The rest of the tests just check the conformity of inputs and outputs (

def test_dimensions():
ew = hiwenet(features, groups)
assert len(ew) == num_groups
assert ew.shape[0] == num_groups and ew.shape[1] == num_groups
def test_too_few_groups():
features, groups, group_ids, num_groups = make_features(100, 1)
with raises(ValueError):
ew = hiwenet(features, groups)
def test_too_few_values():
features, groups, group_ids, num_groups = make_features(10, 500)
with raises(ValueError):
ew = hiwenet(features[:num_groups-1], groups)
def test_invalid_trim_perc():
with raises(ValueError):
ew = hiwenet(features, groups, trim_percentile= -1)
with raises(ValueError):
ew = hiwenet(features, groups, trim_percentile=101)
def test_invalid_weight_method():
with raises(NotImplementedError):
ew = hiwenet(features, groups, weight_method= 'dkjz.jscn')
with raises(NotImplementedError):
ew = hiwenet(features, groups, weight_method= 'somerandomnamenoonewoulduse')
def test_trim_not_too_few_values():
with raises(ValueError):
ew = hiwenet( [0], [1], trim_outliers = False)
def test_trim_false_too_few_to_calc_range():
with raises(ValueError):
ew = hiwenet( [1], groups, trim_outliers = False)
def test_not_np_arrays():
with raises(ValueError):
ew = hiwenet(list(), groups, trim_percentile=101)
with raises(ValueError):
ew = hiwenet(features, list(), trim_percentile=101)
def test_invalid_nbins():
with raises(ValueError):
ew = hiwenet(features, groups, num_bins=np.NaN)
with raises(ValueError):
ew = hiwenet(features, groups, num_bins=np.Inf)
with raises(ValueError):
ew = hiwenet(features, groups, num_bins=2)
def test_return_nx_graph():
nxG = hiwenet(features, groups, return_networkx_graph = True)
assert isinstance(nxG, nx.Graph)
assert nxG.number_of_nodes() == num_groups
assert nxG.number_of_edges() == num_links
def test_extreme_skewed():
# Not yet sure what to test for here!!
ew = hiwenet(10+np.zeros(dimensionality), groups)
), which is fine.

Or they test argparse, when it already takes care of the correctness and interpretation of the command line. I think these tests do not exercise much the code:

# CLI tests
def test_CLI_run():
"function to hit the CLI lines."
# first word is the script names (ignored)
cur_dir = os.path.dirname(os.path.abspath(__file__))
featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt'))
groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt'))
sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
CLI()
def test_CLI_nonexisting_paths():
"invalid paths"
cur_dir = os.path.dirname(os.path.abspath(__file__))
featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt'))
groups_path = 'NONEXISTING_groups_1000.txt'
sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
with raises(IOError):
CLI()
featrs_path = 'NONEXISTING_features_1000.txt'
groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt'))
sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
with raises(IOError):
CLI()
def test_CLI_invalid_args():
"invalid paths"
featrs_path = 'NONEXISTING_features_1000.txt'
# arg aaa or invalid_arg_name doesnt exist
sys.argv = shlex.split('hiwenet --aaa {0} -f {0} -g {0}'.format(featrs_path))
with raises(SystemExit):
CLI()
sys.argv = shlex.split('hiwenet --invalid_arg_name {0} -f {0} -g {0}'.format(featrs_path))
with raises(SystemExit):
CLI()
def test_CLI_too_few_args():
"testing too few args"
sys.argv = ['hiwenet ']
with raises(SystemExit):
CLI()
sys.argv = ['hiwenet -f check']
with raises(SystemExit):
CLI()
sys.argv = ['hiwenet -g check']
with raises(SystemExit):
CLI()
.

(ref. openjournals/joss-reviews#380)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.