The hiwenet from raamana


Docs:	http://hiwenet.readthedocs.io

Broadening options for weight computation

Using the new feature of arbitrary callable, implement more useful weights

such as difference in medians,
correlation, thresholding etc
other of interest

New feature: Ability to plugin a user-defined metric

To enable users a compute a metric or similarity or another function of their choice

optional input of a callable (taking two distributions, returning a number)
callable must be able to handle two numpy arrays (distributions) of different length

[JOSS Review] Please indicate clearly that the software is not Python 3 compatible

The software only works on Python 2. Given the current size of the project (small) and the fact that Python 2 has been discontinued, I would strongly suggest moving forward and start supporting Python 3 now.

In my opinion, no backwards compatibility is necessary. So it is up to the authors to keep it or dismiss it.

If making this software Python 3 compatible is not in the roadmap, both points should be made very clear in the top README file of the project.

It is true that the authors are using the appropriate classifiers for Pypi, but those classifiers are not taken into account by pip during installation. Long story short: installation works in a Python 3 environment without even a warning.

(ref openjournals/joss-reviews#380)

Version in JOSS submission does not match GitHub's latest release

Here (openjournals/joss-reviews#380 (comment)), version 0.2.2 is reported but the repo's version indicates 0.2. The installation script specifies 0.2.2 (https://github.com/raamana/hiwenet/blob/master/setup.py#L6).

I would suggest the versioneer package to keep track automatically of versions.

Just generating the appropriate git tag (0.2.2) would satisfy this comment, though.

[JOSS Review] API is not documented

Methods are documented with docstrings, but no automated method to produce the API documentation is available. Using sphinx would be really straightforward to document this project properly.

This software would really improve if documentation was built automatically and served by a third-party service like http://readthedocs.org

(ref. openjournals/joss-reviews#380)

Python 3 compatibility

Due to single a statement not compatible with Python 3+ in medpy at
https://github.com/loli/medpy/blob/master/medpy/core/logger.py#L94

The current version on GitHub shows it as fixed, but not the one on PyPI (3 years old)

I'm unable to upgrade hiwenet to python 3+ easily.

Need to figure out workarounds - perhaps restricting medpy import to very specific things (not ever getting that error).

[JOSS Review] Paper.md

I am now checking the last two checkpoints of the review list:

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

The statement of need basically replicates the summary in the README file. I think that, for JOSS, a more explicit statement should be made. The target audience is not mentioned in the Summary (of course it can be inferred from the summary and the paper, but I think JOSS wants it to be explicit, not implicit).

Some references from the preprint should be extracted and placed in the references section of Paper.md. At least a reference about the earliest structural connectivity analyses that looked at the cortical thickness (He 2007) and references to the historical methods to extract these networks (there is a table about this in the preprint). It is important to place this contribution in the full scientific scope.

Adding those references will also make it easier to describe the unique contributions of this software.

In my opinion, it is also important to mention that this is a reimplementation of an original code in Matlab. First, I want to applaud the authors' transparency noting this point and their effort of providing a Python version that opens up the audience field. Second, I would advice to include the original Matlab code in the repo, as part of the documentation. That way the authors ensure the transparency of the reimplementation. Third, this point is a bit disturbing to me, and I'm not sure of my opinion on this. Should the Matlab code be the one to be submitted to JOSS?

Additionally, (and this goes beyond this review, it is more for the pre-print), I'd advise to repeat all the analyses done in the paper with this software (if possible), to check that this re-implementation works exactly as the Matlab implementation.

(ref. openjournals/joss-reviews#380)

[JOSS review] Discussion and Clarifications

From http://joss.theoj.org/about#author_guidelines:

What are your requirements for submission?

Your software should be open source as per the OSI definition

Your software should have an obvious research application

You should be a major contributor to the software you are submitting

Should be a significant contribution to the available open source software that either enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)

Should be feature complete (no half-baked solutions)

Minor, 'utility' packages are not acceptable
JOSS publishes articles about research software. This definition includes software that: solves complex modeling problems in a scientific context (physics, mathematics, biology, medicine, social science, neuroscience, engineering); supports the functioning of research instruments or the execution of research experiments; extracts knowledge from large data sets; offers a mathematical library, or similar.

The point Should be feature complete (no half-baked solutions) could be ticked on, but some details (as the fact that no automated generation of documentation is provided #7 ) points to the opposite. Tests are probably a bit light as well (will post a different issue with more concrete ideas). These issues can be addressed during the review process.
But I'm mostly concerned about the last point: Minor, 'utility' packages are not acceptable. The core of the proposed package is the extract function which computes the histograms for each ROI in the parcellation and then uses the histogram distance function from medpy to calculate the histogram-weighted network. The code is completed by some eight supporting functions that basically wrap methods from numpy and medpy to address particular operations. I would like to hear from Pradeep the aspects along which this package goes beyond the concept of 'utility'.

This is a very valuable piece of code and a great contribution to network analysis, please do not get me wrong. But I would find it extremely more useful (and accessible to a wider audience) if it was integrated in other neuroimaging packages, like MNE, nilearn, nistats, medpy (which is a dependency) or even nibabel. Was a PR requested to any of these projects and rejected in first place? These packages always credit contributors and reference the original methods, so I don't see the downside.

Particularly, nilearn currently misses the calculation of histogram-weighted networks and formal utilities for signal extraction from surface (it can be done, but with some manual massaging). Adding these two features (which is, basically what HiWiNet does currently) to nilearn would be extremely useful. And, hiwinets could be calculated on volume-based data as well, at no extra cost.

(ref. openjournals/joss-reviews#380)

User versioneer to automate versioning and releases

check https://github.com/warner/python-versioneer/blob/master/INSTALL.md

Provide a sklearn-comptabile object

Use docs and templates from

and offer the following sorts of object to other packages like nilearn, MNE and the related:

from hiwenet import HistogramDist, FancyFamilyOfDistances, AnotherFancyFamilyOfSimilarities

This should ideally work like a drop-in replacement for LedoitWolf from sklearn.covariance

from sklearn.covariance import LedoitWolf
from hiwenet import HistogramDist

[JOSS Review] Software testing

The main missing point here is a "smoke test" (just checking that the software worked, allowing any size of error at the output). How it would work:

Generate an example features file. Store it in your repo so travis fetches it.
Generate (locally) the outputs from that specific features file. Add it to the repo.
Run the command line on that file (use coverage if you want to make sure that run is taken into account for coverage measure, allowing you to get rid of some tests -I go deeper on this below-).
Check that the output of 3) is the same you got in 2).

This does not check that the method worked correctly, it only checks that it did not break down (no smoke came out the box).

But, you happen to have an alternative implementation. This is a great testing oracle. You will be able to 1) ensure your implementation matches the original; 2) be more certain about the correctness of your software. To do so, modify 2) in the previous list with the outputs generated with your Matlab implementation. In 4) you probably want to modify your test to accept certain threshold of numerical tolerance. After all, you wouldn't need to repeat all the experiments of your preprint to provide some evidence that this implementation is equivalent.

The rest of the tests just check the conformity of inputs and outputs (

hiwenet/hiwenet/test_hiwenet.py

Lines 46 to 110 in c4fc23b

 def test_dimensions(): 

 ew = hiwenet(features, groups) 

 assert len(ew) == num_groups 

 assert ew.shape[0] == num_groups and ew.shape[1] == num_groups 

 def test_too_few_groups(): 

 features, groups, group_ids, num_groups = make_features(100, 1) 

 with raises(ValueError): 

 ew = hiwenet(features, groups) 

 def test_too_few_values(): 

 features, groups, group_ids, num_groups = make_features(10, 500) 

 with raises(ValueError): 

 ew = hiwenet(features[:num_groups-1], groups) 

 def test_invalid_trim_perc(): 

 with raises(ValueError): 

 ew = hiwenet(features, groups, trim_percentile= -1) 

 with raises(ValueError): 

 ew = hiwenet(features, groups, trim_percentile=101) 

 def test_invalid_weight_method(): 

 with raises(NotImplementedError): 

 ew = hiwenet(features, groups, weight_method= 'dkjz.jscn') 

 with raises(NotImplementedError): 

 ew = hiwenet(features, groups, weight_method= 'somerandomnamenoonewoulduse') 

 def test_trim_not_too_few_values(): 

 with raises(ValueError): 

 ew = hiwenet( [0], [1], trim_outliers = False) 

 def test_trim_false_too_few_to_calc_range(): 

 with raises(ValueError): 

 ew = hiwenet( [1], groups, trim_outliers = False) 

 def test_not_np_arrays(): 

 with raises(ValueError): 

 ew = hiwenet(list(), groups, trim_percentile=101) 

 with raises(ValueError): 

 ew = hiwenet(features, list(), trim_percentile=101) 

 def test_invalid_nbins(): 

 with raises(ValueError): 

 ew = hiwenet(features, groups, num_bins=np.NaN) 

 with raises(ValueError): 

 ew = hiwenet(features, groups, num_bins=np.Inf) 

 with raises(ValueError): 

 ew = hiwenet(features, groups, num_bins=2) 

 def test_return_nx_graph(): 

 nxG = hiwenet(features, groups, return_networkx_graph = True) 

 assert isinstance(nxG, nx.Graph) 

 assert nxG.number_of_nodes() == num_groups 

 assert nxG.number_of_edges() == num_links 

 def test_extreme_skewed(): 

 # Not yet sure what to test for here!! 

 ew = hiwenet(10+np.zeros(dimensionality), groups)

), which is fine.

Or they test argparse, when it already takes care of the correctness and interpretation of the command line. I think these tests do not exercise much the code:

hiwenet/hiwenet/test_hiwenet.py

Lines 113 to 168 in c4fc23b

 # CLI tests 

 def test_CLI_run(): 

 "function to hit the CLI lines." 

 # first word is the script names (ignored) 

 cur_dir = os.path.dirname(os.path.abspath(__file__)) 

 featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt')) 

 groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt')) 

 sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path)) 

 CLI() 

 def test_CLI_nonexisting_paths(): 

 "invalid paths" 

 cur_dir = os.path.dirname(os.path.abspath(__file__)) 

 featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt')) 

 groups_path = 'NONEXISTING_groups_1000.txt' 

 sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path)) 

 with raises(IOError): 

 CLI() 

 featrs_path = 'NONEXISTING_features_1000.txt' 

 groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt')) 

 sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path)) 

 with raises(IOError): 

 CLI() 

 def test_CLI_invalid_args(): 

 "invalid paths" 

 featrs_path = 'NONEXISTING_features_1000.txt' 

 # arg aaa or invalid_arg_name doesnt exist 

 sys.argv = shlex.split('hiwenet --aaa {0} -f {0} -g {0}'.format(featrs_path)) 

 with raises(SystemExit): 

 CLI() 

 sys.argv = shlex.split('hiwenet --invalid_arg_name {0} -f {0} -g {0}'.format(featrs_path)) 

 with raises(SystemExit): 

 CLI() 

 def test_CLI_too_few_args(): 

 "testing too few args" 

 sys.argv = ['hiwenet '] 

 with raises(SystemExit): 

 CLI() 

 sys.argv = ['hiwenet -f check'] 

 with raises(SystemExit): 

 CLI() 

 sys.argv = ['hiwenet -g check'] 

 with raises(SystemExit): 

 CLI()

.

(ref. openjournals/joss-reviews#380)

	def test_dimensions():
	ew = hiwenet(features, groups)
	assert len(ew) == num_groups
	assert ew.shape[0] == num_groups and ew.shape[1] == num_groups

	def test_too_few_groups():
	features, groups, group_ids, num_groups = make_features(100, 1)
	with raises(ValueError):
	ew = hiwenet(features, groups)

	def test_too_few_values():
	features, groups, group_ids, num_groups = make_features(10, 500)
	with raises(ValueError):
	ew = hiwenet(features[:num_groups-1], groups)

	def test_invalid_trim_perc():

	with raises(ValueError):
	ew = hiwenet(features, groups, trim_percentile= -1)

	with raises(ValueError):
	ew = hiwenet(features, groups, trim_percentile=101)

	def test_invalid_weight_method():

	with raises(NotImplementedError):
	ew = hiwenet(features, groups, weight_method= 'dkjz.jscn')

	with raises(NotImplementedError):
	ew = hiwenet(features, groups, weight_method= 'somerandomnamenoonewoulduse')

	def test_trim_not_too_few_values():
	with raises(ValueError):
	ew = hiwenet( [0], [1], trim_outliers = False)

	def test_trim_false_too_few_to_calc_range():
	with raises(ValueError):
	ew = hiwenet( [1], groups, trim_outliers = False)

	def test_not_np_arrays():
	with raises(ValueError):
	ew = hiwenet(list(), groups, trim_percentile=101)

	with raises(ValueError):
	ew = hiwenet(features, list(), trim_percentile=101)

	def test_invalid_nbins():
	with raises(ValueError):
	ew = hiwenet(features, groups, num_bins=np.NaN)

	with raises(ValueError):
	ew = hiwenet(features, groups, num_bins=np.Inf)

	with raises(ValueError):
	ew = hiwenet(features, groups, num_bins=2)

	def test_return_nx_graph():
	nxG = hiwenet(features, groups, return_networkx_graph = True)
	assert isinstance(nxG, nx.Graph)
	assert nxG.number_of_nodes() == num_groups
	assert nxG.number_of_edges() == num_links

	def test_extreme_skewed():
	# Not yet sure what to test for here!!
	ew = hiwenet(10+np.zeros(dimensionality), groups)

	# CLI tests
	def test_CLI_run():
	"function to hit the CLI lines."

	# first word is the script names (ignored)
	cur_dir = os.path.dirname(os.path.abspath(__file__))
	featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt'))
	groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt'))
	sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
	CLI()

	def test_CLI_nonexisting_paths():
	"invalid paths"

	cur_dir = os.path.dirname(os.path.abspath(__file__))
	featrs_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'features_1000.txt'))
	groups_path = 'NONEXISTING_groups_1000.txt'
	sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
	with raises(IOError):
	CLI()

	featrs_path = 'NONEXISTING_features_1000.txt'
	groups_path = os.path.abspath(os.path.join(cur_dir, '..', 'examples', 'groups_1000.txt'))
	sys.argv = shlex.split('hiwenet -f {} -g {} -n 25'.format(featrs_path, groups_path))
	with raises(IOError):
	CLI()


	def test_CLI_invalid_args():
	"invalid paths"

	featrs_path = 'NONEXISTING_features_1000.txt'
	# arg aaa or invalid_arg_name doesnt exist
	sys.argv = shlex.split('hiwenet --aaa {0} -f {0} -g {0}'.format(featrs_path))
	with raises(SystemExit):
	CLI()

	sys.argv = shlex.split('hiwenet --invalid_arg_name {0} -f {0} -g {0}'.format(featrs_path))
	with raises(SystemExit):
	CLI()


	def test_CLI_too_few_args():
	"testing too few args"

	sys.argv = ['hiwenet ']
	with raises(SystemExit):
	CLI()

	sys.argv = ['hiwenet -f check']
	with raises(SystemExit):
	CLI()

	sys.argv = ['hiwenet -g check']
	with raises(SystemExit):
	CLI()

raamana / hiwenet Goto Github PK

hiwenet's Introduction

Histogram-weighted Networks (hiwenet)

Installation

Documentation

hiwenet's People

Contributors

Stargazers

Watchers

Forkers

hiwenet's Issues

Recommend Projects

Recommend Topics

Recommend Org