Giter Club home page Giter Club logo

pyneurovault's Introduction

pyneurovault

pyneurovault

python wrapper for NeuroVault api

Currently supports:

  • downloading all image and collections data into tables
  • counting unique cognitive atlas contrasts
  • downloading all resampled images
  • decoding with neurosynth terms
  • basic querying of results

Installation

pip install git+https://github.com/NeuroVault/pyneurovault

pyneurovault's People

Contributors

chrisgorgo avatar vsoch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyneurovault's Issues

Simplify package dependencies

Right now, this package depends on nipype and nilearn.

The nipype is a larger dependency for a single utility function--seems better to copy it out.
The nilearn dependency is only for an example. Seems odd to install it when installing pyneurovault, as the example isn't accessible.

error in get_images_with_collections()

all_images = get_images_with_collections()

Extracting NeuroVault collections meta data...
http://neurovault.org/api/collections/?limit=100&format=json
Found 408 results.
Retrieving http://neurovault.org/api/collections/?format=json&limit=100&offset=100
Retrieving http://neurovault.org/api/collections/?format=json&limit=100&offset=200
Retrieving http://neurovault.org/api/collections/?format=json&limit=100&offset=300
Retrieving http://neurovault.org/api/collections/?format=json&limit=100&offset=400
Extracting NeuroVault images meta data...
http://neurovault.org/api/images/?limit=1000&format=json
Found 7578 results.
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=1000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=2000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=3000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=4000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=5000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=6000
Retrieving http://neurovault.org/api/images/?format=json&limit=1000&offset=7000
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-80d2a85866da> in <module>()
----> 1 all_images = get_images_with_collections()

/Users/filo/anaconda/lib/python2.7/site-packages/pyneurovault/api.pyc in get_images_with_collections(collection_pks)
     94     collections_df = get_collections(pks=collection_pks)
     95     images_df = get_images(collection_pks=collection_pks)
---> 96     combined_df = pandas.merge(images_df, collections_df, how='left', on='collection_id',suffixes=('_image', '_collection'))
     97     return combined_df
     98 

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
     36                          right_index=right_index, sort=sort, suffixes=suffixes,
     37                          copy=copy)
---> 38     return op.get_result()
     39 if __debug__:
     40     merge.__doc__ = _merge_doc % '\nleft : DataFrame'

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in get_result(self)
    184 
    185     def get_result(self):
--> 186         join_index, left_indexer, right_indexer = self._get_join_info()
    187 
    188         ldata, rdata = self.left._data, self.right._data

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in _get_join_info(self)
    271              right_indexer) = _get_join_indexers(self.left_join_keys,
    272                                                  self.right_join_keys,
--> 273                                                  sort=self.sort, how=self.how)
    274 
    275             if self.right_index:

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in _get_join_indexers(left_keys, right_keys, sort, how)
    459 
    460     # get left & right join labels and num. of levels at each location
--> 461     llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))
    462 
    463     # get flat i8 keys from label lists

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/tools/merge.pyc in _factorize_keys(lk, rk, sort)
    621     rizer = klass(max(len(lk), len(rk)))
    622 
--> 623     llab = rizer.factorize(lk)
    624     rlab = rizer.factorize(rk)
    625 

pandas/hashtable.pyx in pandas.hashtable.Int64Factorizer.factorize (pandas/hashtable.c:15733)()

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

API to allow uploads

Would it be possible to expose an API to allow uploads, assuming a valid key pair and all that?

Add filtering capabilities to collection fetching

According to API docs (http://neurovault.org/api-docs),

Returns a json file containing a list of dictionaries with information corresponding to each collection stored in NeuroVault. Results can be filtered by specifying the name, DOI or owner of the collection.

Parameters: name, DOI, owner

example: neurovault.org/api/collections/?DOI=10.1016/j.neurobiolaging.2012.11.002

This filtering should be available in the API.

Non-ascii characters prevent export

Bug! Both data frames have weird characters that don't want to save to any kind of text file. Must figure out how to address, and add an export function.

Retrieving collections by pk returns images, not collections

https://github.com/NeuroVault/pyneurovault/blob/master/pyneurovault/api.py#L99
def get_collections(self,pks=None):
calls
get_json_df("collections",pks,limit=1).

https://github.com/NeuroVault/pyneurovault/blob/master/pyneurovault/utils.py#L75
def get_json_df(data_type, pks=None, limit=1000):
calls
tmp = get_url("http://neurovault.org/api" "/images/%s/?format=json" % pk)
when pk is not None.

I believe images in that URL should be replaced by the variable data_type.

Tag NeuroVault images with Cognitive Atlas

Larger goal: we should be able to use the contrast_definition to put in a good hypothesis for what contrast_definition_cogatlas is, and then we can figure out a clever way to get it checked by the authors of the study (eg, have a function that does the tagging, and then sends an email to authors on paper and says "hey, this is what you meant right?"

** Don't worry, I will not do anything without discussion first! We must honor user contributions and nto spam. Just putting thoughts here.

data table export

  • add export of combined images and collections
  • "row numbers" are confusing, the first column should be the collection_id (this is likely because is not set as index in tables).

Use programmatic paths in examples

Right now, examples don't work because they use hard-coded paths. Use paths programmatically (in the worst case, use os.getcwd()) so that they work out of the box.

Mysterious characters not parsing from json

When using the API with urllib2 to read the json into a data structure, there seem to be some kind of invalid characters leading to error:

nv = api.NeuroVault()
Extracting NeuroVault images meta data...
Traceback (most recent call last):
File "", line 1, in
File "/home/vsochat/.local/lib/python2.6/site-packages/pyneurovault-0.1.0-py2.6.egg/pyneurovault/api.py", line 55, in init
self.images = self.get_images()
File "/home/vsochat/.local/lib/python2.6/site-packages/pyneurovault-0.1.0-py2.6.egg/pyneurovault/api.py", line 73, in get_images
images = DataJson("http://neurovault.org/api/images/?format=json")
File "/home/vsochat/.local/lib/python2.6/site-packages/pyneurovault-0.1.0-py2.6.egg/pyneurovault/utils.py", line 46, in init
self.json = self.get_json()
File "/home/vsochat/.local/lib/python2.6/site-packages/pyneurovault-0.1.0-py2.6.egg/pyneurovault/utils.py", line 55, in get_json
return urllib2.urlopen(self.url).read().encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 21126: ordinal not in range(128)

This needs to be addressed! We don't have much control over what users copy paste / write in their text boxes, so it probably is best to deal with here. We need a solution that is reasonably fast and deals with such characters.

cannot import 'images_from_collections'

I am getting this import error...is it because I am using python 3.10?

ImportError: cannot import name 'images_from_collections' from 'pyneurovault.api' (/Users/saurabh.ranjan/opt/anaconda3/envs/brain/lib/python3.10/site-packages/pyneurovault/api.py)

Downloading images from a specific collection does not work

nv.download_and_resample("/tmp/", "/Applications/fsl/data/standard/MNI152_T1_2mm_brain.nii.gz", collection_ids=[457])


    ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-5dd535d694c7> in <module>()
----> 1 nv.download_and_resample("/tmp/", "/Applications/fsl/data/standard/MNI152_T1_2mm_brain.nii.gz", collection_ids=[457])

/Users/filo/anaconda/lib/python2.7/site-packages/pyneurovault/api.pyc in download_and_resample(self, dest_dir, target, collection_ids, image_ids)
    162     resampled_path = os.path.join(dest_dir, "resampled")
    163     mkdir_p(resampled_path)
--> 164     combined_df = self.get_images_with_collections_df()
    165     # If the user has specified specific images
    166     if image_ids:

/Users/filo/anaconda/lib/python2.7/site-packages/pyneurovault/api.pyc in get_images_with_collections_df(self)
    117   def get_images_with_collections_df(self):
    118     """Downloads metadata about images/statistical maps stored in NeuroVault and enriches it with metadata of the corresponding collections. The result is returned as a pandas DataFrame"""
--> 119     collections_df = self.get_collections_df()
    120     images_df = self.get_images_df()
    121     combined_df = pd.merge(images_df, collections_df, how='left', on='collection_id',suffixes=('_image', '_collection'))

/Users/filo/anaconda/lib/python2.7/site-packages/pyneurovault/api.pyc in get_collections_df(self)
    109   def get_collections_df(self):
    110     """Return just collections data frame"""
--> 111     return self.collections.data
    112 
    113   def get_images_df(self):

/Users/filo/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
   1976                 return self[name]
   1977             raise AttributeError("'%s' object has no attribute '%s'" %
-> 1978                                  (type(self).__name__, name))
   1979 
   1980     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'data'

Lazy read of collections and images

Currently, api.NeuroVault fetches collections and images on object construction. This is very slow (~1 minute), and may not be relevant to the user.

    def __init__(self):
        self.collections = self.get_collections()      
        self.images = self.get_images()                
        print self

I suggest only downloading these data when needed. If you really want this behavior available, I suggest toggling it with a construction flag (perhaps preload=True or on_demand=False)

Download image metadata has key error

  images = api.get_images(collection_pks=collections.collection_id.tolist())

  KeyError                                  Traceback (most recent call last)
  <ipython-input-19-8a09080e4037> in <module>()
  ---> 94     images['collection'] = images['collection'].apply(lambda x: int(x.split("/")[-2]))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.