gbif / pygbif Goto Github PK
View Code? Open in Web Editor NEWGBIF Python client
Home Page: https://pygbif.readthedocs.io/en/latest/
License: MIT License
GBIF Python client
Home Page: https://pygbif.readthedocs.io/en/latest/
License: MIT License
from http://lists.gbif.org/pipermail/api-users/2016-June/000344.html
A new repatriation filter allows to search for records whose publishing country is different to the country where the record was recorded in. For example, the following link shows the repatriated data of Costa Rica:http://api.gbif.org/v1/occurrence/search?COUNTRY=CR&REPATRIATED=true .
Add a method to return taxon url (as in the old and depreated python-gbif client library, https://github.com/matagus/python-gbif).
See http://lists.gbif.org/pipermail/api-users/2016-June/000344.html. These facets allow to get quick counts for any occurrence search.
From
https://github.com/sckott/pygbif/blob/master/demos/pygbif-intro.ipynb
The example
from pygbif import occurrences
occurrences.get(key = 252408386)
Throws an exception
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/pygbif/occurrences/get.py", line 17, in get
out = gbif_GET(url, {}, **kwargs)
File "/Library/Python/2.7/site-packages/pygbif/gbifutils.py", line 19, in gbif_GET
out.raise_for_status()
File "/Library/Python/2.7/site-packages/requests/models.py", line 862, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http://api.gbif.org/v1/occurrence/252408386
I would like to trigger a download based on a query structured as follows (just an example):
basisOfRecords in ['HUMAN_OBSERVATION', 'LITERATURE']
AND
country
AND
year >= 1000
AND
year <= 2019
AND
hasCoordinate = TRUE
If I try something like this:
test_download = occurrences.download(['basisOfRecord = OBSERVATION',
'basisOfRecord = LITERATURE',
'basisOfRecord = PRESERVED_SPECIMEN',
'basisOfRecord = MATERIAL_SAMPLE',
'basisOfRecord = UNKNOWN',
'basisOfRecord = HUMAN_OBSERVATION',
'country = BE',
'year >= 1000',
'year <= 2019',
'hasCoordinate = TRUE'],
pred_type = 'and')
I get a valid but empty occurrence.txt
file because observations cannot have multiple values of basisOfRecords
. This is clearly a query with multiple levels of predicates involved: an OR within basisOfRecord
values and a general AND for all query keys.
Via rgbif
R package I can do it easily. Here below an example with taxon keys and countries in vectors where values are comma separated:
rgbif::occ_download(
paste0("taxonKey = ", paste(taxon_keys, collapse = ",")),
paste0("country = ", paste(countries, collapse = ",")),
paste0("hasCoordinate = TRUE")
)
Unfortunately, I cannot pass multiple values in this way to pygbif. I am quite new to pygbif, so probably I miss something. However, I didn't find any example in documentation tackling such situations.
Python version:
3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 18:50:55) [MSC v.1915 64 bit (AMD64)]
pygbif version:
> print(pygbif.__version__):
0.3.0
Any help is welcome. Thanks.
======================================================================
FAIL: test the addition of another predicate combiner
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/sckott/github/sac/pygbif/test/test-occurrences-download_request.py", line 43, in test_alternative_main_type
'send_notification': 'true'})
AssertionError: {'creator': 'name', 'notification_address': ['em[94 chars] []}} != {'created': 2018, 'creator': 'name', 'notificati[94 chars]rue'}
- {'created': 2019,
? ^
+ {'created': 2018,
? ^
'creator': 'name',
'notification_address': ['email'],
'predicate': {'predicates': [], 'type': 'or'},
'send_notification': 'true'}
======================================================================
FAIL: test the creation of the predicate class
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/sckott/github/sac/pygbif/test/test-occurrences-download_request.py", line 31, in test_gbif_creation
'send_notification': 'true'})
AssertionError: {'creator': 'name', 'notification_address': ['em[95 chars] []}} != {'created': 2018, 'creator': 'name', 'notificati[95 chars]rue'}
- {'created': 2019,
? ^
+ {'created': 2018,
? ^
'creator': 'name',
'notification_address': ['email'],
'predicate': {'predicates': [], 'type': 'and'},
'send_notification': 'true'}
do pythonistas think we need something like https://github.com/ropensci/finch for Python? or not so much?
for requests
arguments, list the acceptable set in docs, and allow only those to be passed to the request call, and
for GBIF arguments, e.g, faceting things, use those, then pop out of the kwargs before kwargs are passed to requests call
like in rgbif
for now, getting rid of shapely, too heavy, look for lighter weight dependency.
Python 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
pygbif 0.3.0
When querying GBIF using name_suggest
the name Lantanophaga pusillidactyla is not found. Apparently, it has been written in the wrong gender form. However, the GBIF website given the same query returns the results for Lantanophaga pusillidactylus.
Is it possible to include this functionality in pygbif?
working on this in rgbif now ropensci/rgbif#362
Here https://github.com/sckott/pygbif/blob/master/pygbif/occurrences/download.py#L245-L253 we talk about that predicate in
does not work - this should work now
see also https://discourse.gbif.org/t/downloading-and-citing-occurrence-data-for-multiple-taxa/1152
should be doable as no C code, or 2 or 3 specific code I can think of
e..g., from shapely
https://github.com/Toblerity/Shapely/blob/master/setup.py#L162-L174
then use like https://github.com/Toblerity/Shapely/blob/master/setup.py#L188
A user wants to query a range of months overlapping two different years. It's clear that its not really possible to do month 11 to month 2, so probably have to use year-month dates. It doesn't seem possible with year-month dates though. We've tried eventDate
with YYYY-MM,YYYY-MM
, e.g., 2010-10,2011-02
, but that doesn't work. e.g.,
from pygbif import occurrences
x = occurrences.search(2489603, eventDate="2010-11,2011-12")
dates = [ z['eventDate'] for z in x['results'] ]
the dates are all in 2011
In [31]: min(dates)
Out[31]: '2011-01-02T00:00:00'
In [32]: max(dates)
Out[32]: '2011-04-30T00:00:00'
There's about 5500 or so records for this request, and paging through to the end, all dates I believe are 2011.
@MattBlissett @timrobertson100 any ideas on date range searches?
we have a limit of 12 thousands characters long for a download query, this restriction is imposed by the workflow engine that we use to process downloads
Hi,
I was considering trying to find a bit of time this year to contribute to pygbif, and in my opinion making the test suite more exhaustive would be a good way to start!
Currently, it seems pygbif test suite consumes the "real" GBIF API, which is IMHO suboptimal (fragility, some tests are difficult to implement, speed considerations, ...). I was therefore considering (in another project) to create an Mock object (or mock server) that could be consumed by pygbif and other similar projects.
If such a project existed, would you be interested in using it to improve pygbif test suite?
opened an issue http://dev.gbif.org/issues/browse/POR-3125
Hello, just wanna report a possible typo in the library:
Version:
Python 3.6.2
pygbif 0.2.0
How to reproduce:
from pygbif import registry
registry.installations(data='dataset', uuid='1cbabffe-9073-4007-ba1e-40ebcda6e302')
Trackback:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "foo/bar/python3.6/site-packages/pygbif/registry/installations.py", line 41, in installations
'identifierType': identifierTyp}
NameError: name 'identifierTyp' is not defined
Thank you so much.
work on the clitool
branch locally
not sure i'll do this or not yet, but started some work
Should we support the maps API?
If so, def. wait for /v2 to come out
Hi, I'm trying to download occurrence data and run into an exception. Here is my code:
` from pygbif import occurrences as occ
key = 2473421
download_info = occ.download(
user=my_user_name,
pwd=my_pwd,
email=my_email,
queries=["taxonKey = {}".format(key)]
)
if len(download_info) > 0:
occ.download_get(key=download_info[0])`
And the error :
File "/home/leguilln/workspace/python3.7-env/lib/python3.7/site-packages/pygbif/occurrences/download.py", line 384, in download_get raise Exception('download "%s" not of status SUCCEEDED' % key) Exception: download "0014929-191105090559680" not of status SUCCEEDED
Thank you for your help.
I have the Next code for exctract the coordinate from map
from pygbif import maps
x = maps.map(familiaKey = 4690, year = 2000, bin = "hex", country='CO',srs='EPSG:4326',
hexPerTile = 500, style = "classic-noborder.poly",format = ".mvt")
x.response
x.path
x.img # None
import mapbox_vector_tile
result = mapbox_vector_tile.decode(x.response.content)
Next, I visualizate the data
result['occurrence']['features'][0]
# Output
{'geometry': {'coordinates': [[[2425, 2312],
[2423, 2307],
[2417, 2307],
[2415, 2312],
[2417, 2316],
[2423, 2316],
[2425, 2312]]],
'type': 'Polygon'},
'id': 0,
'properties': {'total': 150},
'type': 3}
My problem is in item for coordinate. Because I need in this format:
{'geometry': {'coordinates': [[[-75.3, 4.35],
[-73.2423, 4.2307],
[-75.2417, 5.2307],
[-75.2415, 5.2312],
.........
How can i make the trasfomation? I need the coordinate in this way because I will create a GeoDataFrame with Geopandas.
This is the link of my notebook in Google Colab.
https://colab.research.google.com/drive/1noZlE6fFq39Qnhml2mTJk5BI2ksYRITA
occ.download_get("0000099-140929101555934")
#> NoResultException: content-type did not = 'application/octet-stream; qs=0.5'
Python 3.7
pygbif 0.3.0
In my application, I send a lot of requests to get info from GBIF backbone taxonomy, sometimes with the same parameters. Ultimately, I plan to download the full taxonomic backbone, but in the meantime, I think it would be great to implement some caching mechanism, for instance using requests-cache.
Anyway, great work on this library. Thanks.
When I search for Oryctes rhinoceros within a bounding box for Guam, I get an occurrence record for the Philippines. Using python 2.7/pygbif 0.2.0.
from pygbif import occurrences as occ
occ.search(scientificName='Oryctes rhinoceros',
geometry='POLYGON((144.61 13.22, 144.61 13.66, 144.96 13.66, 144.96 13.22, 144.61 13.22))',
limit=1)
Returns:
{u'count': 3,
u'endOfRecords': False,
u'facets': [],
u'limit': 1,
u'offset': 0,
u'results': [{u'acceptedScientificName': u'Oryctes rhinoceros (Linnaeus, 1758)',
u'acceptedTaxonKey': 4995642,
u'basisOfRecord': u'HUMAN_OBSERVATION',
u'catalogNumber': u'4864787',
u'class': u'Insecta',
u'classKey': 216,
u'collectionCode': u'Observations',
u'coordinateUncertaintyInMeters': 88056.0,
u'country': u'Philippines',
u'countryCode': u'PH',
u'crawlId': 141,
u'datasetKey': u'50c9509d-22c7-4a22-a47d-8c48425ef4a7',
u'datasetName': u'iNaturalist research-grade observations',
u'dateIdentified': u'2016-12-30T15:01:30',
u'day': 23,
u'decimalLatitude': 13.4756,
u'decimalLongitude': 120.851388,
u'eventDate': u'2016-12-23T04:00:00',
...
_
see https://www.python.org/dev/peps/pep-0484/
@stijnvanhoey @peterdesmet any complaints if we use type hints? i think it means python 3.5 or greater
subgenusKey
repatriated
see also #216phylumKey
kingdomKey
classKey
orderKey
familyKey
genusKey
establishmentMeans
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.