Power Analytics and Visualization for Climate Science - Spatial Data Infrastructure
Check out the official documentation. Check out the latest documentation.
Power Analytics and Visualization for Climate Science - Spatial Data Infrastructure
Home Page: https://pavics-sdi.readthedocs.io
Power Analytics and Visualization for Climate Science - Spatial Data Infrastructure
Check out the official documentation. Check out the latest documentation.
The next version of pywps will support keywords in process metadata (alongside abstract, etc.)
geopython/pywps#306
When this is done, we can use keywords to search for processes in the UI.
Note that this is not in the WPS 1.0 standard, only in 2.0, so we should not expect all WPS service providers to include keywords.
Add enough ensemble member to showcase distributed parallel task scheduling.
Add river network to geoserver so it can be shown on the map.
We should have a service to take a netCDF file and convert part of it to xls or csv.
A priori and unless someone finds this a really bad idea, I'd fill 80% of the "data" disk space with source data. When user space goes over 90% we buy new disks.
I was thinking it would be good to add a few more choices for basemaps. OpenStreetMap is free and has many different layers (roads, standard, humanitarian). Is the process complicated for adding WMS services? Looking into this.
Write and test a workflow analyzing the CLIMEX ensemble.
ncWMS is too slow, it would be nice to test other options that can serve WMS layers from netCDF files faster.
At the moment the UI relies on process input identifiers to determine its behavior. For example, it knows the process requires a map interaction if (typename, featureid) are found in the input parameters, so if an input's name matches some pattern, clicking on the map will feed information into the form.
I suggest we find a more robust way to tell the UI an input requires a map interaction. This will likely require a refactoring of the processes, but I think it's worth it. One possibility is to replace the two typename and featureid fields by one "polygon" field, with mimetype "application/xml+gml". When the user selects a region from the UI, the input is fed the GML description of the polygon obtained from geoserver (or the URL, not clear to me yet).
This would allow us to implement other type of map interaction such as a bounding box selector, and use the same mechanism.
Write a WPS that combines the Solr response from the ESGF search with the Solr response from the PAVICS search.
One outstanding issue is the availability of OPEnDAP/WMS links in ESGF. Since these protocols are not always available, should there be a fallback to the fileserver as much as possible, in which cases? Do we allow a mix of OPEnDAP link and fileserver links as inputs to WPS (this is already possible in some processes that use the opendap_or_download function (see https://github.com/Ouranosinc/flyingpigeon/blob/pavics/flyingpigeon/handler_common.py).
Seeing as the version of GeoServer bundled in PAVICS (v2.9) is around 2.5 years old, it would be good to update it to a more recent iteration.
Some of the improvements over the past 2.5 years include better performance for reading/writing/displaying shapefiles, postgreSQL/postGIS geometries, NetCDFs as well as better management and UI. @tlogan2000 and I suggest updating to at least 2.13+, as read/write/display performance for GeoPackages (a newer/better OGC standard than shapefiles/gml) is on par with SQL databases.
Félix Gagnon-Grenier [3:09 PM]
ok. ftr je crois que dans une phase 2 de pavics (si phase 2 il y a) il serait priomordial de repenser au système de gestion des sessions et de l'authentification des usagers
See layers available in https://outarde.crim.ca/
Here is a branch: https://github.com/huard/flyingpigeon/tree/kddm_bc
What I haven't checked so far is whether or not it installs correctly, ie all dependencies are listed.
I realize that within the PAVICS-frontend interface, I haven't been able to close panes by clicking on their upper-right corner check-box. Only clicking the on the option-wheel in the bottom left of the screen allows me to open and close panes.
I have tried performing the same actions on other servers and same results. I've also attempted this using some VMs I had on hand (Ubuntu 16.04, openSUSE Tumbleweed, Arch) with same results. Everything seems to work fine on Windows, so it may be OS-related.
Following dev setup instructions at https://github.com/Ouranosinc/pavics-sdi/blob/master/docs/source/dev/contributing.rst ... FlyingPigeon make start command does not result in functioning local WPS
'make status' output (see below) after attempting start indicates nginx problem?
Supervisor status ...
bin/supervisorctl status
flyingpigeon RUNNING pid 6865, uptime 0:00:32
nginx FATAL Exited too quickly (process log may have details)
Use hydroclimatic indicator from DEH.
Test whether it is possible to import WPS processes into GIS software (ArcGIS, QGIS)
Users need to be made aware (and accept) permissions / credits / terms of use for the various datasets on the platform.
Permissions/access to individual datasets can be managed in Magpie, however even some 'public' datasets can have conditions of use (e.g. need to acknowledge/cite in any publications, not for profit use only etc.)
Need to create a mechanism for users to consult and agree to these types of conditions for the ensemble of data on the platform.
Also need to establish an official / standard way to update this each time a new dataset is added
The ESGF search only supports logical AND for different facets, and multiple entries for the same facet (e.g. multiple variables) are combined as logical OR (unless they are != restriction, in which case it reverts back to a logical AND).
Once we have the result from the solr search, we would need to go through that json structure and remove entries that do not satisfy our logical AND. This also require an additional syntax in the inputs, perhaps constraints=model:MRCC5,variable+:pr,variable+:tas could be implemented?
In the past day, I've transferred over the following datasets:
And currently transferring over:
Once the last one is in, I'll see about running the crawler to update data.
Topic: The UI form to enter input parameters for processes
When an input defines "allowed_values", create a drop down list to select the input value.
The issue is that searching the catalog for (e.g.) a given simulation for a given model also returns files that are derived from that dataset (e.g. various subsets, or other operations that do not alter variable facets).
The easy solution would be to tag all derived data using a single facet (e.g. derived=true). This should be in the derived NetCDF files global attributes, and so it involves either modifying all processes (not practical) or to add this operation to our publish mechanism (note that recently the wps_outputs directory was removed from the catalog crawler, so those files are not searchable until they are published.)
The more complicated solution (but more descriptive in terms of metadata) would be to have appropriate facets change as operations are made, e.g. domain=Quebec after a subset. This would also require changing many processes, or a more strict publishing process.
Redirigier pavics.ouranos.ca vers boreas.ouranos.ca
When launching a workflow on outarde, got 401 Unautorized error but I had a user logged in. Permissions look correctly set in magpie for user group: thredds wirte/read on top level node and all wps access activated.
0 0%: Error: Not authorized to access this resource.
After adding permission for anonymous for all wps execute actions (especially malleefowl!), process didnt get blocked by permissions
Are there still show stoppers to get there ?
To reproduce on pavics.ouranos.ca
Search Datasets
First weird thing is that the Variable drop down menu now displays: poids, tasmax and tasmin. It's not clear why these variables would show up. There are 10 results, whose name suggest they store tasmin and tasmax. I would be expecting tas
only.
For all of these datasets, the interface displays:
Variable: valeur des poids utilises pour faire la statistique
Selecting a dataset and "Add selection" raises an error. (Failed to fetch).
If I then select tasmax
in addition to tas
, the dashboard says Found 8700 total files in 5 results
but displays no search result.
If I click on X to remove tasmax, the dropdown menus for all the search facets die.
Logged on a few PAVICS servers at CRIM and Ouranos and noticed many of the features are not operational today.
0:04:05 10%: Subsetting-proc10-data2: ERROR: Process error: wps_subset_WFS.py._handler Line 79 Traceback (most recent call last): File "/opt/birdhouse/src/flyingpigeon/flyingpigeon/processes/wps_subset_WFS.py", line 76, in _handler result = wfs_common(request, response, mode='subsetter') File "/opt/birdhouse/src/flyingpigeon/flyingpigeon/handler_common.py", line 76, in wfs_common raise Exception(msg) Exception: Failed to fetch features. geoserver: http://boreas.ouranos.ca:8087/geoserver/wfs typename: ADMINBOUNDARIES:canada_admin_boundaries features [u'canada_admin_boundaries.5'] Unknown namespace [ADMINBOUNDARIES] code=NoApplicableCode locator=None)
Path: https://pavics.ouranos.ca/geoserver/wms?request=GetCapabilities
javax.xml.transform.TransformerException: java.lang.RuntimeException: Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
java.lang.RuntimeException: Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
0
https://github.com/Ouranosinc/pavics-sdi/blame/master/docs/source/dev/installation.rst#L30
I'm following along to try and build pavics via docker-compose. Issue arises with pointing to birdhouse folder not included in pavcs-sdi. Is this suggesting that birdhouse should already be in place with birds preinstalled via git clone?
Should there be a command somewhere to build the docker-compose image at some point before cloning pavics-sdi? Very confused.
I caught an issue when seeting SLD styles in GeoServer. It seems that the direct SLD output from QGIS adds a styling option that GeoServer doesn't like. Leads to behaviour as follows:
The grey with black lines is the default style while the blue lines is representative of the SLD style I've set manually (though I might change this for aesthetic reasons).
See EarthCube projects: https://www.earthcube.org/groups
I suggest we identify and connect with projects that are relevant to our activities.
And try to open the xml status link, like this one : https://pavics.ouranos.ca/wpsoutputs/flyingpigeon/bef66456-5532-11e8-a6aa-0242ac12000c.xml
I got a 404
I've noticed something else strange today, specific to pavics.ouranos.ca.
Logging in using my credentials for the server/geoserver consistently fails:
However, waiting 2-3 minutes after a failed login attempt, I am automatically signed in as @moulab88:
Is this normal behaviour?
Est-ce qu'on pourrait se mettre une tâche / un rappel de repenser à la méthode de déploiement / gestion des variables d'environnement? c'est vraiment error-prone présentement.
Avoir un fichier de config non-commité qui est lu par le docker-compose (j'ignore si c'est possible) ou encore à partir duquel les variables d'environnement des vms sont déclarées lors de la création des vms permettrait de régler les problèmes mentionnés ci-haut, en plus de travailler vers la possibilité de rendre open source le docker-compose
Something that came up in a discussion between me and @dbyrns was that many of the questions/issues on here that reference servers and deployment criteria are viewable to the general public. This isn't a great security practice.
As it stands, we have a few issues (e.g. #1, #34, #35) where specific servers at Ouranos or CRIM are identified. Would it be better to discuss the deployment/server issues that arise on the issues the protected "PAVICS" repo?
In the meantime, I've looked the possibility of migrating issues to another repo and GitHub explicitly does not allow this natively. Ways of doing this (e.g. a Google-made tool and a Chrome Extension) introduce graver security concerns, so we may need to migrate the issues manually if we choose to do this.
Thoughts?
https://github.com/ioos/compliance-checker
Add the CF-check to our procedure for adding new files to PAVICS once things have settled.
Some datasets do not have the project facet and may not be fully compliant with CF conventions.
I think this will cause problem down the line.
I suggest we scan files on the THREDDS server using https://github.com/ioos/compliance-checker, report issues found and start cleaning meta-data when it can be easily done. For more complex problems, I suggest removing the files from THREDDS.
The checks would ideally cover both the netCDF metadata and the file name (see pyessv)
-[ ] Select scanner library
-[ ] Create process to scan the library and report the issues in an HTML document
-[ ] Automate the process and host the HTML output
-[ ] Put in place scripts that help with meta-data modifications and keep a trace of what was done.
The ocgis outputs in many processes write to (e.g.) /opt/birdhouse/var/lib/pywps/ocgis_output/ocgis_output.nc which creates conflicts when multiple processes run at the same time.
Everything that is written to disk within a process (i.e. intermediate files) need to go in a tempdir (set via ocgis.env.DIR_OUTPUT) otherwise two similar processes running at the same time will try to write to the same file.
This problem is currently (at least) in the following processes:
Ouranos Pub Indicators
All subsets/averager (within handler_common.py)
nc_merge
analogs?
Then the intermediate files need to be deleted before leaving the process. (There could be a buildup of files from processes that crash, a future step would be to also have a cronjob within the container to delete old tempdirs)
Add material in docs/source/dev/geoserver.rst
This server could not prove that it is pavics.ouranos.ca; its security certificate is from boreas.ouranos.ca. This may be caused by a misconfiguration or an attacker intercepting your connection.
Reasonable ? It provides some CDO tools and a remapping process that is in our list of deliverables to CANARIE. Other option is to add a regridding process in FP.
Please advise.
Trying to understand why tests pass locally but fail run remotely.
Using a thredds URL as input for a process run locally, the error log shows:
2018-05-02 09:35:46,284] [ERROR] line=275 module=Process Process error: wps_subset_countries.py._handler Line 98 [Errno 2] No such file or directory
Line 98 is a call to
def rename_complexinputs(complexinputs):
"""
TODO: this method is just a dirty workaround to rename input files according to the url name.
"""
resources = []
for inpt in complexinputs:
new_name = inpt.url.split('/')[-1]
os.rename(inpt.file, new_name)
resources.append(os.path.abspath(new_name))
return resources
Pull the regridding process from birdhouse/flyingpigeon into PAVICS. Connect service to CANARIE registry.
As discussed at the workshop, it may be worthwhile to compare and push general updates/changes to the Birdhouse documentation so we can can streamline PAVICS-specific documentation and reduce redundancy. I'll be looking into this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.