Giter Club home page Giter Club logo

pavics-sdi's Introduction

pavics-sdi's People

Contributors

aulemahal avatar bstdenis avatar chaamc avatar davidcaron avatar fmigneault avatar fmigneault-crim avatar huard avatar matprov avatar mishaschwartz avatar pre-commit-ci[bot] avatar tlogan2000 avatar tlvu avatar zeitsperre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pavics-sdi's Issues

Add keywords to processes

The next version of pywps will support keywords in process metadata (alongside abstract, etc.)
geopython/pywps#306

When this is done, we can use keywords to search for processes in the UI.
Note that this is not in the WPS 1.0 standard, only in 2.0, so we should not expect all WPS service providers to include keywords.

File aggregation issues

The catalog includes aggregation of two apparently identical files:
image

Also the Aggregate name does not seem to match the individual files names. How does it work ?

Make useful netCDF datasets available on PAVICS

  • Select datasets most likely to be used in the coming year (Blaise décide unilatéralement)
  • Copy them to THREDDS and index them in catalog
  • Set appropriate permissions
  • Ask permissions to make restricted datasets public (David)
  • Explain process in docs

A priori and unless someone finds this a really bad idea, I'd fill 80% of the "data" disk space with source data. When user space goes over 90% we buy new disks.

Add OSM basemaps to PAVICS

I was thinking it would be good to add a few more choices for basemaps. OpenStreetMap is free and has many different layers (roads, standard, humanitarian). Is the process complicated for adding WMS services? Looking into this.

Mechanism to inform UI of map interaction

At the moment the UI relies on process input identifiers to determine its behavior. For example, it knows the process requires a map interaction if (typename, featureid) are found in the input parameters, so if an input's name matches some pattern, clicking on the map will feed information into the form.

I suggest we find a more robust way to tell the UI an input requires a map interaction. This will likely require a refactoring of the processes, but I think it's worth it. One possibility is to replace the two typename and featureid fields by one "polygon" field, with mimetype "application/xml+gml". When the user selects a region from the UI, the input is fed the GML description of the polygon obtained from geoserver (or the URL, not clear to me yet).

This would allow us to implement other type of map interaction such as a bounding box selector, and use the same mechanism.

Combine PAVICS search with ESGF search

Write a WPS that combines the Solr response from the ESGF search with the Solr response from the PAVICS search.

One outstanding issue is the availability of OPEnDAP/WMS links in ESGF. Since these protocols are not always available, should there be a fallback to the fileserver as much as possible, in which cases? Do we allow a mix of OPEnDAP link and fileserver links as inputs to WPS (this is already possible in some processes that use the opendap_or_download function (see https://github.com/Ouranosinc/flyingpigeon/blob/pavics/flyingpigeon/handler_common.py).

GeoServer should be updated to 2.13+

Seeing as the version of GeoServer bundled in PAVICS (v2.9) is around 2.5 years old, it would be good to update it to a more recent iteration.

Some of the improvements over the past 2.5 years include better performance for reading/writing/displaying shapefiles, postgreSQL/postGIS geometries, NetCDFs as well as better management and UI. @tlogan2000 and I suggest updating to at least 2.13+, as read/write/display performance for GeoPackages (a newer/better OGC standard than shapefiles/gml) is on par with SQL databases.

Rethink user permission system

Félix Gagnon-Grenier [3:09 PM]
ok. ftr je crois que dans une phase 2 de pavics (si phase 2 il y a) il serait priomordial de repenser au système de gestion des sessions et de l'authentification des usagers

Can't close PAVICS-frontend panes via check-box

I realize that within the PAVICS-frontend interface, I haven't been able to close panes by clicking on their upper-right corner check-box. Only clicking the on the option-wheel in the bottom left of the screen allows me to open and close panes.

I have tried performing the same actions on other servers and same results. I've also attempted this using some VMs I had on hand (Ubuntu 16.04, openSUSE Tumbleweed, Arch) with same results. Everything seems to work fine on Windows, so it may be OS-related.

screenshot from 2018-03-15 13-37-59

Unable to start local FlyingPigeon following instructions for VM dev environment

Following dev setup instructions at https://github.com/Ouranosinc/pavics-sdi/blob/master/docs/source/dev/contributing.rst ... FlyingPigeon make start command does not result in functioning local WPS

'make status' output (see below) after attempting start indicates nginx problem?

Supervisor status ...
bin/supervisorctl status
flyingpigeon RUNNING pid 6865, uptime 0:00:32
nginx FATAL Exited too quickly (process log may have details)

Error message multiplication

To reproduce:
Go to pluvier.crim.ca
Go into workspace
Click any dataset
Visualize
Error about ncWMS2
Click on map multiple times
image

Create terms of use / permissions mechanism for data sets hosted on PAVICS

Users need to be made aware (and accept) permissions / credits / terms of use for the various datasets on the platform.

Permissions/access to individual datasets can be managed in Magpie, however even some 'public' datasets can have conditions of use (e.g. need to acknowledge/cite in any publications, not for profit use only etc.)

Need to create a mechanism for users to consult and agree to these types of conditions for the ensemble of data on the platform.

Also need to establish an official / standard way to update this each time a new dataset is added

Allow search for a set of available variables (logical AND)

The ESGF search only supports logical AND for different facets, and multiple entries for the same facet (e.g. multiple variables) are combined as logical OR (unless they are != restriction, in which case it reverts back to a logical AND).

Once we have the result from the solr search, we would need to go through that json structure and remove entries that do not satisfy our logical AND. This also require an additional syntax in the inputs, perhaps constraints=model:MRCC5,variable+:pr,variable+:tas could be implemented?

Populating climate data on Boreas

In the past day, I've transferred over the following datasets:

  • CORDEX: AWIx, CCCMAx, DMIx, MGOx, MOHCx, SMHIx, ULgx, UQAMx
  • NCEP: NARR, Reanalysis2

And currently transferring over:

  • CMIP5: CCCMA

Once the last one is in, I'll see about running the crawler to update data.

Distinguish between original data and data on which operations have been made

The issue is that searching the catalog for (e.g.) a given simulation for a given model also returns files that are derived from that dataset (e.g. various subsets, or other operations that do not alter variable facets).

The easy solution would be to tag all derived data using a single facet (e.g. derived=true). This should be in the derived NetCDF files global attributes, and so it involves either modifying all processes (not practical) or to add this operation to our publish mechanism (note that recently the wps_outputs directory was removed from the catalog crawler, so those files are not searchable until they are published.)

The more complicated solution (but more descriptive in terms of metadata) would be to have appropriate facets change as operations are made, e.g. domain=Quebec after a subset. This would also require changing many processes, or a more strict publishing process.

Cookies are missing when calling phoenix from koa server

When launching a workflow on outarde, got 401 Unautorized error but I had a user logged in. Permissions look correctly set in magpie for user group: thredds wirte/read on top level node and all wps access activated.

0 0%: Error: Not authorized to access this resource.

After adding permission for anonymous for all wps execute actions (especially malleefowl!), process didnt get blocked by permissions

Mismatch between search criteria and data shown

To reproduce on pavics.ouranos.ca
Search Datasets

  • Project: ClimEx
  • Frequency: day
  • Variable: tas

First weird thing is that the Variable drop down menu now displays: poids, tasmax and tasmin. It's not clear why these variables would show up. There are 10 results, whose name suggest they store tasmin and tasmax. I would be expecting tas only.

For all of these datasets, the interface displays:
Variable: valeur des poids utilises pour faire la statistique

Selecting a dataset and "Add selection" raises an error. (Failed to fetch).

If I then select tasmax in addition to tas, the dashboard says Found 8700 total files in 5 results but displays no search result.

If I click on X to remove tasmax, the dropdown menus for all the search facets die.

Frontend Features non-functional

Logged on a few PAVICS servers at CRIM and Ouranos and noticed many of the features are not operational today.

  • Facets can be searched on some servers but return no results. Other servers detail no facets found.
  • When visualizing nc files associated with existing projects (e.g. on Boreas), the temporal slider manipulation is non-functional.
  • Workflows that are already set-up (Demo Projects) that point to existing data seem to be running fine until they demand data from server or rely on a process from flyingpigeon:

0:04:05 10%: Subsetting-proc10-data2: ERROR: Process error: wps_subset_WFS.py._handler Line 79 Traceback (most recent call last): File "/opt/birdhouse/src/flyingpigeon/flyingpigeon/processes/wps_subset_WFS.py", line 76, in _handler result = wfs_common(request, response, mode='subsetter') File "/opt/birdhouse/src/flyingpigeon/flyingpigeon/handler_common.py", line 76, in wfs_common raise Exception(msg) Exception: Failed to fetch features. geoserver: http://boreas.ouranos.ca:8087/geoserver/wfs typename: ADMINBOUNDARIES:canada_admin_boundaries features [u'canada_admin_boundaries.5'] Unknown namespace [ADMINBOUNDARIES] code=NoApplicableCode locator=None)

Geoserver on Boreas returns currently an error on GetCapabilities()

Path: https://pavics.ouranos.ca/geoserver/wms?request=GetCapabilities

javax.xml.transform.TransformerException: java.lang.RuntimeException: Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
java.lang.RuntimeException: Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
Can't obtain Envelope of Layer-Groups: Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
Error occurred trying to write out metadata for layer group: Duree_Enneigment_ObsIMS_CMIP5_RCP45
0

Vector SLD Style enforcement seems dependent on zoom level

I caught an issue when seeting SLD styles in GeoServer. It seems that the direct SLD output from QGIS adds a styling option that GeoServer doesn't like. Leads to behaviour as follows:

screenshot from 2018-05-14 15-44-13
screenshot from 2018-05-14 15-44-23

The grey with black lines is the default style while the blue lines is representative of the SLD style I've set manually (though I might change this for aesthetic reasons).

Cannot access xml status of flyingpigeon

Login as another user?

I've noticed something else strange today, specific to pavics.ouranos.ca.

Logging in using my credentials for the server/geoserver consistently fails:
screenshot from 2018-04-03 11-19-57

However, waiting 2-3 minutes after a failed login attempt, I am automatically signed in as @moulab88:
screenshot from 2018-04-03 11-20-56

Is this normal behaviour?

Environment variables for deployment

Est-ce qu'on pourrait se mettre une tâche / un rappel de repenser à la méthode de déploiement / gestion des variables d'environnement? c'est vraiment error-prone présentement.

  • il faut manuellement entrer HOSTNAME dans le command line pour chaque oiseau
  • la config est dans le répertoire git PACVICS mais on ne peut pas vraiment utiliser git pour déployer les nouvelles configs de docker-compose parce que le processus manuel de remplacement du nom des oiseaux change le contenu de tous les fichiers (il faut donc faire un reset, et relancer le processus qui remplace les hostnames)
  • c'est généralement pas exactement une bonne pratique de commiter les variables propres à un environnement. Outre les problèmes mentionnés ci-haut, ça complexifie l'utilisation de git (on doit utiliser un répertoire privé).

Avoir un fichier de config non-commité qui est lu par le docker-compose (j'ignore si c'est possible) ou encore à partir duquel les variables d'environnement des vms sont déclarées lors de la création des vms permettrait de régler les problèmes mentionnés ci-haut, en plus de travailler vers la possibilité de rendre open source le docker-compose

Server deployment issues should not be publicly accessible

Something that came up in a discussion between me and @dbyrns was that many of the questions/issues on here that reference servers and deployment criteria are viewable to the general public. This isn't a great security practice.

As it stands, we have a few issues (e.g. #1, #34, #35) where specific servers at Ouranos or CRIM are identified. Would it be better to discuss the deployment/server issues that arise on the issues the protected "PAVICS" repo?

In the meantime, I've looked the possibility of migrating issues to another repo and GitHub explicitly does not allow this natively. Ways of doing this (e.g. a Google-made tool and a Chrome Extension) introduce graver security concerns, so we may need to migrate the issues manually if we choose to do this.

Thoughts?

Review metadata of files on THREDDS server

Some datasets do not have the project facet and may not be fully compliant with CF conventions.
I think this will cause problem down the line.
I suggest we scan files on the THREDDS server using https://github.com/ioos/compliance-checker, report issues found and start cleaning meta-data when it can be easily done. For more complex problems, I suggest removing the files from THREDDS.

The checks would ideally cover both the netCDF metadata and the file name (see pyessv)

-[ ] Select scanner library
-[ ] Create process to scan the library and report the issues in an HTML document
-[ ] Automate the process and host the HTML output
-[ ] Put in place scripts that help with meta-data modifications and keep a trace of what was done.

ocgis calls in processes write to same file

The ocgis outputs in many processes write to (e.g.) /opt/birdhouse/var/lib/pywps/ocgis_output/ocgis_output.nc which creates conflicts when multiple processes run at the same time.

Everything that is written to disk within a process (i.e. intermediate files) need to go in a tempdir (set via ocgis.env.DIR_OUTPUT) otherwise two similar processes running at the same time will try to write to the same file.

This problem is currently (at least) in the following processes:

Ouranos Pub Indicators
All subsets/averager (within handler_common.py)
nc_merge
analogs?

Then the intermediate files need to be deleted before leaving the process. (There could be a buildup of files from processes that crash, a future step would be to also have a cronjob within the container to delete old tempdirs)

Security certificate for pavics.ouranos.ca

This server could not prove that it is pavics.ouranos.ca; its security certificate is from boreas.ouranos.ca. This may be caused by a misconfiguration or an attacker intercepting your connection.

Add hummingbird to PAVICS

Reasonable ? It provides some CDO tools and a remapping process that is in our list of deliverables to CANARIE. Other option is to add a regridding process in FP.
Please advise.

Process errors in flyingpigeon

Trying to understand why tests pass locally but fail run remotely.

Using a thredds URL as input for a process run locally, the error log shows:

2018-05-02 09:35:46,284] [ERROR] line=275 module=Process Process error: wps_subset_countries.py._handler Line 98 [Errno 2] No such file or directory

Line 98 is a call to

def rename_complexinputs(complexinputs):
    """
    TODO: this method is just a dirty workaround to rename input files according to the url name.
    """
    resources = []
    for inpt in complexinputs:
        new_name = inpt.url.split('/')[-1]
        os.rename(inpt.file, new_name)
        resources.append(os.path.abspath(new_name))
    return resources

Synchronization with Birdhouse documentation

As discussed at the workshop, it may be worthwhile to compare and push general updates/changes to the Birdhouse documentation so we can can streamline PAVICS-specific documentation and reduce redundancy. I'll be looking into this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.