fair-data / fairdatapoint Goto Github PK
View Code? Open in Web Editor NEWProvide machine-readable descriptions of your data assets
Home Page: https://www.research-software.nl/software/fairdatapoint
License: Apache License 2.0
Provide machine-readable descriptions of your data assets
Home Page: https://www.research-software.nl/software/fairdatapoint
License: Apache License 2.0
I tried to run the fdp example.com 80 --db='http://dbpedia.org/sparql'
command described at https://github.com/NLeSC/fairdatapoint#deploy-without-docker, but got a error about the graphviz executable fdp
.
What did work was
fdp-run localhost 8888 --db='http://dbpedia.org/sparql'
Can the README be updated?
Check if the path to swagger.json
config was modified.
Hi, we are trying to deploy the current version of the FDP API using Docker publicly
I am using a nginx-proxy in Docker that usually works to route most services without issues: https://github.com/nginx-proxy/nginx-proxy
Note that the previous docker-compose was working a few months ago (at the time there was no grlc search API)
The grlc search worked without issues, it is accessible at http://search.fdp.semanticscience.org
But the FDP server is not accessible:
My first guess is that it seems to be related to routing configuration. I deployed on HTTP made sure to not redirect to HTTPS, which was causing issue in the previous deployment a few months ago
search.fdp.semanticscience.org and fdp.semanticscience.org are defined using the same DNS configuration
Since the fdpsearch is accessible it seems to be exclusively related to the FDP server routing configuration (not to the global nginx-proxy)
Here is my docker-compose.prod.yml
:
version: '3'
services:
# See nginx-proxy docs: https://github.com/nginx-proxy/nginx-proxy/wiki/List-of-Supported-Environment-Variables
fdp:
command: gunicorn -b 0.0.0.0:80 "fdp.fdp:create_app('fdp.semanticscience.org', 80, 'http://virtuoso:8890/sparql')"
environment:
- VIRTUAL_HOST=fdp.semanticscience.org
- VIRTUAL_PORT=80
- HTTPS_METHOD=noredirect
fdpsearch:
environment:
- GRLC_SERVER_NAME=search.fdp.semanticscience.org
- VIRTUAL_HOST=search.fdp.semanticscience.org
- VIRTUAL_PORT=8088
- HTTPS_METHOD=noredirect
virtuoso:
volumes:
- /data/fdp-python/virtuoso:/data
And the docker-compose.yml
, almost not changed from yours:
version: '3'
services:
fdp:
image: "nlesc/fairdatapoint"
restart: unless-stopped
depends_on:
- virtuoso
fdpsearch:
image: "nlesc/fairdatapoint-search"
restart: unless-stopped
# ports:
# - "8088:8088"
environment:
- GRLC_SPARQL_ENDPOINT=http://virtuoso:8890/sparql
- DEBUG=false
depends_on:
- virtuoso
virtuoso:
image: "tenforce/virtuoso"
restart: unless-stopped
# ports:
# - "8890:8890"
# - "1111:1111"
environment:
- SPARQL_UPDATE=true
# volumes:
# - ./data/virtuoso:/data
And I am starting the container with a simple up:
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d --force-recreate
Do you have an idea why the FDP API is not properly exposed? As we can see the grlc Search API is properly exposed with the same routing configuration, maybe it requires some specific proxy configuration required for the FDP API
Nginx-proxy documentation at https://github.com/nginx-proxy/nginx-proxy/wiki/List-of-Supported-Environment-Variables
Note that I can easily enable https
using letsencrypt, but I was thinking to use http
first for debugging (but maybe the FDP should be deployed as HTTPS?)
Here are the docker-compose logs
after starting the docker-compose and querying once the Search API (resolving) and once the FDP API (not resolving):
fdp_1 | [2020-12-10 14:34:13 +0000] [1] [INFO] Starting gunicorn 20.0.4
fdp_1 | [2020-12-10 14:34:13 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
fdp_1 | [2020-12-10 14:34:13 +0000] [1] [INFO] Using worker: sync
fdp_1 | [2020-12-10 14:34:13 +0000] [7] [INFO] Booting worker with pid: 7
fdp_1 | /usr/local/lib/python3.8/site-packages/SPARQLWrapper/Wrapper.py:880:
UserWarning: keepalive support not available, so the execution of this method has no effect
fdp_1 | warnings.warn("keepalive support not available, so the execution of this method has no effect")
fdp_1 | this operation accepts multiple content types, using text/turtle
fdp_1 | this operation accepts multiple content types, using text/turtle
fdp_1 | this operation accepts multiple content types, using text/turtle
virtuoso_1 | Converting environment variables to ini file
virtuoso_1 | Finished converting environment variables to ini file
virtuoso_1 |
virtuoso_1 | Thu Dec 10 2020
virtuoso_1 | 14:34:12 OpenLink Virtuoso Universal Server
virtuoso_1 | 14:34:12 Version 07.20.3229-pthreads for Linux as of Aug 21 2019
virtuoso_1 | 14:34:12 uses parts of OpenSSL, PCRE, Html Tidy
virtuoso_1 | 14:34:12 Database version 3126
virtuoso_1 | 14:34:12 SQL Optimizer enabled (max 1000 layouts)
virtuoso_1 | 14:34:14 Compiler unit is timed at 0.000164 msec
virtuoso_1 | 14:34:14 Roll forward started
virtuoso_1 | 14:34:14 Roll forward complete
virtuoso_1 | 14:34:14 Checkpoint started
virtuoso_1 | 14:34:15 Checkpoint finished, log reused
virtuoso_1 | 14:34:17 HTTP/WebDAV server online at 8890
virtuoso_1 | 14:34:17 Server online at 1111 (pid 1)
fdpsearch_1 | Mapping UID/GID
fdpsearch_1 | nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
fdpsearch_1 | nginx: configuration file /etc/nginx/nginx.conf test is successful
fdpsearch_1 | Restarting nginx: nginx.
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [71] [INFO] Starting gunicorn 19.6.0
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [71] [INFO] Listening at: http://0.0.0.0:8088 (71)
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [71] [INFO] Using worker: sync
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [105] [INFO] Booting worker with pid: 105
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [106] [INFO] Booting worker with pid: 106
fdpsearch_1 | [2020-12-10 14:34:26 +0000] [107] [INFO] Booting worker with pid: 107
docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
docker-compose version 1.25.0, build 0a186604
docker-py version: 4.1.0
CPython version: 3.7.4
OpenSSL version: OpenSSL 1.1.0l 10 Sep 2019
On CentOS 7
ODEX-FAIRDataPoint
->fairdatapoint
also relates to #28
The Swagger UI is enabled, but not installed and not documented.
When starting fdp I got following warning:
The swagger_ui directory could not be found.
Please install connexion with extra install: pip install connexion[swagger-ui]
or provide the path to your local installation by passing swagger_path=<your path>
After I ran pip install connexion[swagger-ui]
. I could access http://localhost/ui/, so I could test out the api using web forms.
Could connexion[swagger-ui]
be added to the dependencies and can it documented why/how the Swagger UI can be accessed?
We've tried both python2/3. Example (Pet Store) endpoint shown instead of FDP endpoints.
It is currently not possible to search for datasets that meet a certain criterion (e. g. of a given type). This means the client of the FDP needs to download the full catalog and browse through it. The FDP would be more useful if it provided such functionality.
There are several compatibility issues when using python3.
See the code https://github.com/NLeSC/ODEX-FAIRDataPoint/blob/5d7007ae7d22a0cf85aa95cc686028794cf79345/Dockerfile#L22
Build the image and push to nlesc-fairdatapoint
.
Reason: We don't want to grant Docker Hub (almost) 'admin' privileges over all NLeSC repos.
The SHACL validation report requires that users understand SHACL language, too bad!
For example, users just need to know if a required term is missing, while the report give too much detail without explicitly mentioning some term is missing but giving a minCount=1
.
A wrapper on the validation report is necessary to give a simple and human-friendly message.
We are using to start the fdp app, even though the app is build with flask:
https://github.com/NLeSC/fairdatapoint/blob/550a001d3a70cd43c41e8536f4c45a9e2456af7b/bin/fdp-run#L17
I think it would make more sense to use flask.run and get rid of the bottle dependency.
With the current specification of the FDP, it looks like one data set which is indexed on two different FDP's could end up having different identifiers (and thus could incorrectly be assumed to be two separate data sets).
E.g: I could deposit my-dataset on FDP1 as:
http://fdp1.org/fdp/dataset/my-dataset-X
and on FDP2 as:
http://fdp2.org/fdp/dataset/my-dataset-Y
This should perhaps be adjusted on the specification (not necessarily on this repo).
Add dcmitype:Dataset
in addition to dcat:Dataset
.
It takes time for (3rd party) plugins to be updated, which affects our FDP.
For the moment (2020-06-09), we have to use rdflib 4.2.2 and its plugins.
v0.6.0
@prefix ex: <http://example.com/ns#>
is used in the fdp/shema/*.shacl
files.
This URI should be replaced when there is real website to archive FDP schemas.
Change http://purl.org/dc/terms/version
to http://purl.org/dc/terms/hasVersion
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.