Giter Club home page Giter Club logo

fairdatapoint's People

Contributors

abelsiqueira avatar anandgavai avatar arnikz avatar c-martinez avatar codacy-badger avatar cunlianggeng avatar elboyran avatar felipez avatar jspaaks avatar larsmans avatar rajaram5 avatar vemonet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fairdatapoint's Issues

Issue deploying the FDP API with Nginx proxy on Docker

Hi, we are trying to deploy the current version of the FDP API using Docker publicly

I am using a nginx-proxy in Docker that usually works to route most services without issues: https://github.com/nginx-proxy/nginx-proxy

Note that the previous docker-compose was working a few months ago (at the time there was no grlc search API)

The grlc search worked without issues, it is accessible at http://search.fdp.semanticscience.org

But the FDP server is not accessible:

My first guess is that it seems to be related to routing configuration. I deployed on HTTP made sure to not redirect to HTTPS, which was causing issue in the previous deployment a few months ago

search.fdp.semanticscience.org and fdp.semanticscience.org are defined using the same DNS configuration

Since the fdpsearch is accessible it seems to be exclusively related to the FDP server routing configuration (not to the global nginx-proxy)

Here is my docker-compose.prod.yml:

version: '3'
services:
  # See nginx-proxy docs: https://github.com/nginx-proxy/nginx-proxy/wiki/List-of-Supported-Environment-Variables
  fdp:
    command: gunicorn -b 0.0.0.0:80 "fdp.fdp:create_app('fdp.semanticscience.org', 80, 'http://virtuoso:8890/sparql')"
    environment:
      - VIRTUAL_HOST=fdp.semanticscience.org
      - VIRTUAL_PORT=80
      - HTTPS_METHOD=noredirect

  fdpsearch:
    environment:
      - GRLC_SERVER_NAME=search.fdp.semanticscience.org
      - VIRTUAL_HOST=search.fdp.semanticscience.org
      - VIRTUAL_PORT=8088
      - HTTPS_METHOD=noredirect
  
  virtuoso:
    volumes:
      - /data/fdp-python/virtuoso:/data

And the docker-compose.yml, almost not changed from yours:

  • We removed all exposed ports for security, since deploying using my nginx-proxy does not require the ports to be exposed
  • We use an absolute path for persistent data
  • We use docker-compose version 3 (not 3.8, not supported on our server, I could check to update this if it could be the issue)
version: '3'
services:
  fdp:
    image: "nlesc/fairdatapoint"
    restart: unless-stopped
    depends_on:
      - virtuoso
  fdpsearch:
    image: "nlesc/fairdatapoint-search"
    restart: unless-stopped
    # ports:
    #   - "8088:8088"
    environment:
      - GRLC_SPARQL_ENDPOINT=http://virtuoso:8890/sparql
      - DEBUG=false
    depends_on:
      - virtuoso
  virtuoso:
    image: "tenforce/virtuoso"
    restart: unless-stopped
    # ports:
    #   - "8890:8890"
    #   - "1111:1111"
    environment:
      - SPARQL_UPDATE=true
    # volumes:
      # - ./data/virtuoso:/data

And I am starting the container with a simple up:

docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d --force-recreate

Do you have an idea why the FDP API is not properly exposed? As we can see the grlc Search API is properly exposed with the same routing configuration, maybe it requires some specific proxy configuration required for the FDP API

Nginx-proxy documentation at https://github.com/nginx-proxy/nginx-proxy/wiki/List-of-Supported-Environment-Variables

Note that I can easily enable https using letsencrypt, but I was thinking to use http first for debugging (but maybe the FDP should be deployed as HTTPS?)

Docker logs

Here are the docker-compose logs after starting the docker-compose and querying once the Search API (resolving) and once the FDP API (not resolving):

fdp_1        | [2020-12-10 14:34:13 +0000] [1] [INFO] Starting gunicorn 20.0.4
fdp_1        | [2020-12-10 14:34:13 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
fdp_1        | [2020-12-10 14:34:13 +0000] [1] [INFO] Using worker: sync
fdp_1        | [2020-12-10 14:34:13 +0000] [7] [INFO] Booting worker with pid: 7
fdp_1        | /usr/local/lib/python3.8/site-packages/SPARQLWrapper/Wrapper.py:880: 
UserWarning: keepalive support not available, so the execution of this method has no effect
fdp_1        |   warnings.warn("keepalive support not available, so the execution of this method has no effect")
fdp_1        | this operation accepts multiple content types, using text/turtle
fdp_1        | this operation accepts multiple content types, using text/turtle
fdp_1        | this operation accepts multiple content types, using text/turtle
virtuoso_1   | Converting environment variables to ini file
virtuoso_1   | Finished converting environment variables to ini file
virtuoso_1   | 
virtuoso_1   | 		Thu Dec 10 2020
virtuoso_1   | 14:34:12 OpenLink Virtuoso Universal Server
virtuoso_1   | 14:34:12 Version 07.20.3229-pthreads for Linux as of Aug 21 2019
virtuoso_1   | 14:34:12 uses parts of OpenSSL, PCRE, Html Tidy
virtuoso_1   | 14:34:12 Database version 3126
virtuoso_1   | 14:34:12 SQL Optimizer enabled (max 1000 layouts)
virtuoso_1   | 14:34:14 Compiler unit is timed at 0.000164 msec
virtuoso_1   | 14:34:14 Roll forward started
virtuoso_1   | 14:34:14 Roll forward complete
virtuoso_1   | 14:34:14 Checkpoint started
virtuoso_1   | 14:34:15 Checkpoint finished, log reused
virtuoso_1   | 14:34:17 HTTP/WebDAV server online at 8890
virtuoso_1   | 14:34:17 Server online at 1111 (pid 1)
fdpsearch_1  | Mapping UID/GID
fdpsearch_1  | nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
fdpsearch_1  | nginx: configuration file /etc/nginx/nginx.conf test is successful
fdpsearch_1  | Restarting nginx: nginx.
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [71] [INFO] Starting gunicorn 19.6.0
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [71] [INFO] Listening at: http://0.0.0.0:8088 (71)
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [71] [INFO] Using worker: sync
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [105] [INFO] Booting worker with pid: 105
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [106] [INFO] Booting worker with pid: 106
fdpsearch_1  | [2020-12-10 14:34:26 +0000] [107] [INFO] Booting worker with pid: 107

Environment

  • docker version
    Client: Docker Engine - Community
    Version: 19.03.8
    API version: 1.40

  • docker-compose version 1.25.0, build 0a186604
    docker-py version: 4.1.0
    CPython version: 3.7.4
    OpenSSL version: OpenSSL 1.1.0l 10 Sep 2019

On CentOS 7

swagger ui

The Swagger UI is enabled, but not installed and not documented.

When starting fdp I got following warning:

The swagger_ui directory could not be found.
    Please install connexion with extra install: pip install connexion[swagger-ui]
    or provide the path to your local installation by passing swagger_path=<your path>

After I ran pip install connexion[swagger-ui]. I could access http://localhost/ui/, so I could test out the api using web forms.

Could connexion[swagger-ui] be added to the dependencies and can it documented why/how the Swagger UI can be accessed?

FDP broken API

We've tried both python2/3. Example (Pet Store) endpoint shown instead of FDP endpoints.

Search functionality

It is currently not possible to search for datasets that meet a certain criterion (e. g. of a given type). This means the client of the FDP needs to download the full catalog and browse through it. The FDP would be more useful if it provided such functionality.

Add conformance testing

  • Test to make sure that our implementation comply with FDP spec
  • Test to check if our implementation is compatible with the JAVA FDP

Make SHACL validation report human-friendly

The SHACL validation report requires that users understand SHACL language, too bad!

For example, users just need to know if a required term is missing, while the report give too much detail without explicitly mentioning some term is missing but giving a minCount=1.

A wrapper on the validation report is necessary to give a simple and human-friendly message.

Unique identification of datasets across data points

With the current specification of the FDP, it looks like one data set which is indexed on two different FDP's could end up having different identifiers (and thus could incorrectly be assumed to be two separate data sets).

E.g: I could deposit my-dataset on FDP1 as:
http://fdp1.org/fdp/dataset/my-dataset-X

and on FDP2 as:
http://fdp2.org/fdp/dataset/my-dataset-Y

This should perhaps be adjusted on the specification (not necessarily on this repo).

Add query feature

  • check query solutions
  • discuss and decide query solution
  • implement query feature

Upgrade rdflib to v5.0.0

It takes time for (3rd party) plugins to be updated, which affects our FDP.
For the moment (2020-06-09), we have to use rdflib 4.2.2 and its plugins.

  • check it again in December, 2020

Update CI

  • Travis CI -> Git Actions?
  • update URL (badge) in README.md

Incorrect DC term(s)

Change http://purl.org/dc/terms/version to http://purl.org/dc/terms/hasVersion

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.