Giter Club home page Giter Club logo

w3id.org's Issues

w3id.org https configuration

I made a check of the https configuration of the w3id.org server using the Qualys SSL Labs server test -> https://www.ssllabs.com/ssltest/analyze.html?d=w3id.org

I'm seeing grade B, which I think is sub-optimal for a piece of web infrastructure that is as important as w3id.org is.

I believe this should be improved by:

  1. adding !kRSA to the SSLCipherSuite configuration option (to get forward secrecy);

  2. disabling the now almost obsolete TLS v.1.0 and 1.1 by restricting SSLProtocol to just "TLSv1.2".

Best regards

Jan Dvorak
leader, CERIF & CRIS Architecture Task Group, euroCRIS
https://github.com/perma-id/w3id.org/tree/master/cerif

Submit w3id org to HSTS preload list (configuration changes needed)

Dear all,

HSTS preload lists enable to avoid sending the first request as plain HTTP and directly encrypt the first request. This has a lot of security benefits, in particular avoiding man-in-the-middle attacks that target interception of the first request.

It seems that w3id.org is not fit for being submitted to the list that is used by a couple of browsers:

https://hstspreload.org/?domain=w3id.org

So, in my opinion, basically everyone that uses http://w3id.org to refer to their resources could potentially be targeted and users of these URIs could be easy victims on malicious public WIFI etc.

Edit - here a screenshot:
fail

Edit:
So when someone requests http://w3id.org/fraunhofer/lighthouse-projects/evolopro/cirp.ttl, and has never visited https://w3id.org before, this first request will be plain HTTP (tried and tested with wireshark).

Kind regards,
Andreas

CORS

How is the w3id.org server configured to support CORS?

Unable to find valid certification path

I am using w3id.org a means of creating persistent, de-referenceable namespaces for a set of ontologies. Recently, I have encountered the following error when attempting to load the ontologies in Protege via the w3id url:
sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

This is the first time I've encountered such an issue, (and have not recently updated Protege); the ontologies have been loading successfully via the w3id url prior to this. They also still load successfully via their current url on GitHub.

Any suggestions to fix this?

Change default branch from `master` to `main`.

  • This isn't a strictly necessary change but it would match what modern tooling uses.
  • There are lots of redirect maintainers with various skill levels. Will this cause mass chaos? Or will everyone easily adapt?

How does the perfect Pull Request look like?

Hi @dgarijo and @davidlehn,

I just sent out a Pull Request, and was wondering what the perfect content of the Pull Request would actually be to make your life easy.

For example, I put in a link to the Readme, so you can check that I am actually "allowed" to make that change. The Readme already says that the project name should be in the title of the Pull Request (as far as I understood)....

I found that it's possible to create a template for Pull Requests, maybe that's something for you, too? https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository

Best,
Robert

Conflicts between directories /un and /UN

On case-insensitive systems such as macOS, the two directories /un and /UN are considered the same and cause a conflict. The local branch cannot be synced, as it meets an endless loop.

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   UN/.htaccess

no changes added to commit (use "git add" and/or "git commit -a")
$ git stash
Saved working directory and index state WIP on master: 9f48a6e
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   un/.htaccess

no changes added to commit (use "git add" and/or "git commit -a")
$ git stash
Saved working directory and index state WIP on master: 9f48a6e
$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   UN/.htaccess

no changes added to commit (use "git add" and/or "git commit -a")

Monitoring? (HTTPS currently down!)

Do you have some kind of remote monitoring in place?

HTTPS seems down at this moment, trying from local and remote machine:

curl https://w3id.org -I
curl: (7) Failed to connect to w3id.org port 443: No route to host

also JSON-LD Playgrounds says: Error loading context URL: https://w3id.org/plp/v1

HTTP works though

curl http://w3id.org -I 
HTTP/1.1 200 OK
Server: BaseHTTP/0.3 Python/2.6.6
Content-type: text/html
Date: Wed, 13 May 2015 07:56:06 GMT
Age: 0
Via: 1.1 varnish
Connection: close

Redirecting paths with urlencoded URIs fails (AllowEncodedSlashes server config?)

Hi all,

I have set up a server that offers a reconciliation service for SKOS concept schemes. For this, it offers API endpoints that include the ConceptScheme's URI - url-encoded - in the request path. (For example, see https://c111-064.cloud.gwdg.de/reconc/mpilhlt/https%3A%2F%2Fw3id.org%2Fmpilhlt%2Fworktime_role or https://c111-064.cloud.gwdg.de/reconc/mpilhlt/https%3A%2F%2Fw3id.org%2Fmpilhlt%2Fworktime_role/_preview/r2.)

Now I would like to put this service behind a w3id.org redirection, too. My .htaccess thus has a line like this:

RewriteRule   ^reconcile/(.*)   https://c111-064.cloud.gwdg.de/reconc/mpilhlt/$1   [R=307,NE,L]

This seems not to work, however. Running curl fails as soon as the path includes an url-encoded forward slash:

> curl -I https://w3id.org/mpilhlt/reconcile/https%3A
HTTP/1.1 307 Temporary Redirect
Date: Fri, 05 Aug 2022 22:31:37 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://c111-064.cloud.gwdg.de/reconc/mpilhlt/https%3A
Content-Type: text/html; charset=iso-8859-1

> curl -I https://w3id.org/mpilhlt/reconcile/https%3A%2F
HTTP/1.1 404 Not Found
Date: Fri, 05 Aug 2022 22:31:49 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Content-Type: text/html; charset=iso-8859-1

I have tried with and without the NE (NoEscape) flag in the redirection rule, but this did not help. The required and the present behaviour correspond quite closely (including the "404" when encountering an encoded forward slash) to what is in the docs for the AllowEncodedSlashes config option. So I assume setting this to On or - better, because more secure - NoDecode would help my case. However, this is not something that can be set in an htacccess file context. It needs to be set in the main or virtual site config. (I have tried to set it in my htaccess file, this led to "500" instead of "404" errors and I have since removed the line from my htaccess file again.)

Do you have a policy on enabling things such as this one? Would you be ready to enable it? (Or is it even already enabled and I have been doing other things wrong?)

Thanks in advance - also, thanks more generally for all your efforts in maintaining this fantastic service!!!

Andreas

PS. I think I will be pushing for the service to allow passing the vocab URI in a query parameter instead of path components anyway. In cursory testing, I had the following result, which seems to indicate it may in fact work. So it's not a showstopper. Still, I believe it would be nice to allow encoded path components.

> curl -I https://w3id.org/mpilhlt/reconcile/endpoint?q=https%3A%2F
HTTP/1.1 307 Temporary Redirect
Date: Mon, 08 Aug 2022 12:35:56 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://c111-064.cloud.gwdg.de/reconc/mpilhlt/endpoint?q=https%3A%2F
Content-Type: text/html; charset=iso-8859-1

Update readme file

The guidelines indicate that the maintainers in a readme file should be optional. I think this file should be required, not optional. We also should require the github id of the maintainers.

The reason for this request is because we have been receiving several updates from w3ids, and not having a readme file makes me having to go to the commit history of that folder, which takes more time. The reason for using the github id in addition to email is because sometimes the author of the pull request has an id that differs significantly from their name and email.

The web server appears to be configured to generate directory indexes

I have a redirect set up from /people/nxg to http://nxg.me.uk/norman/. It uses ^nxg$ in the .htaccess file, so should match the former URL, but not match /people/nxg/ (which I intend or at least am willing to have fail). When I test this, however, it appears that /people/nxg is rewritten to /people/nxg/, with a 301 status:

% curl --head https://w3id.org/people/nxg
HTTP/1.1 301 Moved Permanently
Date: Thu, 23 Oct 2014 22:38:05 GMT
Server: Apache/2.4.7 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://w3id.org/people/nxg/
Content-Type: text/html; charset=iso-8859-1

This appears to happen before .htaccess processing. This would (I think) be explained if the overall Apache httpd.conf has Options Indexes enabled.

Is that the case? If so, is it deliberate? And if it's deliberate, is it desirable? I can't think of a use-case for that in this context.

Looking at the other /people/*/.htaccess files, I can see patterns matching ^$, which may be intended to catch the result of this redirection. If there is no directory /people/foo, then the URL /people/foo is matched as expected in .htaccess.

Travis checking for all README files

I noticed as I tried to create a pull-request that the travis-ci check failed to complete its validation of README links, but not due to my own commit, but to another users link timing out.

Maybe the travis check should focus on the diff of the pull request rather than checking all README files? As the service grows I can imagine that at least one link will be dead at all times.

Service Unavailable https://www.w3.org/2018/credentials/v1

Hi guys, we are testing verifiable credentials with more than 500 users. So far 300+ users have been able to get did and verifiable credential from our server. This test is basically to prove SSI systems can be used to productions and is stable. We started this test on Saturday and planned to continue till this weekend. The use case which we picked up is authentication and authorization. where a user receives did and vc from our server and uses those to authenticate to the service provider. But today we started to see a problem with issuing credentials, jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. After debugging we found that https://www.w3.org/2018/credentials/v1 this url which was previously working, is not accessible any more. Our users are stucked at this point and we are struggling to fix this issue. If any one here has any suggestion or solutions, kindly help. Or if you can just forward this message to right people, will also be helpful. Thank You!

Directory structure reorganization

The top level directory has grown to have lots of subdirs. This is a bit of a performance and usability issue on github. Slow to render and the README is waaay down there. Perhaps all the served files should move into one subdir like rules/. Thoughts?

a simple tutorial for common redirecting examples?

I would appreciate a simple guide, or a list of .htaccess files, showing examples of how to do some quite common redirections.

w3id.org/myname ---> my-current.domain/my-site-path/my-file.php
w3id.org/myname?a=1&b=4 ---> my-current.domain/my-site-path/my-file.php?a=1&b=4
w3id.org/myname/a_1/b_4 ---> my-current.domain/my-site-path/my-file.php?a=1&b=4
w3id.org/myname/site1 ---> my-current.domain/my-site-path/site1
w3id.org/myname/anothersite ---> my-current.domain/my-site-path/anothersite

... and so on

Perhaps this is such a basic question, but there are no links in w3id.org to any good tutorial for beginners.

Agents don't like protocol switchings

Dear all,

My organization uses w3id.org for semantic web projects, and we have rationale to make our ontologies ultimately accessible using the HTTP protocol (and not HTTPS).

Some agents refuse to follow redirects that imply a protocol change from HTTPS to HTTP, and I guess that's a good thing in general. But it makes classical tools randomly trigger errors:

Let's use ontology https://w3id.org/seas/FeatureOfInterestOntology as an example:

So my question is:

  • could it be possible to have only HTTP redirection for some projects in w3id ?

CNAME to w3id.org possible?

A protection against single point of failure, however unlikely for this resource but nonetheless, is to own a domain name which points to w3id.org. PURLs for a project could then use a custom domain name if they wanted to protect against this potential failure point.

Is it reasonable to do this for w3id.org?

Improving examples.

Some ideas for helping new folks figure out how to use this service. I've had these ideas rattling around in my head for years, might as well note them here.

@dgarijo started with an ontology example in #1639. (That has been needed for a long time. Thanks!)

  • Long ago I had thought of having two example locations. example/ with a really basic example, more like a template for just having a .htaccess and README.md to get started and redirect something to https://example.com/. Then have a secondary examples/ that has a handful of subdirs for various use cases. In this case the new example dir would be under something like examples/ontology/.
  • Refer people to appropriate Apache docs (also noted in #1530)
  • Add other examples based on patterns we've seen other people using over the years.
    • Simple redirects.
    • Pattern matching and replacement.
    • All in one file cover subdirs vs multiple directories.
    • CORS compatability.
    • It hasn't been an issue before, but show how to avoid serving the local README.md if you don't have a wildcard redirect that covers that.
  • Figure out which .htaccess options are commonly needed and document them and why they are there.
  • The root README should point people at the examples.
  • The main homepage should point people at the examples.
  • Figure out if info and maintainer info in .htaccess is ok vs putting in README.md. Recently it was suggested to just use README. I had originally thought it would have been the other way.
  • Add github PR magic info file with link to or an inline checklist for common issues we've seen (no maintainer info, etc).

I'm sure there's more... if anyone has ideas, add a comment.

Internal Server Error 500

Problem:
Whenever I try to access w3id.org/iqb/ related URL's I get an internal server error 500.
I attached a screenshot of this error.
Screenshot from 2022-06-30 09-46-45

I tried to contact the mentioned support mail ([email protected]), however mails can not be delivered to it.

Last related action:
I made a pull request yesterday that was accepted and merged into the master. The pull request were changes to the .htaccess file inside the iqb folder. (pull request)
Is this error related to the rules in the .htaccess?
The redirection worked perfectly before yesterdays pull request.

Edit: I just noticed that there is a extraneous "\" at the end of line 12 in the .htaccess file

Kind regards,
Huaning Yang

Intermittent 404 errors from previously consistent URL mappings

For many hours this morning (Dec/08/2021 Pacific Time), we have noticed inconsistent downtime of the following URLs that have previously been reliable for many months. We use these redirects very often as part of our software builds, so the downtime is affecting development.

curl -L https://w3id.org/linkml/types.yaml fails with the message:
curl: (28) Failed to connect to w3id.org port 443: Operation timed out

but one out of every 10 tries generates the proper response:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://linkml.github.io/linkml-model/linkml_model/model/schema/types.yaml">here</a>.</p>
<hr>
<address>Apache/2.4.29 (Ubuntu) Server at w3id.org Port 443</address>
</body></html>

similar behavior when attempted in the browser directly.

We also tried different organizational redirects like this one: https://w3id.org/sssom and it too is intermittently failing to resolve.

@cmungall

Problems with application/rdf+xml

Dear perma-id team,

again many thanks for providing this web service!

We have performed some update on the w3id.org/bot namespace to provide access to version IRIs (cf.#1995 by @attadanta).

While our test script worked fine, still, the Protege tool cannot retrieve a version IRI, e.g. w3id.org/bot-0.3.2

We observed some irregular behaviour of Protege sending out text/html requests. As we cannot test more deeply in w3id.org server, we were wondering if you have some suggestion in this regard maybe from other projects which have a similar naming structure.

Thanks and BR

Georg

404 not found for BCI-Ontology html, turtle, owl representations

@srodriguez142857, please solve these 404 Not Found issues before you present the papers about the BCI-Ontology at SSN 2018 and SIS-IoT 2018

Best,
Maxime

RewriteRule ^$ http://bci.pet.cs.nctu.edu.tw/ontology [R=303,L]

RewriteRule ^$ http://bci.pet.cs.nctu.edu.tw/ontology/BCI_(plain).owl [R=303,L]

RewriteRule ^$ http://bci.pet.cs.nctu.edu.tw/ontology/BCI.ttl [R=303,L]

RewriteRule ^$ http://bci.pet.cs.nctu.edu.tw/ontology [R=303,L]

License of perma-id/w3id.org repository

I would like to setup an organisational permanent-id service based on this service, and was wondering what the license is? While we will have own own redirect rules there is still some code/content that would be of use (git update hook, linkchecker, content from index page).

Avoid using rawgit.com

It has come to our attention (see #394) that the free service https://rawgit.com/ (which re-hosts GitHub files with nicer content-type headers) is shutting down:

RawGit is now in a sunset phase and will soon shut down. It's been a fun five years, but all things must end.

GitHub repositories that served content through RawGit within the last month will continue to be served until at least October of 2019. URLs for other repositories are no longer being served.

If you're currently using RawGit, please stop using it as soon as you can.

https://rawgit.com suggests alternatives like:

Here are the redirects I found in w3id.org that uses rawgit.com that should be updated (one example per folder):

Some of these already fail with:

403 Forbidden
RawGit will soon shut down and is no longer serving new repos. Please visit https://rawgit.com for more details.

.. so I would recommend the authors CC-ed above to update before the October 2019 shutdown date of rawgit.com.

Adding list of maintainers into the README?

Dear team,

I probably might have simply overlooked it -- I saw the names of the companies / entities who pledged to maintain w3id.org, but not the GitHub usernames of those of you who are official maintainers of this repo.

Is there a way to find out who is currently allowed to merge Pull Requests? If not, would you be interested in adding this info to the README?

Looking forward to a short discussion if needed -- all the best and thanks for maintaining this! (Obviously this would also give you credit for the important work you do :)

Best,
Robert

Unable to import/open ontology in Protege using w3id.org url

I am using w3id.org to create persistent and de-referenceable namespaces for the ReproduceMe Ontologies.
I encounter error while direct importing ontology or opening ontology using URL in Protege using the w3id.org URL:
https://w3id.org/reproduceme.

The error is as follows:
[Fatal Error] :31:28: Open quote is expected for attribute "href" associated with an element type "a".

Encountered " " "" at line 1, column 1.

The repository with the .htacess file is available here: https://github.com/perma-id/w3id.org/tree/master/reproduceme
I am using Protege 5.5.0.

I tried with curl. The content negotiation is done properly.
curl -sH "accept:application/rdf+xml" -L https://w3id.org/reproduceme

Any suggestion to fix this?

Thanks
Sheeba

Checking for duplicates/existing rewrites

As part of an an attempt to automate the checking process, I did some experimentation tonight and came across the following extant rules:

bash-4.3$ for i in find . -type d | cut -c 3-; do echo $i; grep $i .htaccess; done
aidan
RewriteRule ^aidan$ http://aidan.droppages.com/ [R=302,L]
dgarijo
fastflo
jennybc
keski
nandana
nxg
RewriteRule ^nxg$ http://nxg.me.uk/norman/ [R=303,L]
ocorcho
paolo
pfig
ssevertson
tailot
tokarenko
RewriteRule ^tokarenko$ http://flavors.me/tokarenko [R=303,L]

We have three directories that duplicate existing rewrite rules. As far as automated checking is concerned, I'd like to prevent someone else adding a directory /people/bsletten if there exists a ^bsletten$ rule in the people/.htaccess file.

I'd like to clean up the three existing rules and then barf in the Travis process if someone is trying this on a PR.

Does this seem like a good strategy? Any issue in doing this?

travis actually check redirect targets from .htaccess?

would there be any interest in having travis actually check the redirect targets from the .htaccess files?

It's not too hard to get the target URIs from the RewriteRules:

find . -name .htaccess -print0 | xargs -0 gsed -r -n 's_^[[:space:]]*RewriteRule[[:space:]]+.*?[[:space:]]+((.*?:)?//.*?)[[:space:]]+.*$_\1_p'

(actually shows quite some non-https URIs btw.)

  • Very many of them don't contain a $[0-9], so they're just static and could just be checked
  • many of the rules contain a "static" $1, which just makes the rule shorter, so possible to re-construct the target
  • for those which don't, we could at least check if the domain resolves and something answers...

Might also be beneficial to have travis submit the URIs and full sites to something like https://web.archive.org ...

rawgit usage

There are a handful of .htaccess uses of https://rawgit.com/.

I think people use this service to deal with the lack of content-type headers on github servers, but I'm not sure. It's possible some uses just copy & paste from other projects that needed that feature.

The files with test URLs that hit rawgit.com are causing travis to have SSL failures. However, it works on cdn.rawgit.com. The SSL issue may be related to their cloudflare usage? It works on local tests so it has something to do with the travis test environment.

Additionally, according to the rawgit.com homepage, rawgit.com itself is to be used only for development purposes and has short caching time. I would suggest just changing all the uses to cdn.rawgit.com but that has permanent caching! I'm unsure if the handful of users using that host are aware their redirects are cached forever. It looks like all targets but one are not using versioned resources, so I suspect not.

The .htaccess owners using rawgit should be made aware of this issue.

is URL rewriting with content negotiation supported?

I wanted to ask whether w3id supports/permits to do URL rewriting with conditions for content negotiation, for instance using RewriteCond.

My usecase is to host my ontology in Github pages and use w3id URL as a permanent identifier. Further, I want to have content negotiation on that URL but unfortunately Github pages doesn't support content negotiation. However, if I can do this at the URL rewrite stage I can still properly do what I wanted to do.

For example, something like the following.

RewriteCond %{HTTP:Accept} text/turtle
RewriteRule ^foo$ http://proj.github.io/vocab/foo.ttl [R=303,L]

RewriteCond %{HTTP:Accept} application/rdf+xml
RewriteRule ^foo$ http://proj.github.io/vocab/foo.xml [R=303,L]

RewriteRule ^foo$ http://proj.github.io/vocab/foo.html [R=303,L]

P.S. I am not an expert on HTTPD config, so there may be a better way to do this.

Move all redirect rule directories to an `ids/` subdirectory.

Problem

  • In hindsight, it would have been better to put all the redirect rules in a subdirectory.
    • Currently when visiting the project page you get a long load time and poor experience of documentation being a thousand page downs away.
    • It's difficult to have non-rule project files in the main repo.
  • There are lots of redirect maintainers with various skill levels. Will this cause mass chaos? Or will everyone easily adapt?

Proposal

  • I propose moving all rules, index.html, and other required served files to an ids/ subdirectory.
  • To avoid making operational concerns too complex this will likely involve a very brief downtime to adjust server side paths.

Testing instructions

The readme calls to "test your changes with a local checkout of the site."

What is the easiest way to do that?

I would propose a dockerised local setup to easily spin up a server so that I can curl some requests and see if my .htaccess does what I expect.

I tried setting up Apache with lando but that didn't really work, likely because I don't know how to write that http.conf.

Any suggestions?

How big can we go

I was wondering how many rules you would be willing to accept in an .htaccess?

The reason I ask, is that we want to use w3id.org to redirect to a secondary service which in turn redirects further. One way to speed the process up would be to put our forward rules directly into w3id, say once every year or so. We're currently have around 10,000 URLs. Am guessing this is too many, but it would be good to know for sure.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.