Giter Club home page Giter Club logo

Comments (7)

chrdebru avatar chrdebru commented on June 25, 2024

Here are some suggestions for simple datatype tests:

  • does rml:datatype work
  • does rml:datatypeMap work
  • does rml:datatypeMap for non-xsd datatypes work (should be accepted) as data validation is a separate process

The last one assumes http://example.com/base is given as input for the base IRI.

In other words, the datatype map "behaves" like a IRI generating term map.

RMLTC0021a-CSV.zip

from rml-core.

DylanVanAssche avatar DylanVanAssche commented on June 25, 2024

The ones covered by shapes are listed in tests.py:

  • RMLTC0004b: Literal as Term Type in Subject Map
  • RMLTC0007h: Named graph which is not an IRI
  • RMLTC0012c: missing Subject Map
  • RMLTC0012d: 2 Subject Maps
  • RMLTC0015b: invalid language tag --> do you have a proposal for the shapes here to improve it? I am not so sure what you suggest above.

These should already raise an error by the engine if they use the SHACL shapes to validate the mapping.
Currently, we don't have nice page like this: https://rml.io/test-cases/ where it is listed which test-cases throws an error.
We should have this in the future but also add metadata in each test-case what kind of error is thrown.
Do you have a suggestion for that?

Regarding 19a, that one might be a bit in the flux.

Regarding 19b, is on purpose with a data error. The test-cases currently assume 'best-effort' in that case.
Maybe we need to be here super strict and throw an error, to be on the same level as other test-cases like a named graph must be an IRI (0007h)

Regarding 20b, why is that an error? The RFC for URIs (https://www.rfc-editor.org/rfc/rfc3986#section-4.1) says that it can be resolved if needed. It is a valid IRI.
Question is: should we require engines to remove relative path stuff in IRIs like they do for encoding?

Datatype maps are missing yes, feel free to put it in a PR, thanks a lot! If you don't have time, I can do it as well, let me know.

from rml-core.

chrdebru avatar chrdebru commented on June 25, 2024

Aha! Okay, I understood that the test cases (for engines) assume valid mappings. In other words, you now assume that each engine uses the shapes (or something else) to cover all cases. Which is OK, but I misinterpreted that.

19a --> Assuming the base IRI of the mapping is the same as the output is a dangerous assumption. You could leave it for the tests for backward compatibility with previous engines, but I would propose to document it as (either http://www.example.com/base is used as input or the base of the mapping file is assumed to be the base). R2RML explicitly states that the base IRI for the output is passed as a parameter. I prefer David's solution of rml:baseIRI per triples map.

19b --> "best effort" contradicts with "generating no file" so you want to be strict. You retain partial results of the same triples map.

20b --> Based on R2RML --> the string is tested for being an absolute IRI and does not mention anything about trying to compute the absolute IRI where possible. RML can propose this, but I have yet to find this mentioned in the spec. RDF does allow one to store information about <http://example.com/base/path/../Danny>, but I'm pretty certain that <http://example.com/base/path/../Danny> and <http://example.com/base/Danny> are two different resources. That will open a whole can of worms.

from rml-core.

DylanVanAssche avatar DylanVanAssche commented on June 25, 2024

Aha! Okay, I understood that the test cases (for engines) assume valid mappings. In other words, you now assume that each engine uses the shapes (or something else) to cover all cases. Which is OK, but I misinterpreted that.

Well not required but engines should not crash with invalid mappings. Thry can make their life ewsy by using the shapes or do it manually.

Assuming the base IRI of the mapping is the same as the output is a dangerous assumption. You could leave it for the tests for backward compatibility with previous engines, but I would propose to document it as (either http://www.example.com/base is used as input or the base of the mapping file is assumed to be the base). R2RML explicitly states that the base IRI for the output is passed as a parameter. I prefer David's solution of rml:baseIRI per triples map.

I agree here. I just need a way forward. Dropping the testcase seems the best way, we don't want engines to support this behavior. Do you agree?

19b --> "best effort" contradicts with "generating no file" so you want to be strict. You retain partial results of the same triples map.

Agreed! Especially this kind of stuff needs to go, the test cases must always follow the same paradigm. Let's change it to an error.

20b --> Based on R2RML --> the string is tested for being an absolute IRI and does not mention anything about trying to compute the absolute IRI where possible. RML can propose this, but I have yet to find this mentioned in the spec. RDF does allow one to store information about http://example.com/base/path/../Danny, but I'm pretty certain that http://example.com/base/path/../Danny and http://example.com/base/Danny are two different resources. That will open a whole can of worms.

Exactly! This is a whole can of worms so we need to pick a side here. These URIs are valid ones because they are in the end absolute. So not resolving? But if not resolving we can allow this case right?

from rml-core.

chrdebru avatar chrdebru commented on June 25, 2024

Agree to drop the case.
Agree to remove the file.

Well, turning http://example.com/base/path/../Danny into http://example.com/base/Danny is called IRI normalization. RDF 1.1 states that non-normalized IRIs must be avoided, but it does not say that they must be normalized before ingestion. This makes sense, as we can say different things about the two IRIs. The RFC about IRIs states that one way to test IRI equality is by string-comparison (character-by-character), but other approaches may include normalization. That's what I appreciate about R2RML. If an IRI is absolute, then use that one. If not, test whether the base IRI + IRI is absolute. In other words, R2RML enforces the use of absolute IRIs.

rml:normalizeIRIs true (by default false) can be a solution. (an expensive solution, that is).

from rml-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.