Giter Club home page Giter Club logo

rdfunit's Introduction

RDFUnit - RDF Unit Testing Suite

Maven Central Build Status Coverity Scan Build Status Coverage Status Codacy Badge codebeat badge Project Stats

Homepage: http://rdfunit.aksw.org
Documentation: https://github.com/AKSW/RDFUnit/wiki
Slack #rdfunit: https://dbpedia-slack.herokuapp.com/
Mailing list: https://groups.google.com/d/forum/rdfunit (rdfunit [at] googlegroups.com)
Presentations: http://www.slideshare.net/jimkont
Brief Overview: https://github.com/AKSW/RDFUnit/wiki/Overview

RDFUnit is implemented on top of the Test-Driven Data Validation Ontology and designed to read and produce RDF that complies to that ontology only. The main components that RDFUnit reads are TestCases (manual & automatic), TestSuites, Patterns & TestAutoGenerators. RDFUnit also strictly defines the results of a TestSuite execution along with different levels of result granularity.

Contents

Basic usage

See RDFUnit from Command Line or bin/rdfunit -h for (a lot) more options but the simplest setting is as follows:

$ bin/rdfunit -d <local-or-remote-location-URI>

What RDFUnit will do is:

  1. Get statistics about all properties & classes in the dataset
  2. Get the namespaces out of them and try to dereference all that exist in LOV
  3. Run our Test Generators on the schemas and generate RDFUnit Test cases
  4. Run the RDFUnit test cases on the dataset
  5. You get a results report in html (by default) but you can request it in RDF or even multiple serializations with e.g. -o html,turtle,jsonld
  • The results are by default aggregated with counts, you can request different levels of result details using -r {status|aggregate|shacl|shacllite}. See here for more details.

You can also run:

$ bin/rdfunit -d <dataset-uri> -s <schema1,schema2,schema3,...>

Where you define your own schemas and we pick up from step 3. You can also use prefixes directly (e.g. -s foaf,skos) we can get everything that is defined in LOV.

Using Docker

A Dockerfile is provided to create a Docker image of the CLI of RDFUnit.

To create the Docker image:

$ docker build -t rdfunit .

It is meant to execute a rdfunit command and then shutdown the container. If the output of rdfunit on stdout is not enough or you want to include files in the container, a directory could be mounted via Docker in order to create the output/result there or include files.

Here an example of usage:

$ docker run --rm -it rdfunit -d https://awesome.url/file -r aggregate

This creates a temporary Docker container which runs the command, prints the results on stdout and stops plus removes itself. For further usage of CLI visit https://github.com/AKSW/RDFUnit/wiki/CLI.

Supported Schemas

RDFUnit supports the following types of schemas

  1. OWL (using CWA): We pick the most commons OWL axioms as well as schema.org. (see [1],[2] for details)
  2. SHACL: Full SHACL is almost available except for a few SHACL constructs. Whatever constructs we support can also run directly on SPARQL Endpoints
  3. IBM Resource Shapes: The progress is tracked here but as soon as SHACL becomes stable we will drop support for RS
  4. DSP (Dublin Core Set Profiles): The progress is tracked here but as soon as SHACL becomes stable we will drop support for DSP

Note that you can mix all of these constraints together and RDFUnit will validate the dataset against all of them.

Acknowledgements

The first version of RDFUnit (formely known as Databugger) was developed by AKSW as part of the PhD thesis of Dimitris Kontokostas. A lot of additional work for improvement, requirements & refactoring was performed through the EU funded project ALIGNED. Through the project, a lot of project partners provided feedback and contributed code like e.g. Wolters Kluwers Germany and Semantic Web Company that are also users of RDFUnit.

There are also many code contributors as well as people submitted bug reports or provided constructive feedback.

In addition, RDFUnit used Java profiler (JProfiler) for optimizations

rdfunit's People

Contributors

aklakan avatar danmichaelo avatar dcherix avatar dependabot-preview[bot] avatar dependabot[bot] avatar der-bruemmer avatar dhiller avatar gcpdev avatar jimkont avatar jlleitschuh avatar k00ni avatar kurzum avatar mgns avatar neradis avatar roland-c avatar sandroacoelho avatar seebi avatar skyplabs avatar tboonx avatar vehnem avatar vemonet avatar vladimiralexiev avatar white-gecko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdfunit's Issues

Provide the complete errorneous triple

At the moment we provide subject / predicate, we should also provide the object value when applicable (e.g. we cannot when checking existence).

This should be part of the extended results format

Preconditions failure

Hi,

I've found a small bug in TestExecutionImpl. It fails at #L207 when the method testingFinished() is called. A value for ExecutionType is required for other methods that uses the same builder.

Could you, please , give me a hand on it?

Thank you in advance

RDFUnit does not build due to javadoc errors

I would really like to try RDFUnit; unfortunately following the build instructions leads to numerous Javadoc build errors and the necessary jars do not get compiled. I recommend using something like Travis CI to ensure that your build actually works on a fresh clone.

HasNoModelException thrown analyzing skos.rdf

We were getting this error and were able to narrow down to just SKOS, as follows. Any ideas of what's the issue here?

fcbr@FCBR-TP:~/repos/RDFUnit/bin$ java -Xmx5g -jar rdfunit-validate-0.7.7-SNAPSHOT-jar-with-dependencies.jar -d http://www.w3.org/TR/skos-reference/skos.rdf
[INFO  RDFUnitUtils] Loaded 517 additional schema declarations from LOV SPARQL Endpoint
[INFO  RDFUnitUtils] Loaded 14 schema declarations from: java.io.FileInputStream@b7bccb4e
[INFO  ValidateUtils] Searching for used schemata in dataset
[INFO  TestGeneratorExecutor] Generating tests for: http://www.w3.org/2004/02/skos/core#
Exception in thread "main" com.hp.hpl.jena.rdf.model.HasNoModelException: 4ad4b639:14f508389b0:-7ff6
    at com.hp.hpl.jena.rdf.model.impl.ResourceImpl.mustHaveModel(ResourceImpl.java:164)
    at com.hp.hpl.jena.rdf.model.impl.ResourceImpl.addProperty(ResourceImpl.java:277)
    at org.aksw.rdfunit.elements.writers.ResultAnnotationWriter.write(ResultAnnotationWriter.java:70)
    at org.aksw.rdfunit.tests.TestCaseAnnotation.serialize(TestCaseAnnotation.java:85)
    at org.aksw.rdfunit.tests.TestCase.serialize(TestCase.java:77)
    at org.aksw.rdfunit.tests.PatternBasedTestCase.serialize(PatternBasedTestCase.java:54)
    at org.aksw.rdfunit.utils.TestUtils.writeTestsToFile(TestUtils.java:291)
    at org.aksw.rdfunit.tests.generators.TestGeneratorExecutor.generateAutoTestsForSchemaSource(TestGeneratorExecutor.java:149)
    at org.aksw.rdfunit.tests.generators.TestGeneratorExecutor.generateTestSuite(TestGeneratorExecutor.java:106)
    at org.aksw.rdfunit.validate.cli.ValidateCLI.main(ValidateCLI.java:113)
'''

SHACL support

This issue tracks SHACL support in RDFUnit

Scopes

  • sh:targetNode
  • sh:targetClass
  • sh:targetSubjecsOf
  • sh:targetObjects

SHACL Core Constraint Components

  • sh:class
  • sh:datatype
  • sh:nodeKind
  • sh:minCount
  • sh:maxCount
  • sh:minExclusive
  • sh:minInclusive
  • sh:maxExclusive
  • sh:maxInclusive
  • sh:minLength
  • sh:maxLength
  • sh:pattern
  • sh:languageIn (does not yet match cases like @en / @en-us, only works for exact matches for now)
  • sh:uniqueLang
  • sh:equals
  • sh:disjoint
  • sh:lessThan
  • sh:lessThanOrEquals
  • sh:not
  • sh:and (partial support in top-level sh:and constraints)
  • sh:or
  • sh:xone
  • sh:node
  • sh:property
  • sh:qualifiedValueShape, sh:qualifiedMinCount, sh:qualifiedMaxCount
  • sh:closed, sh:ignoredProperties
  • sh:hasValue
  • sh:in

SPARQL-based Constraints

  • Almost fully supported
    • shapesGraph prebinding is not supported
    • currentShape prebinding is not supported
    • projection expressions in SPARQL queries are not yet supported e.g. the path variable in SELECT ?this (<a> AS ?path) WHERE {...} is not yet supported (planned to be). what can be done instead for now is SELECT ?this ?path WHERE { BIND(<a> AS ?path) ...}

SPARQL-based Constraint Components

  • Supported
    • noticed some strange behavior in complex ASK-based validators and node constraints that is investigated. simple ASK-based validators that contain only a filter clause are well supported.

Actually, all of the supported SHACL-Core is defined in this document which are SPARQL-based Constraint Components

Some pre-binding corner cases are not covered yet and are under development (see implementation report (to be submitted soon))

RDFUnit Demo page: undefined prefix https

I want to use the following URI for test data: https://raw.githubusercontent.com/GeoKnow/GeoQuality/master/GeoKnow-SpatialDQ/datacubes/NUTS/NUTS-metric1-6.ttl

Its a ttl-file located at Github. I selected at your RDFUnit demo page http://rdfunit.aksw.org/demo/ the Data Selection setting Direct Input. After clicking to Load the following error appears:

Error: Cannot read dump URI: http://rdfunit.aksw.org/CustomSource#0865b91b4e8bd1c8c885af0827b6587a15451fc6bc743ef8b66e5c32f86f6a80 Reason: [line: 1, col: 1 ] Undefined prefix: http

What am i doing wrong?

RDFUnit extendedResults are not ordered

Hi, after some tests, I notice that if rdfunit doesn't find errors the output is like that:

@Prefixes....

rutr:4fb97fee-2acd-11b2-8041-aa98dd5aae63
        a                          rut:TestExecution , prov:Activity ;
        rut:source                 </Users/AndreAga/Documents/Sviluppo/Progetti/UnifiedViews/Datasets/Elezioni.ttl> ;
        rut:testsError             "0"^^xsd:nonNegativeInteger ;
        rut:testsFailed            "0"^^xsd:nonNegativeInteger ;
        rut:testsRun               "1594"^^xsd:nonNegativeInteger ;
        rut:testsSuceedded         "1594"^^xsd:nonNegativeInteger ;
        rut:testsTimeout           "0"^^xsd:nonNegativeInteger ;
        rut:totalIndividualErrors  "0"^^xsd:nonNegativeInteger ;
        prov:endedAtTime           "2015-03-25T21:24:26.962Z"^^xsd:dateTime ;
        prov:startedAtTime         "2015-03-25T21:24:24.399Z"^^xsd:dateTime ;
        prov:used                  ruts:4fb97fef-2acd-11b2-8041-aa98dd5aae63 ;
        prov:wasAssociatedWith     schema: , <http://xmlns.com/foaf/0.1/> ;
        prov:wasStartedBy          <http://localhost/> .

ruts:4fb97fef-2acd-11b2-8041-aa98dd5aae63
        a               rut:TestSuite , prov:Collection ; prov:hadMember .....

But if the tool finds errors the output is like that:

@Prefixes...

<http://rdfunit.aksw.org/data/results#861d2bf8-2acd-11b2-80c0-aa98dd5aae63/861d62bc-2acd-11b2-80c0-aa98dd5aae63>
        a                    rut:TestCaseResult , rut:RLOGTestCaseResult , rlog:Entry , rut:ExtendedTestCaseResult , spin:ConstraintViolation ;
        rlog:level           rlog:WARN ;
        rlog:message         "http://schema.org/address is missing proper range" ;
        rlog:resource        <Via Marino Mazzacurati, 90@it> ;
        dcterms:date         "2015-03-26T22:45:15.951Z"^^xsd:dateTime ;
        rut:testCase         rutt:schema-RDFSRANGE-MISS-3f67d88cc38f44bb98e89a21653c7dc0f5227586b7b712f54cfdf78761762f2a ;
        spin:violationPath   schema:address ;
        spin:violationRoot   <Via Marino Mazzacurati, 90@it> ;
        prov:wasGeneratedBy  rutr:861d2bf8-2acd-11b2-80c0-aa98dd5aae63 .

rutr:861d2bf8-2acd-11b2-80c0-aa98dd5aae63
        a                          prov:Activity , rut:TestExecution ;
        rut:source                 </Users/AndreAga/Documents/Sviluppo/Progetti/UnifiedViews/Plugins-QualityAssessment/Q-RDFUnit/target/test-classes/Scuole.ttl> ;
        rut:testsError             "0"^^xsd:nonNegativeInteger ;
        rut:testsFailed            "3"^^xsd:nonNegativeInteger ;
        rut:testsRun               "1594"^^xsd:nonNegativeInteger ;
        rut:testsSuceedded         "1591"^^xsd:nonNegativeInteger ;
        rut:testsTimeout           "0"^^xsd:nonNegativeInteger ;
        rut:totalIndividualErrors  "1232"^^xsd:nonNegativeInteger ;
        prov:endedAtTime           "2015-03-26T22:45:16.822Z"^^xsd:dateTime ;
        prov:startedAtTime         "2015-03-26T22:45:14.576Z"^^xsd:dateTime ;
        prov:used                  ruts:861d634b-2acd-11b2-80c0-aa98dd5aae63 ;
        prov:wasAssociatedWith     <http://xmlns.com/foaf/0.1/> , schema: ;
        prov:wasStartedBy          <http://localhost/> .

<http://rdfunit.aksw.org/data/results#861d2bf8-2acd-11b2-80c0-aa98dd5aae63/861d5bea-2acd-11b2-80c0-aa98dd5aae63>
        a                    rut:ExtendedTestCaseResult , rut:RLOGTestCaseResult , rlog:Entry , spin:ConstraintViolation , rut:TestCaseResult ;
        rlog:level           rlog:WARN ;
        rlog:message         "http://schema.org/address is missing proper range" ;
        rlog:resource        <Piazza Di S. Alessio, 34@it> ;
        dcterms:date         "2015-03-26T22:45:15.95Z"^^xsd:dateTime ;
        rut:testCase         rutt:schema-RDFSRANGE-MISS-3f67d88cc38f44bb98e89a21653c7dc0f5227586b7b712f54cfdf78761762f2a ;
        spin:violationPath   schema:address ;
        spin:violationRoot   <Piazza Di S. Alessio, 34@it> ;
        prov:wasGeneratedBy  rutr:861d2bf8-2acd-11b2-80c0-aa98dd5aae63 .

Is there a way to put the "summary" on the top of the file? after the prefixes?

Resource Filtering

While writing input/drafts for #49 we encountered issues with writing Tests. Problem is that the Module rdfunit-resources is required but this won't work unless Maven copied certain resources from the /configuration folder to the appropriate package in /rdfunit-resources/src/main/resources.

To run Unit-Tests without the requirement to invoke Maven beforehand, specific resources should be maintained in the rdfunit-resources module directly instead of /configuration.

Handle subproperties

subproperties are not handled correctly at the moment. There are 2 cases:

  1. they derive the domain & range (and other axioms) from their superproperties so we miss some validations tests.
  2. they return false violations on cardinality constrains.

For (1) we can create additional generators that produce the same tests for their superproperties (need to think of side-effects)

For (2) we need to place property paths in the TYPRODEP, OWLCARDT and OWLCARD patterns to include subproperties in the query evaluation. (but this will produce some huge queries)

Build Error at end of run.

When a testrun is completed a build error is thrown.

[INFO SimpleTestExecutorMonitor] Test 2366/2367 returned Errors: 0 / Prevalence: 0. Test: rutt:foaf-RDFSDOMAIN-MISS-ce32ce369944d467574430f8a2c9b638b535fa0981634f7176e007322a630a9d
[INFO SimpleTestExecutorMonitor] Test 2367/2367 returned Errors: 0 / Prevalence: 0. Test: rutt:foaf-RDFSDOMAIN-MISS-892f44da16d3228153e3800c73194f447259e7e01b41f6a7bedf459635fdc446
[ERROR] BUILD ERROR

Wrong inverse functional pattern

The built-in pattern rutp:INVFUNCT in https://github.com/AKSW/RDFUnit/blob/master/rdfunit-core/src/main/resources/org/aksw/rdfunit/patterns.ttl is obviously wrong:

rut:sparqlWherePattern """ {   {
                                        ?a %%P1%% ?resource ;
                                           a %%T1%% .
                                        ?b %%P1%% ?resource ;
                                           a %%T1%% .
                                    } UNION {
                                        ?a %%P1%% ?resource ;
                                            rdfs:subClassOf+ %%T1%% .
                                        ?b %%P1%% ?resource ;
                                            rdfs:subClassOf+ %%T1%% .
                                    }
                                    FILTER (?a != ?b)
                                } """ ;

If you have a look at the second part of the UNION, neither ?a nor ?b is usually a class, and there is a missing triple pattern which connects the type of ?a (resp. ?b) with the rdfs:subClassOf triple pattern, i.e. it should more look like

rut:sparqlWherePattern """ {   {
                                        ?a %%P1%% ?resource ;
                                           a %%T1%% .
                                        ?b %%P1%% ?resource ;
                                           a %%T1%% .
                                    } UNION {
                                        ?a %%P1%% ?resource ; ?a a ?TA .
                                            ?TA rdfs:subClassOf+ %%T1%% .
                                        ?b %%P1%% ?resource ; ?b a ?TB .
                                            ?TB rdfs:subClassOf+ %%T1%% .
                                    }
                                    FILTER (?a != ?b)
                                } """ ;

By the way, SPARQL property paths allow for a more compact notation, using / and * like

rut:sparqlWherePattern """ {   
                                 ?a %%P1%% ?resource ;
                                 rdf:type/rdfs:subClassOf* %%T1%% .
                                ?b %%P1%% ?resource ;
                                rdf:type/rdfs:subClassOf* %%T1%% .
                                FILTER (?a != ?b)
                                } """ ;

Want to see the triple with S,P,O

error case
In this screenshot, I can see some properties that occur errors. but I'd like to see the triple with S, P, O format that occur errors. For example,
S1 http://dbpedia.org/ontology/influencedBy O1
S2 http://dbpedia.org/ontology/influencedBy O2
S3 http://dbpedia.org/ontology/influencedBy O3
S4 http://dbpedia.org/ontology/influencedBy O4
It will show us 4 triples because 'http://dbpedia.org/ontology/influencedBy' property occurs 4 errors in this screenshot.
I tried to fix some code after clone this code on my local machine, but it is too big to fix.. I have no idea where to fix.. Would you give me some advice?
Thank you

More transparent facilites to specify local alternatives for LOV schema locations for CLI

Currently, when I want to use, for example a local version of the NIF core ontology instead of the official version at /opt/nif-core.ttl, I can declare this using:
bin/rdf-unit -s /opt-nif-core.ttl -d /opt/nif-doc-to-test.ttl

But if I use a file URL, the initialisation fails - a org.aksw.rdfunit.io.reader.RDFDereferenceReader and org.aksw.rdfunit.io.reader.RDFaReader are tried, which both seem only to expect and accept RDF/XML:
bin/rdf-unit -s file:///opt-nif-core.ttl -d /opt/nif-doc-to-test.ttl <- fails

This specific issue should probably be fixed, but then still a user just wanting to re-route a single ontology to a local work-in-progress version would not be able to utilize the convenient automatic LOV-resolution for other vocabs appearing in the document ot validate.

I would propose a different approach, that introduced a new command line switch --prefix-remap:

bin/rdf-unit --prefix-map 'nif:/opt-nif-core.ttl' -d /opt/nif-doc-to-test.ttl

Proposed semantics: Initialize the prefix <-> schema-uri BiMap as before, but afterwards merge the map provided from the command line into the default BiMap. This allows (as in the example scenario) easy-rerouting for test purposes and to specify additional prefixes for vacabularies that did not make it to LOV yet.

The -s switch could then really just expect prefix names (as many identifiers in the code already suggest), making its semantics clearer to follow by users and less error-prone to implement.

@jimkont What do you think?

owl:complementOf Autogenerator fails on anonymous class declarations for the complement

One example of problematic schema specification from the current NIF Core ontology:

nif:lang
    a owl:ObjectProperty ;
    # ommisions for brevity
    rdfs:domain [ rdf:type owl:Class ; owl:complementOf nif:Context ] .

When the autogenerator tries to generate the ?DESCRIPTION, this fails with the Jena
SPARQL engine (probably because it cannot come up with a str(...) value for a bound
blank node in ?T1:

    rut:sparqlGenerator """ SELECT DISTINCT ?T1 ?T2 ?DESCRIPTION WHERE {
                             ?T1 owl:complementOf ?T2 . 
                             BIND ( concat(str(?T1)," is owl:complementOf with ", str(?T2)) AS ?DESCRIPTION)} """ ;
    rut:basedOnPattern rutp:OWLDISJC ;

Resulting in the error message:
[ERROR TestAutoGenerator] No ?DESCRIPTION variable found in AutoGenerator: http://rdfunit.aksw.org/data/generators#OWLCOMPL

CLI startup needs too long

I assume, that CLI needs so much initial time, because it loads a lot of data from lov.okfn.org

This is annoying for dev mode, i.e. I make many small changes to the ontology schema and then I want to rerun the CLI to check the result.

an option to disable caching on file would also be nice. so I don't need to do
"rm data/tests" all the time

CLI final message

2014-05-05 11:45:17,351 [org.aksw.rdfunit.Main.main()] INFO org.aksw.rdfunit.Main: Tests run: 110, Failed: 0, Timeout: 0, Error: 0. Individual Errors: 0

you could add:
"test written to file____home_sebastian_svn_NLP2RDF_ontologies_nif-core_example.owl.aggregatedTestCaseResult.*"

Check RDF directly from web demo

At the moment the web demo supports only SPARQL endpoints.

Something like text area where users paste RDF or a URI to download
(TODO: check file size limit)

Remove the dependency to LOV for every validation

LOV is not very stable any more and results in failed RDFUnit executions.
instead generate and store the vocabulary metadata offline and save them as a Java resource which will be updated before each release

Installing RDFUnit: missing documentation and errors

Installing RDFUnit is tricky. I could not find documentation on what file to run.

I tried ./build-deb-package.sh, which produced git and dch errors at first. I fixed the first error by cloning the repo instead of downloading the tar.gz release. The second error was fixed by installing the devscripts package. However, building the deb failed eventually with the maven error I've copy pasted below (section _DEB_)

Additionally, I tried running the bin scripts. At first, running ./bin/rdfunit did not work (see section _BIN/RDFUNIT_.

Afterwards, running ./bin/rdfunit-dev did work. After running the dev version, running the regular ./bin/rdfunit worked as well.

I guess either I would need a lesson in finding documentation and rtfm, or some extra documentation on how to install rdfunit is needed.

_DEB_

[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing /home/lrd900/Downloads/RDFUnit/rdfunit-validate/target/rdfunit-cli.jar with /home/lrd900/Downloads/RDFUnit/rdfunit-validate/target/rdfunit-validate-0.7.2-SNAPSHOT-shaded.jar
[INFO] Dependency-reduced POM written at: /home/lrd900/Downloads/RDFUnit/rdfunit-validate/dependency-reduced-pom.xml
[INFO] Dependency-reduced POM written at: /home/lrd900/Downloads/RDFUnit/rdfunit-validate/dependency-reduced-pom.xml
[INFO]
[INFO] --- jdeb:1.0:jdeb (default) @ rdfunit-validate ---
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7.748s
[INFO] Finished at: Fri Sep 12 15:59:08 CEST 2014
[INFO] Final Memory: 26M/345M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.vafer:jdeb:1.0:jdeb (default) on project rdfunit-validate: Unable to parse configuration of mojo org.vafer:jdeb:1.0:jdeb for parameter mapper: Cannot find setter, adder nor field in org.vafer.jdeb.maven.Mapper for 'serialization' -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginConfigurationException

_BIN/RDFUNIT_
[ERROR] Failed to execute goal on project rdfunit-validate: Could not resolve dependencies for project org.aksw.rdfunit:rdfunit-validate:jar:0.7.2-SNAPSHOT: The following artifacts could not be resolved: org.aksw.rdfunit:rdfunit-core:jar:0.7.2-SNAPSHOT, org.aksw.rdfunit:rdfunit-resources:jar:0.7.2-SNAPSHOT: Could not find artifact org.aksw.rdfunit:rdfunit-core:jar:0.7.2-SNAPSHOT in maven.aksw.internal (http://maven.aksw.org/archiva/repository/internal) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

CLI Connection Refused

Hi, I'm trying to execute RDFUnit via CLI whit this command:

bin/rdfunit -d /...../Dataset.rdf -o ttl

I get this error:

java.lang.RuntimeException: HttpException: org.apache.http.conn.HttpHostConnectException: Connection to http://helium.okfnlabs.org:3030 refused: Unexpected error making the query: org.apache.http.conn.HttpHostConnectException: Connection to http://helium.okfnlabs.org:3030 refused
    at org.aksw.jena_sparql_api.cache.core.QueryExecutionCacheEx.doCacheResultSet(QueryExecutionCacheEx.java:98)
    at org.aksw.jena_sparql_api.cache.core.QueryExecutionCacheEx.execSelect(QueryExecutionCacheEx.java:210)
    at org.aksw.jena_sparql_api.pagination.core.ResultSetPaginated.prefetch(ResultSetPaginated.java:97)
    at org.aksw.jena_sparql_api.pagination.core.ResultSetPaginated.prefetch(ResultSetPaginated.java:49)
    at org.aksw.commons.collections.PrefetchIterator.preparePrefetch(PrefetchIterator.java:35)
    at org.aksw.commons.collections.PrefetchIterator.getCurrent(PrefetchIterator.java:51)
    at org.aksw.commons.collections.PrefetchIterator.hasNext(PrefetchIterator.java:58)
    at org.aksw.jena_sparql_api.pagination.core.QueryExecutionIterated.execSelect(QueryExecutionIterated.java:92)
    at org.aksw.rdfunit.Utils.RDFUnitUtils.fillSchemaServiceFromLOV(RDFUnitUtils.java:138)
    at org.aksw.rdfunit.validate.cli.ValidateCLI.main(ValidateCLI.java:61)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
    at java.lang.Thread.run(Thread.java:745)
Caused by: HttpException: org.apache.http.conn.HttpHostConnectException: Connection to http://helium.okfnlabs.org:3030 refused: Unexpected error making the query: org.apache.http.conn.HttpHostConnectException: Connection to http://helium.okfnlabs.org:3030 refused
    at com.hp.hpl.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuery.java:417)
    at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:358)
    at com.hp.hpl.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:295)
    at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execResultSetInner(QueryEngineHTTP.java:346)
    at com.hp.hpl.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:338)
    at org.aksw.jena_sparql_api.core.QueryExecutionDecoratorBase.execSelect(QueryExecutionDecoratorBase.java:75)
    at org.aksw.jena_sparql_api.delay.core.QueryExecutionDelay.execSelect(QueryExecutionDelay.java:33)
    at org.aksw.jena_sparql_api.cache.core.QueryExecutionCacheEx.doCacheResultSet(QueryExecutionCacheEx.java:76)
    ... 15 more
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to http://helium.okfnlabs.org:3030 refused
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:643)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
    at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:137)
    at org.apache.http.impl.client.DecompressingHttpClient.execute(DecompressingHttpClient.java:118)
    at org.apache.jena.riot.web.HttpOp.exec(HttpOp.java:1108)
    at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:385)
    at org.apache.jena.riot.web.HttpOp.execHttpGet(HttpOp.java:447)
    at com.hp.hpl.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:346)
    ... 21 more
Caused by: java.net.ConnectException: Operation timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
    ... 31 more
[ERROR RDFUnitUtils] Encountered error when reading schema information from LOV, schema prefixes & auto schema discovery might not work as expected
java.lang.RuntimeException: Underlying result set not avaliable - probably a query failed.
    at org.aksw.jena_sparql_api.pagination.core.QueryExecutionIterated.execSelect(QueryExecutionIterated.java:107)
    at org.aksw.rdfunit.Utils.RDFUnitUtils.fillSchemaServiceFromLOV(RDFUnitUtils.java:138)
    at org.aksw.rdfunit.validate.cli.ValidateCLI.main(ValidateCLI.java:61)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
    at java.lang.Thread.run(Thread.java:745)
[INFO  RDFUnitUtils] Loaded 0 schema declarations from LOV SPARQL Endpoint
[INFO  RDFUnitUtils] Loaded 12 schema declarations from: java.io.FileInputStream@650ebab7
[INFO  ValidateUtils] Searching for used schemata in dataset
[WARN  DatasetStatistics] Undefined namespace in LOV or schemaDecl.csv: http://www.w3.org/1999/02/22-rdf-syntax-ns#
[WARN  DatasetStatistics] Undefined namespace in LOV or schemaDecl.csv: http://linkeddata.comune.fi.it:8080/resource/data/
[WARN  DatasetStatistics] Undefined namespace in LOV or schemaDecl.csv: http://www.w3.org/2000/01/rdf-schema#
[INFO  SimpleTestExecutorMonitor] Testing /Users/AndreAga/Documents/Sviluppo/Progetti/UnifiedViews/Datasets/FirenzeSinistri.rdf
[INFO  SimpleTestExecutorMonitor] Tests run: 0, Failed: 0, Timeout: 0, Error: 0. Individual Errors: 0
[INFO  ValidateCLI] Results stored in: ../data/results/_Users_AndreAga_Documents_Sviluppo_Progetti_UnifiedViews_Datasets_FirenzeSinistri.rdf.aggregatedTestCaseResult.*

JUnit Integration

We're currently using RDFUnit mainly to test if our transformations are producing RDF that complies to an underlying ontology. The current solution feeds single RDF files (effectively as Jena-Models) - that are produced by an external resource - to RDFUnit one-by-one using a parameterized JUnit-Test. Due to the long running process (~10 minutes ATM) this is usually executed by the CI-System where further reporting takes place (diagrams, statistics, etc.).

So far the process is working, but while integrating this into our dev-pipeline we spotted the following issues:

  • tests are counted on a "input-model-level" via JUnit, i.e. the actual individual/auto-generated RDFUnit test-cases are not represented as every input file leads to several RDFUnit-TestCases
  • this lack of granularity makes analysing issues harder as the reporting-scope and -context is usually to big, esp. when there are more than a couple of errors per input
  • the entire setup is cumbersome and requires quite a bit of boilerplate

Solution: Integrate RDFUnit with JUnit
A solution could foresee a specialized JUnit-Runner that can be configured to provide essential inputs (Ontology, local CVs, Inputs) which reports RDFUnit Test-Cases to JUnit so that this information is not hidden. This could looks like:

@RunWith(RDFUnitJUnitRunner.class)
@Ontology(uri="http://reference.to.ontology")
public class SomeRdfTest {

    @ControlledVocabulary
    public Model controlledVocabularies() {
        Model cvModel = ...
        ...
        return cvModel;
    }

    @Input
    public List<Model> inputModels() {
        List<Model> modelsToVerify = ...
        ...
        return modelsToVerify; 
    }

    @After
    public void result(Model validationModel) {
        // do additional things on the validation results
    }       
}

Most notably there is no @Test method which is because most of the tests are dynamically/auto generated by the Runner. So for the given ontology a number of RDFUnit TestCases per input is created, executed and reported back to JUnit. Furthermore a Model containing local Controlled-Vocabularies can be provided, validation model can be re-injected after test is run, etc.

Not sure how RDFUnit is/should be used generally but I think this could play out nicely esp. for Unit Tests.

pc:profile testing problem

I have RDF/XML data. Everything is checked fine but pc:profile isn't. I have this result.

ashampoo_snap_2015 03 09_21h44m16s_002_

Over this data

 <pc:contractingAuthority>
         <gr:BusinessEntity rdf:about="http://linked.opendata.cz/resource/ted.europa.eu/business-entity/c64aa269-bc65-4159-b82e-8691a5535f25">
            <gr:legalName>Compania de Apa Oltenia SA</gr:legalName>
            <s:address>
               <s:PostalAddress rdf:about="http://linked.opendata.cz/resource/ted.europa.eu/postal-address/1a4179b0-7153-4b16-84e2-9886919352cd">
                  <s:streetAddress>Str. Brestei nr. 133</s:streetAddress>
                  <s:addressLocality>Craiova</s:addressLocality>
                  <s:postalCode>200177</s:postalCode>
                  <s:addressCountry>RO</s:addressCountry>
               </s:PostalAddress>
            </s:address>
            <pc:profile rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">www.e-licitatie.ro</pc:profile>
            <foaf:page rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">www.apaoltenia.ro</foaf:page>
            <pc:mainActivity rdf:resource="http://purl.org/procurement/public-contracts-activities#Water"/>
         </gr:BusinessEntity>
      </pc:contractingAuthority>

I belive there is no problem with this Literal within pc:profile element.

Ontology says this

https://code.google.com/p/public-contracts-ontology/source/browse/public-contracts.ttl

pc:profile a owl:InverseFunctionalProperty, owl:DatatypeProperty ;
rdfs:subPropertyOf foaf:homepage ;
rdfs:label "Profilová stránka zadavatele"@cs, "Profile web site of contracting authority"@en ;
rdfs:comment "Vlastnost pro internetovou adresu profilové stránky zadavatele veřejných zakázek (URL). Kardinalita 0..1"@cs ;
rdfs:comment "Property for the internet address of the profile of contracting authority (URL). Cardinality 0..1"@en ;
rdfs:domain gr:BusinessEntity ;
rdfs:range xsd:anyURI ;
rdfs:isDefinedBy http://purl.org/procurement/public-contracts

Thank you for your consideration of this issue.

Test Coverage Metrics

We're having the rdfunit-junit integration up and running which provides us with a good overview of failing test cases, esp. in conjunction with IDE and/or CI-server. If a test is "red" we can trust something broke.

However, the issue with "green" tests is, that we actually do not know why it is green: could be data that's valid according to the test case or maybe because there's no data to validate at all. The latter fact would decrease significance of that test (at least in the given context).

Furthermore we're missing metrics of how much input-data is actually covered by the test cases. Looking at the TestCoverageEvaluator this seems usable - though we need some elaboration. It's currently not clear what input is expected.

Request:

  • Can we figure out on a per-test-case basis if there is data to be tested (before/after test is run)? We could use the "test-skipped" notificaiton of JUnit to provide an overview how many tests are not testing anything.
  • Could the API of TestCoverageEvaluator elaborated?

Compile error

When testing the build there is an error:
java.lang.NoClassDefFoundError: org/hamcrest/SelfDescribing
Caused by: java.lang.ClassNotFoundException: org.hamcrest.SelfDescribing

Results :

Tests in error:
PrefixNSServiceTest.initializationError » NoClassDefFound org/hamcrest/SelfDes...

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] There are test failures.

Please refer to /opt/rdfunit/rdfunit-commons/target/surefire-reports for the individual test results.
[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8 seconds
[INFO] Finished at: Sat Jan 23 13:26:09 CET 2016
[INFO] Final Memory: 86M/1722M
[INFO] ------------------------------------------------------------------------

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.