Giter Club home page Giter Club logo

carml-service's Introduction

Zazuko CARML Service

CARML is an implementation of the RML mapping specification, with extensions to process streams. It can be used to convert non-RDF data like XML, JSON or CSV to RDF.

This project creates a web service around the CARML RML Engine. This facilitates using carml as a mapping engine from non-Java/JVM projects. Via the HTTP API, one can send mappings and sources with a POST to the service and get the resulting triples back.

At Zazuko, we use the service to scale RDF conversion of millions of XML files by integrating the carml service in our linked data pipelining framework barnard59. The step implementing this service can be found here.

If you are looking for a command-line tool you might want to check out carml-jar

Flavors

This project provides two flavors

Building

To build this project you need a standard maven setup

mvn clean package

Will generate both the Meecrowave bundle and the drop in WAR

Results are available in war/target/war-1.0.0-SNAPSHOT.war service/target/meecrowave-meecrowave-distribution.zip

The war should be copied in the Tomcat webapps directory, the zip distribution contains a Meecrowave instance that can be started through bin/meecrowave.sh run

The war has test endpoint at service/test the meecrowave instance has the test endpoint at /test

Service

The service at /(meecrowave),/service/(war) expects multipart/form-data with the following fields to be POSTed

  • mapping a turtle based R2RML mapping file
  • source the source file, the formats supported are XML, CSV and JSON, indicated by the content type

Headers

  • The service supports content negotiation to determine the result format through the Accept header, if none is provided it will return text/turtle

curl example

To process a mapping from the command line the following curl command can be used:

curl -F [email protected] -F [email protected] -H "Accept: text/turtle" http://localhost:8080/

Where:

  • mapping.ttl is a valid R2RML mapping fle
  • source.xml is XML file that is described by the mapping.ttl
  • text/turtle is the requested output format
  • http://localhost:8080 is the URI where the service is listening

Results

Either a RDF file in the requested format is returned with 200 OK status code or a error report according to the Problem Details for HTTP APIs with a 400 status code.

Notes on the stream extension

The RML spec supports file based sources by default and CARML extends this to use streams. This service expects a logical source that declares a stream named 'stdin'

Example:

PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX carml: <http://carml.taxonic.com/carml/>
PREFIX rr: <http://www.w3.org/ns/r2rml#>
PREFIX ql: <http://semweb.mmlab.be/ns/ql#>

<#person>
a rr:TriplesMap;
	rml:logicalSource [
		rml:source [
			a carml:Stream;
			carml:streamName "stdin"
		];
		rml:referenceFormulation ql:JSONPath;
		rml:iterator "$.characters[*]"
	].

If you are using XRM plugin, set the mapping outputs to carml and use stdin instead of file-names. The plugin will produce this mapping for you.

carml-service's People

Contributors

cristianvasquez avatar ktk avatar ludovicm67 avatar mchlrch avatar semanticfire avatar tpluscode avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

bnationsdev

carml-service's Issues

Premature end of file

Version 0.0.3 does not work reliable, at least in some of the settings we have. In GitLab pipelines it very quickly fails with this error:

info:   CARML_SERVICE: http://carml-service:8080/ {"iri":"http://example.org/pipeline/toFile","timestamp":"2023-08-11T14:07:58.037Z"}
info: created new Step {"iri":"http://example.org/pipeline/parseN3","timestamp":"2023-08-11T14:07:58.041Z"}
Error: server error: [400] Bad Request ({"title":"Error executing XPath expression.","detail":"io.carml.logicalsourceresolver.LogicalSourceResolverException: Error executing XPath expression.\n\tat io.carml.logicalsourceresolver.XPathResolver.xpathPathFlux(XPathResolver.java:154)\n\tat io.carml.logicalsourceresolver.XPathResolver.lambda$getXpathResultFlux$2(XPathResolver.java:120)\n\tat reactor.core.publisher.FluxCreate.subscribe(FluxCreate.java:95)\n\tat reactor.core.publisher.Flux.subscribe(Flux.java:8660)\n\tat reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:200)\n\tat reactor.core.publisher.FluxFlatMap.subscribeOrReturn(FluxFlatMap.java:93)\n\tat reactor.core.publisher.Flux.subscribe(Flux.java:8646)\n\tat reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:426)\n\tat reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:335)\n\tat reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:294)\n\tat reactor.core.publisher.FluxFlatMap$FlatMapMain.onSubscribe(FluxFlatMap.java:371)\n\tat reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:201)\n\tat reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:83)\n\tat reactor.core.publisher.Flux.subscribe(Flux.java:8660)\n\tat reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber.onComplete(FluxConcatArray.java:258)\n\tat reactor.core.publisher.FluxConcatArray.subscribe(FluxConcatArray.java:78)\n\tat reactor.core.publisher.Mono.subscribe(Mono.java:4444)\n\tat reactor.core.publisher.Mono.block(Mono.java:1733)\n\tat io.carml.engine.rdf.RdfRmlMapper.toModel(RdfRmlMapper.java:368)\n\tat io.carml.engine.rdf.RdfRmlMapper.mapToModel(RdfRmlMapper.java:351)\n\tat com.zazuko.service.carml.CarmlEndpoint.doConversion(CarmlEndpoint.java:68)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.webbeans.intercept.AbstractInvocationContext.directProceed(AbstractInvocationContext.java:113)\n\tat org.apache.webbeans.intercept.AbstractInvocationContext.proceed(AbstractInvocationContext.java:106)\n\tat org.apache.webbeans.intercept.InterceptorInvocationContext.proceed(InterceptorInvocationContext.java:78)\n\tat org.apache.meecrowave.cxf.JAXRSFieldInjectionInterceptor.lazyInjectContexts(JAXRSFieldInjectionInterceptor.java:64)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.webbeans.component.InterceptorBean.intercept(InterceptorBean.java:136)\n\tat org.apache.webbeans.intercept.InterceptorInvocationContext.proceed(InterceptorInvocationContext.java:65)\n\tat org.apache.webbeans.intercept.DefaultInterceptorHandler.invoke(DefaultInterceptorHandler.java:139)\n\tat com.zazuko.service.carml.CarmlEndpoint$$OwbInterceptProxy0.doConversion(com/zazuko/service/carml/CarmlEndpoint.java)\n\tat com.zazuko.service.carml.CarmlEndpoint$$OwbNormalScopeProxy0.doConversion(com/zazuko/service/carml/CarmlEndpoint.java)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179)\n\tat org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96)\n\tat org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201)\n\tat org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104)\n\tat org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59)\n\tat org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96)\n\tat org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)\n\tat org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)\n\tat org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)\n\tat org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:234)\n\tat org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:208)\n\tat org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:160)\n\tat org.apache.cxf.transport.servlet.CXFNonSpringServlet.invoke(CXFNonSpringServlet.java:225)\n\tat org.apache.cxf.transport.servlet.AbstractHTTPServlet.handleRequest(AbstractHTTPServlet.java:298)\n\tat org.apache.cxf.transport.servlet.AbstractHTTPServlet.doPost(AbstractHTTPServlet.java:217)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:681)\n\tat org.apache.cxf.transport.servlet.AbstractHTTPServlet.service(AbstractHTTPServlet.java:273)\n\tat org.apache.meecrowave.cxf.CxfCdiAutoSetup$1.doFilter(CxfCdiAutoSetup.java:122)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)\n\tat org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:540)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)\n\tat org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:687)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)\n\tat org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:382)\n\tat org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)\n\tat org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:895)\n\tat org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1732)\n\tat org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)\n\tat org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)\n\tat org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)\n\tat org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n\tSuppressed: java.lang.Exception: #block terminated with an error\n\t\tat reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:139)\n\t\tat reactor.core.publisher.Mono.block(Mono.java:1734)\n\t\t... 61 more\nCaused by: javax.xml.xpath.XPathException: org.xml.sax.SAXException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]\nMessage: Premature end of file.\njavax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]\nMessage: Premature end of file.\n\tat jlibs.xml.sax.dog.XMLDog.sniff(XMLDog.java:192)\n\tat io.carml.logicalsourceresolver.XPathResolver.xpathPathFlux(XPathResolver.java:152)\n\t... 78 more\nCaused by: org.xml.sax.SAXException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]\nMessage: Premature end of file.\njavax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]\nMessage: Premature end of file.\n\tat io.carml.logicalsourceresolver.PausableStaxXmlReader.run(PausableStaxXmlReader.java:152)\n\tat io.carml.logicalsourceresolver.PausableStaxXmlReader.start(PausableStaxXmlReader.java:103)\n\tat io.carml.logicalsourceresolver.PausableStaxXmlReader.parse(PausableStaxXmlReader.java:75)\n\tat jlibs.xml.sax.dog.XMLDog.sniff(XMLDog.java:189)\n\t... 79 more\nCaused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]\nMessage: Premature end of file.\n\tat java.xml/com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652)\n\tat io.carml.logicalsourceresolver.PausableStaxXmlReader.run(PausableStaxXmlReader.java:150)\n\t... 82 more\n"})
    at Client.map (/builds/pipelines/fso-ech_0071-ld/node_modules/barnard59-carml-service/lib/Client.js:39:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /builds/pipelines/fso-ech_0071-ld/node_modules/barnard59-carml-service/transform.js:17:26

So the error seems to be Message: Premature end of file.. This can be "fixed" by trying often enough but at least on the ldbar Gitlab it fails more often than not. I wonder if it is related to amount of memory/cpu assigned, as in that it fails less on beefy machines but more on restricted machines.

The log above is taken from this run, I had to run it many times to get one done without errors.

I propose to update to the latest carml release & if necessary add additional debugging support so that we can narrow down how/why the issue happens.

proposal: accept unnamed streams

Currently, the service expects a carml:streamName "stdin" like this:

	rml:logicalSource [
		rml:source [
			a carml:Stream;
			carml:streamName "stdin"
		];
		rml:referenceFormulation ql:JSONPath;
		rml:iterator "$.characters[*]"
	];

If the name is not specified, the service returns 400.

However, CARML does not make this mandatory. Also, carml-jar expects stream-sources without streamName when processing through stdin

It would be nice if the service accepted mappings without streamName (and 'stdin' for backward compatibility)

Motivation: to make simpler CARML mappings, documentation, and XRM mappings. Also would be no mistmatch between carml-jar mappings and the ones that the service expects.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.