Giter Club home page Giter Club logo

drafter's Introduction

CircleCI

Drafter & Drafter Client

This repo contains both the Drafter Server and a version locked Clojure client to that server.

There are two README's depending on what you want, choose your path

CI in this project should build and test Drafter first, then use that Drafter to build and test the client.

License

Licensed under the Eclipse Public License 2.0

Copyright © TPXimpact Data Ltd 2023

If you modify or host Drafter, you are not permitted to use Swirrl, TPXimpact, Drafter or PublishMyData names or logos without explicit permission.

Usage expectation

Although now released under an open licence we expect that Drafter is, at the current time at least, probably not that useful for others who aren't using other components of the hosted PublishMyData service. This may change over time.

drafter's People

Contributors

rickmoynihan avatar lkitching avatar andrewmcveigh avatar ricswirrl avatar callum-oakley avatar scottlowe avatar kiramclean avatar the-frey avatar robsteranium avatar danmidwood avatar rosado avatar

Watchers

 avatar  avatar Bill Roberts avatar James Cloos avatar Alasdair Gray avatar  avatar Ben avatar  avatar Madeline Kosse avatar  avatar

drafter's Issues

Use Sparql Service Description vocab in Drafter state graph

We should use the SPARQL Service Description vocab to describe graphs and endpoints etc in drafter.

In particular the following vocab terms are relevant, and could be used to remodel basically all of drafters state graph:

  • sd:endpoint
  • sd:GraphCollection
  • sd:availableGraphs
  • sd:namedGraph
  • etc...

Additionally this spec includes stuff on how to discover SPARQL endpoints over REST etc... which might also a good thing for us to support in the future and expose on PMD etc...

drafter displays ring error page, with a NullPointerExecption when you send it a gif

</style></head><body><div id="exception"><h1>java.lang.NullPointerException</h1><div class="message"></div><div class="trace"><table><tbody><tr class="clojure"><td class="source">media.clj:67</td><td class="method">pantomime.media/fn</td></tr><tr class="clojure"><td class="source">media.clj:49</td><td class="method">pantomime.media/fn[fn]</td></tr><tr class="clojure"><td class="source">io.clj:341</td><td class="method">grafter.rdf.io/mimetype-&gt;rdf-format</td></tr><tr class="clojure"><td class="source">jobs.clj:121</td><td class="method">drafter.rdf.draft-management.jobs/append-data-to-graph-from-file-job</td></tr><tr class="clojure"><td class="source">drafts_api.clj:51</td><td class="method">drafter.routes.drafts-api/draft-api-routes[fn]</td></tr><tr class="clojure"><td class="source">core.clj:99</td><td class="method">compojure.core/make-route[fn]</td></tr><tr class="clojure"><td class="source">core.clj:45</td><td class="method">compojure.core/if-route[fn]</td></tr><tr class="clojure"><td class="source">core.clj:30</td><td class="method">compojure.core/if-method[fn]</td></tr><tr class="clojure"><td class="source">core.clj:112</td><td class="method">compojure.core/routing[fn]</td></tr><tr class="clojure"><td class="source">core.clj:2515</td><td class="method">clojure.core/some</td></tr><tr class="clojure"><td class="source">core.clj:112</td><td class="method">compojure.core/routing</td></tr><tr class="java"><td class="source">RestFn.java:139</td><td class="method">clojure.lang.RestFn.applyTo</td></tr><tr class="clojure"><td class="source">core.clj:626</td><td class="method">clojure.core/apply</td></tr><tr class="clojure"><td class="source">core.clj:117</td><td class="method">compojure.core/routes[fn]</td></tr><tr class="clojure"><td class="source">core.clj:112</td>

updating a draft with cat.gif causes the job status endpoint to 500

Stacktrace below. Was expecting a 200 with a JSON body containing the error.

com.fasterxml.jackson.core.JsonGenerationException
Cannot JSON encode object of class: class java.lang.NullPointerException: java.lang.NullPointerException
generate.clj:147    cheshire.generate/generate
generate.clj:119    cheshire.generate/generate
core.clj:32 cheshire.core/generate-string
core.clj:19 cheshire.core/generate-string
format_response.clj:118 ring.middleware.format-response/wrap-format-response[fn]
multipart_params.clj:118    ring.middleware.multipart-params/wrap-multipart-params[fn]
validation.clj:155  noir.validation/wrap-noir-validation[fn]
cookies.clj:72  noir.cookies/noir-cookies[fn]
cookies.clj:156 ring.middleware.cookies/wrap-cookies[fn]
session.clj:148 noir.session/noir-flash[fn]
flash.clj:35    ring.middleware.flash/wrap-flash[fn]
session.clj:99  noir.session/noir-session[fn]
session.clj:98  ring.middleware.session/wrap-session[fn]
Var.java:379    clojure.lang.Var.invoke
resource.clj:25 ring.middleware.resource/wrap-resource[fn]
file_info.clj:69    ring.middleware.file-info/wrap-file-info[fn]
reload.clj:22   ring.middleware.reload/wrap-reload[fn]
stacktrace.clj:23   ring.middleware.stacktrace/wrap-stacktrace-log[fn]
stacktrace.clj:86   ring.middleware.stacktrace/wrap-stacktrace-web[fn]
jetty.clj:20    ring.adapter.jetty/proxy-handler[fn]
(Unknown Source)    ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle
HandlerWrapper.java:116 org.eclipse.jetty.server.handler.HandlerWrapper.handle
Server.java:369 org.eclipse.jetty.server.Server.handle
AbstractHttpConnection.java:486 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest
AbstractHttpConnection.java:933 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete
AbstractHttpConnection.java:995 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete
HttpParser.java:644 org.eclipse.jetty.http.HttpParser.parseNext
HttpParser.java:235 org.eclipse.jetty.http.HttpParser.parseAvailable
AsyncHttpConnection.java:82 org.eclipse.jetty.server.AsyncHttpConnection.handle
SelectChannelEndPoint.java:668  org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle
SelectChannelEndPoint.java:52   org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run
QueuedThreadPool.java:608   org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
QueuedThreadPool.java:543   org.eclipse.jetty.util.thread.QueuedThreadPool$3.run
Thread.java:744 java.lang.Thread.run

PUT /draftset/{id}/submit

  • Route name is wrong, needs to be /submit not /offer
  • Spec is wrong should be a POST not a put, code is correct in
    choice of verb. Update SPEC to POST.

Test In Memory Drafter Instance

From a kanban ticket by @Robsteranium:

"An in memory Instance of Drafter would be useful for testing pipelines.

This would allow draft based context testing e.g. reads from drafts vs live and integration testing of pipelines"

Draft endpoint performance

Some queries take loads longer against the draft endpoint than the live one. I have a few hundred triples in the state graph, and the query below takes <0.1s on live but ~5s on draft (with union-with-live = true)

 SELECT DISTINCT ?uri

        WHERE {
          GRAPH <http://data.hampshirehub.net/graph/concept-scheme/folders> {
            ?folder <http://publishmydata.com/def/ontology/folder/inTree> ?uri .
            ?folder ?fp ?fo .
          }
        }

Perfect draftsets & rewriting

There be dragons when there is data in the database that references a graph for which there is a draft in the draftset. In these circumstances queries that refer to that graph / data don't work properly.

There are a few known cases of this, and there are several potential approaches to fixing it. This ticket is for describing the problem assembling known examples where rewriting isn't perfect, proposing a solution and then actually fixing it.

... TODO write this ticket up a bit more clearly...

User Database & test data

  • Test user names in test-edn.example and test data being the same
    as the role names is confusing
  • Falling back to a default set of users feels dangerous in
    production, I would rather it failed and forced you to create the
    test-user file. To prevent misconfiguration security bloopers.
  • Remove default-users from clj file

Drafter blocks requests (locking up) when it has more than 20 concurrent connections to remote sparql service

Drafter's SPARQLRepository connection pool is by default configured to 20 concurrent connections. Once these connections are all in use new requests for connections block forever, waiting for a connection to become available.

Now there is a parameter in apache httpclient (which is used by sesame) http.connection-manager.timeout that lets you configure a timeout on that connection - so incoming queries/updates to the remote database will wait for a timeout period on a new connection being found in the pool before raising an exception.

I'd suggest we set this timeout to something like a second - as the odds are if you have to wait for a new connection - stardog has locked up or is too busy anyway - we should just 503 the request - so as not to contribute more load to the problem.

Unfortunately in sesame 2.7.16 we don't have access to the apache http client to set these options. However it looks like in sesame 2.8.6 and sesame 4.x we do have access, so to improve this we'd need to upgrade drafter to use sesame 2.8.x as a pre-requisite to setting this option.

Draftset REST API

Implement a restful draftset API according to:

http://api.grafter.org/scratch/swagger/#/

  • First class draftset query endpoints
  • Copies implicit on delete/append
  • Draftset JSON objects
  • Publish / Reject / Submit / Claim etc... draftset workflow
  • Deletions via empty draft graphs

Protocol for drafter backends

In order to merge the "sesame native" and "stardog remote" branches we are proposing to define a protocol for the drafter API which can be implemented for different backends. Initially it will be specific to the functionality required by the drafter server, however drafter client defines various protocols for drafter functionality so eventually the goal is to merge these into a single drafter.protocols library which both will implement as required.

Drafter is currently tied quite heavily to the sesame repository API so another goal of the modular backend is to move this explicit dependency into the sesame-specific backend(s).

There are two main parts to the API - the various SPARQL endpoints (raw, live, draft, state) and the 'job' functionality for creating, populating and publishing draft graphs.

Job API

Below are the proposed functions for the job API functions to add to the protocol

create-managed-graph

create-managed-graph(uri: URI, metadata: Map[String, String] = Map.empty): Unit
Defines a new managed graph in the state graph. This function already exists in the drafter client.

create-draft-graph

create-graft-graph(live-uri: URI, metadata: Map[String, String] = Map.empty): URI
This function exists in the drafter client however the drafter server version expects the live-uri to already exist as a managed graph while the client documentation does not specify this restriction.

migrate-graph-to-live

def migrate-graphs-to-live(draftGraphs: Seq[URI]): Job
Migrates a collection of draft graphs to live. Note the implementation of this function in the sesame native and stardog-remote branches are quite different.

delete-graph

def deleteGraph(graphUri: URI, contentsOnly: Boolean = false): Job
Deletes a managed graph and optionally its entry in the state graph. This is called drop-graph in the drafter client protocols but it is not yet implemented there.

append-data-to-graph-job

append-data-to-graph-job(graphUri: URI, statements: Seq[Statement], metadata: Map[String, String] = Map.empty): Job
Appends the given data and optional metadata to a managed graph. Called append-data! in drafter client.

update-metadata

updateMetadata(graphs: Seq[URI], metadata: Map[String, String]): Unit
Updates the metadata associated with a collection of managed graphs. Called assoc-metadata in drafter client although that only allows a single graph to be specified.

delete-metadata

delete-metadata(graph: Seq[URI], metaKeys: Seq[String]): Unit
Deletes the named metadata keys associated with a collection of managed graphs. Called dissoc-metadata in drafter client although that only allows a single graph to be specified.

SPARQL API

The SPARQL API should be separated from the Sesame Repository API.

update-restricted

update-restricted(query: String, restrictions: Set[URI] = Set.empty): Unit
Submit an update statement with a given graph restriction.

Querying

The query API currently makes reference to various concrete sesame repository types to do content negotiation. Initially this dependency could be broken by providing three functions in the query API:

def prepareQuery(query: String, restrictions: Set[URI]): TPrep
def getQueryType(preparedQuery: TPrep): QueryType
def getResultStreamer(query: TPrep, format: ResultFormat): (OutputStream => ())

where QueryType indicates whether the query is an ASK CONSTRUCT or SELECT query.

Better query rewriting

We should ideally support rewriting from a String -> AST -> Rewritten AST -> String i.e. we should rewrite via the AST and convert back to a string so its portable and will work on a better database.

The original plan was to fix up sesame's SparqlQueryRenderer to support 1.1; but it looks like that's probably not a priority for anyone other than us.

There are a number of options available to us:

  1. Try using Jena ARC to do the rewriting via their AST - and convert it back into SPARQL again. Initial API investigations indicate this could be possible.
  2. Use SPIN to convert a SPARQL query into RDF, then rewrite the query in RDF, and render it back out again. https://openrdf.atlassian.net/browse/SES-1840

I think 1 is probably more likely to work in the immediate future.

If we stick with sesame (which seems unlikely) we should probably rewrite the code, so we rewrite the queries in a documented way. Michael Grove alerted us on the sesame-users list that we're doing this in an unsupported manner (by effectively assuming that prepareQuery returns a SailQuery not a Query.)

See also additional conversations here: https://groups.google.com/forum/#!topic/sesame-users/mqVQk7pivUs

Upgrade to Sesame 2.8.x

We're still using an old sesame version of drafter and are consequently being left behind, and already have to maintain our own fork to get the memory-leak patch I created.

We need to upgrade at least to 2.8.x and possibly to 4.x (though that may be problematic as it uses java 8 streams to represent sequences of quads, and java streams might not map as neatly as old-school java iterators onto clojure lazy sequences).

Also #61 depends on this being done.

GET /draftset/{id}/data

  • Doesn't appear to support quad formats 'Accept header required
    with MIME type of RDF format to return
  • Vulnerable to potential SPARQL query injection attack. graph
    parameter needs sanitized.

Tidy state on publish (to ensure deleted graphs aren't referenced)

This isn't a big problem right now, but might lead to problems later if we ever query the state graph for something like a list of live graphs; and expect them to all contain something. So we should fix this to ensure the state graph is actually in sync with the data.

The problem will become apparent in the following workflow:

  1. A user makes a draft d of a live graph l that contains some data.
  2. The user deletes the contents of the draft graph (leaving its state graph entry alone). The intent here is to schedule a deletion into live.
  3. The user makes the draft live
  4. Drafter replaces live graph l with the empty draft d effectively creating the deletion.

This is all great; however we are now still left an entry for the managed graph l in the state graph, even though that graph has effectively been deleted. Ideally we should keep the state graph in sync with reality.

So make-live! should check via an ASK { GRAPH <l> { ?s ?p ?o } LIMIT 1 } whether or not the graph we just made live has any contents. If it doesn't, and iff the managed graph for l is not referenced by any other open drafts then the state graph entry for it should be removed.

Drafter doesn't seem to report DB errors when batching triples into drafts

From the discussion below it sounds like the error occurred and was swallowed somewhere inside sesame/grafter... Probably this line... should take a look:

https://github.com/Swirrl/drafter/blob/master/src/drafter/backend/sesame/common/draft_management.clj#L10

See chat below:

[13:14]
ricroberts Hi @RickMoynihan
[13:14]
I’ve been running an upload of postcodes to sg-sandbox
[13:14]
and something weird happened
[13:14]
it was happily batching writes:
[13:15]
ricroberts added a Plain Text snippet
2016-02-05 12:37:06,443 INFO write-scheduler :: Writer waiting for tasks
2016-02-05 12:37:06,443 INFO write-scheduler :: Executing job #swirrl_server.async.jobs.Job{:id #uuid "a75e8363-9ea5-4dc5-a230-86df28d8ed2f", :priority :batch-write, :time 1454675826443, :function #<core$partial$fn__4234 clojure.core$partial$fn__4234@586168ef>, :value-p #<core$promise$reify__6363@1b96733b: :pending>}
2016-02-05 12:37:06,452 INFO draft-api :: Adding a batch of triples to repoclojure.lang.LazySeq@5bf48648
2016-02-05 12:37:08,545 INFO middleware 1401082658 :: REQUEST /status/finished-jobs/a75e8363-9ea5-4dc5-a230-86df28d8ed2f /; q=0.5, application/xml {}
2016-02-05 12:37:08,545 INFO middleware 1401082658 :: RESPONSE 404 took 0ms
2016-02-05 12:37:08,549 INFO middleware 1177801622 :: REQUEST /status/finished-jobs/a75e8363-9ea5-4dc5-a230-86df28d8ed2f /; q=0.5, application/xml {}
2016-02-05 12:37:08,549 INFO middleware 1177801622 :: RESPONSE 404 took 0ms
2016-02-05 12:37:08,755 INFO write-scheduler :: Queueing job: #swirrl_server.async.jobs.Job{:id #uuid "a75e8363-9ea5-4dc5-a230-86df28d8ed2f", :priority :batch-write, :time 1454675828755, :function #<core$partial$fn__4234 clojure.core$partial$fn__4234@61a84658>, :value-p #<core$promise$reify__6363@1b96733b: :pending>}
2016-02-05 12:37:08,756 INFO write-scheduler :: Writer waiting for tasks
2016-02-05 12:37:08,756 INFO write-scheduler :: Executing job #swirrl_server.async.jobs.Job{:id #uuid "a75e8363-9ea5-4dc5-a230-86df28d8ed2f", :priority :batch-write, :time 1454675828755, :function #<core$partial$fn__4234 clojure.core$partial$fn__4234@61a84658>, :value-p #<core$promise$reify__6363@1b96733b: :pending>}
2016-02-05 12:37:08,767 INFO draft-api :: Adding a batch of triples to repoclojure.lang.LazySeq@7ba04909
2016-02-05 12:37:11,048 INFO write-scheduler :: Queueing job: #swirrl_server.async.jobs.Job{:id #uuid "a75e8363-9ea5-4dc5-a230-86df28d8ed2f", :priority :batch-write, :time 1454675831048, :function #<core$partial$fn__4234 clojure.core$partial$fn__4234@4c0d8a28>, :value-p #<core$promise$reify__6363@1b96733b: :pending>}
2016-02-05 12:37:11,049 INFO write-scheduler :: Writer waiting for tasks
2016-02-05 12:37:11,049 INFO write-scheduler :: Executing job #swirrl_server.async.jobs.Job{:id #uuid "a75e8363-9ea5-4dc5-a230-86df28d8ed2f", :priority :batch-write, :time 1454675831048, :function #<core$partial$fn__4234 clojure.core$partial$fn__4234@4c0d8a28>, :value-p #<core$promise$reify__6363@1b96733b: :pending>}
2016-02-05 12:37:11,057 INFO draft-api :: Adding a batch of triples to repoclojure.lang.LazySeq@bc9f7a79
Add Comment Collapse
[13:15]
ricroberts but since 12:37 it stopped adding batches
[13:15]
and the checks for whether it finished just return 404
[13:15]
ricroberts added a Plain Text snippet
2016-02-05 12:37:13,239 INFO middleware 2037890086 :: REQUEST /status/finished-jobs/a75e8363-9ea5-4dc5-a230-86df28d8ed2f /; q=0.5, application/xml {}
2016-02-05 12:37:13,239 INFO middleware 2037890086 :: RESPONSE 404 took 0ms
Add Comment
[13:15]
ricroberts There’s no completed log entry(edited)
[13:16]
😕
[13:17]
drafter hasn’t restarted
[13:18]
i’ll try restarting it and see what state it got to
[13:18]
(unless you want me to check anything first(?) @RickMoynihan(edited)
[13:19]
there’s virtually no cpu activity
[13:23]
rickmoynihan Sorry rick was just writing this up: https://openrdf.atlassian.net/browse/SES-2369
[13:23]
ricroberts np
[13:23]
rickmoynihan parsing your woes now...
[13:24]
hmmm weird
[13:24]
are there any exceptions or errors anywhere?
[13:25]
ricroberts no
[13:25]
i restarted drafter / killed the job in pmd
[13:25]
not all the data got thru
[13:26]
rickmoynihan did any go in?
[13:26]
ricroberts about 80% of it
[13:26]
by the look of it
[13:26]
50k postcodes out of 65k
[13:27]
rickmoynihan @leekitching: might have some ideas - I think the batching code has been tweaked a few times
[13:28]
ricroberts it’s running build_449
[13:28]
if that makes a diff!
[13:28]
rickmoynihan ta
[13:29]
will figure out what commit its at
[13:29]
ricroberts it was taking ages (longer than i expected) so i looked at drafter
[13:29]
rickmoynihan https://github.com/Swirrl/drafter/tree/f38195a347e490dc31bdfd8219cb0e48a8f01c76(edited)
[13:33]
do we know if stardog barfed somewhere?
[13:35]
ricroberts will look
[13:38]
rickmoynihan one change has been the child-job stuff lee did - not sure how that stuff works
[13:42]
ricroberts after restarting drafter now getting errors on every sparql request
[13:42]
rickmoynihan stardog?
[13:42]
ricroberts yeah
[13:43]
looks like it
[13:43]
ricroberts added a Plain Text snippet
Feb 05, 2016 1:41:08 PM com.complexible.common.protocols.server.rpc.ServerHandler exceptionCaught
SEVERE: exceptionCaughtServerHandler
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
Add Comment Click to expand inline 38 lines
[13:43]
rickmoynihan so do we think stardogs queries started erroring and drafter failed - but failed to fail properly?
[13:43]
ricroberts it was happily loading postcodes then….
[13:44]
ricroberts added a Plain Text snippet
Feb 05, 2016 12:37:20 PM com.complexible.stardog.protocols.server.SPECServerFunction query
SEVERE: There was an error executing query: INSERT DATA
{
...

}
java.lang.RuntimeException: com.complexible.stardog.plan.eval.operator.OperatorException: There was a fatal failure during preparation of 1d33a1e4-0681-44be-881d-9cd9058cb884 org.openrdf.rio.RDFParseException: Unable to allocate 4.1K bytes, direct memory exhausted
at com.complexible.common.iterations.TransformException.wrapException(TransformException.java:66)
at com.complexible.common.iterations.TransformException.hasNext(TransformException.java:90)
at com.complexible.common.iterations.Iterations.each(Iterations.java:339)
at com.complexible.common.iterations.Iterations.consume(Iterations.java:417)
at com.complexible.stardog.plan.eval.QueryEngine.executeUpdate(QueryEngine.java:155)
at com.complexible.stardog.query.DefaultQueryFactory$UpdateQueryImpl.execute(DefaultQueryFactory.java:311)
at com.complexible.stardog.query.DefaultQueryFactory$UpdateQueryImpl.execute(DefaultQueryFactory.java:290)
at com.complexible.stardog.StardogKernel$DelegatingUpdateQuery.execute(StardogKernel.java:3817)
at com.complexible.stardog.StardogKernel$SecuredUpdateQuery.execute(StardogKernel.java:3712)
at com.complexible.stardog.StardogKernel$SecuredUpdateQuery.execute(StardogKernel.java:3699)
at com.complexible.stardog.protocols.server.SPECServerFunction.query(SPECServerFunction.java:511)
at com.complexible.stardog.protocols.server.SPECServerFunction.handleMessage(SPECServerFunction.java:147)
at com.complexible.common.protocols.server.rpc.ServerHandler.handleMessage(ServerHandler.java:247)
at com.complexible.common.protocols.server.rpc.ServerHandler.channelRead(ServerHandler.java:146)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32)
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:299)
at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:36)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.complexible.stardog.plan.eval.operator.OperatorException: There was a fatal failure during preparation of 1d33a1e4-0681-44be-881d-9cd9058cb884 org.openrdf.rio.RDFParseException: Unable to allocate 4.1K bytes, direct memory exhausted
at com.complexible.stardog.plan.eval.operator.impl.UpdateSequenceOperatorImpl.computeNext(UpdateSequenceOperatorImpl.java:88)
at com.complexible.stardog.plan.eval.operator.impl.UpdateSequenceOperatorImpl.computeNext(UpdateSequenceOperatorImpl.java:27)
at com.complexible.common.collect.AbstractSkippingIteration.tryToComputeNext(AbstractSkippingIteration.java:125)
at com.complexible.common.collect.AbstractSkippingIteration.hasNext(AbstractSkippingIteration.java:110)
at com.complexible.stardog.plan.eval.operator.util.AutoCloseOperator.computeNext(AutoCloseOperator.java:92)
at com.complexible.stardog.plan.eval.operator.util.AutoCloseOperator.computeNext(AutoCloseOperator.java:27)
at com.complexible.common.collect.AbstractSkippingIteration.tryToComputeNext(AbstractSkippingIteration.java:125)
at com.complexible.common.collect.AbstractSkippingIteration.hasNext(AbstractSkippingIteration.java:110)
at com.complexible.common.iterations.TransformException.hasNext(TransformException.java:87)
... 19 more
Caused by: com.complexible.stardog.db.DatabaseException: There was a fatal failure during preparation of 1d33a1e4-0681-44be-881d-9cd9058cb884 org.openrdf.rio.RDFParseException: Unable to allocate 4.1K bytes, direct memory exhausted
at com.complexible.stardog.db.DatabaseConnectionImpl.commit(DatabaseConnectionImpl.java:350)
at com.complexible.stardog.db.DelegatingTransactionalConnectableConnection.commit(DelegatingTransactionalConnectableConnection.java:68)
at com.complexible.stardog.plan.eval.operator.impl.UpdateSequenceOperatorImpl.computeNext(UpdateSequenceOperatorImpl.java:71)
... 27 more
Caused by: com.complexible.tx.api.HeuristicRollbackException: There was a fatal failure during preparation of 1d33a1e4-0681-44be-881d-9cd9058cb884 org.openrdf.rio.RDFParseException: Unable to allocate 4.1K bytes, direct memory exhausted
at com.complexible.tx.api.impl.DefaultTransaction.doRollback(DefaultTransaction.java:625)
at com.complexible.tx.api.impl.DefaultTransaction.runPreparePhase(DefaultTransaction.java:446)
at com.complexible.tx.api.impl.DefaultTransaction.commit(DefaultTransaction.java:336)
at com.complexible.stardog.db.DatabaseConnectionImpl.commit(DatabaseConnectionImpl.java:342)
... 29 more
Add Comment Collapse
[13:45]
ricroberts (added a bit more to the start of that last error)
[13:45]
rickmoynihan ok
[13:46]
maybe stardog failed to fail properly in that case
[13:46]
ricroberts hrm
[13:46]
but weird that drafter thought it was still going
[13:46]
rickmoynihan or maybe it did fail properly and drafter failed to fail gracefully
[13:46]
yeah thats what I mean
[13:47]
ricroberts :simple_smile:
[13:48]
gonna try it again
[13:48]
btw, this was the first load i did after successfully loading 900M triples via data add (stardog command line)
[13:49]
rickmoynihan https://github.com/Swirrl/drafter/blob/better-errors/src/drafter/backend/sesame/common/draft_management.clj#L10
[13:50]
ricroberts did it again and it worked in <1 min
[13:50]
rickmoynihan ↖️ this line calls into grafter -> sparql-repository - the way errors are handled in that stack is a little trixy
[13:50]
so I wouldn't be surprised if there was a problem there

Vocabulary use in state graph

  • Refactor dcterms:creator to be a mailto:uri rather than a string containing an email address. This will let us add attributes to users in RDF in the future without having to migrate the drafter state graph.

Implement Drafter server in terms of the Drafter protocols

It would be beneficial to implement Drafter server in terms of the protocols defined in the Drafter client project to avoid duplication of functionality between the server and client. It would also provide a migration path for adding functionality into the server which is currently emulated in the client e.g. Draftsets.

Drafter Sever backend protocols

#47 has implemented multiple configurable 'backends' for the Drafter server which are responsible for implementing the functionality exposed by the HTTP API. There are two main protocols implemented by the backends which are related to the external API: DraftManagement and ApiOperations. ApiOperations methods map closely to the external HTTP API functionality while DraftManagement methods usually deal with low-level graph management details. ApiOperations methods commonly use DraftManagement methods in their implementations.

Drafter server backends also implement some Grafter protocols for querying/updating such as ITripleReadable, ITripleWriteable and ISPARQLable. Again, it is common for the higher-level protocols to use these in their implementations.

Drafter Client protocols

The protocols in the Drafter Client project are defined in an object-oriented style in that they depend on the context of the receiver. For example the IPublishable/publish! method is defined for a record encapsulating a set of draft graphs to publish. These differ from the server protocols which take the context as an explicit parameter (and the receiver is the backend implementing the operation). Given these differences, below is an approximate mapping between the client protocol methods and the corresponding method on the server protocols:

client method server method
IDraftCreate/create-draft-graph! ApiOperations/new-draft-job
IDrafterGraph/live-graph DraftManagement/get-live-draft-for-graph
IDrafterGraph/append-data ApiOperations/append-data-to-graph-job
IDrafterGraph/drop-graph! ApiOperations/delete-graph-job
IDrafterGraph/append-data-from-live! ApiOperations/copy-from-live-graph-job
IPublishable/publish! ApiOperations/migrate-graphs-to-live
IDrafterMetadata/dissoc-metadata ApiOperations/delete-metadata-job
IDrafterMetadata/assoc-metadata ApiOperations/update-metadata-job

Implementation

As described above, the main difference between the client and server protocols is the 'context' (usually the graph or set of graphs to operate on) is the receiver in the client protocols while in the server protocols the receiver is the backend implementing the operation. The server could therefore be implemented by the server protocols by adding a layer in the HTTP API which sets the context based on the incoming HTTP request. The implementation for each context will then delegate to the current backend to carry out the operation.

Draftsets

The Drafter client describes Draftsets as a first-class concept, while the server does not and deals only with graphs. However many of the backend protocol operations take a collection of draft graphs to operate on, so Draftsets could be managed at a level higher than that of individual backends.

Undelete delete draft graph contents

@RicSwirrl and @asacalow told us to remove replace, and we did. But it turns out that we need to delete graph contents which was previously implemented as a replace with an empty graph.

This is good though, because deleting by replacing an empty file was a bit weird, and it would be nicer to be more explicit, and it needs to use the new batched delete stuff.

So we need to reimplement delete draft graph contents, on drafters rest API.

It should work via a HTTP request to:

DELETE /draft?graph=<draft-graph-uri>

Should only delete the contents of the graph.

Merge with master (for Better Error Objects)

Drafter currently returns HTML dumps of stacktraces in most cases, which is both hard to read as a PMD developer and unparseable programmatically.

We should:

  • Not dump HTMLized stacktraces in production
  • Serve error objects as JSON where applicable
  • Provide a better infrastructure for mapping exceptions to specific error objects etc

Live endpoint perf

This query is really slow against Live, but quick against Raw (on hants). Try it here:

http://hantshub-dev.publishmydata.com:3001/live

PREFIX qb: <http://purl.org/linked-data/cube#>
          PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
          PREFIX skos: <http://www.w3.org/2004/02/skos/core#Concept> SELECT * {
  SELECT DISTINCT ?uri ?label WHERE {        

            GRAPH <http://data.hampshirehub.net/graph/skills-and-qualifications/pupil-attainment-at-key-stage-2-by-location-of-pupil-residence> {
              ?obs qb:dataSet <http://data.hampshirehub.net/data/skills-and-qualifications/pupil-attainment-at-key-stage-2-by-location-of-pupil-residence> .
              ?obs <http://opendatacommunities.org/def/ontology/time/refPeriod> ?uri .
            }
          OPTIONAL { ?uri rdfs:label ?label . }  }  ORDER BY ?uri
}

<!---
@huboard:{"order":27.5}
-->

Draftset Metadata

  • Implement PUT /draftset/:id for setting metadata
  • Save display-name and description on the draftset & return on every draftset model

When creating or updating a draft graph we should:

  • Set a created at time stamp on individual draft graphs
  • Set update modified time stamp on individual draft graphs in a draftset when they are appended
  • Update modified time stamp on delete
  • Update modified time stamp on delete graph

When publishing we should:

  • Leave created-at time if graph is already published - if graph doesn't exist use draft graphs created at time
  • Update modified time stamp on live graphs to draft graphs modified time stamp
  • Set a published-at time stamp to the publication time (start of make live transaction)

PUT /draftset/:id/data

  • Returns 400 error "content type required."
  • Doesn't appear to handle any quad data formats (though code exists for it)
    • Looks like it looks in the wrong place for the content-type
    • because of the file loading middleware which requires a multipart form submission.

Make cloning faster

The scenario is you have a live graph l with 1 million triples in it, and you want to create a draft of it and append a few more triples into the draft.

This operation takes a really long time. e.g. the following code using drafter client takes just 70 seconds to build a draftset of 1m triples on drafter. Then after publishing it live, making a new draft from it and appending just 1 triple to it takes 24.6 minutes!

I forgot to time the publish step, but it completed in one or two minutes (probably about 70s):

drafter.client> (time (do
                        (def c (->client "http://localhost:3001/"))

                        (def ds (create-draft-set! c))

                        (def ds (add ds one-million))))
"Elapsed time: 70471.931472 msecs"
;; => #'drafter.client/ds
drafter.client> @(publish! ds)
;; => true
drafter.client> (time (do
                        (def c (->client "http://localhost:3001/"))

                        (def ds (create-draft-set! c))

                        (def ds (add ds (take 1 one-million)))))
"Elapsed time: 1477135.562665 msecs"

I haven't definitely pinpointed the root cause in a profiler, but it looks like the problem could be to do with buffer sizes.

To clone append data into a draft, Drafter essentially creates a massive lazy sequence which in pseudo code looks like:

(lazy-cat (select-all-live-triples) (new-triples-from-file-upload))

You can see the real code for this here. This sequence is then applied in batches.

I suspect that this is because reads and writes to sesame are being interleaved, because the lazy-seq of results is reading in chunks of only 32 statements at a time... But then my profiler seemed to show a lot of time spent calling put! on the blocking queue between reads and writes; which I think means execution is somewhere altogether different so this might not be the case.

We should profile properly and fix.

PUT /draftset/:id/data

  • Route has wrong verb. Appears to require a =POST= not a =PUT=.
  • Returns 400 error "content type required."
  • Doesn't appear to handle any quad data formats.

Add a clone-live-graph operation to the API

I've just realised we don't have clone-live-graph operation on the API yet - we should add one - though you could do this at the minute by adding/removing a temporary triple.

Should look something like:

PUT /draftset/{id}/copy?graph=http://live-graph.com/

  • Add to swagger spec
  • Implement route

Make drafter logging not suck

Drafters logging like the Java ecosystems is a bit of a mess.

Basically drafter itself uses clojure.tools.logging with clj-logging-config because we want to be able to configure our logging regardless of what logger is being used by the underlying project (on the JVM some libraries use JUL, others Log4j/logback etc...). Additionally some of our libraries/apps use timbre for logging; and for some reason we've started using timbre for logging in drafter.

The problem with the current setup is that the timbre logs don't get configured by the mechanism, and almost certainly won't make it out into our log file on the server.

I believe it should be possible to wire up anything logged through timbre, to actually log through our backend; as most of the logging libraries are using SLF4J as a facade anyway... so we might be able to use https://github.com/fzakaria/slf4j-timbre too.

An alternative might be just to avoid timbre in drafter altogether.

Delete/ Insert with ?g works against Raw but not Live

This update query works against the raw endpoint, but not against the live one

PREFIX pmd: <http://publishmydata.com/def/dataset#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>

DELETE { 
  GRAPH ?g { ?a pmd:downloadURL ?old_url } 
} INSERT { 
  GRAPH ?g { ?a pmd:downloadURL ?new_url }
} WHERE {
  GRAPH ?g {
    ?a pmd:downloadURL ?old_url ;
       pmd:fileName ?f ;
       .

    BIND(IRI(fn:concat("http://data.hampshirehub.net/downloads/file?id=", ?f)) AS ?new_url)
  }
}

User Database

In order to authenticate users and manage draftset workflows we need to have access to the user records that PMD uses.

This layer needs to only implement a minimal read-only subset of the user database.

  • Implement user abstraction behind a protocol that drafter can use for common queries against users... e.g. things like user-exists? valid-api-key? user-role etc...
  • Implement Mongo adapter
  • Implement in memory database adapter - (Nice to have for testing etc...)

See ticket #63 also about implementing basic auth security.

Async Jobs Error Handling

Async Jobs continue after failing with a java.lang.Error. We currently only catch and handle Exceptions not errors.

Error's should also really fail the job and deliver an error message (through the async mechanism) to the client who submitted the job.

DELETE /draftset/{id}/data parse error

When running e.g.

curl -X DELETE -F "[email protected];type=application/n-triples" --header 'Accept: application/json' 'http://localhost:3001/draftset/e6d1c936-3895-475a-89e3-104c36b6ebc3/data?graph=http://flibble.com/' --basic -u publisher
java.lang.IllegalArgumentException: No matching method found: parse for class org.openrdf.rio.ntriples.NTriplesParser
                   Reflector.java:80 clojure.lang.Reflector.invokeMatchingMethod
                   Reflector.java:28 clojure.lang.Reflector.invokeInstanceMethod
                       sesame.clj:17 drafter.rdf.sesame/parse-stream-statements
                draftsets_api.clj:95 drafter.routes.draftsets-api/read-rdf-file-handler[fn]
                draftsets_api.clj:77 drafter.routes.draftsets-api/rdf-file-part-handler[fn]
                draftsets_api.clj:67 drafter.routes.draftsets-api/restrict-to-draftset-owner[fn]
                draftsets_api.clj:61 drafter.routes.draftsets-api/existing-draftset-handler[fn]
                   middleware.clj:58 drafter.middleware/require-authenticated[fn]
                   middleware.clj:38 buddy.auth.middleware/wrap-authentication[fn]
                   middleware.clj:72 buddy.auth.middleware/wrap-authorization[fn]
                        core.clj:113 compojure.core/make-route[fn]
                        core.clj:103 compojure.core/wrap-route-middleware[fn]
                         core.clj:41 compojure.core/if-route[fn]
                         core.clj:27 compojure.core/if-method[fn]
                        core.clj:127 compojure.core/routing[fn]
                       core.clj:2570 clojure.core/some
                        core.clj:127 compojure.core/routing
                     RestFn.java:139 clojure.lang.RestFn.applyTo
                        core.clj:632 clojure.core/apply
                        core.clj:132 compojure.core/routes[fn]
                        core.clj:127 compojure.core/routing[fn]
                       core.clj:2570 clojure.core/some
                        core.clj:127 compojure.core/routing
                     RestFn.java:139 clojure.lang.RestFn.applyTo
                        core.clj:632 clojure.core/apply
                        core.clj:132 compojure.core/routes[fn]
                        verbs.clj:27 ring.middleware.verbs/wrap-verbs[fn]
                   middleware.clj:32 drafter.middleware/template-error-page[fn]
                   middleware.clj:23 drafter.middleware/log-request[fn]
                   middleware.clj:39 noir.util.middleware/wrap-request-map[fn]
               keyword_params.clj:35 ring.middleware.keyword-params/wrap-keyword-params[fn]
            multipart_params.clj:117 ring.middleware.multipart-params/wrap-multipart-params[fn]
                       params.clj:64 ring.middleware.params/wrap-params[fn]
           absolute_redirects.clj:36 ring.middleware.absolute-redirects/wrap-absolute-redirects[fn]
                 content_type.clj:30 ring.middleware.content-type/wrap-content-type[fn]
                 not_modified.clj:52 ring.middleware.not-modified/wrap-not-modified[fn]
                   middleware.clj:12 hiccup.middleware/wrap-base-url[fn]
               format_params.clj:113 ring.middleware.format-params/wrap-format-params[fn]
               format_params.clj:113 ring.middleware.format-params/wrap-format-params[fn]
             format_response.clj:174 ring.middleware.format-response/wrap-format-response[fn]
                  validation.clj:155 noir.validation/wrap-noir-validation[fn]
                      cookies.clj:72 noir.cookies/noir-cookies[fn]
                     cookies.clj:161 ring.middleware.cookies/wrap-cookies[fn]
                     session.clj:158 noir.session/noir-flash[fn]
                        flash.clj:35 ring.middleware.flash/wrap-flash[fn]
                     session.clj:109 noir.session/noir-session[fn]
                     session.clj:102 ring.middleware.session/wrap-session[fn]
                        Var.java:379 clojure.lang.Var.invoke
                     resource.clj:28 ring.middleware.resource/wrap-resource[fn]
                    file_info.clj:69 ring.middleware.file-info/wrap-file-info[fn]
                       reload.clj:22 ring.middleware.reload/wrap-reload[fn]
                   stacktrace.clj:23 ring.middleware.stacktrace/wrap-stacktrace-log[fn]
                   stacktrace.clj:86 ring.middleware.stacktrace/wrap-stacktrace-web[fn]
                        jetty.clj:24 ring.adapter.jetty/proxy-handler[fn]
                    (Unknown Source) ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle
              HandlerWrapper.java:97 org.eclipse.jetty.server.handler.HandlerWrapper.handle
                     Server.java:497 org.eclipse.jetty.server.Server.handle
                HttpChannel.java:310 org.eclipse.jetty.server.HttpChannel.handle
             HttpConnection.java:257 org.eclipse.jetty.server.HttpConnection.onFillable
         AbstractConnection.java:540 org.eclipse.jetty.io.AbstractConnection$2.run
           QueuedThreadPool.java:635 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
           QueuedThreadPool.java:555 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run
                     Thread.java:745 java.lang.Thread.run
java.lang.IllegalArgumentException: No matching method found: parse for class org.openrdf.rio.ntriples.NTriplesParser
                   Reflector.java:80 clojure.lang.Reflector.invokeMatchingMethod
                   Reflector.java:28 clojure.lang.Reflector.invokeInstanceMethod
                       sesame.clj:17 drafter.rdf.sesame/parse-stream-statements
                draftsets_api.clj:95 drafter.routes.draftsets-api/read-rdf-file-handler[fn]
                draftsets_api.clj:77 drafter.routes.draftsets-api/rdf-file-part-handler[fn]
                draftsets_api.clj:67 drafter.routes.draftsets-api/restrict-to-draftset-owner[fn]
                draftsets_api.clj:61 drafter.routes.draftsets-api/existing-draftset-handler[fn]
                   middleware.clj:58 drafter.middleware/require-authenticated[fn]
                   middleware.clj:38 buddy.auth.middleware/wrap-authentication[fn]
                   middleware.clj:72 buddy.auth.middleware/wrap-authorization[fn]
                        core.clj:113 compojure.core/make-route[fn]
                        core.clj:103 compojure.core/wrap-route-middleware[fn]
                         core.clj:41 compojure.core/if-route[fn]
                         core.clj:27 compojure.core/if-method[fn]
                        core.clj:127 compojure.core/routing[fn]
                       core.clj:2570 clojure.core/some
                        core.clj:127 compojure.core/routing
                     RestFn.java:139 clojure.lang.RestFn.applyTo
                        core.clj:632 clojure.core/apply
                        core.clj:132 compojure.core/routes[fn]
                        core.clj:127 compojure.core/routing[fn]
                       core.clj:2570 clojure.core/some
                        core.clj:127 compojure.core/routing
                     RestFn.java:139 clojure.lang.RestFn.applyTo
                        core.clj:632 clojure.core/apply
                        core.clj:132 compojure.core/routes[fn]
                        verbs.clj:27 ring.middleware.verbs/wrap-verbs[fn]
                   middleware.clj:32 drafter.middleware/template-error-page[fn]
                   middleware.clj:23 drafter.middleware/log-request[fn]
                   middleware.clj:39 noir.util.middleware/wrap-request-map[fn]
               keyword_params.clj:35 ring.middleware.keyword-params/wrap-keyword-params[fn]
            multipart_params.clj:117 ring.middleware.multipart-params/wrap-multipart-params[fn]
                       params.clj:64 ring.middleware.params/wrap-params[fn]
           absolute_redirects.clj:36 ring.middleware.absolute-redirects/wrap-absolute-redirects[fn]
                 content_type.clj:30 ring.middleware.content-type/wrap-content-type[fn]
                 not_modified.clj:52 ring.middleware.not-modified/wrap-not-modified[fn]
                   middleware.clj:12 hiccup.middleware/wrap-base-url[fn]
               format_params.clj:113 ring.middleware.format-params/wrap-format-params[fn]
               format_params.clj:113 ring.middleware.format-params/wrap-format-params[fn]
             format_response.clj:174 ring.middleware.format-response/wrap-format-response[fn]
                  validation.clj:155 noir.validation/wrap-noir-validation[fn]
                      cookies.clj:72 noir.cookies/noir-cookies[fn]
                     cookies.clj:161 ring.middleware.cookies/wrap-cookies[fn]
                     session.clj:158 noir.session/noir-flash[fn]
                        flash.clj:35 ring.middleware.flash/wrap-flash[fn]
                     session.clj:109 noir.session/noir-session[fn]
                     session.clj:102 ring.middleware.session/wrap-session[fn]
                        Var.java:379 clojure.lang.Var.invoke
                     resource.clj:28 ring.middleware.resource/wrap-resource[fn]
                    file_info.clj:69 ring.middleware.file-info/wrap-file-info[fn]
                       reload.clj:22 ring.middleware.reload/wrap-reload[fn]
                   stacktrace.clj:23 ring.middleware.stacktrace/wrap-stacktrace-log[fn]
                   stacktrace.clj:86 ring.middleware.stacktrace/wrap-stacktrace-web[fn]
                        jetty.clj:24 ring.adapter.jetty/proxy-handler[fn]
                    (Unknown Source) ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle
              HandlerWrapper.java:97 org.eclipse.jetty.server.handler.HandlerWrapper.handle
                     Server.java:497 org.eclipse.jetty.server.Server.handle
                HttpChannel.java:310 org.eclipse.jetty.server.HttpChannel.handle
             HttpConnection.java:257 org.eclipse.jetty.server.HttpConnection.onFillable
         AbstractConnection.java:540 org.eclipse.jetty.io.AbstractConnection$2.run
           QueuedThreadPool.java:635 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
           QueuedThreadPool.java:555 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run
                     Thread.java:745 java.lang.Thread.run
10:50:37,840 ERROR server               :: Unhandled REPL handler exception processing message {:op init-debugger, :print-level 10, :print-length 10, :session d8e6a35c-deb6-448c-81a3-b69dd8135a71, :id 7}
java.net.SocketException: Socket closed
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at clojure.tools.nrepl.transport$bencode$fn__18638.invoke(transport.clj:103)
    at clojure.tools.nrepl.transport.FnTransport.send(transport.clj:28)
    at clojure.tools.nrepl.middleware.pr_values$pr_values$fn$reify__18995.send(pr_values.clj:27)
    at cider.nrepl.middleware.debug$debugger_send.doInvoke(debug.clj:148)
    at clojure.lang.RestFn.invoke(RestFn.java:421)
    at cider.nrepl.middleware.debug$initialize.invoke(debug.clj:325)
    at cider.nrepl.middleware.debug$wrap_debug$fn__21995.invoke(debug.clj:353)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at drafter.repl$eval36185$fn__36186$fn__36188.invoke(form-init6774542758411843841.clj:1)
    at cider.nrepl.middleware.inspect$wrap_inspect$fn__21729.invoke(inspect.clj:108)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at clojure.tools.nrepl.middleware.session$add_stdin$fn__19153.invoke(session.clj:238)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.track_state$wrap_tracker$fn__26326.invoke(track_state.clj:206)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at clojure.tools.nrepl.middleware.load_file$wrap_load_file$fn__19194.invoke(load_file.clj:79)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.undef$wrap_undef$fn__26353.invoke(undef.clj:30)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.out$wrap_out$fn__25465.invoke(out.clj:96)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.pprint$wrap_pprint$fn__25499.invoke(pprint.clj:51)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at clojure.tools.nrepl.middleware.pr_values$pr_values$fn__18992.invoke(pr_values.clj:22)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.test$wrap_test$fn__25774.invoke(test.clj:240)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at cider.nrepl.middleware.stacktrace$wrap_stacktrace$fn__21807.invoke(stacktrace.clj:176)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at clojure.tools.nrepl.middleware.session$session$fn__19138.invoke(session.clj:192)
    at clojure.tools.nrepl.middleware$wrap_conj_descriptor$fn__18809.invoke(middleware.clj:22)
    at clojure.tools.nrepl.server$handle_STAR_.invoke(server.clj:19)
    at clojure.tools.nrepl.server$handle$fn__19209.invoke(server.clj:28)
    at clojure.core$binding_conveyor_fn$fn__4444.invoke(core.clj:1916)
    at clojure.lang.AFn.call(AFn.java:18)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Exploit Stardog Performance

We should undo some of the batching we do on deletes/moves etc to take advantage of stardogs enhanced performance on drop/move etc...

  • drop graph
  • Move/rename graph (this is easy to code - as we just need the old function that did this is still in the code base. We just need to check the performance is ok or better on big datasets.

Upgrade to Jena ARQ 3.0

We're currently locked to a specific snapshot of Jena 3.0 that's hosted in our private repo.

We only use ARQ for query rewriting.

We should probably switch to the finalised 3.0 version of Jena when we find time. Shouldn't be a big task, just update the project.clj and check the tests all still pass.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.