eclipse-rdf4j / rdf4j Goto Github PK
View Code? Open in Web Editor NEWEclipse RDF4J: scalable RDF for Java
Home Page: https://rdf4j.org/
License: BSD 3-Clause "New" or "Revised" License
Eclipse RDF4J: scalable RDF for Java
Home Page: https://rdf4j.org/
License: BSD 3-Clause "New" or "Revised" License
(Migrated from https://openrdf.atlassian.net/browse/SES-2209 )
Federated queries like the following won't return bound values for variables assignments defined in the SERVICE graph pattern.
{code}
SELECT ?s ?bindingS ?now
WHERE {
SERVICE http://dbpedia.org/sparql {
?s a ?someType .
BIND (?s as ?bindingS)
BIND (now() as ?now)
}
} LIMIT 1
{code}
We have yet to a file a CQ for ElasticSearch library inclusion.
We need to remove references to 'Sesame' from the UI, and the logo as well.
(Migrated from SES-2161)
There's a query parse error with the following query:
{code}
insert data {
urn:alpha urn:beta """\U0001F61F""" .
}
{code}
I am fairly certain this is a valid query, from what I can grok of the spec, that unicode sequence is correct. ARQ also bombs out on this query though, which leaves me with some doubt.
(Migrated from https://openrdf.atlassian.net/browse/SES-2178 )
As Jeen raised the point in http://stackoverflow.com/questions/28415722/sparql-1-1-entailment-regimes-and-query-with-from-clause , the inferencing stores implementations predates the http://www.w3.org/TR/sparql11-entailment/ recommendation.
It would be nice to have this recommendation fulfilled for more complete SPARQL 1.1 support.
We need to stabilize the build - there's several things fail after merging in the final sync with the old Sesame repo.
The goal would be to provide the old package hierarchy with all classes deprecated, and marked as (empty) subclasses of their equivalent.
We are currently focusing on the Sesame 4 code base as the launch point for RDF4J. However there are several core users who require to stay at Java 7 for a while longer. We should consider bringing over the Sesame 2.9 code base to RDF4J to live alongside the main branch, so we can do parallel releases for those users who wish to stick to Java 7.
We need to migrate open JIRA issues from our old JIRA issue tracker to GitHub. Anybody know any good tools for this?
The current RDF4J Javadoc is massive and quite hard to find your way in. We should try and do some simplifications to make it easier to browse. Things to think of:
(Migrated from https://openrdf.atlassian.net/browse/SES-2191 )
We should provide utility methods to allow checking that a given string is a valid IRI according to RFC3987.
Either (temporarily) using ci.rdf4j.net, or looking into using an Eclipse-hosted environment
(Migrated from https://openrdf.atlassian.net/browse/SES-2218 )
To guard against server-side OoM errors (especially when processing SPARQL update requests) the RDF4J Server should support a configurable size limit for the payload.
These all need to be modified to refer to correct new package naming.
(Migrated from https://openrdf.atlassian.net/browse/SES-2234 )
This query:
{code}
PREFIX ex: ex:
ASK WHERE {
?this ex:score ?score .
FILTER (!(?score+5 != 0)) .
}
{code}
produces the following algebra expression:
{code}
Slice ( limit=1 )
Filter
Not
Compare (!=)
MathExpr (+)
Var (name=score)
ValueConstant (value="+5"^^http://www.w3.org/2001/XMLSchema#integer)
ValueConstant (value="0"^^http://www.w3.org/2001/XMLSchema#integer)
StatementPattern
Var (name=this)
Var (name=_const-313ecd0b-uri, value=ex:score, anonymous)
Var (name=score)
{code}
The value constant representing the integer 5 incorrectly has a '+' sign prepended - presumably because the parser incorrectly processes the + math operator as part of the integer value.
Although this causes no problems in normal operation of the SPARQL engine, it is an issue in work by [~pulquero] on a SPIN engine.
(Migrated from https://openrdf.atlassian.net/browse/SES-2175 )
If Statement has associated context it will be ignored by SPARQLConnection#add method. User need to provide explicit context to add method to make it work.
The cause is inside SPARQLConnection#createInsertDataCommand, this method just ignores associated Statement's context.
(Migrated from https://openrdf.atlassian.net/browse/SES-2226 )
Example from taken from http://www.datypic.com/sc/xsd/t-xsd_anyURI.html
new java.net.URI("http://datypic.com#f% rag")
throws "java.net.URISyntaxException: Malformed escape pair at index 20: http://datypic.com#f% rag"
where as:
XMLDatatypeUtil.isValidValue("http://datypic.com#f% rag", XMLSchema.ANYURI)
returns true.
Looking at the source for isValidValue there is no case to validate XMLSchema.ANYURIs, is this deliberate or simply an omission?
(Migrated from https://openrdf.atlassian.net/browse/SES-2206 )
SPARQL 1.1 constructs cannot be rendered via the SparqlQueryRenderer class.
For example If I parse this query with QueryParserUtil.parseQuery:
SELECT \* WHERE {
?s ?p ?o .
BIND(uri("http://test-graph.com/") AS ?foo) .
}
And then render the ParsedQuery back out again with SPARQLQueryRenderer, it appears to lose the binding clause, returning the following string:
select ?g ?s ?p ?o ?g2
where {
GRAPH ?g {
?s ?p ?o.
}}
Looking at renderTupleExpr in SparqlTupleExprRenderer I can see that the following lines are commented out:
// aRenderer.mProjection = new ArrayList<ProjectionElemList>(mProjection);
// aRenderer.mDistinct = mDistinct;
// aRenderer.mReduced = mReduced;
// aRenderer.mExtensions = new HashMap<String, ValueExpr>(mExtensions);
// aRenderer.mOrdering = new ArrayList<OrderElem>(mOrdering);
// aRenderer.mLimit = mLimit;
// aRenderer.mOffset = mOffset;
With the following commented out in SPARQLQueryRenderer:
// SPARQL does not support this, its an artifact of copy and
// paste from the serql stuff
// aQuery.append(mRenderer.getExtensions().containsKey(aElem.getSourceName())
// ?
// mRenderer.renderValueExpr(mRenderer.getExtensions().get(aElem.getSourceName()))
// : "?"+aElem.getSourceName());
//
// if (!aElem.getSourceName().equals(aElem.getTargetName()) ||
// (mRenderer.getExtensions().containsKey(aElem.getTargetName())
// &&
// !mRenderer.getExtensions().containsKey(aElem.getSourceName())))
// {
// aQuery.append(" as ").append(mRenderer.getExtensions().containsKey(aElem.getTargetName())
// ?
// mRenderer.renderValueExpr(mRenderer.getExtensions().get(aElem.getTargetName()))
// : aElem.getTargetName());
// }
I believe these lines are commented out in error and that they should be commented back in in order to get be able to round trip queries from SPARQL text into the AST and back out again.
Other SPARQL 1.1 queries that fail include:
SELECT (COUNT (*) as ?c) WHERE { ?s ?p ?o }
is rendered as
select ?c where { ?s ?p ?o }
and the query
SELECT (?p as ?x) WHERE { ?s ?p ?o }
is rendered as
select ?p WHERE { ?s ?p ?o }
(Migrated from https://openrdf.atlassian.net/browse/SES-2189 )
This is simple to reproduce. I installed RDF4J into Tomcat and created a new in-memory repository called "test".
Add the following triples:
<http://example.org/a> <http://example.org/value> 1 .
<http://example.org/b> <http://example.org/value> 2 .
Running this query returns the value "3" as expected.
SELECT (SUM(?value) AS ?total) {
?s <http://example.org/value> ?value
}
Now, create a second in-memory repository called "test2".
Running this query from that repository returns a blank value.
SELECT ?total {
SERVICE <http://localhost:8080/openrdf-sesame/repositories/test> {{
SELECT (SUM(?value) AS ?total) {
?s <http://example.org/value> ?value
}
}}
}
By turning debug logging I was able to see the query being sent to "test".
[DEBUG] 2015-02-27 11:38:31,682 [http-bio-8080-exec-7] path info: /test
[DEBUG] 2015-02-27 11:38:31,682 [http-bio-8080-exec-7] repositoryID is 'test'
[DEBUG] 2015-02-27 11:38:31,682 [http-bio-8080-exec-7] queryLn="SPARQL"
[DEBUG] 2015-02-27 11:38:31,682 [http-bio-8080-exec-7] query="PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# SELECT ?s ?value WHERE { {
SELECT (SUM(?value) AS ?total) {
?s http:/example.org/value ?value
}
} }"
[DEBUG] 2015-02-27 11:38:31,682 [http-bio-8080-exec-7] infer="true"
Scrolling to the right, you can see that although ?value is included in the projection, ?total from within the aggregation is not. As a workaround, I added an additional inner select to ensure ?total is projected:
SELECT ?total {
SERVICE http://localhost:8080/openrdf-sesame/repositories/test {{
SELECT ?total {
{
SELECT (SUM(?value) AS ?total) {
?s http:/example.org/value ?value
}
}
}
}}
}
We need to make a decision on what version number to use for the initial RDF4J release. There are, roughly, three options:
Advantage of the first option is that it's a more 'gradual' transition. Potential downside is that it suggests it's not the first Eclipse RDF4J release.
Advantage of the second option is that it is more clear that there may be compatibility problems between the last Sesame release and the first RDF4J release.
Advantage of the last option is that we get to start fresh. Downside is that it's not obvious how this release relates to existing Sesame releases.
No matter what we choose, we will always need to provide accompanying upgrade notes anyway.
We have yet to file a CQ for inclusion of the Solr library.
(Migrated from https://openrdf.atlassian.net/browse/SES-2229 )
The fix for SES-1995 is less than ideal when dealing with a results page coming directly from the query page POSTing a long query (>~ 1k characters). It requires working around by saving the long query on the server.
However, the query text is actually present in the cookies along with the other parameters needed to specify the query. These cookies could be copied into a hidden form at page load, then the Download link would perform its request as a form POST, getting around the URL character limit.
(Migrated from https://openrdf.atlassian.net/browse/SES-2248 )
Hi Jeen,
I could reproduce the behaviour for [https://openrdf.atlassian.net/browse/SES-2099] with a much simpler query; you'll see that each result binds ct01 with a different bnode.
SELECT * WHERE {
BIND (bnode() as ?ct01)
{ SELECT ?s WHERE {
?s ?p ?o .
}
LIMIT 10
}
}
If I'm not mistaken, the query should be equivalent with this one which actually works as expected :
SELECT * WHERE {
BIND (bnode() as ?ct01)
?s ?p ?o .
}
LIMIT 10
meaning the algebra should first create a SingetonSet then extend it with the BIND and only then do the join so the variable ct01 should be bound to the same bnode for each result of the subquery.
So it seems that evaluating the subquery first (which is indeed required by the recommendation) does not respect the evaluation or join orders of the preceding graph patterns.
Possible candidates are RDFException
or RDF4JException
. I personally prefer the first since it's shorter. OpenRDFException should remain as a deprecated class for backward compatibility.
(Migrated from https://openrdf.atlassian.net/browse/SES-2161 )
There's a query parse error with the following query:
{code}
insert data {
urn:alpha urn:beta """\U0001F61F""" .
}
{code}
I am fairly certain this is a valid query, from what I can grok of the spec, that unicode sequence is correct. ARQ also bombs out on this query though, which leaves me with some doubt.
The SPIN compliance tests severely slow down the build (just these tests take almost 45 minutes to run on our HIPP), and moreover they are unstable: in several builds the testOrderByQueriesAreInterruptable
test intermittently fails.
We should temporarily disable these compliance tests from the normal build process and only (manually) execute them when changes are made to the SPIN modules.
Current maven configuration still relies on old Sesame project settings for syncing with sonatype OSS (and from there syncing to maven central). This needs to be tweaked/reconfigured according to what Eclipse projects do for maven artifact deployment.
Basic housecleaning aimed at getting the maven lifecycle to run more smoothly in combination with M2E.
(Migrated from https://openrdf.atlassian.net/browse/SES-2185 )
When uploading a file through the "Add RDF" screen, the (autodetect) option is supposed to determine the correct format and select the right parser. However, this does not work. In the current system, for any format other than RDF/XML, file upload with autodetect results in an error "Content is not allowed in prolog. [line 1, column 1] "
Only after explicitly selecting the correct format from the dropdown does file upload work.
Eclipse recommends that contributors be shown or directed to the following text when attempting to make a contribution:
Before your contribution can be accepted by the project, you need to create and
electronically sign the Eclipse Foundation Contributor License Agreement (CLA) and sign
off on the Eclipse Foundation Certificate of Origin.
For more information, please visit
http://wiki.eclipse.org/Development_Resources/Contributing_via_Git
This can be done for GitHub issues and pull requests by adding a file to the repository named either CONTRIBUTING or CONTRIBUTING.md:
https://help.github.com/articles/setting-guidelines-for-repository-contributors/
(Migrated from https://openrdf.atlassian.net/browse/SES-2194 )
Since RepositoryConnection now extends AutoCloseable, it is a valid candidate for use in try-with-resources.
We should simplify our internal code based on this to reduce the number of finally blocks that we need to maintain.
The test evaluates the handling of a limit on a subselect in a larger CONSTRUCT query. Failure is specific to FederationSail as the same test succeeds on other store types.
(Migrated from https://openrdf.atlassian.net/browse/SES-2250 )
When using a BIND variable in a pattern join, the result is a cross join of the dataset triple when the BIND expression raises a type error.
This query should expose the behavior on any store that contains blank nodes.
{code}
SELECT *
WHERE {
?s ?p ?o .
FILTER(isBlank(?o))
BIND (iri(?o) as ?s2)
?s2 ?p2 ?o2 .
} LIMIT 10
{code}
The join evaluation should normally conclude that both multisets are incompatible since ?s2 is unbound in the join's left argument so the query should normally return no result.
Creation of the SDK distro files still uses org.openrdf and sesame in places.
(Migrated from https://openrdf.atlassian.net/browse/SES-2227 )
Investigate creation of a new persistent RDF store using MapDB - possibly as a replacement/alternative for the native store.
(Migrated from https://openrdf.atlassian.net/browse/SES-2168 )
afaict, there's no way for gradle projects to pull down sesame artifacts from maven central.
I am admittedly still new to gradle, so i might have overlooked something obvious, but i think the fact that some dependencies are unversioned and others use variables, is problematic for gradle when trying to resolve that dependency.
If you look at [http://repo1.maven.org/maven2/org/openrdf/sesame/sesame-model/2.7.14/sesame-model-2.7.14.pom] you can see that junit has no scope or version, and that the sesame-util uses variable placeholders.
Trying to grab that artifact via
{code}
compile ("org.openrdf.sesame:sesame-model:2.7.14")
{code}
will yield:
{code}
Could not resolve org.openrdf.sesame:sesame-model:2.7.14.
Required by:
com.complexible.stardog.openrdf-utils:openrdf:2.2.4
Could not parse POM https://repo1.maven.org/maven2/org/openrdf/sesame/sesame-model/2.7.14/sesame-model-2.7.14.pom
> Unable to resolve version for dependency 'junit:junit:jar'
{code}
I'm not a maven guru either, but I thought these, while legal, are not recommended.
As an aside, this works fine using Ivy to resolve the exact same dependency, and I'm assuming it works fine with Maven. So I think only gradle users are affected.
I know Jeen is mucking about with the maven stuff atm, it would be nice if this could be resolved as well.
The project documentation and website at http://rdf4j.org/ will need to be updated to reflect the changes from Sesame to RDF4J. In particular, we'll need:
(Migrated from https://openrdf.atlassian.net/browse/SES-2162 )
The current SAIL interface assumes it gets passed a TupleExpr (that is, an algebra representation of a query), and currently this is handled by SailRepositoryConnection.prepareQuery, which passes the query string to RDF4J query parser and produces a TupleExpr.
However, some SAIL implementation prefer to do their own parsing and/or prefer not to base their query evaluation on RDF4J's algebra model. To facilitate this, we should do a pass-down of the query string at the prepare stage, which allows a SAIL to (optionally) process or wrap the query in such a way that the RDF4J query parser is bypassed and the SAIL implementation can opt to use a completely independent parser and query engine.
The Javadoc still contains references to 'org.openrdf' and 'sesame' in many places. This needs to be reviewed and edited.
Current integration tests fail because the W3C test case data was not included in the initial contrib. We need to re-integrate this.
Hudson build is currently failing with test failures. We need to get the build stabilized ASAP.
We should run code formatting with rdf4j settings over the entire master branch, so that the code base is consistently well-formatted again.
GitHub released a new feature enabling a template to be created as the basis for new issues and pull requests. This is more visible than the guidelines for contributing as it is inserted into the comment for each pull request when it is opened, so may be useful to add support for.
https://github.com/blog/2111-issue-and-pull-request-templates
Sesame datadirs are by defaults stored in $APP_DIR/Aduna/OpenRDF Sesame
or something along those lines. This needs to be modified to something simpler. A preference is to have a root dir $APP_DIR/RDF4J/
with subdirs for the various RDF4J applications: RDF4J/Server
RDF4J/Workbench
, etc.
In addition, we should provide a conversion method that allows users to migrate their existing data to the new dir structure. This should either be a separate script (so that users can choose to run it), or an automated one-time migration, with a preference for the former (an automated procedure can cause problems if the datadirs are sufficiently large).
We need a new logo (and house style) for the rdf4j project, to visually distinguish ourselves from the 'old' Sesame project. This issue can be used to propose and discuss designs.
lucene sails for lucene 3 and 4 need to be removed from the code base.
YASQE and its dependency CodeMirror were excluded from intial code contribution and treated as third-party dependencies. We need to reintegrate this code.
The CQ for YASQE (v 2.7.2) is https://dev.eclipse.org/ipzilla/show_bug.cgi?id=10646 .
The CQ for CodeMirror (v 4.13) is https://dev.eclipse.org/ipzilla/show_bug.cgi?id=10573 .
The current SPARQL endpoint implementation handles update sequences by sending them down to the underlying Repository. Since at the level of SPARQL protocol no transactions are supported, this effectively means that transaction handling is left to the Repository API.
The Repository API handles SPARQL update sequences by treating each operation in the sequence as a separate update, which is conform the SPARQL 1.1 Update specification (section 3):
Implementations MUST ensure that the operations of a single request are
executed in a fashion that guarantees the same effects as executing them
sequentially in the order they appear in the request.
In effect the SPARQL endpoint implementation handles update sequence requests as several transactions. The SPARQL spec, however, also has the following soft requirement (see section 2.2):
SPARQL 1.1 Update requests are sequences of operations. Each request SHOULD
be treated atomically by a SPARQL 1.1 Update service. The term 'atomically'
means that a single request will result in either no effect or a complete
effect, regardless of the number of operations that may be present in the
request.
While the current implementation does not break the spec, it does deviate from this recommended pattern. To change this, we should add a flag to the RDF4J REST protocol that allows our service implementation to distinguish between requests coming from a SPARQL endpoint client, and requests coming from an RDF4J client. In the former case, the service can choose to explicitly start a transaction before executing the sequence, so that the sequence is treated as an atomic update.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.