Comments (4)
Please explain what is going on here and why these changes are necessary
This is a repeat comment from before - I am concerned that changes to central machinery for one particular usage will have consequences for other users that are hard to predict.
Did you consider batching inside the service execution reading from the nextStage(Binding)
?
Why is single treated different from a bulk of one?
from jena.
Did you consider batching inside the service execution reading from the nextStage(Binding)?
Batching based on "push-events" to nextStage(Binding)
is extremely complex to implement when the rest of ARQ is "pull" based - i.e. operators are implemented as iterators that pull as many items as needed from their underlying inputs in order to produce the next item.
As an example why I am not blindly batching by n bindings:
It would be easily possible adding a subclass of QueryIterRepeatApplyBulk
that calls an abstract nextStage(Binding[] batch)
method.
However, e.g. in #1314 I created conceptually a IterBatchByService
(the actual naming is currently a bit awkward) which is an iterator over batches targeted at the same service and which is sourced by the query iterator of the lhs. It aims to create batches of a fixed size but it also adheres to a read ahead limit as to avoid consuming all of the lhs while attempting to fill a batch; or adding items to a batch that are too far ahead.
This is a bit smarter than blindly batching unrelated bindings.
Doing this with a push-based API (that expects immediate return values) would be hell.
Why is single treated different from a bulk of one?
Well, the single API is a special case of the bulk one - and the bulk one delegates to it. So its not really treating it differently but rather having two reasonable levels of abstraction (That's why I let QueryIterRepeatApply extend from QueryIterRepeatApplyBulk) - maybe the latter should be called simply QueryIterFlatMap
.
Single is certainly more easy to implement - and so far this was how SERVICE was implemented in the first place!
Bulk is more powerful because it hands out the QueryIterator but that may require more effort (and knowledge) to work with.
from jena.
I gave the QueryIteratorRepeatApply more thought and it turns out the extension system doesn't need a dedicated QueryIterator at all. Only the OpExecutor
has to delegate the OpService case to the extension registry rather than directly creating a QueryIterService instance. Handlers in the registry for the bulk case (which receive the input QueryIterator) can choose to wrap that iterator with their own one (may be repeat-apply or some other) or pass them on to the next registered executor.
This means I can revert QueryIterRepeatApply.
I am also looking into combining the bulk and single registry into a one class (I am just not yet sure whether this internally uses the two existing registries or its own set of attributes).
from jena.
W.r.t. to the comment that this is just for my solution: The bigger picture is that I would really love to have an infrastructure/path where it is possible to combine SERVICE plugins of different third-party systems such as those of sparql-anything (this is their OpExecutor*) - and of course those developed in my group. (Whether and when they'd upgrade is of course their choice.)
Back then I reached out with sparql-anything issue 60
Whenever you think this issue is in a state where its beneficial to reach out to them for comments I can do so.
(IMO I should first make a stable proposal for the revised registry though)
* Most work there happens in OpService; yet there is one override for OpBGP which extracts options from triples - similar to what stardog does. This would also be interesting to have as a future extension point but I don't think this has to be be a concern for this PR.
from jena.
Related Issues (20)
- How to Add Custom Predicate Support to Apache Jena Fuseki? HOT 1
- LPBRuleEngine's usage of LinkedList for activeInterpreters is a hot spot HOT 3
- org.apache.jena.shared.PrefixMapping$IllegalPrefixException: 😀 HOT 6
- Use PrefixMappingBase for BufferingPrefixMapping
- Update `PrefixMappingImpl` prefix checking rule to align with Turtle and XML 1.1
- a Fuseki Jetty server behind an Apache HTTPd proxy-pass HOT 17
- Remove Fuseki system database
- QueryExec: abort before exec is ignored.
- RDFParserBuilder.toDatasetGraph() is much slower than calling RDFParserBuilder.parse(DatasetGraph) HOT 3
- RDF Patch Binary Reader silently accepts some invalid patch files HOT 4
- Support for multi-variable join keys
- improve arq command line documentation
- Incorrect JoinClassifier results with unbound values.
- Fix broken Fuseki when using a context path in the URL
- Update the lexical space and value space of rdf:XMLLiteral to comply with RDF 1.1 HOT 2
- Remove commons-cli dependency from jena-core
- Update various @Deprecation to include "forRemoval"
- The CORS filter has references to Jetty code.
- Make jena-fuseki-core independent of Eclipse Jetty
- Lookup script name "javascript"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jena.