Giter Club home page Giter Club logo

Comments (4)

afs avatar afs commented on May 27, 2024

Please explain what is going on here and why these changes are necessary

This is a repeat comment from before - I am concerned that changes to central machinery for one particular usage will have consequences for other users that are hard to predict.

Did you consider batching inside the service execution reading from the nextStage(Binding)?

Why is single treated different from a bulk of one?

from jena.

Aklakan avatar Aklakan commented on May 27, 2024

Did you consider batching inside the service execution reading from the nextStage(Binding)?

Batching based on "push-events" to nextStage(Binding) is extremely complex to implement when the rest of ARQ is "pull" based - i.e. operators are implemented as iterators that pull as many items as needed from their underlying inputs in order to produce the next item.

As an example why I am not blindly batching by n bindings:

It would be easily possible adding a subclass of QueryIterRepeatApplyBulk that calls an abstract nextStage(Binding[] batch) method.

However, e.g. in #1314 I created conceptually a IterBatchByService (the actual naming is currently a bit awkward) which is an iterator over batches targeted at the same service and which is sourced by the query iterator of the lhs. It aims to create batches of a fixed size but it also adheres to a read ahead limit as to avoid consuming all of the lhs while attempting to fill a batch; or adding items to a batch that are too far ahead.

This is a bit smarter than blindly batching unrelated bindings.
Doing this with a push-based API (that expects immediate return values) would be hell.

Why is single treated different from a bulk of one?

Well, the single API is a special case of the bulk one - and the bulk one delegates to it. So its not really treating it differently but rather having two reasonable levels of abstraction (That's why I let QueryIterRepeatApply extend from QueryIterRepeatApplyBulk) - maybe the latter should be called simply QueryIterFlatMap.
Single is certainly more easy to implement - and so far this was how SERVICE was implemented in the first place!
Bulk is more powerful because it hands out the QueryIterator but that may require more effort (and knowledge) to work with.

from jena.

Aklakan avatar Aklakan commented on May 27, 2024

I gave the QueryIteratorRepeatApply more thought and it turns out the extension system doesn't need a dedicated QueryIterator at all. Only the OpExecutor has to delegate the OpService case to the extension registry rather than directly creating a QueryIterService instance. Handlers in the registry for the bulk case (which receive the input QueryIterator) can choose to wrap that iterator with their own one (may be repeat-apply or some other) or pass them on to the next registered executor.

This means I can revert QueryIterRepeatApply.

I am also looking into combining the bulk and single registry into a one class (I am just not yet sure whether this internally uses the two existing registries or its own set of attributes).

from jena.

Aklakan avatar Aklakan commented on May 27, 2024

W.r.t. to the comment that this is just for my solution: The bigger picture is that I would really love to have an infrastructure/path where it is possible to combine SERVICE plugins of different third-party systems such as those of sparql-anything (this is their OpExecutor*) - and of course those developed in my group. (Whether and when they'd upgrade is of course their choice.)

Back then I reached out with sparql-anything issue 60
Whenever you think this issue is in a state where its beneficial to reach out to them for comments I can do so.
(IMO I should first make a stable proposal for the revised registry though)

* Most work there happens in OpService; yet there is one override for OpBGP which extracts options from triples - similar to what stardog does. This would also be interesting to have as a future extension point but I don't think this has to be be a concern for this PR.

from jena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.