Giter Club home page Giter Club logo

Comments (8)

andyjefferson avatar andyjefferson commented on July 19, 2024

You mean BULK FETCH, that was introduced in DN 4.x, generates SQL to attempt to load objects now, whereas in earlier versions it simply ignored the users fetch plan and you had lazy loading always. If you think the SQL could/should be improved then you can kindly go to the code and contribute improvements. Here is the starting point
https://github.com/datanucleus/datanucleus-rdbms/blob/master/src/main/java/org/datanucleus/store/rdbms/query/BulkFetchExistsHelper.java

from datanucleus-rdbms.

andreiIfrim avatar andreiIfrim commented on July 19, 2024

Thank you for your fast response Andy. I will have a look on how this can be done better!

from datanucleus-rdbms.

andyjefferson avatar andyjefferson commented on July 19, 2024

No comment on how this can be improved, nor a test that demonstrates the problem, so closing. Feel free to comment here and it could be reopened, or raise a new issue with some specific improvement and pull request that provides it.

from datanucleus-rdbms.

jonathanvx avatar jonathanvx commented on July 19, 2024

The way to fix this is choosing JOIN over EXISTS (SELECT ..FROM).
As in, use BulkFetchJoinHandler.java instead of BulkFetchExistsHandler.java

from datanucleus-rdbms.

andyjefferson avatar andyjefferson commented on July 19, 2024

Actually not totally correct; the OP is talking about loading objects into memory, NOT the actual SQL invoked. No demonstration of this was presented.

A comparison of different methods for bulk-fetching is available from the authors of EclipseLink, https://java-persistence-performance.blogspot.co.uk/2010/08/batch-fetching-optimizing-object-graph.html, and while JOIN actually comes out best with a simple query, it comes out poorest when the query it is utilised with becomes complex.

Clearly if someone is particularly interested in this feature for their projects they can contribute time to provide the JOIN bulk-fetch implementation, as well as investigate what this issue is actually about.

from datanucleus-rdbms.

jonathanvx avatar jonathanvx commented on July 19, 2024

Unfortunately, I am a MySQL performance engineer and not a developer. Therefore, I am only able to analyse the SQL thats hitting the database and make recommendations on how to fix them. I can tell you that the vast majority of cases would benefit (sometimes greatly) from JOIN over EXISTS. In the event where you have many JOINs, this would be a more complicated issue that may be solved with better indexes, index hints... but it is more of a specialist issue at that point.
Your decision to use EXISTS means that you usually do a full table scan on the main table while using a subquery where the data in that subquery is dependant on data in the main table. This is really the worst sort of performance (at least for MySQL).
Bringing a lot of blocks into memory and then filtering them - instead of using indexes - leads to high IO and high CPU which is what the OP was complaining about. Either way, it is well worth looking at it.

from datanucleus-rdbms.

jonathanvx avatar jonathanvx commented on July 19, 2024

What would be great to know if there is a queryHint or some annotation where we can help datanucleus decide whether to use a JOIN or an EXISTS and then we can see for ourselves if there is a performance improvement.

from datanucleus-rdbms.

andyjefferson avatar andyjefferson commented on July 19, 2024

As my comment above said, JOIN is not yet implemented; needs resource to do it and not a priority for me. So if your project/company needs it then it is there to do, and any comments relating to it should be put against the correct issue ... #171 and the query extension is documented at http://www.datanucleus.org/products/accessplatform_5_0/jdo/query.html#FetchPlan as datanucleus.rdbms.query.multivaluedFetch with values of "exists" and "none" currently.

from datanucleus-rdbms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.