Comments (8)
You mean BULK FETCH, that was introduced in DN 4.x, generates SQL to attempt to load objects now, whereas in earlier versions it simply ignored the users fetch plan and you had lazy loading always. If you think the SQL could/should be improved then you can kindly go to the code and contribute improvements. Here is the starting point
https://github.com/datanucleus/datanucleus-rdbms/blob/master/src/main/java/org/datanucleus/store/rdbms/query/BulkFetchExistsHelper.java
from datanucleus-rdbms.
Thank you for your fast response Andy. I will have a look on how this can be done better!
from datanucleus-rdbms.
No comment on how this can be improved, nor a test that demonstrates the problem, so closing. Feel free to comment here and it could be reopened, or raise a new issue with some specific improvement and pull request that provides it.
from datanucleus-rdbms.
The way to fix this is choosing JOIN over EXISTS (SELECT ..FROM).
As in, use BulkFetchJoinHandler.java instead of BulkFetchExistsHandler.java
from datanucleus-rdbms.
Actually not totally correct; the OP is talking about loading objects into memory, NOT the actual SQL invoked. No demonstration of this was presented.
A comparison of different methods for bulk-fetching is available from the authors of EclipseLink, https://java-persistence-performance.blogspot.co.uk/2010/08/batch-fetching-optimizing-object-graph.html, and while JOIN actually comes out best with a simple query, it comes out poorest when the query it is utilised with becomes complex.
Clearly if someone is particularly interested in this feature for their projects they can contribute time to provide the JOIN bulk-fetch implementation, as well as investigate what this issue is actually about.
from datanucleus-rdbms.
Unfortunately, I am a MySQL performance engineer and not a developer. Therefore, I am only able to analyse the SQL thats hitting the database and make recommendations on how to fix them. I can tell you that the vast majority of cases would benefit (sometimes greatly) from JOIN over EXISTS. In the event where you have many JOINs, this would be a more complicated issue that may be solved with better indexes, index hints... but it is more of a specialist issue at that point.
Your decision to use EXISTS means that you usually do a full table scan on the main table while using a subquery where the data in that subquery is dependant on data in the main table. This is really the worst sort of performance (at least for MySQL).
Bringing a lot of blocks into memory and then filtering them - instead of using indexes - leads to high IO and high CPU which is what the OP was complaining about. Either way, it is well worth looking at it.
from datanucleus-rdbms.
What would be great to know if there is a queryHint or some annotation where we can help datanucleus decide whether to use a JOIN or an EXISTS and then we can see for ourselves if there is a performance improvement.
from datanucleus-rdbms.
As my comment above said, JOIN is not yet implemented; needs resource to do it and not a priority for me. So if your project/company needs it then it is there to do, and any comments relating to it should be put against the correct issue ... #171 and the query extension is documented at http://www.datanucleus.org/products/accessplatform_5_0/jdo/query.html#FetchPlan as datanucleus.rdbms.query.multivaluedFetch with values of "exists" and "none" currently.
from datanucleus-rdbms.
Related Issues (20)
- Invalid MySQL schema generated when used with Connector/J 8.0.30 HOT 2
- Support TINYBLOB on MySQL/MariaDB without size specifier, as per TINYTEXT
- Search of setter function in ResultClassROF.getObject() is broken HOT 1
- NullPointerException in ResultClassROF.getObject() in 6.0.2 HOT 6
- ArgType not cached for null value in first Result in ResultSet HOT 1
- Wrong identity type generated for type `long` with PostgreSQL
- ResultClassROF can fail to set fields/properties when JDBC driver returns unassignable type HOT 9
- datanucleus.query.jdoql.{varName}.join extension does not work when "varName" is not lower case HOT 4
- Support bulk-fetch on a Collection when field is empty
- BatchUpdateException when reordering elements of 1-m relationship as List<> (join table) - with testcase HOT 3
- Provide InitSQL hikari options HOT 2
- Support ConnectionInitSQL with HikariCP
- keyword "COLUMN_NAME" for dm database HOT 3
- `UpdateStmtAllowTableAliasInSet` is not honored for update statements involving `DelegatedExpression`s HOT 1
- Issue #470 doesn't allow for discriminatorColumnName not being in different case
- Provide registerMbeans HikariCP options
- Unexpected query performance numbers in datanucleus log HOT 2
- JDOQLQuery should throw exception if datastore does not support query canceling
- `SQLStatement#getSQLText` implementations are not thread-safe HOT 1
- Question: strategy for dealing with mysql table limit when eagerly fetching fields of a large object
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datanucleus-rdbms.