Comments (4)
Hi Ervin,
We did implement a similar feature when we perform benchmark internally. The query
interface under Connection
can take encodedJoin
as a parameter which will guide the join order.
std::unique_ptr<QueryResult> Connection::query(
const std::string& query, const std::string& encodedJoin) {
lock_t lck{mtx};
auto preparedStatement = prepareNoLock(query, true /* enumerate all plans */, encodedJoin);
One example of encoded is join can be found under benchmark/queries/ldbc-sf100/join
-NAME q29
-COMPARE_RESULT 1
-QUERY MATCH (a:Person)-[:knows]->(b:Person) RETURN MIN(a.birthday), MIN(b.birthday)
-ENCODED_JOIN HJ(b._ID){E(b)S(a)}{S(b)}
---- 1
1980-02-01|1980-02-01
S
stands for SCAN
, E
stands for EXTEND
and HJ stands for HJ
. Inside HJ()
is the join condition, i.e. the _ID
of the table that we are joining on.
Note that it may not always be able to find the join order if the join graph becomes very large. This feature is not very user friendly but might be able to solve your problem temporarily.
In the long term, I will expose a configuration to disable join order optimizer, and the join order will be picked in the left to right order as the written query.
from kuzu.
Hello Andy, thanks for your detailed reply.
Besides the join order, I wonder if I can control the physical plan? Such as use worst case optimal join instead of binary hash join, and change aggregation orders. I know usually there is no user interface for this kind of demand. So which part of code should I edit if I want to build a execution plan on my own and execute it?
from kuzu.
I think for worst case optimal join you can still encode, e.g.
Given statement
(a)->(b)->(c), (a)->(c)
One WCOJ plan could be encoded as
I(c._id){E(b)S(a)}{E(c)S(a)}{E(c)S(b)}
For aggregation order, I'm assuming you mean eager/lazy aggregation. I think the most principled way is to write an optimizer rule to do that. Which means you will need to take a look at optimizer
module and add your own ones. Alternatively this could be done during planning phase so you can change planner
module. Let me know if my understanding of aggregation order is wrong. Happy to chat more in slack if you want.
from kuzu.
Thank you for the reply. I will look into the code then.
from kuzu.
Related Issues (20)
- Make warning limit per-query instead of per-connection
- Missing progress report for second and third pipelines in COPY REL statements
- Feature: Implement a table function to show bm statistics
- Bug: Query Performance Discrepancy: KuzuDB CLI vs. Rust Code HOT 8
- Add support for CALL function without RETURN clause
- Bug: Twice COPY FROM command causing segmentation fault
- Feature: Adding pipeline number to plan
- Feature: Report error line number when error occurs in the json reader.
- More test coverage on multi-label rel scan
- Feature: Full text search
- Bug: ldbc sf01 deleteComment test fail randomly
- Optimization: Join order optimization across match/load from clauses
- Kรนzu to NetworkX fails when results of an HOT 1
- GDS and Recursive Joins TODOs
- Bug: PROJECT graph does not error when projecting graphs without the right node/rel tables
- Bug: Matching ; within strings to end of command
- Remove RDF support
- Bug: Skip comment lines in csv files
- Bug: string being treated as regexp in regexp functions
- Feature: Add Support for Exporting Graphs to graph-tool
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kuzu.