cmu-db / optd Goto Github PK
View Code? Open in Web Editor NEWCMU-DB's Cascades optimizer framework
Home Page: https://cmu-db.github.io/optd/
License: MIT License
CMU-DB's Cascades optimizer framework
Home Page: https://cmu-db.github.io/optd/
License: MIT License
Currently it uses the heuristic optimizer. It seems that to use the cascades optimizer, we might need to introduce a more advanced dummy cost model. It might also be the case that we want to just keep testing on the heuristic optimizer.
See #140 for context.
List of logical->logical rule files to port from datafusion:
Make sub-issues as needed and link them here.
we can (1) collect all table scans from the plan node, (2) in the async context, retrieve the schema, and (3) remove async from schema inference.
optd-core
We should display cardinality and cost of the physical plan in explain
.
We can implement this with RelNodeMetaMap
introduced in #65, by storing the cost in RelNodeMeta
. Since the explain
output is produced in a separate traversal of the plan tree after optimization, and Pretty
objects are immutable after construction, we should pass the RelNodeMeta
as a parameter to dispatch_explain
.
currently, we do not compute upper bound / lower bound for the pruning process. we simply explore the full logical plan space and compute a winner for the physical plan. the pruning logic is already there, and we just need to provide a lower / upper bound of groups to the system.
The logical properties of the logical optimizer includes:
I don't think this task is important because I don't see physical properties very useful when evaluating Datafusion. Sometime later, we should add physical property support for the optimizer framework.
Remove:
let batches = df.collect().await?;
from datafusion-optd-cli/src/exec.rs
because it will internally run datafusion
's logical optimizer. We should try to call the other collect
function instead, after running our own optimizer and converting the optd
plan into datafusion
's (impl ExecutionPlan
). This will allow us to run our end-to-end optimized query on datafusion
.
When the generated expr by heuristic rule already exist in another group, we merge two groups and mark the old expr as a dead end (mark all rules have been fired for it.) It might be costly.
And here are several alternatives:
It happens in two scenarios:
With small number of rules, the first scenario is quite rare. The feat might be improved and tested with more rules in the future.
Resources:
This would be a far-in-the-future thing, but just as a note, for filter pushdown, we would have to take it into account when pushing past project
From Chi:
yes we should do that, feel free to create an issue...
this is so-called "column name inference"
assign a name for unnamed columns, and use user-specified names for projections
and for the top-level node, if we have physical property implemented, it will always be a gather node
unless all plan nodes are singleton
ColumnRef and Schema seem to have a lot of purpose overlap, I think that we can merge their functionality together.
Without the datafusion logical optimizer enabled, the join predicate would be stored in filter
instead of the on
field. This makes equal condition join break because we do not handle that.
We need to reconsider the logic of handling join conditions.
> explain select * from t1 join t2 on t1.t1v1 = t2.t2v1;
Join.on: []
Join.filter: Some(
BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t1",
},
),
name: "t1v1",
},
),
op: Eq,
right: Column(
Column {
relation: Some(
Bare {
table: "t2",
},
),
name: "t2v1",
},
),
},
),
)
thread 'main' panicked at optd-datafusion-bridge/src/lib.rs:320:14:
called `Result::unwrap()` on an `Err` value: unsupported join filter: Some(BinaryExpr(BinaryExpr { left: Column(Column { relation: Some(Bare { table: "t1" }), name: "t1v1" }), op: Eq, right: Column(Column { relation: Some(Bare { table: "t2" }), name: "t2v1" }) }))
EXTRACT(year FROM orders.o_orderdate)
and it simply uses the N-Distinct of orders.o_orderdate as the cardinality of the query.p_name like '%forest'
and just use o_orderdate as o_year
, you get exactly 60150 rows.currently, we have so many representation of plan nodes:
RelNode
: the only thing we should have in the core frameworkOptRelNode
: a set of pre-defined plan node that works for most database systems, should be moved to another crateRelRuleNode
: that's the evil thing. When handling rule conversion, we do not have access to the full plan node. The only thing we have is some part of it and the group id of the children. This is so-called binding in Cascades/Columbia.The RelRuleNode
also makes it harder to convert from the dynamically-typed RelNode to statically-typed LogicalXXX structs. LogicalXXX::from_rel_node
is only implemented on RelNode instead of RelRuleNode.
Given that the optimizer only supports Cascades, we can ask the user to have a placeholder(group_id)
function over RelNodeTyp
, so that the optimizer can put some placeholders into user's plan node representation, and everything can work as usual. With that, we can eliminate RelRuleNode from the system and use a unified representation of plan nodes everywhere.
On the user side, they only need to ensure their plan type enum contains a special placeholder type for optd internal use.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum OptRelNodeTyp {
Placeholder(GroupId), // every set of plan nodes should have this
// Plan nodes
// Developers: update `is_plan_node` function after adding new elements
Projection,
...
}
Error: ArrowError(NotYetImplemented("Unsupported Interval Expression with value \"115 day (3) seconds\""))
cannot find best binding for group
(Not supporting left semi join)Error: Arrow error: Invalid argument error: Invalid comparison operation: Decimal128(15, 2) >= Decimal128(20, 0)
Join
filter
says Filters applied during join (non-equi conditions)
, but there can actually be equal comparisons!unsupported plan node: CreateView: Bare { table: "revenue0" }
cannot find best binding for group
(Not supporting left anti join)cannot find best binding for group
(Not supporting left semi join)cannot find best binding for group
(Not supporting left semi join)cannot find best binding for group
(Not supporting left semi and left anti join)cannot find best binding for group
(Not supporting left anti join)OuterReferenceColumn
by moving the column to the outer query. Since we do not know how to access columns of the outer query, I think we should enable DF's logical optimizer when dealing with subqueries.Error during planning: table 'datafusion.public.xxxx' not found
error.explain_with_logical
in the planner test, make sure to execute the DDLs with execute_with_logical
!InListExpr
: Q12, Q16, Q19 #77IntervalMonthDayNano
: Q1, Q10, Q14(#78)ColumnRefPropertyBulder
LogOp
: Q16For both testing the optd cascades core and shrinking the search space, we were trying to add heuristic rules in cascades. However, some simple rules like eliminateFilter which eliminate filter node when the filter predicates are always true lead to the situation that we have to add a MergeGroup in optd core.
// memo before eliminateFilter
Group(n): Filter(true) -> child(Group(n+1))
Group(n+1): expr
// memo after eliminateFilter
Group(n): Filter(true) -> child(Group(n+1)), placeholder(Group(n+1))
Group(n+1): expr
// The result returned by eliminateFilter means that Group(n) and Group(n+1) are actually logically equivalent. So we add MergeGroup in optd to merge these two groups.
There's no MergeGroup definition in any cascades paper nor columbia paper and corresponding implementation. The adding of mergeGroup is actually pretty experimental. Besides, in real world optimizer, there are always rewrite phase/prepare phase before the cascades framework to make the expressions normalized. There's actually no need to do such heuristic rules in cascades.
The introduction of MergeGroup leads to two problems: loops in operator tree.
Take Select * from t1 where true
as an example. The initial groups are:
// memo
G1: filter(true) -> child(G2)
G2: Scan (t1)
After eliminateFilter in cascades rule, it becomes:
// memo
G1: filter(true)->child(G2), Scan(t1)
G2: merged to G1, reduceGroupId=G1
It is ok when we execute the query as it only cares the winner. But when we wants to traverse the operator tree somehow, it goes into loops for expressions like filter(true)->child(G2)
as they are expressions points to themselves as a child.
It makes tricky bugs for some logic other than execution, for example, join order enumeration, tree traversal, debug printing and so on.
In optd, the order of the tasks fired are carefully designed following the columbia paper.
We first fire OptimizeGroup
task for the root/top/first group, it then traverse all the expressions in the task and fire OptimizeExpr
task for all the logical exprs in the group, and OptimizeInput
task for all the physical exprs in the group.
OptimizeExpr
first call ExploreGroup
tasks for the expr's children and then traverse the rules and fire ApplyRule
task for the matched rules for expr.
Newly generated logical exprs are fired with new OptimizeExpr
as they are not originally in the group when optimizeGroup
or ExploreGroup
is called.
OptimizeInput
is responsible to calculate the costs for the expr and update winner for the group, it fires optimizeGroup
for the children group when the children group haven't have a winner. But except for the root/top/first group, other groups have to be the children node of the physical expression in OptimizeInput
can it be called with optimizeGroup
.
In abstract, the cascades framework expect all the groups and their children are fully explored (apply all matching rules) from top group to the leaf group, and the costs and winners are updated then from the top group's physical expression to the children group.
If we merge two groups, these groups exist before merge (as rules can not create new group themselves, they can only return existing group). They are either being called with exploreGroup/OptimizeGroup or not. If they are called with exploreGroup before, we can not explore the group again.(exploreGroup skip groups following the groupId, though they are merged, the group Id remains the same). And optimizeGroup can only be called for root group or children group of physical expressions in parent group.
So if the merge destination group is explored before, the merge source group is not explored and both groups are not being called with optimizeGroup(it is either not the children group of physical expressions or pruned if we adds pruning), it will miss newly added expressions from merge source group. ๐คฏ๐คฏ๐คฏ It is a very trivial case and hard to be found.
MergeGroup will lead to debugging nightmare and tricky problems. We may need to remove it. Besides, if we don't add heuristic rules into cascades logic, we really don't need it.
In cascades framework, physical properties have two features:
Derived are the physical properties derived from children, for example, scan from a table whose column 1 is ordered ascend makes the scan node have the derived property as column 1 ordered ascend.
Required are the physical properties required by the query, for example, select * from t1 order by t1.v1 makes the root node having the required physical properties as v1 is required to be ordered.
Once derived properties cannot support all required properties, operator created by enforcer rule is inserted in the operator tree.
And where to insert the operator created by enforcer rule will depend on the calculated costs. (Determine select v1, v2 from t1 order by v1
is transferred to Sort->Projection->TableScan
or Projection->Sort->TableScan
, more complicated cases include join, filter, subqueries...)
Derived properties are for expression base, which means multiple expressions in one group in optd can have different derived physical properties. (For example, sort-merge join provide the column in order, while hash join not, but they are logically equivalent.)
Required properties are for group base, imagining a query like select * from t1 order by v1.
The root node group should have the required properties (ordering = v1).
A critical issue is that we should calculate the cost to determine where to put enforcer rule, for example, Select node can choose children which already have the required properties in their derived and the cost is the children expression cost, it can also choose children which not have the derived properties and insert an enforcer rule on top of that, the cost then becomes the enforcer operator + the children cost.
However, as the children representations are placeholder for group id in optd, we cannot directly pick best children (winner) with/without required physical properties. And as the derived properties are for expression based, we might need to traverse all the expressions in the child group.
This change of logic hugely interferes the optimizeGroup task(we need to provide an interface for required physical properties) and optimizeInput task(we need to calculate different winners for different physical properties requirement) and the main memo structure(we need to add physical properties storage for each expr/exprId and add infer derived physical properties when adding new expr to group).
CockroachDB's doc(https://github.com/cockroachdb/cockroach/blob/fe36467047fbbf88569fc03cff97fb4d728122f3/pkg/sql/opt/xform/optimizer.go#L319) provides a good example of how to develop physical properties and enforcer rule. It launches two optimizeGroup tasks, one with required properties, one without. The task without required properties need to add the cost of enforcer operator. And it then compares the cost of their result. For the children, it is separating the child group in different properties, so these plans are different: (select G2="ordering: y" G3)
and (select G2 G3)
.
Columnbia's paper provides definition of the winner structure with physical properties. It states as
In Columbia, a winner is also used to store the temporary result of a search. While the costs of physical multi-expressions of a group are being calculated, the cheapest yet found expression is stored as a winner. During the optimization process, the winner is improved and finally the best (cheapest) plan is found. Sometimes, when no physical multi-expression can be found with the required physical property, we store the multi-expression pointer as NULL to indicate no winner for that physical property. Since no winner is also a solution for a search of this sub-problem, this information is memoized and will be useful in the future optimizing process. The following is the definition of data members in a WINNER class in Columbia.
Calcite is creating subGroups within groups. subGroups are expressions that logically equivalent and have same physical properties. It then optimizeSubGroups and finds winner within each subgroup. (https://github.com/apache/calcite/blob/4823cb7760913f236e7f0f2cb149325b55a3f124/core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java#L84)
Generally, my idea is to
We can run regression test on the plan we generate for TPC-H SQLs. There is already a tool for such planner tests: https://github.com/risinglightdb/sqlplannertest-rs
[optd-datafusion-bridge](https://github.com/cmu-db/optd/tree/main/optd-datafusion-bridge)
datafusion
tree, datafusion-substrait
, that claims to convert from datafusion
to substrait
representations. However, on the physical side (which is what we need to convert), it seems like it has the same problem that would be encountered if we were to send impl ExecutionPlan
trait objects around, in that it doesn't seem to know the type or any information about each node. It seems like we wouldn't want to use it, in this case, and instead convert directly from optd
nodes. I may be wrong on this---check for yourself here.Cost usually contains row count, network cost, I/O, compute cost, etc. A reasonable abstraction for it in optd is simple a vector of value. The cost model implementor can interpret the vector as anything they want, and inside the optimizer, it will simply compare it by the first element in the vector (which is the weighted cost).
When there are multiple join predicates, they are stored as LogOp(And)
optd/optd-datafusion-bridge/src/into_optd.rs
Lines 299 to 305 in ad37149
But this case is not handled in the projection pull up rule.
optd/optd-datafusion-repr/src/rules/joins.rs
Lines 330 to 342 in 755db92
Here, it seems we assume cond
is a BinOpExpr
(so that its children are aways Expr
), but for LogOp
its child is a ExprList
, whose type is List
which is not an expression. That's why the unwrap()
on line 335 panics.
We can hard code this scenario, or categorize OptRelNodeTyp::List
into expression (but I'm not 100% sure whether is only used to store expression).
we already have the intermediate representation of rule matchers, and therefore it would be easy to implement a Rust macro that converts a rule DSL into the IR. one proposal:
(join inner !a !b !cond) -- defines a matcher on join a and b, for join commute rule
(join inner (join inner !a !b !condab) !c !condabc -- defines a matcher on two joins, for join assoc rule
!a
, !b
, !cond
will be picked up and the rule can use them to compose the new plan node.
the correct plan node set generally follows the Calcite definition of plan nodes. we might need to adjust them a little bit in order to make it work with Datafusion.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.