| Revision | $Id: //open/dev/farrago/src/org/eigenbase/relopt/volcano/package.html#1 $ |
|---|---|
| Copyright | Copyright (C) 2005-2009 The Eigenbase Project
Copyright (C) 2003-2009 SQLstream, Inc. Copyright (C) 2009-2009 LucidEra, Inc. |
| Author | Julian Hyde, Stephan Zuercher |
A planner (also known as an optimizer) finds the most efficient implementation of a {@link org.eigenbase.rel.RelNode relational expression}.
Interface {@link org.eigenbase.relopt.RelOptPlanner} defines a planner, and class {@link org.eigenbase.relopt.volcano.VolcanoPlanner} is an implementation which uses a dynamic programming technique. It is based upon the Volcano optimizer [1].
Interface {@link org.eigenbase.relopt.RelOptCost} defines a cost model; class {@link
org.eigenbase.relopt.volcano.VolcanoCost} is the implementation for a VolcanoPlanner.
A {@link org.eigenbase.relopt.volcano.RelSet} is a set of equivalent relational expressions.
They are equivalent because they will produce the same result for any set of
input data. It is an equivalence class: two expressions are in the same set if
and only if they are in the same RelSet.
One of the unique features of the optimizer is that expressions can take on a variety of physical traits. Each relational expression has a set of traits. Each trait is described by an implementation of {@link org.eigenbase.relopt.RelTraitDef}. Manifestations of the trait implement {@link org.eigenbase.relopt.RelTrait}. The most common example of a trait is calling convention: the protocol used to receive and transmit data. {@link org.eigenbase.relopt.CallingConventionTraitDef} defines the trait and {@link org.eigenbase.relopt.CallingConvention} enumerates the protocols. Every relational expression has a single calling convention by which it returns its results. Some examples:
names array would
generate the following code:
String[] names;
for (int i = 0; i < names.length; i++) {
String name = names[i];
// ... code for consuming relational expression ...
}
New traits are added to the planner in one of two ways:
AbstractRelNode constructors
if most relational expressions use a single manifestation of the trait.setRoot(RelNode) call.The second trait extension mechanism requires that implementations of {@link
org.eigenbase.rel.AbstractRelNode#clone()} must not assume the type and quantity of traits in
their trait set. In either case, the new RelTraitDef implementation must be
{@link org.eigenbase.relopt.volcano.VolcanoPlanner#addRelTraitDef(org.eigenbase.relopt.RelTraitDef)}
registered with the planner.
A {@link org.eigenbase.relopt.volcano.RelSubset} is a subset of a RelSet
containing expressions which are equivalent and which have the same
CallingConvention. Like RelSet, it is an equivalence class.
org.eigenbase.rel defines {@link
org.eigenbase.rel.RelNode relational expressions}.openjava.ptree
defines an object model for Java expressions....
Sets merge when the result of a rule already exists in another set. This implies that all of the expressions are equivalent. The RelSets are merged, and so are the contained RelSubsets.
Expression registration.
Algorithm
To optimize a relational expression R:
1. Register R.
2. Create rule-calls for all applicable rules.
3. Rank the rule calls by importance.
4. Call the most important rule
5. Repeat.
Importance. A rule-call is important if it is likely to produce better implementation of a relexp on the plan's critical path. Hence (a) it produces a member of an important RelSubset, (b) its children are cheap.
Conversion. Conversions are difficult because we have to work backwards from the goal.
Rule triggering
The rules are:
PushFilterThroughProjectRule. Operands:Filter Project
CombineProjectsRule. Operands:Project Project
A rule can be triggered by a change to any of its operands. Consider the rule to combine two filters into one. It would have operands [Filter [Filter]]. If I register a new Filter, it will trigger the rule in 2 places. Consider:
Project (deptno) [exp 1, subset A]
Filter (gender='F') [exp 2, subset B]
Project (deptno, gender, empno) [exp 3, subset C]
Project (deptno, gender, empno, salary) [exp 4, subset D]
TableScan (emp) [exp 0, subset X]
Apply PushFilterThroughProjectRule to [exp 2, exp 3]:
Project (deptno) [exp 1, subset A]
Project (deptno, gender, empno) [exp 5, subset B]
Filter (gender='F') [exp 6, subset E]
Project (deptno, gender, empno, salary) [exp 4, subset D]
TableScan (emp) [exp 0, subset X]
Two new expressions are created. Expression 5 is in subset B (because it is equivalent to expression 2), and expression 6 is in a new equivalence class, subset E.
The products of a applying a rule can trigger a cascade of rules. Even in this simple system (2 rules and 4 initial expressions), two more rules are triggered:
CombineProjectsRule(exp 1, exp 5),
which creates
Project (deptno) [exp 7, subset A]
Filter (gender='F') [exp 6, subset E]
Project (deptno, gender, empno, salary) [exp 4, subset D]
TableScan (emp) [exp 0, subset X]
PushFilterThroughProjectRule(exp
6, exp 4), which creates
Project (deptno) [exp 1, subset A]
Project (deptno, gender, empno) [exp 5, subset B]
Project (deptno, gender, empno, salary) [exp 8, subset E]
Filter (gender='F') [exp 9, subset F]
TableScan (emp) [exp 0, subset X]
Each rule application adds additional members to existing subsets. The
non-singleton subsets are now A {1, 7}, B {2, 5} and E {6, 8}, and new
combinations are possible. For example, CombineProjectsRule(exp 7,
exp 8) further reduces the depth of the tree to:
Project (deptno) [exp 10, subset A]
Filter (gender='F') [exp 9, subset F]
TableScan (emp) [exp 0, subset X]
Todo: show how rules can cause subsets to merge.
Conclusion: