MySQL 8.0 Server层最新架构详解
(SELECT *
FROM ttt1)
UNION ALL
(SELECT *
FROM
(SELECT *
FROM ttt2) AS a,
(SELECT *
FROM ttt3
UNION ALL SELECT *
FROM ttt4) AS b)
setup_tables:Set up table leaves in the query block based on list of tables.
resolve_placeholder_tables/merge_derived/setup_table_function/setup_materialized_derived:Resolve derived table, view or table function references in query block.
setup_natural_join_row_types:Compute and store the row types of the top-most NATURAL/USING joins.
setup_wild:Expand all '*' in list of expressions with the matching column references.
setup_base_ref_items:Set query_block's base_ref_items.
setup_fields:Check that all given fields exists and fill struct with current data.
setup_conds:Resolve WHERE condition and join conditions.
setup_group:Resolve and set up the GROUP BY list.
m_having_cond->fix_fields:Setup the HAVING clause.
resolve_rollup:Resolve items in SELECT list and ORDER BY list for rollup processing.
resolve_rollup_item:Resolve an item (and its tree) for rollup processing by replacing items matching grouped expressions with Item_rollup_group_items and updating properties (m_nullable, PROP_ROLLUP_FIELD). Also check any GROUPING function for incorrect column.
setup_order:Set up the ORDER BY clause.
resolve_limits:Resolve OFFSET and LIMIT clauses.
Window::setup_windows1:Set up windows after setup_order() and before setup_order_final().
setup_order_final:Do final setup of ORDER BY clause, after the query block is fully resolved.
setup_ftfuncs:Setup full-text functions after resolving HAVING.
resolve_rollup_wfs : Replace group by field references inside window functions with references in the presence of ROLLUP.
remove_redundant_subquery_clause : Permanently remove redundant parts from the query if 1) This is a subquery 2) Not normalizing a view. Removal should take place when a query involving a view is optimized, not when the view is created.
remove_base_options:Remove SELECT_DISTINCT options from a query block if can skip distinct.
resolve_subquery : Resolve predicate involving subquery, perform early unconditional subquery transformations.
Convert subquery predicate into semi-join, or
Mark the subquery for execution using materialization, or
Perform IN->EXISTS transformation, or
Perform more/less ALL/ANY -> MIN/MAX rewrite
Substitute trivial scalar-context subquery with its value
transform_scalar_subqueries_to_join_with_derived:Transform eligible scalar subqueries to derived tables.
flatten_subqueries:Convert semi-join subquery predicates into semi-join join nests. Convert candidate subquery predicates into semi-join join nests. This transformation is performed once in query lifetime and is irreversible.
apply_local_transforms :
delete_unused_merged_columns : If query block contains one or more merged derived tables/views, walk through lists of columns in select lists and remove unused columns.
simplify_joins:Convert all outer joins to inner joins if possible
prune_partitions:Perform partition pruning for a given table and condition.
push_conditions_to_derived_tables:Pushing conditions down to derived tables must be done after validity checks of grouped queries done by apply_local_transforms();
Window::eliminate_unused_objects:Eliminate unused window definitions, redundant sorts etc.
testdb=# SELECT id, data FROM tbl_a WHERE id < 300 ORDER BY data;
sampledb=# CREATE VIEW employees_list
sampledb-# AS SELECT e.id, e.name, d.name AS department
sampledb-# FROM employees AS e, departments AS d WHERE e.department_id = d.id;
sampledb=# SELECT * FROM employees_list;
| -> Inner hash join (no condition) (cost=1.40 rows=1)
-> Table scan on R4 (cost=0.35 rows=1)
-> Hash
-> Inner hash join (no condition) (cost=1.05 rows=1)
-> Table scan on R3 (cost=0.35 rows=1)
-> Hash
-> Inner hash join (no condition) (cost=0.70 rows=1)
-> Table scan on R2 (cost=0.35 rows=1)
-> Hash
-> Table scan on R1 (cost=0.35 rows=1)
| -> Nested loop inner join (cost=0.55..0.55 rows=0)
-> Nested loop inner join (cost=0.50..0.50 rows=0)
-> Table scan on R4 (cost=0.25..0.25 rows=1)
-> Filter: (R4.c1 = R3.c1) (cost=0.35..0.35 rows=0)
-> Table scan on R3 (cost=0.25..0.25 rows=1)
-> Nested loop inner join (cost=0.50..0.50 rows=0)
-> Table scan on R2 (cost=0.25..0.25 rows=1)
-> Filter: (R2.c1 = R1.c1) (cost=0.35..0.35 rows=0)
-> Table scan on R1 (cost=0.25..0.25 rows=1)
Logical transformations
optimize_derived : Optimize the query expression representing a derived table/view.
optimize_cond : Equality/constant propagation.
prune_table_partitions : Partition pruning.
optimize_aggregated_query : COUNT(*), MIN(), MAX() constant substitution in case of implicit grouping.
substitute_gc : ORDER BY optimization, substitute all expressions in the WHERE condition and ORDER/GROUP lists that match generated columns (GC) expressions with GC fields, if any.
Perform cost-based optimization of table order and access path selection.
JOIN::make_join_plan() : Set up join order and initial access paths.
Post-join order optimization
substitute_for_best_equal_field : Create optimal table conditions from the where clause and the join conditions.
make_join_query_block : Inject outer-join guarding conditions.
Adjust data access methods after determining table condition (several times).
optimize_distinct_group_order : Optimize ORDER BY/DISTINCT.
optimize_fts_query : Perform FULLTEXT search before all regular searches.
remove_eq_conds : Removes const and eq items. Returns the new item, or nullptr if no condition.
replace_index_subquery/create_access_paths_for_index_subquery : See if this subquery can be evaluated with subselect_indexsubquery_engine.
setup_join_buffering : Check whether join cache could be used.
Code generation
alloc_qep(tables) : Create QEP_TAB array.
test_skip_sort : Try to optimize away sorting/distinct.
make_join_readinfo : Plan refinement stage: do various setup things for the executor.
make_tmp_tables_info : Setup temporary table usage for grouping and/or sorting.
push_to_engines : Push (parts of) the query execution down to the storage engines if they can provide faster execution of the query, or part of it.
create_access_paths : generated ACCESS_PATH.
先判断是否属于新优化器可以支持的Query语法(CheckSupportedQuery),不支持的直接返回错误ER_HYPERGRAPH_NOT_SUPPORTED_YET。
转化top_join_list变成JoinHypergraph结构。由于Hypergraph是比较独立的算法层面的实现,JoinHypergraph结构用来更好的把数据库的结构包装到Hypergraph的edges和nodes的概念上的。
通过EnumerateAllConnectedPartitions实现论文中的DPhyp算法。
CostingReceiver类包含了过去JOIN planning的主要逻辑,包括根据cost选择相应的访问路径,根据DPhyp生成的子计划进行评估,保留cost最小的子计划。
得到root_path后,接下来处理group/agg/having/sort/limit的。对于Group by操作,目前Hypergraph使用sorting first + streaming aggregation的方式。
Query_expression::m_root_iterator = CreateIteratorFromAccessPath(......)
unique_ptr_destroy_only<RowIterator> CreateIteratorFromAccessPath(
THD *thd, AccessPath *path, JOIN *join, bool eligible_for_batch_mode) {
......
switch (path->type) {
case AccessPath::TABLE_SCAN: {
const auto ¶m = path->table_scan();
iterator = NewIterator<TableScanIterator>(
thd, param.table, path->num_output_rows, examined_rows);
break;
}
case AccessPath::INDEX_SCAN: {
const auto ¶m = path->index_scan();
if (param.reverse) {
iterator = NewIterator<IndexScanIterator<true>>(
thd, param.table, param.idx, param.use_order, path->num_output_rows,
examined_rows);
} else {
iterator = NewIterator<IndexScanIterator<false>>(
thd, param.table, param.idx, param.use_order, path->num_output_rows,
examined_rows);
}
break;
}
case AccessPath::REF: {
......
}
testdb=# EXPLAIN SELECT * FROM tbl_a WHERE id < 300 ORDER BY data;
QUERY PLAN
---------------------------------------------------------------
Sort (cost=182.34..183.09 rows=300 width=8)
Sort Key: data
-> Seq Scan on tbl_a (cost=0.00..170.00 rows=300 width=8)
Filter: (id < 300)
(4 rows)
关于我们
数据库技术图谱
数据库技术图谱由阿里云数据库专家团出品,含7个知识点,17个课程 ,6个体验场景,9个公开课,带你从基础到进阶玩转常见数据库,同时深入实战,学习阿里云使用各数据库的最佳实践!