摘要:
An improved method, system, and computer program product is disclosed that provides a hybrid approach to optimization which allows different subsets of data accessed by a query to be optimized with different access paths and execution approaches. Transformations may be performed to re-write the query, which restructures the query in a way that facilitates the hybrid optimization process. Multiple transformations may be interleaved to produce an efficient re-written query.
摘要:
Techniques are provided for delaying evaluation of expensive expressions in a query. Expensive expressions in the query are established by cost information or by looking up a list of known expensive expressions for a match. After an execution plan is determined by using the early evaluation technique, one or more equivalent execution plans is established. The one or more equivalent execution plans may include both a type of execution plans that delay evaluation of expensive expressions and a type of execution plans that do not. In addition, the one or more equivalent execution plans may include both parallelized and non-parallelized alternatives to the execution plan identified by the early evaluation technique. Finally, based on a set of criteria, which may include comparing cost information among all the equivalent execution plans generated thus far, the best execution plan is chosen for the query.
摘要:
Under a type of query transformation referred to herein as join factorization, the branches of an UNION/UNION ALL query that join a common table are combined to reduce accesses to the common table. The transformation can be expressed as (T1 join T2) union all (T1 join T3)=T1 join (T2 union all T3), where T1, T2 and T3 are three tables. A given query may be rewritten in many alternate ways using join factorization. Evaluating each alternative can be expensive. Therefore, the alternatives are generated and evaluated in a way that minimizes the cost of evaluating the alternatives.
摘要翻译:在这里称为连接因式分解的一种类型的查询变换中,加入公共表的UNION / UNION ALL查询的分支被组合以减少对公共表的访问。 转换可以表示为(T1连接T2)联合全部(T1连接T3)= T1连接(T2联合全T3),其中T1,T2和T3是三个表。 给定的查询可以使用连接因式分解以许多替代方式重写。 评估每个替代品可能是昂贵的。 因此,以最小化评估替代品的成本的方式生成和评估替代方案。
摘要:
Under a type of query transformation referred to herein as join factorization, the branches of an UNION/UNION ALL query that join a common table are combined to reduce accesses to the common table. The transformation can be expressed as (T1 join T2) union all (T1 join T3)=T1 join (T2 union all T3), where T1, T2 and T3 are three tables. A given query may be rewritten in many alternate ways using join factorization. Evaluating each alternative can be expensive. Therefore, the alternatives are generated and evaluated in a way that minimizes the cost of evaluating the alternatives.
摘要翻译:在这里称为连接因式分解的一种类型的查询变换中,加入公共表的UNION / UNION ALL查询的分支被组合以减少对公共表的访问。 转换可以表示为(T1连接T2)联合全部(T1连接T3)= T1连接(T2联合全T3),其中T1,T2和T3是三个表。 给定的查询可以使用连接因式分解以许多替代方式重写。 评估每个替代品可能是昂贵的。 因此,以最小化评估替代品的成本的方式生成和评估替代方案。
摘要:
A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.
摘要:
Under a type of query transformation referred to herein as join factorization, the branches of an UNION/UNION ALL query that join a common table are combined to reduce accesses to the common table. The transformation can be expressed as (T1 join T2) union all (T1 join T3)=T1 join (T2 union all T3), where T1, T2 and T3 are three tables. A given query may be rewritten in many alternate ways using join factorization. Evaluating each alternative can be expensive. Therefore, the alternatives are generated and evaluated in a way that minimizes the cost of evaluating the alternatives.
摘要翻译:在这里称为连接因式分解的一种类型的查询变换中,加入公共表的UNION / UNION ALL查询的分支被组合以减少对公共表的访问。 转换可以表示为(T 1连接T 2)联合全部(T 1连接T 3)= T 1连接(T 2并联全部T 3),其中T 1,T 2和T 3是三个表。 给定的查询可以使用连接因式分解以许多替代方式重写。 评估每个替代品可能是昂贵的。 因此,以最小化评估替代品的成本的方式生成和评估替代方案。
摘要:
Under a type of query transformation referred to herein as join factorization, the branches of an UNION/UNION ALL query that join a common table are combined to reduce accesses to the common table. The transformation can be expressed as (T1 join T2) union all (T1 join T3)=T1 join (T2 union all T3), where T1, T2 and T3 are three tables. A given query may be rewritten in many alternate ways using join factorization. Evaluating each alternative can be expensive. Therefore, the alternatives are generated and evaluated in a way that minimizes the cost of evaluating the alternatives.
摘要翻译:在这里称为连接因式分解的一种类型的查询变换中,加入公共表的UNION / UNION ALL查询的分支被组合以减少对公共表的访问。 转换可以表示为(T1连接T2)联合全部(T1连接T3)= T1连接(T2联合全T3),其中T1,T2和T3是三个表。 给定的查询可以使用连接因式分解以许多替代方式重写。 评估每个替代品可能是昂贵的。 因此,以最小化评估替代品的成本的方式生成和评估替代方案。
摘要:
A method and apparatus for merging synopses to determine a database statistic, e.g., a number of distinct values (NDV), is disclosed. The merging can be used to determine an initial database statistic or to perform incremental statistics maintenance. For example, each synopsis can pertain to a different partition, such that merging the synopses generates a global statistic. When performing incremental maintenance, only those synopses whose partitions have changed need to be updated. Each synopsis contains domain values that summarize the statistic. However, the synopses may initially contain domain values that are not compatible with each other. Prior to merging the synopses the domain values in each synopsis is made compatible with the domain values in the other synopses. The adjustment is made such that each synopsis represents the same range of domain values, in one embodiment. After “compatible synopses” are formed, the synopses are merged by taking the union of the compatible synopses.
摘要:
A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.
摘要:
A method and apparatus for merging synopses to determine a database statistic, e.g., a number of distinct values (NDV), is disclosed. The merging can be used to determine an initial database statistic or to perform incremental statistics maintenance. For example, each synopsis can pertain to a different partition, such that merging the synopses generates a global statistic. When performing incremental maintenance, only those synopses whose partitions have changed need to be updated. Each synopsis contains domain values that summarize the statistic. However, the synopses may initially contain domain values that are not compatible with each other. Prior to merging the synopses the domain values in each synopsis is made compatible with the domain values in the other synopses. The adjustment is made such that each synopsis represents the same range of domain values, in one embodiment. After “compatible synopses” are formed, the synopses are merged by taking the union of the compatible synopses.