Optimizing database query execution by extending the relational algebra to include non-standard join operators

    公开(公告)号:US11068520B1

    公开(公告)日:2021-07-20

    申请号:US15890277

    申请日:2018-02-06

    IPC分类号: G06F16/00 G06F16/33 G06F16/23

    摘要: A method is executed at a computer system to retrieve data from a database. Upon receiving a database query, a database engine of the computer system parses the query to form an operator tree including a plurality of join operators. For each of the plurality of clauses, the database engine adds to the operator tree a respective node that specifies a mark join operator, a single join operator, an inner join operator, or an outer join operator. Specifically, the database engine adds the mark join operator when the respective clause includes one of a predetermined set of predicate subqueries, and adds the single join operator when the respective clause includes a scalar subquery. The database engine performs one or more optimization passes on the operator tree to form an optimized execution plan, and executes the optimized execution plan to retrieve a result set from the database.

    Elimination of Query Fragment Duplication in Complex Database Queries

    公开(公告)号:US20210019319A1

    公开(公告)日:2021-01-21

    申请号:US17064490

    申请日:2020-10-06

    IPC分类号: G06F16/2453

    摘要: A database engine includes one or more computing devices, each having one or more processors and memory. The memory stores programs configured for execution by the processors. The database engine receives a database query from a client, and parses the database query to build a query operator tree. The query operator tree includes a plurality of query operators. The database engine performs one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan. The deduplication optimization pass includes determining that a first query operator is equivalent to a second query operator during a traversal of the query operator tree, and replacing the second query operator with a link to reuse results from the first query operator. The database engine executes the optimized execution plan to retrieve a result set from the database and returns the result set to the client.

    Elimination of query fragment duplication in complex database queries

    公开(公告)号:US10795888B2

    公开(公告)日:2020-10-06

    申请号:US16231302

    申请日:2018-12-21

    IPC分类号: G06F16/2453

    摘要: A database engine receives a database query from a client. The database engine parses the database query to build a query operator tree that includes a plurality of query operators. The database engine performs one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan. The deduplication optimization pass includes: creating a list of query operators via a first traversal of the query operator tree, determining a first query operator that is equivalent to a second query operator, based on a hash map, via a second traversal of the query operator tree, and substituting, via a third traversal of the query operator tree, the second query operator with a tree node that links to the first query operator. The database engine executes the optimized execution plan to retrieve a result set from the database, and returns the result set.

    Dynamic rebuilding of query execution trees and reselection of query execution operators

    公开(公告)号:US10795887B1

    公开(公告)日:2020-10-06

    申请号:US15681294

    申请日:2017-08-18

    摘要: A database engine receives a query and parses the query to form a first intermediate query. The engine compiles the first intermediate query to form a first executable plan that includes in-memory operators that execute within memory without swapping to secondary memory. While executing a first in-memory operator in the first executable plan, the engine detects insufficient memory and aborts execution of the first executable plan. The engine optimizes the first intermediate query to form a second intermediate query, and compiles the second intermediate query to form a second executable plan. The second plan includes spooling operators that execute within fixed memory budgets and are configured to swap to the secondary memory when needed. The engine executes the second executable plan, including the spooling operators, to retrieve results from the database that are responsive to the query. The engine then returns the retrieved results.

    Hybrid comparison for unicode text strings consisting primarily of ASCII characters

    公开(公告)号:US10789416B2

    公开(公告)日:2020-09-29

    申请号:US16726737

    申请日:2019-12-24

    摘要: A method compares text strings having Unicode encoding. The method receives a first string S=s1s2 . . . sn and a second string T=t1t2 . . . tm, where s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters. The method computes a first string weight for the first string S according to a weight function ƒ. When S consists of ASCII characters, ƒ(S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, ƒ(S)=g(s1)g(s2) . . . g(sn), where g(si)=si when si is an ASCII character and g(si)=s′i when si is an accented ASCII character that is replaceable by the corresponding ASCII character s′i. The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.

    Dynamic Rebuilding of Query Execution Trees and Reselection of Query Execution Operators

    公开(公告)号:US20220237193A1

    公开(公告)日:2022-07-28

    申请号:US17719346

    申请日:2022-04-12

    IPC分类号: G06F16/2453 G06F16/2455

    摘要: A method dynamically selects query execution operators. A database engine receives a query, parses the query to form a query execution tree, and compiles the tree to form a first executable plan that includes in-memory operators. The database engine executes the first plan, including executing in-memory operators in parallel. While executing a first in-memory operator, insufficient memory is detected. In response, the database engine aborts the execution, and recompiles the query tree in two ways, forming a second executable plan that replaces the first in-memory operator with a first spooling operator. The first spooling operator executes within a fixed volatile memory budget and swaps to non-volatile memory according to the budget. A third executable plan retains the first in-memory operator, but schedules it to run serially. The database engine selects either the second plan or the third plan, and executes the selected plan to return results for the query.

    Computing domain cardinality estimates for optimizing database query execution

    公开(公告)号:US10824625B1

    公开(公告)日:2020-11-03

    申请号:US16236183

    申请日:2018-12-28

    IPC分类号: G06F16/2453 G06N3/08

    摘要: A method implements optimization of database queries by computing domain cardinality estimates. A client sends a database query to a server. The method parses the query to identify data columns. For each of the data columns, the method computes a lower bound and an upper bound of distinct data values using a pre-computed table size. The method also computes a patch factor by applying a pre-computed function to a ratio between a number of distinct data values that appear exactly once in a data sample and a number of distinct data values in the sample. Based on the patch factor, the lower bound, and the upper bound, the method computes an estimate of distinct values for each of the data columns. The method subsequently generates an execution plan for the query according to the computed estimates, executes the execution plan, and returns a result set to the client.

    Dynamic rebuilding of query execution trees and reselection of query execution operators

    公开(公告)号:US11301469B2

    公开(公告)日:2022-04-12

    申请号:US17013439

    申请日:2020-09-04

    摘要: A method dynamically selects query execution operators. A database engine receives a query, parses the query to form a query execution tree, and compiles the tree to form a first executable plan that includes in-memory operators. The database engine executes the first plan, including executing in-memory operators in parallel. While executing a first in-memory operator, insufficient memory is detected. In response, the database engine aborts the execution, and recompiles the query tree in two ways, forming a second executable plan that replaces the first in-memory operator with a first spooling operator. The first spooling operator executes within a fixed volatile memory budget and swaps to non-volatile memory according to the budget. A third executable plan retains the first in-memory operator, but schedules it to run serially. The database engine selects either the second plan or the third plan, and executes the selected plan to return results for the query.

    Hybrid approach to collating unicode text strings consisting primarily of ASCII characters

    公开(公告)号:US10325010B1

    公开(公告)日:2019-06-18

    申请号:US16134919

    申请日:2018-09-18

    摘要: Collating text strings having Unicode encoding includes receiving two text strings S=s1s2 . . . s and T=t1t2 . . . tm. When the two text strings are not identical, there is a smallest positive integer p for which the two text strings differ. The process looks up the characters sp and tp in a predefined lookup table. If either of these characters is missing from the lookup table, the collation of the text strings is determined using the standard Unicode comparison of the text strings spsp+1 . . . sn and tptp+1 . . . tm. Otherwise, the lookup table assigns weights vp and wp for the characters sp and tp. When vp≠wp, these weights define the collation order of the strings S and T. When vp=wp, the collation of S and T is determined recursively using the suffix strings sp+1 . . . sn and tp+1 . . . tm.

    Hybrid comparison for unicode text strings consisting primarily of ASCII characters

    公开(公告)号:US10089281B1

    公开(公告)日:2018-10-02

    申请号:US15719479

    申请日:2017-09-28

    摘要: Comparing text strings with Unicode encoding includes receiving two text strings S1 and S2. The process computes, for the first text string S1, a first weight according to a weight function ƒ that computes an ASCII prefix ƒA(S1), computes a Unicode weight suffix ƒU(S1), and concatenates the weights to form the first weight ƒ(S1)=ƒA(S1)+ƒU(S1). Computing the ASCII prefix for the first string applies bitwise operations to n-byte contiguous blocks of the first string to determine whether each block contains only ASCII characters, and replaces accented Unicode characters with equivalent unaccented ASCII characters when comparison is designated as accent-insensitive. When there is a first block containing a non-replaceable non-ASCII character, the Unicode weight suffix is computed by performing a character-by-character Unicode weight lookup beginning with the first block. The same process is applied to the second string. The text string are compared by comparing their computed weights.