Method and apparatus for generating dynamic and hybrid sparse indices
for workfiles used in SQL queries
    1.
    发明授权
    Method and apparatus for generating dynamic and hybrid sparse indices for workfiles used in SQL queries 失效
    用于生成用于SQL查询的工作文件的动态和混合稀疏索引的方法和装置

    公开(公告)号:US5758145A

    公开(公告)日:1998-05-26

    申请号:US393803

    申请日:1995-02-24

    IPC分类号: G06F17/30

    摘要: A method, apparatus and article for manufacture for generating static, dynamic and hybrid sparse indices for use with workfiles used by SQL queries in a relational database management system. A workfile and a sparse index structure are temporarily created in the computer during execution of the query by the computer. The workfile stores intermediate relations resulting from execution of a portion of the SQL query, wherein the intermediate relations comprise sorted rows for an inner table referenced in the SQL query. The sparse index structure contains one or more entries indicating at least an approximate location for at least some of the rows in the workfile. As one or more rows from an outer table referenced in the SQL query are retrieved, the entries of the sparse index structure are searched for a closest matching entry for each retrieved row. The sorted workfile is then scanned for a row matching each retrieved row using the closest matching entry from the sparse index structure as a starting position. The sparse index structure is then updated with an entry corresponding to the row from the sorted workfile matching the retrieved row from the outer table.

    摘要翻译: 一种用于生成静态,动态和混合稀疏索引的方法,装置和制品,用于在关系数据库管理系统中用于SQL查询的工作文件。 在计算机执行查询期间,在计算机中临时创建工作文件和稀疏索引结构。 工作文件存储由执行SQL查询的一部分而产生的中间关系,其中中间关系包括在SQL查询中引用的内部表的排序行。 稀疏索引结构包含一个或多个条目,至少指示工作文件中至少一些行的近似位置。 当从SQL查询中引用的外部表中的一行或多行被检索到时,搜索稀疏索引结构的条目以获取每个检索行的最接近的匹配条目。 然后,使用距离稀疏索引结构最接近的匹配条目作为起始位置,扫描排序的工作文件,使其与每个检索到的行相匹配的行。 然后使用与来自外部表的检索到的行匹配的排序工作文件中的行对应的条目来更新稀疏索引结构。

    Executing complex SQL queries using index screening for conjunct or
disjunct index operations
    4.
    发明授权
    Executing complex SQL queries using index screening for conjunct or disjunct index operations 失效
    使用索引筛选来执行复杂的SQL查询,以进行连续或分离的索引操作

    公开(公告)号:US6081799A

    公开(公告)日:2000-06-27

    申请号:US305552

    申请日:1999-05-05

    IPC分类号: G06F17/30

    摘要: A method, apparatus, and article of manufacture for an index screening system. A query is executed to access data stored on a data storage device connected to a computer. In particular, while accessing one or more indexes to retrieve row identifiers, index matching predicates in the query are applied to select row identifiers and index screening predicates in the query are applied to eliminate one or more selected row identifiers.

    摘要翻译: 索引筛选系统的方法,装置和制品。 执行查询以访问存储在连接到计算机的数据存储设备上的数据。 特别地,在访问一个或多个索引以检索行标识符的同时,将查询中的索引匹配谓词应用于选择行标识符,并且应用查询中的索引筛选谓词以消除一个或多个所选择的行标识符。

    Query optimization through the use of multi-column statistics to avoid
the problems of column correlation
    5.
    发明授权
    Query optimization through the use of multi-column statistics to avoid the problems of column correlation 失效
    查询优化通过使用多列统计来避免列关联的问题

    公开(公告)号:US5995957A

    公开(公告)日:1999-11-30

    申请号:US808521

    申请日:1997-02-28

    IPC分类号: G06F17/30

    摘要: The system, method, and program of this invention collects multi-column statistics, by a database management system, to reflect a relationship among multiple columns of a table in a relational database. These statistics are stored in the system catalog, and are used during query optimization to obtain an estimate of the number of qualifying rows when a query has predicates on multiple columns of a table.A multi-column linear quantile statistic is collected by dividing the data of multiple columns into sub-ranges where each sub-range has approximately an even distribution of data, and determining a frequency and cardinality of each sub-range. A multi-column polygonal quantile statistic is collected by dividing the data of multiple columns into sub-spaces where each sub-space contains approximately the same number of tuples, and determining a frequency and cardinality of each sub-space.The system catalog is accessed for the stored multi-column linear quantile statistic for a query having a single range predicate and at least one equal predicate to determine the selectivity value for the predicates of the query. The system catalog is accessed for the stored multi-column polygonal quantile statistic for a query having more than one range predicate. These statistics are used in various ways to determine the selectivity value for the predicates of the query.

    摘要翻译: 本发明的系统,方法和程序通过数据库管理系统收集多列统计,以反映关系数据库中的表的多个列之间的关系。 这些统计信息存储在系统目录中,并且在查询优化期间使用,以便在查询在表的多个列上进行谓词时获取合格行数的估计。 通过将多列的数据划分成子范围,其中每个子范围具有大致均匀的数据分布,并且确定每个子范围的频率和基数,来收集多列线性分位数统计量。 通过将多列的数据划分成子空间,其中每个子空间包含大致相同数量的元组,并且确定每个子空间的频率和基数,来收集多列多边形分位数统计量。 对于具有单一范围谓词和至少一个相等谓词的查询,为存储的多列线性分位数统计量访问系统目录,以确定查询谓词的选择性值。 对于具有多个范围谓词的查询,为存储的多列多边形分位数统计量访问系统目录。 这些统计信息以各种方式用于确定查询谓词的选择性值。

    Query optimization through the use of multi-column statistics to avoid the problems of non-indexed column correlation
    6.
    发明授权
    Query optimization through the use of multi-column statistics to avoid the problems of non-indexed column correlation 有权
    查询优化通过使用多列统计来避免非索引列相关的问题

    公开(公告)号:US06272487B1

    公开(公告)日:2001-08-07

    申请号:US09277612

    申请日:1999-03-26

    IPC分类号: G06F1730

    摘要: The system, method, and program of this invention collects multi-column statistics, by a database management system, to reflect a relationship among multiple columns of a table in a relational database. These statistics are stored in the system catalog, and are used during query optimization to obtain an estimate of the number of qualifying rows when a query has predicates on multiple columns of a table. A multi-column linear quantile statistic is collected by dividing the data of multiple columns into sub-ranges where each sub-range has approximately an even distribution of data, and determining a frequency and cardinality of each sub-range. A multi-column polygonal quantile statistic is collected by dividing the data of multiple columns into sub-spaces where each sub-space contains approximately the same number of tuples, and determining a frequency and cardinality of each sub-space. The system catalog is accessed for the stored multi-column linear quantile statistic for a query having a single range predicate and at least one equal predicate to determine the selectivity value for the predicates of the query. The system catalog is accessed for the stored multi-column polygonal quantile statistic for a query having more than one range predicate. These statistics are used in various ways to determine the selectivity value for the predicates of the query.

    摘要翻译: 本发明的系统,方法和程序通过数据库管理系统收集多列统计,以反映关系数据库中的表的多个列之间的关系。 这些统计信息存储在系统目录中,并且在查询优化期间使用,以便当查询在表的多个列上进行谓词时获得限定行数的估计。通过划分数据来收集多列线性分位数统计量 的多个列组成子范围,其中每个子范围具有大致均匀的数据分布,以及确定每个子范围的频率和基数。 通过将多列的数据划分为子空间,其中每个子空间包含大致相同数量的元组,并确定每个子空间的频率和基数,收集多列多边形分位数统计量。系统目录被访问 对于具有单个范围谓词和至少一个相等谓词的查询的存储的多列线性分位数统计量来确定查询的谓词的选择性值。 对于具有多个范围谓词的查询,为存储的多列多边形分位数统计量访问系统目录。 这些统计信息以各种方式用于确定查询谓词的选择性值。