Cost-based maintenance of materialized views
    3.
    发明授权
    Cost-based maintenance of materialized views 失效
    物质视图的成本维护

    公开(公告)号:US6026390A

    公开(公告)日:2000-02-15

    申请号:US890194

    申请日:1997-07-09

    IPC分类号: G06F17/30

    摘要: A method of incrementally maintaining a first materialized view of data in a database, by means of an additional materialized view, first determines whether a cost in time of incrementally maintaining the first materialized view with the additional materialized view is less than the cost of incrementally maintaining the first materialized view without the additional materialized view. The method creates the additional materialized view only if the cost in time is less therewith. Determining whether the cost of employing an additional materialized view is less includes using an expression directed acyclic graph that corresponds to the first materialized view. Another method of determining whether the cost is less includes pruning an expression directed acyclic graph to produce a single expression tree, and using the single expression tree to determine whether the cost is less. Both the expression directed acyclic graph and the single expression tree contain equivalence nodes. One or more possible materialized views are selected by marking the equivalence nodes, and materializing one or more views corresponding to the marked equivalence nodes. One or more possible materialized views are also selected by determining which of the views, if materialized, would result in a lowest cost of incrementally maintaining the first materialized view. The method is also used to reduce the cost in time of maintaining a first materialized view employed to check an integrity constraint of the database.

    摘要翻译: 通过附加的物化视图来逐步维护数据库中的数据的第一物化视图的方法首先确定用附加物化视图逐步维持第一物化视图的时间成本是否小于增量维护的成本 第一个物化视图没有附加的物化视图。 该方法只有在时间上的成本较低时才创建附加的物化视图。 确定采用附加物化视图的成本是否较小包括使用对应于第一实例化视图的表达式有向无环图。 确定成本是否较小的另一种方法包括修剪表达式非循环图以产生单个表达式树,并使用单个表达式树来确定成本是否较小。 表达式非循环图和单个表达式树都包含等价节点。 通过标记等价节点以及实现对应于标记的等价节点的一个或多个视图来选择一个或多个可能的物化视图。 还可以通过确定哪些视图(如果实现)将导致递增地维持第一实例化视图的最低成本来选择一个或多个可能的物化视图。 该方法还用于降低维护用于检查数据库的完整性约束的第一物化视图的时间成本。

    System and method for performing joins and self-joins in a database
system
    4.
    发明授权
    System and method for performing joins and self-joins in a database system 失效
    在数据库系统中执行联接和自联接的系统和方法

    公开(公告)号:US5983215A

    公开(公告)日:1999-11-09

    申请号:US853108

    申请日:1997-05-08

    IPC分类号: G06F17/30

    摘要: A technique for efficiently joining multiple large tables in a database system which utilizes a join index. The technique uses a join index and minimizes the number of input/output operations while maximizing the use of the small main memory through a buffer allocation process based on the join index entries. The technique uses multi-dimensional partitioning and assigns partition identifiers to each buffer which are used to coordinate the resultant output files when the technique is complete. The output is vertically fragmented with one fragment for each input table which further allows the individual processing of each input table. The technique performs self-joins in a very efficient manner by requiring the records of the input table to be read only once.

    摘要翻译: 一种利用连接索引的数据库系统中有效连接多个大表的技术。 该技术使用连接索引并且最小化输入/输出操作的数量,同时通过基于连接索引条目的缓冲器分配过程最大化使用小主存储器。 该技术使用多维分区,并为每个缓冲区分配标识符,用于在技术完成时协调结果输出文件。 输出垂直分段,每个输入表有一个片段,进一步允许每个输入表的单独处理。 该技术通过要求输入表的记录只读一次,以非常有效的方式执行自联接。

    Concurrency control in materialized views of a database
    5.
    发明授权
    Concurrency control in materialized views of a database 失效
    数据库的物化视图中的并发控制

    公开(公告)号:US06889358B1

    公开(公告)日:2005-05-03

    申请号:US09004265

    申请日:1998-01-08

    摘要: In a database, a database manager can generate a view, which, in concept, is a subset of the database, which is placed outside the database for use without disturbing the database, and without disturbance by others using the database. The subset, or view, can be understood as a collection of rows, or tuples, of data copied from the database. With views existing, multiple copies of data within the database now exist: the original in the database, and copies in the views. If one of these is changed, without corresponding changes made in the others, then inconsistencies occur, which cannot be tolerated. Under the invention, when a user seeks a lock on a view, indicating that a change may be imminent, the invention locks a superset of the tuples in the database from which the view is derived. A superset is a set which contains the set of tuples of the view, plus possibly others. Thus, more tuples are locked than strictly necessary. The excess locking is tolerated because other benefits are obtained.

    摘要翻译: 在数据库中,数据库管理器可以生成一个视图,概念上是一个数据库的子集,数据库的一个子集被放置在数据库之外而不会干扰数据库,而不受其他使用数据库的干扰。 子集或视图可以被理解为从数据库复制的数据的行或元组的集合。 随着视图的存在,数据库中的数据库的多个副本现在存在:数据库中的原始数据,并在视图中复制。 如果其中之一发生变化,没有对其他变更进行相应更改,则会发生不一致,这是不能容忍的。 根据本发明,当用户在视图上寻求锁定时,指示可能即将发生变化,本发明将锁定从该视图导出的数据库中的元组的超集。 超集是一组包含视图的元组,还有其他元组。 因此,更多的元组被锁定,而不是严格必要的。 由于获得了其他好处,所以允许过多的锁定。

    Refreshing materialized views of a database to maintain consistency with underlying data
    6.
    发明授权
    Refreshing materialized views of a database to maintain consistency with underlying data 失效
    刷新数据库的物化视图以保持与底层数据的一致性

    公开(公告)号:US06272502B1

    公开(公告)日:2001-08-07

    申请号:US09075728

    申请日:1998-05-11

    IPC分类号: G06F1730

    摘要: In a database, a database manager can generate a view, which can be considered as a subset of the database, and which is placed outside the database for use without disturbing the database. However, if the database changes, the views will not reflect those changes, because the views are separate from the database. To solve this problem, a process called “refreshing” keeps the views consistent with the data within the database. But different refreshing approaches are used: some views require immediate refreshing when the database changes, other types can be refreshed at later times, and still other types can be refreshed at different times and intervals. The invention presents a system which keeps data consistent among the views and the database, despite the different times of refreshing undertaken.

    摘要翻译: 在数据库中,数据库管理器可以生成视图,该视图可以被视为数据库的一个子集,并将数据库放在数据库外部使用,而不会干扰数据库。 但是,如果数据库更改,则视图将不会反映这些更改,因为视图与数据库是分开的。 为了解决这个问题,一个叫做“刷新”的过程使视图与数据库中的数据保持一致。 但是使用不同的刷新方法:当数据库更改时,某些视图需要立即刷新,其他类型可以在稍后刷新,还可以在不同的时间和间隔刷新其他类型。 本发明提出了一种保持数据在数据库视图和数据库中保持一致的系统,尽管刷新次数不同。

    System and method for performing an efficient join operation
    7.
    发明授权
    System and method for performing an efficient join operation 失效
    用于执行有效连接操作的系统和方法

    公开(公告)号:US5802357A

    公开(公告)日:1998-09-01

    申请号:US632958

    申请日:1996-04-16

    IPC分类号: G06F7/22 G06F17/30

    摘要: A technique for efficiently joining multiple large tables in a database system with a processor using a small main memory. The technique utilizes a join index and minimizes the number of Input/Output operations while maximizing the use of the small main memory through a buffer allocation process. Three embodiments of the technique are described all of which use the parallel-merge operation. The first technique, slam-join, is for joining two tables and does not require any pre-allocation of buffers to perform the join operation. The second technique, multi-slam-join, is for joining three or more tables and adds the parallel-merge technique to a join technique which partitions memory to be used for an efficient join operation. The third technique, called parallel-join, processes each input table completely independently using the parallel-merge technique. The parallel-merge technique identifies the lowest value from multiple files and orders all the values from lowest to highest. This enables sequential reading of input files saving I/O operations.

    摘要翻译: 一种用于使用小型主存储器在处理器中有效地连接数据库系统中的多个大表的技术。 该技术利用连接索引并且最小化输入/输出操作的数量,同时通过缓冲器分配过程最大化使用小主存储器。 所有这些技术的三个实施例都被描述为使用并行合并操作。 第一种技术,slam-join,用于连接两个表,不需要任何预分配缓冲区来执行连接操作。 第二种技术,多重连接,用于连接三个或更多表,并且将并行合并技术添加到连接技术中,该技术将内存分区以用于高效的连接操作。 称为并行连接的第三种技术使用并行合并技术完全独立地处理每个输入表。 并行合并技术从多个文件中识别最低值,并将所有值从最低到最高。 这样可以顺序读取保存I / O操作的输入文件。

    Cache sensitive search (CSS) tree indexing system and method
    8.
    发明授权
    Cache sensitive search (CSS) tree indexing system and method 失效
    缓存敏感搜索(CSS)树索引系统和方法

    公开(公告)号:US06711562B1

    公开(公告)日:2004-03-23

    申请号:US09600266

    申请日:2002-02-27

    IPC分类号: G06F1730

    摘要: Cache sensitive search tree (CSS-tree) index structures for providing improved search and lookup performance compared with conventional searching schemes. The CSS-tree index structures include a directory tree structure which is stored in an array (216) and serves as an index for a sorted array of elements. The nodes (215) in the directory tree structure may be of sizes selected to correspond to the cache line size in the computer system utilizing the CSS-tree index structures. Child nodes (213) within the directory tree structure are located by performing arithmetic operations on array offsets. Thus, it is not necessary to store internal child node pointers, thereby reducing memory storage requirements. In addition, the CSS-tree index structures are organized so that traversing each level in the tree yields good data reference locality, and therefore relatively few cache misses. Thus, the CSS-tree index structures consider cache-related parameters such as reference locality and cache behavior, without requiring substantial additional amounts of memory.

    摘要翻译: 缓存敏感搜索树(CSS-tree)索引结构,用于提供与传统搜索方案相比改进的搜索和查找性能。 CSS树索引结构包括存储在数组(216)中并用作排序的元素数组的索引的目录树结构。 目录树结构中的节点(215)可以具有选择为对应于利用CSS树索引结构的计算机系统中的高速缓存行大小的大小。 目录树结构中的子节点(213)通过对数组偏移执行算术运算来定位。 因此,不需要存储内部子节点指针,从而减少存储器存储要求。 此外,CSS树索引结构被组织,使得遍历树中的每个级别产生良好的数据引用局部性,因此相对较少的高速缓存未命中。 因此,CSS树索引结构考虑了缓存相关参数,如参考位置和缓存行为,而不需要大量额外的内存量。

    Method of calculating tuples for data cubes
    9.
    发明授权
    Method of calculating tuples for data cubes 失效
    计算数据立方体的元组的方法

    公开(公告)号:US5987467A

    公开(公告)日:1999-11-16

    申请号:US911688

    申请日:1997-08-15

    IPC分类号: G06F17/30

    摘要: A method and apparatus of calculating data cubes is shown in which a data set is partitioned into memory sized data fragments and cuboid tuples are calculated from the data fragments. A search lattice of the data cube is used as a basis for ordering calculations of lower dimensional cuboids in the data cube. Identification of a minimum number of paths through the lattice that is sufficient to traverse all nodes in the lattice is achieved by iteratively duplicating twice all paths in a lower dimensional space, distributing a new attribute to the first duplicate, moving end points from paths of the second duplicate to a corresponding path in the first duplicate and merging the first and second duplicates.

    摘要翻译: 示出了计算数据立方体的方法和装置,其中数据集被分割成存储器大小的数据片段,并且从数据片段计算长方体元组。 使用数据立方体的搜索格子作为数据立方体中较低维度立方体排序计算的基础。 通过遍历网格中所有节点的足够的路径的最小路数的识别通过迭代地复制较低维空间中的所有路径的两倍来实现,将新属性分配给第一副本, 第二个复制到第一个副本中的相应路径,并合并第一个和第二个重复。

    System and method for performing an efficient join operation on large
tables with a small main memory
    10.
    发明授权
    System and method for performing an efficient join operation on large tables with a small main memory 失效
    用于在具有小主存储器的大表上执行高效连接操作的系统和方法

    公开(公告)号:US5666525A

    公开(公告)日:1997-09-09

    申请号:US531789

    申请日:1995-09-21

    申请人: Kenneth A. Ross

    发明人: Kenneth A. Ross

    IPC分类号: G06F7/22 G06F17/30

    摘要: A technique for efficiently joining multiple large tables in a database system with a processor using a small main memory. The technique utilizes a join index and minimizes the number of Input/Output operations while maximizing the use of the small main memory through a buffer allocation process. The technique partitions available main memory into buffers and assigns conditions to the buffers to ensure that each buffer will receive a substantially equal amount of data in the join result. The technique then processes each input table separately based on the assigned conditions and sequentially reads and processes each input table. The output is vertically fragmented with one fragment for each input table which further allows the individual processing of each input table. Also described is a method for creating a join index if one is not present.

    摘要翻译: 一种用于使用小型主存储器在处理器中有效地连接数据库系统中的多个大表的技术。 该技术利用连接索引并且最小化输入/输出操作的数量,同时通过缓冲器分配过程最大化使用小主存储器。 该技术将可用主存储器分为缓冲区,并为缓冲区分配条件,以确保每个缓冲区将在连接结果中接收到基本相等数量的数据。 然后,该技术基于分配的条件分别处理每个输入表,并且顺序地读取和处理每个输入表。 输出垂直分段,每个输入表有一个片段,进一步允许每个输入表的单独处理。 还描述了如果不存在连接索引的创建方法。