Merging synopses to determine number of distinct values in large databases
    121.
    发明申请
    Merging synopses to determine number of distinct values in large databases 有权
    合并摘要以确定大型数据库中不同值的数量

    公开(公告)号:US20080120275A1

    公开(公告)日:2008-05-22

    申请号:US11796110

    申请日:2007-04-25

    IPC分类号: G06F17/30

    摘要: A method and apparatus for merging synopses to determine a database statistic, e.g., a number of distinct values (NDV), is disclosed. The merging can be used to determine an initial database statistic or to perform incremental statistics maintenance. For example, each synopsis can pertain to a different partition, such that merging the synopses generates a global statistic. When performing incremental maintenance, only those synopses whose partitions have changed need to be updated. Each synopsis contains domain values that summarize the statistic. However, the synopses may initially contain domain values that are not compatible with each other. Prior to merging the synopses the domain values in each synopsis is made compatible with the domain values in the other synopses. The adjustment is made such that each synopsis represents the same range of domain values, in one embodiment. After “compatible synopses” are formed, the synopses are merged by taking the union of the compatible synopses.

    摘要翻译: 公开了用于合并概要以确定数据库统计量的方法和装置,例如多个不同值(NDV)。 合并可用于确定初始数据库统计信息或执行增量统计维护。 例如,每个概要可以涉及不同的分区,以便合并概要会生成全局统计量。 执行增量维护时,只需要更新其分区已更改的概要文件。 每个概要包含总结统计量的域值。 但是,这些概要可能最初包含彼此不兼容的域值。 在合并概要之前,每个概要中的域值与其他概要中的域值兼容。 在一个实施例中进行调整,使得每个概要表示相同范围的域值。 在形成“兼容简介”之后,通过兼容兼容简报的合并来合并概要。

    Query processing in a parallel single cursor model on multi-instance configurations, using hints
    122.
    发明申请
    Query processing in a parallel single cursor model on multi-instance configurations, using hints 有权
    使用提示在多实例配置上的并行单光标模型中进行查询处理

    公开(公告)号:US20070038595A1

    公开(公告)日:2007-02-15

    申请号:US11202453

    申请日:2005-08-11

    IPC分类号: G06F17/30

    摘要: A database statement is processed in a multi-server system, in a manner to increase the possibility that slave server processes on remote nodes will generate execution plans that are equivalent to the corresponding execution plan generated by the query coordinator process. A set of hints is generated based on the same information on which the master plan is based. The set of hints is sent to remote nodes, where respective remote plans are generated based in part on the set of hints. Use of the hints in generation of the remote plan increases the possibility that the remote plan will be equivalent to the master plan and that the slaves on the other database server will be able to join in parallel processing of the database statement.

    摘要翻译: 在多服务器系统中处理数据库语句,以增加从属服务器在远程节点上处理的可能性将生成与查询协调器进程生成的相应执行计划相等的执行计划。 基于主计划所基于的相同信息生成一组提示。 一组提示被发送到远程节点,其中基于一组提示部分地生成相应的远程计划。 使用提示生成远程计划会增加远程计划等同于主计划的可能性,并且其他数据库服务器上的从站将能够并行处理数据库语句。

    Techniques for pruning a data object during operations that join multiple data objects
    123.
    发明授权
    Techniques for pruning a data object during operations that join multiple data objects 有权
    在连接多个数据对象的操作期间修剪数据对象的技术

    公开(公告)号:US07020661B1

    公开(公告)日:2006-03-28

    申请号:US10193620

    申请日:2002-07-10

    IPC分类号: G06F17/30

    摘要: Techniques for eliminating one or more portions of a data object from any join step of an operation that joins multiple data objects include determining that an operation joins a first data object and a second data object. The second data object includes multiple portions. Each of multiple data units of the first data object is scanned. Based on data in the data units of the first data object, information is generated. The information indicates a portion of the second data object for exclusion. The indicated portion is excluded from an output of the operation. Only one or more portions of the second data object that are not indicated for exclusion in the information are included in a particular join step involving the second data object. By pruning a large second table, such as a fact table, the computational resources consumed by the joins are substantially reduced.

    摘要翻译: 用于从连接多个数据对象的操作的任何连接步骤中消除数据对象的一个​​或多个部分的技术包括确定操作连接第一数据对象和第二数据对象。 第二数据对象包括多个部分。 扫描第一数据对象的多个数据单元中的每一个。 基于第一数据对象的数据单元中的数据,生成信息。 该信息指示用于排除的第二数据对象的一部分。 指示的部分从操作的输出中排除。 仅在信息中未被指示排除的第二数据对象的一个​​或多个部分被包括在涉及第二数据对象的特定连接步骤中。 通过修剪诸如事实表的大的第二表,大大减少了连接消耗的计算资源。

    Method and mechanism for partition pruning
    124.
    发明授权
    Method and mechanism for partition pruning 有权
    分割修剪的方法和机制

    公开(公告)号:US06965891B1

    公开(公告)日:2005-11-15

    申请号:US09795904

    申请日:2001-02-27

    IPC分类号: G06F17/30

    摘要: A method and system for performing partition pruning for queries that include a non-single table predicate is disclosed. According to an embodiment of the invention, this type of query is processed by performing a transformation of the query to include additional predicates comprising subqueries. The transformed query includes single table predicates on the partitioning column of the table being queried, based upon join predicates that exist in the original query.

    摘要翻译: 公开了一种用于执行包括非单表谓词的查询的分割修剪的方法和系统。 根据本发明的实施例,通过执行查询的变换以包括包括子查询的附加谓词来处理这种类型的查询。 基于原始查询中存在的连接谓词,转换查询包括正在查询的表的分区列上的单表谓词。

    Enabling intra-partition parallelism for partition-based operations
    125.
    发明授权
    Enabling intra-partition parallelism for partition-based operations 有权
    为分区操作启用分区内并行

    公开(公告)号:US06954776B1

    公开(公告)日:2005-10-11

    申请号:US09851065

    申请日:2001-05-07

    IPC分类号: G06F9/46 G06F9/50 G06F15/16

    CPC分类号: G06F9/5066 G06F17/30445

    摘要: Techniques are provided for increasing the degree of parallelism without incurring overhead costs associated with inter-nodal communication for performing parallel operations. One aspect of the invention is to distribute-phase partition-pairs of a parallel partition-wise operation on a pair of objects among the nodes of a database system. The -phase partition-pairs that are distributed to each node are further partitioned to form a new set of-phase partition-pairs. One -phase partition-pair from the set of new-phase partition-pairs is assigned to each slave process that is on a given node. In addition, a target object may be partitioned by applying an appropriate hash function to the tuples of the target object. The parallel operation is performed by broadcasting each tuple from a source table only to the group of slave processes that is working on the static partition to which the tuple is mapped.

    摘要翻译: 提供了用于增加并行程度的技术,而不会产生与用于执行并行操作的节间通信相关联的开销成本。 本发明的一个方面是在数据库系统的节点之间的一对对象上分布并行分区操作的相位分区对。 分配给每个节点的相位分区对进一步分割以形成一组新的相位分区对。 来自一组新阶段分区对的单相分区对被分配给给定节点上的每个从进程。 此外,可以通过对目标对象的元组应用适当的散列函数来对目标对象进行分区。 并行操作是通过将每个元组从源表广播到在该元组被映射到的静态分区上工作的从属进程组来执行的。

    Automatic prevention of run-away query execution
    126.
    发明申请
    Automatic prevention of run-away query execution 审中-公开
    自动防止运行查询执行

    公开(公告)号:US20050177557A1

    公开(公告)日:2005-08-11

    申请号:US10936779

    申请日:2004-09-07

    IPC分类号: G06F7/00 G06F17/00 G06F17/30

    摘要: A run-away query execution is automatically identified by a background process that periodically looks at each of the currently executing queries and compares the current execution time with the execution time estimated by the optimizer. Each query execution having a negative execution time difference can be automatically identified as a run-away query execution. The query execution plans that result in run-away executions can then be automatically tuned to produce more efficient execution plans.

    摘要翻译: 一个后台进程会自动识别一个执行错误的查询,该后台进程定期查看当前正在执行的每个查询,并将当前执行时间与优化器估计的执行时间进行比较。 具有负执行时间差的每个查询执行可以被自动识别为一个逃跑查询执行。 导致执行失效的查询执行计划可以自动调整,以生成更有效的执行计划。

    Managing parallel execution of work granules according to their affinity
    130.
    发明授权
    Managing parallel execution of work granules according to their affinity 有权
    按照其亲和力管理工作颗粒的并行执行

    公开(公告)号:US06826753B1

    公开(公告)日:2004-11-30

    申请号:US09415031

    申请日:1999-10-07

    IPC分类号: G06F900

    摘要: A method and apparatus are provided for managing work granules being executed in parallel. A task is evenly divided between a number of work granules. The number of work granules falls between a threshold minimum and a threshold maximum. The threshold minimum and maximum may be configured to balance a variety of efficiency factors affected by the number of work granules, including workload skew and overhead incurred in managing larger number of work granules. Work granules are distributed to processes on nodes according to which of the nodes, if any, may execute the work granule efficiently. A variety of factors may used to determine where a work granule may be performed efficiently, including whether data accessed during the execution of a work granule may be locally accessed by a node.

    摘要翻译: 提供了一种用于管理并行执行的工作颗粒的方法和装置。 任务在一些工作颗粒之间被均匀分配。 工作颗粒的数量落在阈值最小值和阈值最大值之间。 阈值最小值和最大值可以被配置为平衡受工作颗粒数量影响的各种效率因子,包括工作量偏差和管理更大数量的工作颗粒物所产生的开销。 工作颗粒分配到节点上的过程,根据哪个节点(如果有的话)可以有效地执行工作颗粒。 可以使用各种因素来确定工作颗粒可以有效执行的位置,包括在工作颗粒的执行期间访问的数据是否可以由节点本地访问。