Determining a height-balanced histogram incrementally
    1.
    发明授权
    Determining a height-balanced histogram incrementally 有权
    逐步确定高度平衡直方图

    公开(公告)号:US08392406B1

    公开(公告)日:2013-03-05

    申请号:US12190885

    申请日:2008-08-13

    IPC分类号: G06F7/00

    CPC分类号: G06F17/18 G06F17/30501

    摘要: A table-level histogram is maintained incrementally without requiring rescanning of the entire table when new data values are added to the table. A table has multiple partitions of data values. A histogram for data values of the partitions is generated. When a new partition of data values is added to the table, a histogram for only the new partition is generated. To generate a histogram for the entire table, the histograms for the previously generated and newly added partitions are used without needing to refer to the underlying data. A similar approach is applicable when modifying data values in a partition.

    摘要翻译: 当将新的数据值添加到表中时,表级别的直方图将逐步维护,而不需要对整个表进行重新扫描。 一个表有多个分区的数据值。 生成分区数据值的直方图。 当新的分区数据值被添加到表中时,仅生成新分区的直方图。 为了生成整个表的直方图,使用先前生成的和新添加的分区的直方图,而不需要参考底层数据。 修改分区中的数据值时,也可以使用类似的方法。

    Health monitor
    2.
    发明授权
    Health monitor 有权
    健康监护仪

    公开(公告)号:US08161323B2

    公开(公告)日:2012-04-17

    申请号:US12252128

    申请日:2008-10-15

    IPC分类号: G06F11/00

    摘要: Techniques for proactively and reactively running diagnostic functions. These diagnostic functions help to improve diagnostics of conditions detected in a monitored system and to limit/quarantine the damages caused by the detected conditions. In one embodiment, a health monitor infrastructure is provided that is configured to perform one or more health checks in a monitored system for diagnosing and/or gathering information related to the system. The one or more health checks may be invoked pro-actively on a scheduled basis, reactively in response to a condition detected in the system, or may even be invoked manually by a user such as a system administrator.

    摘要翻译: 主动和反应地运行诊断功能的技术。 这些诊断功能有助于改善在受监控系统中检测到的条件的诊断,并限制/检疫由检测到的条件引起的损害。 在一个实施例中,提供健康监视器基础设施,其被配置为在受监视的系统中执行一个或多个健康检查以诊断和/或收集与该系统有关的信息。 响应于在系统中检测到的状况,或者甚至可以由诸如系统管理员的用户手动地调用一个或多个健康检查,可以在调度的基础上主动地调用。

    Approximating a database statistic
    6.
    发明授权
    Approximating a database statistic 有权
    近似数据库统计

    公开(公告)号:US07636731B2

    公开(公告)日:2009-12-22

    申请号:US11796102

    申请日:2007-04-25

    IPC分类号: G06F7/00 G06F17/30 G06F17/00

    摘要: A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.

    摘要翻译: 提供了用于近似数据库统计量的方法和装置,例如不同值(NDV)的数量。 为了近似表的一部分的NDV,构建了不同值的概要。 该部分中的每个值都映射到值的域。 在一个实施例中,映射功能是用均匀散列函数实现的。 如果在概要中不存在结果域值,则将域值添加到概要中。 如果概要达到其容量,则域值的一部分将从摘要中被丢弃。 统计量基于概要中的域值的数量(N)和在概要中相对于域的大小表示的域的部分近似。

    Merging synopses to determine number of distinct values in large databases
    8.
    发明申请
    Merging synopses to determine number of distinct values in large databases 有权
    合并摘要以确定大型数据库中不同值的数量

    公开(公告)号:US20080120275A1

    公开(公告)日:2008-05-22

    申请号:US11796110

    申请日:2007-04-25

    IPC分类号: G06F17/30

    摘要: A method and apparatus for merging synopses to determine a database statistic, e.g., a number of distinct values (NDV), is disclosed. The merging can be used to determine an initial database statistic or to perform incremental statistics maintenance. For example, each synopsis can pertain to a different partition, such that merging the synopses generates a global statistic. When performing incremental maintenance, only those synopses whose partitions have changed need to be updated. Each synopsis contains domain values that summarize the statistic. However, the synopses may initially contain domain values that are not compatible with each other. Prior to merging the synopses the domain values in each synopsis is made compatible with the domain values in the other synopses. The adjustment is made such that each synopsis represents the same range of domain values, in one embodiment. After “compatible synopses” are formed, the synopses are merged by taking the union of the compatible synopses.

    摘要翻译: 公开了用于合并概要以确定数据库统计量的方法和装置,例如多个不同值(NDV)。 合并可用于确定初始数据库统计信息或执行增量统计维护。 例如,每个概要可以涉及不同的分区,以便合并概要会生成全局统计量。 执行增量维护时,只需要更新其分区已更改的概要文件。 每个概要包含总结统计量的域值。 但是,这些概要可能最初包含彼此不兼容的域值。 在合并概要之前,每个概要中的域值与其他概要中的域值兼容。 在一个实施例中进行调整,使得每个概要表示相同范围的域值。 在形成“兼容简介”之后,通过兼容兼容简报的合并来合并概要。

    Automatic database diagnostic monitor architecture
    9.
    发明申请
    Automatic database diagnostic monitor architecture 有权
    自动数据库诊断监视器架构

    公开(公告)号:US20050055673A1

    公开(公告)日:2005-03-10

    申请号:US10775531

    申请日:2004-02-09

    IPC分类号: G06F9/44

    CPC分类号: G06F17/30306 G06F17/30371

    摘要: Techniques for self-diagnosing performance problems in a database are provided. The techniques include classifying one or more performance problems in a database system. One or more values for quantifying an impact of the one or more performance problems on the database system are then determined. The quantified values are determined based on the performance of operations in the database system. A performance problem based on the one or more quantified values is then determined. A solution for the performance problem is generated and may be outputted.

    摘要翻译: 提供了数据库中自我诊断性能问题的技术。 这些技术包括对数据库系统中的一个或多个性能问题进行分类。 然后确定用于量化一个或多个性能问题对数据库系统的影响的一个或多个值。 量化值基于数据库系统中的操作性能来确定。 然后确定基于一个或多个量化值的性能问题。 产生性能问题的解决方案,并可以输出。

    Method for computing near neighbors of a query point in a database
    10.
    发明授权
    Method for computing near neighbors of a query point in a database 失效
    用于计算数据库中查询点的近邻的方法

    公开(公告)号:US6148295A

    公开(公告)日:2000-11-14

    申请号:US560

    申请日:1997-12-30

    IPC分类号: G06F17/30 G06K9/62

    摘要: A method for determining k nearest-neighbors to a query point in a database in which an ordering is defined for a data set P of a database, the ordering being based on l one-dimensional codes C.sub.1, . . . , C.sub.1. A single relation R is created in which R has the attributes of index-id, point-id and value. An entry (j,i,C.sub..epsilon.j (p.sub.i)) is included in relation R for each data point p.sub.i .EPSILON.P, where index-id equals j, point-id equals i, and value equals C.sub..epsilon.j (p.sub.i). A B-tree index is created based on a combination of the index-id attribute and the value attribute. A query point is received and a relation Q is created for the query point having the attributes of index-id and value. One tuple is generated in the relation Q for each j, j=1, . . . , l, where index-id equals j and value equals C.sub..epsilon.j (q). A distance d is selected. The index-id attribute for the relation R of each data point p.sub.i is compared to the index-id attribute for the relation Q of the query point. A candidate data point p.sub.i is selected when the comparison of the relation R of a data point p.sub.i to the index-id attribute for the relation Q of the query point is less than the distance d. Lower bounds are calculated for each cube of the plurality of cubes that represent a minimum distance between any point in a cube and the query point. Lastly, k candidate data points p.sub.i are selected as k nearest-neighbors to the query point.

    摘要翻译: 一种用于确定数据库中的查询点的k个最近邻的方法,其中为数据库的数据集P定义排序,排序基于l个一维码C1。 。 。 ,C1。 创建单个关系R,其中R具有index-id,point-id和value的属性。 对于每个数据点pi EPSILON P,其中index-id等于j,point-id等于i,并且值等于C epsilon(pi),则在关系R中包括条目(j,i,C epsilon j(pi))。 基于index-id属性和value属性的组合创建B树索引。 接收到一个查询点,并为具有index-id和value属性的查询点创建一个关系Q。 在每个j,j = 1的关系Q中产生一个元组。 。 。 ,l,其中index-id等于j,值等于C epsilon(q)。 选择距离d。 将每个数据点pi的关系R的index-id属性与查询点的关系Q的index-id属性进行比较。 当数据点pi的关系R与查询点的关系Q的index-id属性的比较小于距离d时,选择候选数据点pi。 对于表示多维数据集中的任何点与查询点之间的最小距离的多个立方体中的每个立方体计算下限。 最后,将k个候选数据点pi选为查询点的k个最近邻。