Anomaly detection in data perspectives
    1.
    发明授权
    Anomaly detection in data perspectives 失效
    数据透视异常检测

    公开(公告)号:US07065534B2

    公开(公告)日:2006-06-20

    申请号:US10874956

    申请日:2004-06-23

    IPC分类号: G06F7/00 G06F17/00

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    摘要翻译: 本发明利用曲线拟合数据技术从数据角度提供“数据管”中的数据异常的自动检测,从而允许例如检测诸如屏幕上的数据异常,向下钻取和钻取数据异常的数据异常 例如,枢轴表和/或OLAP多维数据集。 它确定数据是否基本上偏离由曲线拟合处理(例如应用于数据管的分段线性函数)所建立的预测值。 本发明也可以采用阈值,以便在确定数据值被认为是异常之前确定所需的偏差程度。 阈值可以由系统和/或用户经由用户界面动态地和/或静态地提供。 另外,本发明从顶级数据的角度向用户提供了检测到的异常的类型和位置的指示。

    Anomaly detection in data perspectives
    2.
    发明授权
    Anomaly detection in data perspectives 有权
    数据透视异常检测

    公开(公告)号:US07162489B2

    公开(公告)日:2007-01-09

    申请号:US11299539

    申请日:2005-12-12

    IPC分类号: G06F7/00

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    摘要翻译: 本发明利用曲线拟合数据技术从数据角度提供“数据管”中的数据异常的自动检测,从而允许例如检测诸如屏幕上的数据异常,向下钻取和钻取数据异常的数据异常 例如,枢轴表和/或OLAP多维数据集。 它确定数据是否基本上偏离由曲线拟合处理(例如应用于数据管的分段线性函数)所建立的预测值。 本发明也可以采用阈值,以便在确定数据值被认为是异常之前确定所需的偏差程度。 阈值可以由系统和/或用户经由用户界面动态地和/或静态地提供。 另外,本发明从顶级数据的角度向用户提供了检测到的异常的类型和位置的指示。

    Automatic data perspective generation for a target variable
    3.
    发明授权
    Automatic data perspective generation for a target variable 有权
    为目标变量生成自动数据透视图

    公开(公告)号:US07225200B2

    公开(公告)日:2007-05-29

    申请号:US10824108

    申请日:2004-04-14

    IPC分类号: G06F17/00 G06F7/00

    摘要: The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e., target variable predictors) of the data perspective to provide an optimum view and/or accept control inputs from a user to guide/control the generation of the data perspective.

    摘要翻译: 本发明利用机器学习技术来提供用于为给定目标变量构建数据透视图的自动生成调节变量。 本发明确定和分析给定目标变量的最佳目标变量预测变量,使用它们来促进向用户传达关于目标变量的信息。 它自动离散化用作目标变量预测变量的连续和离散变量以确定其粒度。 在本发明的其他实例中,可以规定复杂性和/或效用参数,以通过分析最佳目标变量预测器与调节变量和/或效用的复杂性来促进数据透视的产生。 本发明还可以调整数据透视图的调节变量(即,目标变量预测器),以提供最佳视图和/或接受来自用户的控制输入以指导/控制数据视角的产生。

    Systems and methods for new time series model probabilistic ARMA
    4.
    发明授权
    Systems and methods for new time series model probabilistic ARMA 有权
    新时间序列模型概率ARMA的系统和方法

    公开(公告)号:US07580813B2

    公开(公告)日:2009-08-25

    申请号:US10463145

    申请日:2003-06-17

    IPC分类号: G06F17/50 G05B23/02

    CPC分类号: G06F17/18

    摘要: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.

    摘要翻译: 本发明利用交叉预测方案来预测离散和连续时间观测数据的值,其中每个连续时间管变量的条件方差固定为小的正值。 通过在基于ARMA的模型中允许交叉预测,可以准确预测时间序列中连续和离散观测值。 本发明通过扩展ARMA模型来实现这一目的,使得第一时间序列“管”用于促进或“交叉预测”第二时间序列管中的值以形成“ARMAxp”模型。 一般来说,在ARMAxp模型中,每个连续变量的分布是仅在离散变量上分裂并具有在所有叶上具有连续回归的线性回归的决策图,并且每个离散变量的分布是仅分解为 离散变量,并在所有叶子上具有额外的分布。

    ANOMALY DETECTION IN DATA PERSPECTIVES
    5.
    发明申请
    ANOMALY DETECTION IN DATA PERSPECTIVES 失效
    数据视野中的异常检测

    公开(公告)号:US20050288883A1

    公开(公告)日:2005-12-29

    申请号:US10874956

    申请日:2004-06-23

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    摘要翻译: 本发明利用曲线拟合数据技术从数据角度提供“数据管”中的数据异常的自动检测,从而允许例如检测诸如屏幕上的数据异常,向下钻取和钻取数据异常的数据异常 例如,枢轴表和/或OLAP多维数据集。 它确定数据是否基本上偏离由曲线拟合处理(例如应用于数据管的分段线性函数)所建立的预测值。 本发明也可以采用阈值,以便在确定数据值被认为是异常之前确定所需的偏差程度。 阈值可以由系统和/或用户经由用户界面动态地和/或静态地提供。 另外,本发明从顶级数据的角度向用户提供了检测到的异常的类型和位置的指示。

    Anomaly detection in data perspectives

    公开(公告)号:US20060106560A1

    公开(公告)日:2006-05-18

    申请号:US11299539

    申请日:2005-12-12

    IPC分类号: G06F19/00

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms
    7.
    发明授权
    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms 有权
    确定增量型期望最大化(EM)算法的近似最优块大小

    公开(公告)号:US07246048B2

    公开(公告)日:2007-07-17

    申请号:US11177734

    申请日:2005-07-08

    IPC分类号: G06F7/60

    摘要: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.

    摘要翻译: 公开了确定增量型期望最大化(EM)算法的近似最小块大小。 基于新的认识来确定块大小,即对于给定的块大小范围,使用增量型EM算法而不是标准EM算法导致的速度增加大致相同。 此外,该块大小可以由未达到收敛的EM算法的初始版本来确定。 对于当前块大小,确定速度增加,并且如果到目前为止确定的速度增加最大,则将当前块大小设置为目标块大小。 对于新的块大小重复此过程,直到不能确定新的块大小。

    Efficient determination of sample size to facilitate building a statistical model
    8.
    发明授权
    Efficient determination of sample size to facilitate building a statistical model 有权
    有效确定样本量以便建立统计模型

    公开(公告)号:US07409371B1

    公开(公告)日:2008-08-05

    申请号:US09873719

    申请日:2001-06-04

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005

    摘要: A model is constructed for an initial subset of the data using a first parameter estimation algorithm. The model may be evaluated, for example, by applying the model to a holdout data set of the data. If the model is not acceptable, additional data is added to the data subset and the first parameter estimation algorithm is repeated for the aggregate data subset. An appropriate subset of the data exists when the first parameter estimation algorithm produces an acceptable model. The appropriate subset of the data may then be employed by a second parameter estimation algorithm, which may be a more accurate version of the first algorithm or a different algorithm altogether, to build a statistical model to characterize the data.

    摘要翻译: 使用第一参数估计算法为数据的初始子集构建模型。 可以例如通过将模型应用于数据的保持数据集来评估该模型。 如果模型不可接受,则向数据子集添加附加数据,并且针对聚合数据子集重复第一参数估计算法。 当第一参数估计算法产生可接受的模型时,存在数据的适当子集。 然后可以通过第二参数估计算法来采用数据的适当子集,第二参数估计算法可以是第一算法的更准确的版本或者完全不同的算法,以构建用于表征数据的统计模型。

    Handwriting recognition with mixtures of Bayesian networks
    9.
    发明授权
    Handwriting recognition with mixtures of Bayesian networks 失效
    具有贝叶斯网络混合的手写识别

    公开(公告)号:US07003158B1

    公开(公告)日:2006-02-21

    申请号:US10075962

    申请日:2002-02-14

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00422 G06K9/6296

    摘要: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.

    摘要翻译: 本发明使用贝叶斯网络的混合来执行手写识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的模型中模拟世界。 MBN编码观察对应于手写字符的视觉观察组的概率。 每个HSBN编码观察对应于手写字符的视觉观察组的概率,并给出处于特定状态的隐藏的公共变量。

    Handwriting recognition with mixtures of bayesian networks
    10.
    发明授权
    Handwriting recognition with mixtures of bayesian networks 有权
    手写识别与贝叶斯网络混合

    公开(公告)号:US07200267B1

    公开(公告)日:2007-04-03

    申请号:US11324444

    申请日:2005-12-30

    IPC分类号: G06K9/00 G06K9/62

    CPC分类号: G06K9/00422 G06K9/6296

    摘要: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.

    摘要翻译: 本发明使用贝叶斯网络的混合来执行手写识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的模型中模拟世界。 MBN编码观察对应于手写字符的视觉观察组的概率。 每个HSBN编码观察对应于手写字符的视觉观察组的概率,并给出处于特定状态的隐藏的公共变量。