Handwriting recognition with mixtures of Bayesian networks
    1.
    发明授权
    Handwriting recognition with mixtures of Bayesian networks 失效
    具有贝叶斯网络混合的手写识别

    公开(公告)号:US07003158B1

    公开(公告)日:2006-02-21

    申请号:US10075962

    申请日:2002-02-14

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00422 G06K9/6296

    摘要: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.

    摘要翻译: 本发明使用贝叶斯网络的混合来执行手写识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的模型中模拟世界。 MBN编码观察对应于手写字符的视觉观察组的概率。 每个HSBN编码观察对应于手写字符的视觉观察组的概率,并给出处于特定状态的隐藏的公共变量。

    Handwriting recognition with mixtures of bayesian networks
    2.
    发明授权
    Handwriting recognition with mixtures of bayesian networks 有权
    手写识别与贝叶斯网络混合

    公开(公告)号:US07200267B1

    公开(公告)日:2007-04-03

    申请号:US11324444

    申请日:2005-12-30

    IPC分类号: G06K9/00 G06K9/62

    CPC分类号: G06K9/00422 G06K9/6296

    摘要: The invention performs handwriting recognition using mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. Each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of its states. The MBNs encode the probabilities of observing the sets of visual observations corresponding to a handwritten character. Each of the HSBNs encodes the probabilities of observing the sets of visual observations corresponding to a handwritten character and given a hidden common variable being in a particular state.

    摘要翻译: 本发明使用贝叶斯网络的混合来执行手写识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的模型中模拟世界。 MBN编码观察对应于手写字符的视觉观察组的概率。 每个HSBN编码观察对应于手写字符的视觉观察组的概率,并给出处于特定状态的隐藏的公共变量。

    Systems and methods for new time series model probabilistic ARMA
    3.
    发明授权
    Systems and methods for new time series model probabilistic ARMA 有权
    新时间序列模型概率ARMA的系统和方法

    公开(公告)号:US07580813B2

    公开(公告)日:2009-08-25

    申请号:US10463145

    申请日:2003-06-17

    IPC分类号: G06F17/50 G05B23/02

    CPC分类号: G06F17/18

    摘要: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.

    摘要翻译: 本发明利用交叉预测方案来预测离散和连续时间观测数据的值,其中每个连续时间管变量的条件方差固定为小的正值。 通过在基于ARMA的模型中允许交叉预测,可以准确预测时间序列中连续和离散观测值。 本发明通过扩展ARMA模型来实现这一目的,使得第一时间序列“管”用于促进或“交叉预测”第二时间序列管中的值以形成“ARMAxp”模型。 一般来说,在ARMAxp模型中,每个连续变量的分布是仅在离散变量上分裂并具有在所有叶上具有连续回归的线性回归的决策图,并且每个离散变量的分布是仅分解为 离散变量,并在所有叶子上具有额外的分布。

    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms
    4.
    发明授权
    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms 有权
    确定增量型期望最大化(EM)算法的近似最优块大小

    公开(公告)号:US07246048B2

    公开(公告)日:2007-07-17

    申请号:US11177734

    申请日:2005-07-08

    IPC分类号: G06F7/60

    摘要: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.

    摘要翻译: 公开了确定增量型期望最大化(EM)算法的近似最小块大小。 基于新的认识来确定块大小,即对于给定的块大小范围,使用增量型EM算法而不是标准EM算法导致的速度增加大致相同。 此外,该块大小可以由未达到收敛的EM算法的初始版本来确定。 对于当前块大小,确定速度增加,并且如果到目前为止确定的速度增加最大,则将当前块大小设置为目标块大小。 对于新的块大小重复此过程,直到不能确定新的块大小。

    Efficient determination of sample size to facilitate building a statistical model
    5.
    发明授权
    Efficient determination of sample size to facilitate building a statistical model 有权
    有效确定样本量以便建立统计模型

    公开(公告)号:US07409371B1

    公开(公告)日:2008-08-05

    申请号:US09873719

    申请日:2001-06-04

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005

    摘要: A model is constructed for an initial subset of the data using a first parameter estimation algorithm. The model may be evaluated, for example, by applying the model to a holdout data set of the data. If the model is not acceptable, additional data is added to the data subset and the first parameter estimation algorithm is repeated for the aggregate data subset. An appropriate subset of the data exists when the first parameter estimation algorithm produces an acceptable model. The appropriate subset of the data may then be employed by a second parameter estimation algorithm, which may be a more accurate version of the first algorithm or a different algorithm altogether, to build a statistical model to characterize the data.

    摘要翻译: 使用第一参数估计算法为数据的初始子集构建模型。 可以例如通过将模型应用于数据的保持数据集来评估该模型。 如果模型不可接受,则向数据子集添加附加数据,并且针对聚合数据子集重复第一参数估计算法。 当第一参数估计算法产生可接受的模型时,存在数据的适当子集。 然后可以通过第二参数估计算法来采用数据的适当子集,第二参数估计算法可以是第一算法的更准确的版本或者完全不同的算法,以构建用于表征数据的统计模型。

    Goal-oriented clustering
    6.
    发明授权
    Goal-oriented clustering 有权
    面向目标的聚类

    公开(公告)号:US06694301B1

    公开(公告)日:2004-02-17

    申请号:US09540255

    申请日:2000-03-31

    IPC分类号: G06N502

    摘要: Clustering for purposes of data visualization and making predictions is disclosed. Embodiments of the invention are operable on a number of variables that have a predetermined representation. The variables include input-only variables, output-only variables, and both input-and-output variables. Embodiments of the invention generate a model that has a bottleneck architecture. The model includes a top layer of nodes of at least the input-only variables, one or more middle layer of hidden nodes, and a bottom layer of nodes of the output-only and the input-and-output variables. At least one cluster is determined from this model. The model can be a probabilistic neural network and/or a Bayesian network.

    摘要翻译: 公开了用于数据可视化和进行预测的聚类。 本发明的实施例可以对具有预定表示的多个变量进行操作。 变量包括仅输入变量,仅输出变量,以及输入和输出变量。 本发明的实施例生成具有瓶颈架构的模型。 该模型包括至少仅输入变量,一个或多个中间层隐藏节点的顶层,以及仅输出和输入和输出变量的节点的底层。 从该模型确定至少一个群集。 该模型可以是概率神经网络和/或贝叶斯网络。

    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms
    7.
    发明授权
    Determining near-optimal block size for incremental-type expectation maximization (EM) algorithms 失效
    确定增量型期望最大化(EM)算法的近似最优块大小

    公开(公告)号:US06922660B2

    公开(公告)日:2005-07-26

    申请号:US09728508

    申请日:2000-12-01

    IPC分类号: G06F17/10 G06F17/18

    摘要: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.

    摘要翻译: 公开了确定增量型期望最大化(EM)算法的近似最小块大小。 基于新的认识来确定块大小,即对于给定的块大小范围,使用增量型EM算法而不是标准EM算法导致的速度增加大致相同。 此外,该块大小可以由未达到收敛的EM算法的初始版本来确定。 对于当前块大小,确定速度增加,并且如果到目前为止确定的速度增加最大,则将当前块大小设置为目标块大小。 对于新的块大小重复此过程,直到不能确定新的块大小。

    Bayesian approach for learning regression decision graph models and regression models for time series analysis
    8.
    发明授权
    Bayesian approach for learning regression decision graph models and regression models for time series analysis 有权
    用于学习回归决策图模型的贝叶斯方法和时间序列分析的回归模型

    公开(公告)号:US07660705B1

    公开(公告)日:2010-02-09

    申请号:US10102116

    申请日:2002-03-19

    IPC分类号: G06F17/10

    CPC分类号: G06K9/6297

    摘要: Methods and systems are disclosed for learning a regression decision graph model using a Bayesian model selection approach. In a disclosed aspect, the model structure and/or model parameters can be learned using a greedy search algorithm applied to grow the model so long as the model improves. This approach enables construction of a decision graph having a model structure that includes a plurality of leaves, at least one of which includes a non-trivial linear regression. The resulting model thus can be employed for forecasting, such as for time series data, which can include single or multi-step forecasting.

    摘要翻译: 公开了使用贝叶斯模型选择方法学习回归决策图模型的方法和系统。 在公开的方面,只要模型改进,可以使用应用于增长模型的贪心搜索算法来学习模型结构和/或模型参数。 该方法能够构建具有包括多个叶子的模型结构的决策图,其中至少一个包括非平凡的线性回归。 因此,所得到的模型可以用于预测,例如用于时间序列数据,其可以包括单步或多步预测。

    Automatic data perspective generation for a target variable
    9.
    发明授权
    Automatic data perspective generation for a target variable 有权
    为目标变量生成自动数据透视图

    公开(公告)号:US07225200B2

    公开(公告)日:2007-05-29

    申请号:US10824108

    申请日:2004-04-14

    IPC分类号: G06F17/00 G06F7/00

    摘要: The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e., target variable predictors) of the data perspective to provide an optimum view and/or accept control inputs from a user to guide/control the generation of the data perspective.

    摘要翻译: 本发明利用机器学习技术来提供用于为给定目标变量构建数据透视图的自动生成调节变量。 本发明确定和分析给定目标变量的最佳目标变量预测变量,使用它们来促进向用户传达关于目标变量的信息。 它自动离散化用作目标变量预测变量的连续和离散变量以确定其粒度。 在本发明的其他实例中,可以规定复杂性和/或效用参数,以通过分析最佳目标变量预测器与调节变量和/或效用的复杂性来促进数据透视的产生。 本发明还可以调整数据透视图的调节变量(即,目标变量预测器),以提供最佳视图和/或接受来自用户的控制输入以指导/控制数据视角的产生。

    Staged mixture modeling
    10.
    发明授权
    Staged mixture modeling 有权
    分阶段混合建模

    公开(公告)号:US07133811B2

    公开(公告)日:2006-11-07

    申请号:US10270914

    申请日:2002-10-15

    IPC分类号: G06F17/10

    摘要: A system and method for generating staged mixture model(s) is provided. The staged mixture model includes a plurality of mixture components each having an associated mixture weight, and, an added mixture component having an initial structure, parameters and associated mixture weight. The added mixture component is modified based, at least in part, upon a case that is undesirably addressed by the plurality of mixture components using a structural expectation maximization (SEM) algorithm to modify at the structure, parameters and/or associated mixture weight of the added mixture component.The staged mixture model employs a data-driven staged mixture modeling technique, for example, for building density, regression, and classification model(s). The basic approach is to add mixture component(s) (e.g., sequentially) to the staged mixture model using an SEM algorithm.

    摘要翻译: 提供了一种用于生成分段混合模型的系统和方法。 分级混合物模型包括各自具有相关混合物重量的多种混合物组分,以及具有初始结构,参数和相关混合物重量的添加的混合物组分。 至少部分地,添加的混合物组分基于使用结构期望最大化(SEM)算法不期望地由多个混合物组分解决的情况进行修饰,以在结构,参数和/或相关联的混合物重量 加入的混合物组分。 分级混合模型采用数据驱动的分段混合建模技术,例如建筑密度,回归和分类模型。 基本方法是使用SEM算法将混合物组分(例如,顺序地)添加到分级混合物模型中。