Detecting instabilities in time series forecasting
    1.
    发明授权
    Detecting instabilities in time series forecasting 有权
    检测时间序列预测中的不稳定性

    公开(公告)号:US07617010B2

    公开(公告)日:2009-11-10

    申请号:US11319894

    申请日:2005-12-28

    IPC分类号: G05B13/02

    CPC分类号: G06F17/30539

    摘要: A predictive model analysis system comprises a receiver component that receives predictive samples created by way of forward sampling. An analysis component analyzes a plurality of the received predictive samples and automatically determines whether a predictive model is reliable at a time range associated with the plurality of predictive sample, wherein the determination is made based at least in part upon an estimated norm associated with a forward sampling operator.

    摘要翻译: 预测模型分析系统包括接收器组件,其接收通过前向采样创建的预测样本。 分析组件分析多个接收到的预测样本,并且在与所述多个预测样本相关联的时间范围内自动确定预测模型是否可靠,其中所述确定至少部分地基于与前向相关联的估计范数 抽样运算符。

    Dependency network based model (or pattern)
    2.
    发明授权
    Dependency network based model (or pattern) 有权
    基于依赖网络的模型(或模式)

    公开(公告)号:US08140569B2

    公开(公告)日:2012-03-20

    申请号:US10447462

    申请日:2003-05-29

    IPC分类号: G06F17/30 G06F7/00

    摘要: A dependency network is created from a training data set utilizing a scalable method. A statistical model (or pattern), such as for example a Bayesian network, is then constructed to allow more convenient inferencing. The model (or pattern) is employed in lieu of the training data set for data access. The computational complexity of the method that produces the model (or pattern) is independent of the size of the original data set. The dependency network directly returns explicitly encoded data in the conditional probability distributions of the dependency network. Non-explicitly encoded data is generated via Gibbs sampling, approximated, or ignored.

    摘要翻译: 从使用可伸缩方法的训练数据集创建依赖网络。 然后构建统计模型(或模式),例如贝叶斯网络,以允许更方便的推论。 采用模型(或模式)代替用于数据访问的训练数据集。 产生模型(或模式)的方法的计算复杂度与原始数据集的大小无关。 依赖网络直接在依赖网络的条件概率分布中返回显式编码的数据。 通过Gibbs采样,近似或忽略来生成非显式编码数据。

    Systems and methods for new time series model probabilistic ARMA
    3.
    发明授权
    Systems and methods for new time series model probabilistic ARMA 有权
    新时间序列模型概率ARMA的系统和方法

    公开(公告)号:US07580813B2

    公开(公告)日:2009-08-25

    申请号:US10463145

    申请日:2003-06-17

    IPC分类号: G06F17/50 G05B23/02

    CPC分类号: G06F17/18

    摘要: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.

    摘要翻译: 本发明利用交叉预测方案来预测离散和连续时间观测数据的值,其中每个连续时间管变量的条件方差固定为小的正值。 通过在基于ARMA的模型中允许交叉预测,可以准确预测时间序列中连续和离散观测值。 本发明通过扩展ARMA模型来实现这一目的,使得第一时间序列“管”用于促进或“交叉预测”第二时间序列管中的值以形成“ARMAxp”模型。 一般来说,在ARMAxp模型中,每个连续变量的分布是仅在离散变量上分裂并具有在所有叶上具有连续回归的线性回归的决策图,并且每个离散变量的分布是仅分解为 离散变量,并在所有叶子上具有额外的分布。

    Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications
    4.
    发明授权
    Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications 有权
    用于可视化数据集群和分级集群分类的装置和相关方法

    公开(公告)号:US06742003B2

    公开(公告)日:2004-05-25

    申请号:US09845151

    申请日:2001-04-30

    IPC分类号: G06F1730

    摘要: A system that incorporates an interactive graphical user interface for visualizing clusters (categories) and segments (summarized clusters) of data. Specifically, the system automatically categorizes incoming case data into clusters, summarizes those clusters into segments, determines similarity measures for the segments, scores the selected segments through the similarity measures, and then forms and visually depicts hierarchical organizations of those selected clusters. The system also automatically and dynamically reduces, as necessary, a depth of the hierarchical organization, through elimination of unnecessary hierarchical levels and inter-nodal links, based on similarity measures of segments or segment groups. Attribute/value data that tends to meaningfully characterize each segment is also scored, rank ordered based on normalized scores, and then graphically displayed. The system permits a user to browse through the hierarchy, and, to readily comprehend segment inter-relationships, selectively expand and contract the displayed hierarchy, as desired, as well as to compare two selected segments or segment groups together and graphically display the results of that comparison. An alternative discriminant-based cluster scoring technique is also presented.

    摘要翻译: 一个包含交互式图形用户界面的系统,用于可视化数据的集群(类别)和分段(聚合集群)。 具体来说,系统将传入的病例数据自动分类为群集,将这些群集合成段,确定段的相似性度量,通过相似性度量对所选段进行分类,然后形成并可视地描绘这些群集的层次结构。 基于片段或段组的相似性度量,系统还可以根据需要自动和动态地减少层次组织的深度,通过消除不必要的层级和节点间链接。 倾向于对每个段进行有意义表征的属性/值数据也被划分,基于归一化分数进行排序,然后以图形方式显示。 该系统允许用户浏览层次结构,并且为了容易地理解分段相互关系,根据需要选择性地扩展和收缩所显示的层次结构,以及将两个选定的分段或分段组进行比较,并以图形方式显示 那个比较。 还提出了一种替代的基于判别式的聚类评分技术。

    Generating improved belief networks
    5.
    发明授权
    Generating improved belief networks 失效
    产生改进的信念网络

    公开(公告)号:US06529888B1

    公开(公告)日:2003-03-04

    申请号:US08739200

    申请日:1996-10-30

    IPC分类号: G06N504

    CPC分类号: G06N5/022

    摘要: An improved belief network generator is provided. A belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator provides for the use of continuous variables in the generated belief network and missing data in the empirical data.

    摘要翻译: 提供了一种改进的信任网络生成器。 使用从专家领域的专家知识获取的专家知识和反映在专家的给定领域中作出的观察的经验数据产生信念网络。 除了利用专家知识和经验数据外,信念网络生成器还提供了在生成的信念网络中使用连续变量,并在经验数据中提供丢失的数据。

    Generating improved belief networks
    6.
    发明授权
    Generating improved belief networks 失效
    产生改进的信念网络

    公开(公告)号:US5704018A

    公开(公告)日:1997-12-30

    申请号:US240019

    申请日:1994-05-09

    IPC分类号: G06N5/02 G06F15/18

    CPC分类号: G06N5/022

    摘要: An improved belief network generator is provided. In a preferred embodiment of the present invention, a belief network is generated utilizing expert knowledge retrieved from an expert in a given field of expertise and empirical data reflecting observations made in the given field of the expert. In addition to utilizing expert knowledge and empirical data, the belief network generator of the preferred embodiment provides for the use of continuous variables in the generated belief network and missing data in the empirical data.

    摘要翻译: 提供了一种改进的信任网络生成器。 在本发明的一个优选实施例中,利用从专家专家获取的专家知识和反映在专家给定领域中所作出的观察的经验数据产生信念网络。 除了利用专家知识和经验数据之外,优选实施例的信念网络生成器还提供了所生成的信念网络中的连续变量的使用以及经验数据中的丢失数据。

    Trees of classifiers for detecting email spam
    8.
    发明授权
    Trees of classifiers for detecting email spam 有权
    用于检测电子邮件垃圾邮件的分类树

    公开(公告)号:US07930353B2

    公开(公告)日:2011-04-19

    申请号:US11193691

    申请日:2005-07-29

    IPC分类号: G06F15/16

    CPC分类号: H04L51/12

    摘要: Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.

    摘要翻译: 利用分类器模型填充的决策树利用电子邮件的每个功能使用单独的电子邮件分类器来提供增强的垃圾邮件检测。 这通过定制每个分类器模型提供了更高的垃圾邮件检测的概率,以便于在逐个特征的基础上更准确地确定垃圾邮件。 分类器可以基于诸如逻辑回归模型和/或支持向量机(SVM)等线性模型来构建。 分类器也可以基于决策树构建。 基于决策树的内部和/或外部节点的“复合特征”也可以用于提供线性分类器模型。 垃圾邮件检测结果的平滑可以通过使用来自决策树内的其他节点的分类器模型来实现,如果训练数据是稀疏的。 这形成了可能没有接收到大量训练数据的决策树的分支的基本模型。

    Scalable methods for learning Bayesian networks
    9.
    发明授权
    Scalable methods for learning Bayesian networks 有权
    贝叶斯网络的可扩展方法

    公开(公告)号:US07251636B2

    公开(公告)日:2007-07-31

    申请号:US10732074

    申请日:2003-12-10

    IPC分类号: G06F15/18

    CPC分类号: G06N7/005 G06N99/005

    摘要: The present invention leverages scalable learning methods to efficiently obtain a Bayesian network for a set of variables of which the total ordering in a domain is known. Certain criteria are employed to generate a Bayesian network which is then evaluated and utilized as a guide to generate another Bayesian network for the set of variables. Successive iterations are performed utilizing a prior Bayesian network as a guide until a stopping criterion is reached, yielding a best-effort Bayesian network for the set of variables.

    摘要翻译: 本发明利用可缩放的学习方法来有效地获得一组变量的贝叶斯网络,其中领域中的总排序是已知的。 采用某些标准来生成贝叶斯网络,然后对其进行评估和利用,以便为该组变量生成另一个贝叶斯网络。 使用先前的贝叶斯网络作为指导进行连续迭代,直到达到停止标准,为该组变量产生尽力而为的贝叶斯网络。

    Systems and methods for tractable variational approximation for interference in decision-graph Bayesian networks
    10.
    发明授权
    Systems and methods for tractable variational approximation for interference in decision-graph Bayesian networks 失效
    用于决策贝叶斯网络干扰的易变性近似的系统和方法

    公开(公告)号:US07184993B2

    公开(公告)日:2007-02-27

    申请号:US10458166

    申请日:2003-06-10

    IPC分类号: G06F17/00 G06N5/02

    CPC分类号: G06N7/005 G06N5/04

    摘要: The present invention leverages approximations of distributions to provide tractable variational approximations, based on at least one continuous variable, for inference utilization in Bayesian networks where local distributions are decision-graphs. These tractable approximations are employed in lieu of exact inferences that are normally NP-hard to solve. By utilizing Jensen's inequality applied to logarithmic distributions composed of a generalized sum including an introduced arbitrary conditional distribution, a means is acquired to resolve a tightly bound likelihood distribution. The means includes application of Mean-Field Theory, approximations of conditional probability distributions, and/or other means that allow for a tractable variational approximation to be achieved.

    摘要翻译: 本发明利用分布的近似来提供基于至少一个连续变量的可容易的变分近似,用于在本地分布是决策图的贝叶斯网络中的推理利用。 采用这些易于理解的近似来代替通常难以解决的难以确定的精确推论。 通过利用应用于由包括引入的任意条件分布的广义和组成的对数分布的Jensen不等式,获取用于解决紧密约束似然分布的手段。 该方法包括平均场理论的应用,条件概率分布的近似,和/或允许实现易处理变分近似的其他方式。