System and method for approximating probabilities using a decision tree
    1.
    发明授权
    System and method for approximating probabilities using a decision tree 有权
    使用决策树近似概率的系统和方法

    公开(公告)号:US06718315B1

    公开(公告)日:2004-04-06

    申请号:US09740067

    申请日:2000-12-18

    IPC分类号: G06F1518

    CPC分类号: G06N99/005

    摘要: Disclosed is a system for approximating conditional probabilities using an annotated decision tree where predictor values that did not exist in training data for the system are tracked, stored, and referenced to determine if statistical aggregation should be invoked. Further disclosed is a system for storing statistics for deriving a non-leaf probability corresponding to predictor values, and a system for aggregating such statistics to approximate conditional probabilities.

    摘要翻译: 公开了一种使用注释决策树近似条件概率的系统,其中跟踪,存储和引用系统的训练数据中不存在的预测值,以确定是否应调用统计聚合。 进一步披露的是用于存储用于导出与预测值相对应的非叶概率的统计的系统,以及用于将这种统计量聚合以近似条件概率的系统。

    Trees of classifiers for detecting email spam
    2.
    发明授权
    Trees of classifiers for detecting email spam 有权
    用于检测电子邮件垃圾邮件的分类树

    公开(公告)号:US07930353B2

    公开(公告)日:2011-04-19

    申请号:US11193691

    申请日:2005-07-29

    IPC分类号: G06F15/16

    CPC分类号: H04L51/12

    摘要: Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.

    摘要翻译: 利用分类器模型填充的决策树利用电子邮件的每个功能使用单独的电子邮件分类器来提供增强的垃圾邮件检测。 这通过定制每个分类器模型提供了更高的垃圾邮件检测的概率,以便于在逐个特征的基础上更准确地确定垃圾邮件。 分类器可以基于诸如逻辑回归模型和/或支持向量机(SVM)等线性模型来构建。 分类器也可以基于决策树构建。 基于决策树的内部和/或外部节点的“复合特征”也可以用于提供线性分类器模型。 垃圾邮件检测结果的平滑可以通过使用来自决策树内的其他节点的分类器模型来实现,如果训练数据是稀疏的。 这形成了可能没有接收到大量训练数据的决策树的分支的基本模型。

    Architecture for automated data analysis
    3.
    发明授权
    Architecture for automated data analysis 有权
    自动数据分析架构

    公开(公告)号:US06330563B1

    公开(公告)日:2001-12-11

    申请号:US09298717

    申请日:1999-04-23

    IPC分类号: G06F1730

    摘要: An architecture for automated data analysis. In one embodiment, a computerized system comprising an automated problem formulation layer, a first learning engine, and a second learning engine. The automated problem formulation layer receives a data set. The data set has a plurality of records, where each record has a value for each of a plurality of raw transactional variables. The layer abstracts the raw transactional variables into cooked transactional variables. The first learning engine generates a model for the cooked transactional variables, while the second learning engine generates a model for the raw transactional variables.

    摘要翻译: 用于自动数据分析的架构。 在一个实施例中,包括自动化问题制定层,第一学习引擎和第二学习引擎的计算机化系统。 自动化问题制定层接收数据集。 数据集具有多个记录,其中每个记录具有多个原始事务变量中的每一个的值。 该层将原始事务变量抽象为熟的事务变量。 第一个学习引擎为煮熟的事务变量生成模型,而第二个学习引擎生成原始事务变量的模型。

    Bayesian approach for learning regression decision graph models and regression models for time series analysis
    5.
    发明授权
    Bayesian approach for learning regression decision graph models and regression models for time series analysis 有权
    用于学习回归决策图模型的贝叶斯方法和时间序列分析的回归模型

    公开(公告)号:US07660705B1

    公开(公告)日:2010-02-09

    申请号:US10102116

    申请日:2002-03-19

    IPC分类号: G06F17/10

    CPC分类号: G06K9/6297

    摘要: Methods and systems are disclosed for learning a regression decision graph model using a Bayesian model selection approach. In a disclosed aspect, the model structure and/or model parameters can be learned using a greedy search algorithm applied to grow the model so long as the model improves. This approach enables construction of a decision graph having a model structure that includes a plurality of leaves, at least one of which includes a non-trivial linear regression. The resulting model thus can be employed for forecasting, such as for time series data, which can include single or multi-step forecasting.

    摘要翻译: 公开了使用贝叶斯模型选择方法学习回归决策图模型的方法和系统。 在公开的方面,只要模型改进,可以使用应用于增长模型的贪心搜索算法来学习模型结构和/或模型参数。 该方法能够构建具有包括多个叶子的模型结构的决策图,其中至少一个包括非平凡的线性回归。 因此,所得到的模型可以用于预测,例如用于时间序列数据,其可以包括单步或多步预测。

    Visualization of high-dimensional data
    6.
    发明授权
    Visualization of high-dimensional data 有权
    高维数据的可视化

    公开(公告)号:US06519599B1

    公开(公告)日:2003-02-11

    申请号:US09517138

    申请日:2000-03-02

    IPC分类号: G06F1730

    摘要: Visualization of high-dimensional data sets is disclosed, particularly the display of a network model for a data set. The network, such as a dependency or a Bayesian network, has a number of nodes having dependencies thereamong. The network can be displayed items and connections, corresponding to nodes and dependencies, respectively. Selection of a particular item in one embodiment results in the display of the local distribution associated with the node for the item. In one embodiment, only a predetermined number of the items are shown, such as only the items representing the most popular nodes. Furthermore, in one embodiment, in response to receiving a user input, a sub-set of the connections is displayed, proportional to the user input. In another embodiment, a particular item is displayed in an emphasized manner, and the particular connections representing dependencies including the node represented by the particular item, as well as the items representing nodes also in these dependencies, are also displayed in the emphasized manner. Furthermore, in one embodiment, only an indicated sub-set of the items is displayed.

    摘要翻译: 公开了高维数据集的可视化,特别是显示数据集的网络模型。 诸如依赖关系或贝叶斯网络的网络具有多个具有依赖关系的节点。 网络可以分别显示对应于节点和依赖关系的项目和连接。 在一个实施例中,特定项目的选择导致与项目的节点相关联的本地分布的显示。 在一个实施例中,仅显示预定数量的项目,诸如仅表示最受欢迎节点的项目。 此外,在一个实施例中,响应于接收到用户输入,显示与用户输入成比例的连接的子集。 在另一个实施例中,以强调方式显示特定项目,并且还以强调的方式显示表示依赖性的特定连接,包括由特定项目表示的节点以及表示节点的项目也在这些依赖关系中。 此外,在一个实施例中,仅显示所指示的项目子集。

    Systems and methods for new time series model probabilistic ARMA
    7.
    发明授权
    Systems and methods for new time series model probabilistic ARMA 有权
    新时间序列模型概率ARMA的系统和方法

    公开(公告)号:US07580813B2

    公开(公告)日:2009-08-25

    申请号:US10463145

    申请日:2003-06-17

    IPC分类号: G06F17/50 G05B23/02

    CPC分类号: G06F17/18

    摘要: The present invention utilizes a cross-prediction scheme to predict values of discrete and continuous time observation data, wherein conditional variance of each continuous time tube variable is fixed to a small positive value. By allowing cross-predictions in an ARMA based model, values of continuous and discrete observations in a time series are accurately predicted. The present invention accomplishes this by extending an ARMA model such that a first time series “tube” is utilized to facilitate or “cross-predict” values in a second time series tube to form an “ARMAxp” model. In general, in the ARMAxp model, the distribution of each continuous variable is a decision graph having splits only on discrete variables and having linear regressions with continuous regressors at all leaves, and the distribution of each discrete variable is a decision graph having splits only on discrete variables and having additional distributions at all leaves.

    摘要翻译: 本发明利用交叉预测方案来预测离散和连续时间观测数据的值,其中每个连续时间管变量的条件方差固定为小的正值。 通过在基于ARMA的模型中允许交叉预测,可以准确预测时间序列中连续和离散观测值。 本发明通过扩展ARMA模型来实现这一目的,使得第一时间序列“管”用于促进或“交叉预测”第二时间序列管中的值以形成“ARMAxp”模型。 一般来说,在ARMAxp模型中,每个连续变量的分布是仅在离散变量上分裂并具有在所有叶上具有连续回归的线性回归的决策图,并且每个离散变量的分布是仅分解为 离散变量,并在所有叶子上具有额外的分布。

    SOCIAL REWARDS FOR ONLINE GAME PLAYING
    9.
    发明申请
    SOCIAL REWARDS FOR ONLINE GAME PLAYING 有权
    在线游戏玩的社会奖励

    公开(公告)号:US20080153595A1

    公开(公告)日:2008-06-26

    申请号:US11614588

    申请日:2006-12-21

    IPC分类号: A63F9/24

    摘要: Useful information is acquired from a community of individuals by way of a game that rewards participants with social information about other participants. Points can be awarded to participants simply for participation and/or as a function of game performance. Such points can subsequently be exchanged to reveal information about game partners or other community members. Among other things, such a reward system can motivate individuals to perform tasks that might not otherwise be compelling and/or enjoyable.

    摘要翻译: 有用的信息是通过游戏方式从个人社区获取的,该游戏会奖励参与者有关其他参与者的社交信息。 点数可以仅授予参与者参与和/或作为游戏演出的功能。 随后可以交换这些点以揭示关于游戏伙伴或其他社区成员的信息。 除此之外,这种奖励制度可以激励个人执行可能无法强制和/或愉快的任务。

    USER INTERACTION-BIASED ADVERTISING
    10.
    发明申请
    USER INTERACTION-BIASED ADVERTISING 审中-公开
    用户互动偏好广告

    公开(公告)号:US20080114639A1

    公开(公告)日:2008-05-15

    申请号:US11559992

    申请日:2006-11-15

    IPC分类号: G06Q30/00 G06F17/40

    摘要: On-line and/or off-line advertisement interactions are tracked for individual users. This information can then be utilized to adjust display parameters for an advertisement. Tracking can be accomplished via a client-side tracking mechanism and/or a server side tracking mechanism. The advertisement interactions allow advertisers to adjust their advertising campaigns to better target their advertisements. The tracked interactions can include, but are not limited to selections (clicking, etc.) and/or conversions (purchases) and the like. Some instances include a display component that can employ the user-specific interaction information to automatically adjust, for example, location, frequency, and/or to whom an advertisement is displayed. The interaction information can also be utilized for revenue generation by charging advertisers for the information and/or for adjusting their advertising campaigns and the like. Instances can be utilized with on-line and/or off-line advertising media.

    摘要翻译: 为个人用户追踪在线和/或离线广告交互。 然后可以利用该信息来调整广告的显示参数。 跟踪可以通过客户端跟踪机制和/或服务器端跟踪机制来实现。 广告互动允许广告客户调整他们的广告活动,以更好地定位他们的广告。 跟踪的交互可以包括但不限于选择(点击等)和/或转换(购买)等。 一些实例包括可以使用用户特定交互信息来自动调整例如位置,频率和/或广告被显示给谁的显示组件。 交互信息还可以通过向广告商收取信息和/或调整其广告活动等来用于创收。 实例可以与在线和/或离线广告媒体一起使用。