Visualization of high-dimensional data
    1.
    发明授权
    Visualization of high-dimensional data 有权
    高维数据的可视化

    公开(公告)号:US06519599B1

    公开(公告)日:2003-02-11

    申请号:US09517138

    申请日:2000-03-02

    IPC分类号: G06F1730

    摘要: Visualization of high-dimensional data sets is disclosed, particularly the display of a network model for a data set. The network, such as a dependency or a Bayesian network, has a number of nodes having dependencies thereamong. The network can be displayed items and connections, corresponding to nodes and dependencies, respectively. Selection of a particular item in one embodiment results in the display of the local distribution associated with the node for the item. In one embodiment, only a predetermined number of the items are shown, such as only the items representing the most popular nodes. Furthermore, in one embodiment, in response to receiving a user input, a sub-set of the connections is displayed, proportional to the user input. In another embodiment, a particular item is displayed in an emphasized manner, and the particular connections representing dependencies including the node represented by the particular item, as well as the items representing nodes also in these dependencies, are also displayed in the emphasized manner. Furthermore, in one embodiment, only an indicated sub-set of the items is displayed.

    摘要翻译: 公开了高维数据集的可视化,特别是显示数据集的网络模型。 诸如依赖关系或贝叶斯网络的网络具有多个具有依赖关系的节点。 网络可以分别显示对应于节点和依赖关系的项目和连接。 在一个实施例中,特定项目的选择导致与项目的节点相关联的本地分布的显示。 在一个实施例中,仅显示预定数量的项目,诸如仅表示最受欢迎节点的项目。 此外,在一个实施例中,响应于接收到用户输入,显示与用户输入成比例的连接的子集。 在另一个实施例中,以强调方式显示特定项目,并且还以强调的方式显示表示依赖性的特定连接,包括由特定项目表示的节点以及表示节点的项目也在这些依赖关系中。 此外,在一个实施例中,仅显示所指示的项目子集。

    Architecture for automated data analysis
    3.
    发明授权
    Architecture for automated data analysis 有权
    自动数据分析架构

    公开(公告)号:US06330563B1

    公开(公告)日:2001-12-11

    申请号:US09298717

    申请日:1999-04-23

    IPC分类号: G06F1730

    摘要: An architecture for automated data analysis. In one embodiment, a computerized system comprising an automated problem formulation layer, a first learning engine, and a second learning engine. The automated problem formulation layer receives a data set. The data set has a plurality of records, where each record has a value for each of a plurality of raw transactional variables. The layer abstracts the raw transactional variables into cooked transactional variables. The first learning engine generates a model for the cooked transactional variables, while the second learning engine generates a model for the raw transactional variables.

    摘要翻译: 用于自动数据分析的架构。 在一个实施例中,包括自动化问题制定层,第一学习引擎和第二学习引擎的计算机化系统。 自动化问题制定层接收数据集。 数据集具有多个记录,其中每个记录具有多个原始事务变量中的每一个的值。 该层将原始事务变量抽象为熟的事务变量。 第一个学习引擎为煮熟的事务变量生成模型,而第二个学习引擎生成原始事务变量的模型。

    Goal-oriented clustering
    4.
    发明授权
    Goal-oriented clustering 有权
    面向目标的聚类

    公开(公告)号:US06694301B1

    公开(公告)日:2004-02-17

    申请号:US09540255

    申请日:2000-03-31

    IPC分类号: G06N502

    摘要: Clustering for purposes of data visualization and making predictions is disclosed. Embodiments of the invention are operable on a number of variables that have a predetermined representation. The variables include input-only variables, output-only variables, and both input-and-output variables. Embodiments of the invention generate a model that has a bottleneck architecture. The model includes a top layer of nodes of at least the input-only variables, one or more middle layer of hidden nodes, and a bottom layer of nodes of the output-only and the input-and-output variables. At least one cluster is determined from this model. The model can be a probabilistic neural network and/or a Bayesian network.

    摘要翻译: 公开了用于数据可视化和进行预测的聚类。 本发明的实施例可以对具有预定表示的多个变量进行操作。 变量包括仅输入变量,仅输出变量,以及输入和输出变量。 本发明的实施例生成具有瓶颈架构的模型。 该模型包括至少仅输入变量,一个或多个中间层隐藏节点的顶层,以及仅输出和输入和输出变量的节点的底层。 从该模型确定至少一个群集。 该模型可以是概率神经网络和/或贝叶斯网络。

    Cluster-based and rule-based approach for automated web-based targeted advertising with quotas
    5.
    发明授权
    Cluster-based and rule-based approach for automated web-based targeted advertising with quotas 有权
    基于群集和基于规则的自动化基于Web的定向广告配额配额方法

    公开(公告)号:US07472102B1

    公开(公告)日:2008-12-30

    申请号:US09430767

    申请日:1999-10-29

    IPC分类号: G06N5/00

    CPC分类号: G06Q30/02 G06Q10/087

    摘要: Targeted delivery of items with inventory management using a cluster-based approach or a rule-based approach is disclosed. An example of items is advertisements. Each item is allocated to one or more clusters. The allocation is made based on a predetermined criterion accounting for at least a quota for each item and possibly a constraint for each cluster. The former can refer to the number of times an item must be shown. The latter can refer to the number of times a given group of web pages is likely to be visited by users, and hence is the number of times items can be shown in a given cluster. The invention is not limited to any particular definition of what constitutes a cluster or item.

    摘要翻译: 披露了使用基于群集的方法或基于规则的方法对库存管理进行目标交付。 项目的一个例子是广告。 每个项目被分配给一个或多个集群。 基于至少考虑每个项目的配额和可能的每个集群的约束的预定标准进行分配。 前者可以参考项目必须显示的次数。 后者可以参考给定的一组网页可能被用户访问的次数,因此是在给定的集群中可以显示项目的次数。 本发明不限于什么构成集群或项目的任何具体定义。

    Transmission of information during ad click-through
    6.
    发明授权
    Transmission of information during ad click-through 有权
    广告点击过程中的信息传输

    公开(公告)号:US07058592B1

    公开(公告)日:2006-06-06

    申请号:US09450262

    申请日:1999-11-29

    IPC分类号: G06F17/60

    摘要: The transmission of information during ad click-through is disclosed. In one embodiment, a computer-implemented method selects an ad to be displayed on a web page, as one of a plurality of ads within a current cluster in which each of the ad has a probability to be selected. The method displays the ad on the web page, and then detects activation—for example, click-through—of the displayed ad. The method transmits information to an entity associated with the ad, such as an advertiser, upon detecting click-through or other activation of the ad. In one embodiment, the information transmitted includes information regarding the current cluster.

    摘要翻译: 透露广告点击过程中的信息传输。 在一个实施例中,计算机实现的方法选择要在网页上显示的广告,作为当前集群中的多个广告之一,其中每个广告具有被选择的概率。 该方法会在网页上显示广告,然后检测激活 - 例如,所显示广告的点击。 该方法在检测到点击或其他激活广告时,向与广告相关联的实体(例如广告商)发送信息。 在一个实施例中,发送的信息包括关于当前集群的信息。

    Decision theoretic approach to targeted solicitation by maximizing expected profit increases
    7.
    发明授权
    Decision theoretic approach to targeted solicitation by maximizing expected profit increases 有权
    通过最大化预期利润增长的决策理论方法进行有针对性的招标

    公开(公告)号:US08103537B2

    公开(公告)日:2012-01-24

    申请号:US11257473

    申请日:2005-10-24

    IPC分类号: G06Q99/00

    摘要: A decision theoretic approach to targeted solicitation, by maximizing expected profit increases, is disclosed. A decision theoretic model is used to identify a sub-population of a population to solicit, where the model is constructed to maximize an expected increase in profits. A decision tree in particular can be used as the model. The decision tree has paths from a root node to a number of leaf nodes. The decision tree has a split on a solicitation variable in every path from the root node to each leaf node. The solicitation variable has two values, a first value corresponding to a solicitation having been made, and a second value corresponding to a solicitation not having been made.

    摘要翻译: 披露了通过最大化预期利润增长的针对性招标的决策理论方法。 决策理论模型用于识别人口的子群体,以便建立模型以最大化利润的预期增长。 决策树特别可以用作模型。 决策树具有从根节点到多个叶节点的路径。 决策树在从根节点到每个叶节点的每个路径中的请求变量上都有一个拆分。 招标变量具有两个值,对应于已经作出的邀请的第一个值,以及对应于未作出的请求的第二个值。

    Noise reduction for a cluster-based approach for targeted item delivery with inventory management
    8.
    发明授权
    Noise reduction for a cluster-based approach for targeted item delivery with inventory management 有权
    基于群集的方法进行降噪,用于通过库存管理进行目标物品交付

    公开(公告)号:US06665653B1

    公开(公告)日:2003-12-16

    申请号:US09565583

    申请日:2000-05-04

    IPC分类号: G06N504

    CPC分类号: G06Q30/02

    摘要: Reduction of noise within a cluster-based approach for item (such as ad) allocation, such as by using a linear program, is described. In one embodiment, probabilities are discretized into a predetermined number of groups, where the mean for the group that a particular probability has been discretized into is substituted for the particular probability when the items are being allocated. In another embodiment, the probabilities are decreased by a power function of the variances for them. In a third embodiment, allocation of items to clusters is not changed unless the sample sizes used to determine the corresponding probabilities for those ads is greater than a threshold. In a fourth embodiment, after allocation is performed a first time, a predetermined number of item are removed, and reallocation is performed.

    摘要翻译: 描述了基于群集的方法中的项目(例如广告)分配(例如通过使用线性程序)来减少噪声。 在一个实施例中,将概率离散为预定数量的组,其中特定概率已被离散化的组的均值代替项目被分配时的特定概率。 在另一个实施例中,通过它们的方差的幂函数来降低概率。 在第三实施例中,除了用于确定这些广告的相应概率的样本大小大于阈值之外,项目到群集的分配也不会改变。 在第四实施例中,在首次执行分配之后,去除预定数量的项目,并且执行重新分配。

    Fast extraction of one-way and two-way counts from sparse data
    9.
    发明授权
    Fast extraction of one-way and two-way counts from sparse data 有权
    从稀疏数据快速提取单向和双向计数

    公开(公告)号:US06360224B1

    公开(公告)日:2002-03-19

    申请号:US09298723

    申请日:1999-04-23

    IPC分类号: G06F1730

    摘要: Two-way counts utilizing sparse representation of a data set. In one embodiment, a computer-implemented method such that a data set is first input. The data set has a plurality of records. Each record has at least one attribute, where each attribute has a default value. The method stores a sparse representation of each record, such that the value of an attribute of the record is stored only if it varies from the default value. A data model is then generated, utilizing the sparse representation. Generation of the data model includes initially extracting two-way counts from the sparse representation. Finally, the model is output.

    摘要翻译: 使用数据集的稀疏表示的双向计数。 在一个实施例中,一种计算机实现的方法,使得首先输入数据集。 数据集具有多个记录。 每个记录至少有一个属性,每个属性都有一个默认值。 该方法存储每个记录的稀疏表示,使得记录的属性的值仅在与默认值不同时被存储。 然后使用稀疏表示生成数据模型。 数据模型的生成包括从稀疏表示中初始提取双向计数。 最后输出模型。

    Fast clustering with sparse data
    10.
    发明授权
    Fast clustering with sparse data 有权
    使用稀疏数据快速聚类

    公开(公告)号:US06556958B1

    公开(公告)日:2003-04-29

    申请号:US09298600

    申请日:1999-04-23

    IPC分类号: G06F760

    摘要: Efficient data modeling utilizing sparse representation of a data set. In one embodiment, a computer-implemented method such that a data set is first input. The data set has a plurality of records. Each record has at least one attribute, where each attribute has a default value. The method stores a sparse representation of each record, such that the value of each attribute of the record is stored only if the value of the attribute varies from the default value. A data model is then generated, utilizing the sparse representation, and the model is output. The generation of the data model in one embodiment is in accordance with the Expectation Maximization (EM) algorithm.

    摘要翻译: 利用数据集稀疏表示的高效数据建模。 在一个实施例中,一种计算机实现的方法,使得首先输入数据集。 数据集具有多个记录。 每个记录至少有一个属性,每个属性都有一个默认值。 该方法存储每个记录的稀疏表示,使得仅当属性的值从默认值变化时才记录记录的每个属性的值。 然后使用稀疏表示生成数据模型,并输出模型。 在一个实施例中,数据模型的产生符合期望最大化(EM)算法。