Method and apparatus for generating test data sets in accordance with user feedback
    31.
    发明授权
    Method and apparatus for generating test data sets in accordance with user feedback 失效
    根据用户反馈生成测试数据集的方法和装置

    公开(公告)号:US07085981B2

    公开(公告)日:2006-08-01

    申请号:US10457333

    申请日:2003-06-09

    CPC classification number: G01R31/318307 Y10S707/99943

    Abstract: Techniques for processing data sets and, more particularly, constructing a synthetic data set (test data set) from real data sets (input data sets) in accordance with user feedback. The technique mimics real data sets effectively to generate the corresponding synthetic ones. Multiple real data sets may be used to create a test data set which combines the characteristics of these multiple data sets. Users of the technique have the ability to modify the characteristics of the data sets to create a new data set which has features that a user may desire. For example, a user may change the shape or size of, or distort the different patterns in the data to create a new data set. A user may also choose to inject noise into the system.

    Abstract translation: 用于处理数据集的技术,更具体地,根据用户反馈从实际数据集(输入数据集)构建合成数据集(测试数据集)。 该技术有效地模拟实际数据集,以产生相应的合成数据集。 可以使用多个实数据集来创建组合这些多个数据集的特征的测试数据集。 该技术的用户能够修改数据集的特征以创建具有用户可能期望的特征的新数据集。 例如,用户可以改变数据中的不同模式的形状或大小或使其变形,以创建新的数据集。 用户还可以选择将噪声注入系统。

    Methods and apparatus for user-centered similarity learning
    32.
    发明授权
    Methods and apparatus for user-centered similarity learning 失效
    以用户为中心的相似性学习的方法和设备

    公开(公告)号:US06970884B2

    公开(公告)日:2005-11-29

    申请号:US09929202

    申请日:2001-08-14

    Abstract: Techniques are provided for incorporating human or user interaction in accordance with the design and/or performance of data mining applications such as similarity determination and classification. Such user-centered techniques permit the mining of interesting characteristics of data in a data or feature space. For example, such interesting characteristics that may be determined in accordance with the user-centered mining techniques of the invention may include a determination of similarity among different data objects, as well the determination of individual class labels. These techniques allow effective data mining applications to be performed in accordance with high dimensional data.

    Abstract translation: 根据数据挖掘应用程序的设计和/或性能,如相似性确定和分类,提供了用于结合人或用户交互的技术。 这种以用户为中心的技术允许在数据或特征空间中挖掘有趣的数据特征。 例如,可以根据本发明的以用户为中心的挖掘技术确定的这种有趣的特征可以包括确定不同数据对象之间的相似性以及确定各个类别标签。 这些技术允许根据高维数据执行有效的数据挖掘应用。

    System and method for mining unstructured data sets
    33.
    发明授权
    System and method for mining unstructured data sets 失效
    非结构化数据集挖掘的系统和方法

    公开(公告)号:US06847955B2

    公开(公告)日:2005-01-25

    申请号:US09829798

    申请日:2001-04-10

    CPC classification number: G06K9/6232 Y10S707/99932

    Abstract: A method for mining incomplete data sets that avoids the process of having to extrapolate the attributes, and instead concentrate on the use of conceptual representations in order to mine the data sets. The idea in using conceptual representations is that even though many attributes may be missing, it is possible to accurately guess the behavior of the data along certain pre-specified directions, i.e., the conceptual directions of the data set.

    Abstract translation: 一种用于挖掘不完整数据集的方法,避免了必须外推属性的过程,而是专注于使用概念表示以挖掘数据集。 使用概念表示的想法是即使许多属性可能丢失,也可以准确地猜测数据沿特定的预定方向(即数据集的概念方向)的行为。

    System and method for classification using time sequences
    34.
    发明授权
    System and method for classification using time sequences 有权
    使用时间序列分类的系统和方法

    公开(公告)号:US06721719B1

    公开(公告)日:2004-04-13

    申请号:US09361381

    申请日:1999-07-26

    CPC classification number: G06N5/025

    Abstract: System and method for generating classification using time sequences comprises inputting a set of time dependant feature variable graphs along with a set of time dependant category variable graphs; finding frequent shapes in the time dependant feature variable graphs; utilizing the frequent shapes to generate combinations of frequent shapes; generating rules relating one or more patterns of combinations of frequent shapes to a category variable; and, performing a categorization utilizing the rules generated.

    Abstract translation: 使用时间序列生成分类的系统和方法包括:输入一组时间相关特征变量图以及一组时间依赖类别变量图; 在时间依赖特征变量图中发现频繁的形状; 利用频繁的形状产生频繁形状的组合; 生成与频繁形状的一个或多个组合的模式相关联的规则到类别变量; 并且利用所生成的规则执行分类。

    Method for optimizing profits in electronic delivery of digital objects
    35.
    发明授权
    Method for optimizing profits in electronic delivery of digital objects 失效
    优化数字物体电子交付利润的方法

    公开(公告)号:US06631413B1

    公开(公告)日:2003-10-07

    申请号:US09239008

    申请日:1999-01-28

    CPC classification number: G06Q10/08

    Abstract: In accordance with the present invention, a method for selecting a channel and delivery time for digital objects for a broadcast delivery service including multiple channels of varying bandwidths includes the steps of selecting digital objects to be sent over the multiple channels, generating a schedule and pricing for the digital objects based on the digital object selected and existing delivery commitments and manipulating the schedule and pricing to provide a profitable delivery of the digital objects. A system is also included.

    Abstract translation: 根据本发明,一种用于为包括多个变化带宽的多个信道的广播传送业务的数字对象选择信道和传送时间的方法包括以下步骤:选择要在多个信道上发送的数字对象,生成调度和定价 用于基于选定的数字对象和现有交付承诺的数字对象,并操纵计划和定价以提供数字对象的有利可图的交付。 还包括一个系统。

    Depth first method for generating itemsets
    37.
    发明授权
    Depth first method for generating itemsets 失效
    深度第一种生成项目集的方法

    公开(公告)号:US06389416B1

    公开(公告)日:2002-05-14

    申请号:US09253243

    申请日:1999-02-19

    Abstract: A system and method for generating itemset associations in a memory storage system comprising many transactions, with each transaction including one or more items capable of forming the itemset associations. The method involves generating a lexicographic tree structure having nodes representing itemset associations meeting a minimum support criteria. In a recursive manner, for each lexicographic least itemset (node) P of the lexicographic tree structure, candidate extensions of the node P are first determined. Then, the support of each of the candidate extensions is counted to determine frequent extension itemsets of that node P, while those itemsets not meeting a predetermined support criteria are eliminated. Child nodes corresponding to the frequent extensions and meeting the predetermined support criteria are created. For each frequent child of node P, all itemset associations for all descendants of node P are generated first. Thus, the lexicographic tree structure is generated in a depth first manner. By projecting transactions upon the lexicographic tree structure in a depth-first manner, the CPU time for counting large itemsets is substantially reduced.

    Abstract translation: 一种用于在包括许多事务的存储器存储系统中生成项集合关联的系统和方法,每个事务包括能够形成项目集关联的一个或多个项目。 该方法涉及生成具有表示符合最小支持标准的项目集关联的节点的词典树结构。 以递归的方式,对于词典树结构的每个词典最小项集(节点)P,首先确定节点P的候选扩展。 然后,对每个候选分机的支持进行计数,以确定该节点P的频繁扩展项目集,而不符合预定支持标准的那些项目集被消除。 创建对应于频繁扩展并满足预定支持标准的子节点。 对于节点P的每个频繁子节点,首先生成节点P的所有后代的所有项目集关联。 因此,以深度第一方式生成词典树结构。 通过以深度优先的方式在字典树结构上投影事务,大大减少了用于计数大项目集的CPU时间。

    Maximum factor selection policy for batching VOD requests
    38.
    发明授权
    Maximum factor selection policy for batching VOD requests 失效
    批量VOD请求的最大因素选择策略

    公开(公告)号:US5631694A

    公开(公告)日:1997-05-20

    申请号:US595313

    申请日:1996-02-01

    CPC classification number: H04N7/17336

    Abstract: A VOD scheduler maintains a queue of pending performance for each video. Using the notion of queue selection factor, a batching policy is devised that schedules the video with the highest selection factor. Selection factors are obtained by applying discriminatory weighting factors to the adjusted queue lengths associated with each video where the weight decreases as the popularity of the respective video increases and the queue length is adjusted to take defection into account.

    Abstract translation: VOD调度程序为每个视频维护一个待处理的性能队列。 使用队列选择因子的概念,设计了一个配置策略,以最高的选择因子调度视频。 选择因素是通过对与每个视频相关联的调整的队列长度应用歧视性加权因子,其中权重随着各个视频的普及而增加并且调整队列长度以考虑到缺陷。

    System, method and computer program product for classification of social streams

    公开(公告)号:US09679337B2

    公开(公告)日:2017-06-13

    申请号:US13595732

    申请日:2012-08-27

    CPC classification number: G06Q50/01 G06F17/30707 G06N99/005

    Abstract: A system that labels an unlabeled message of a social stream. The system including a memory device storing instructions to execute a training model, the training model being trained based on labeled messages, and partitioned into a plurality of class partitions, each of which comprise statistical information and a class label, and a Central Processing Unit (CPU) that computes a confidence for each of the class partitions based on information of an unlabeled message and the statistical information of a respective class partition, and that labels the unlabeled message according to respective confidences of the class partitions.

    Mechanisms for privately sharing semi-structured data
    40.
    发明授权
    Mechanisms for privately sharing semi-structured data 有权
    私有分享半结构数据的机制

    公开(公告)号:US09471645B2

    公开(公告)日:2016-10-18

    申请号:US12568976

    申请日:2009-09-29

    CPC classification number: G06F17/30539 G06F17/30598

    Abstract: Mechanisms are provided for anonymizing data comprising a plurality of graph data sets. The mechanisms receive input data comprising a plurality of graph data sets. Each graph data set comprises data for generating a separate graph from graphs associated with other graph data sets. The mechanisms perform clustering on the graph data sets to generate a plurality of clusters. At least one cluster of the plurality of clusters comprises a plurality of graph data sets. Other clusters in the plurality of clusters comprise one or more graph data sets. The mechanisms also determine, for each cluster in the plurality of clusters, aggregate properties of the cluster. Moreover, the mechanisms generate, for each cluster in the plurality of clusters, pseudo-synthetic data representing the cluster, from the determined aggregate properties of the clusters.

    Abstract translation: 提供了用于对包括多个图形数据集的数据进行匿名化的机制。 机构接收包括多个图形数据集的输入数据。 每个图形数据集包括用于从与其它图形数据集相关联的图形生成单独图形的数据。 这些机制对图形数据集执行聚类以产生多个聚类。 多个群集中的至少一个群集包括多个图形数据集。 多个集群中的其他集群包括一个或多个图形数据集。 这些机制还针对多个集群中的每个集群确定集群的集合属性。 此外,从所确定的群集的聚合属性,机制针对多个群集中的每个群集生成表示群集的伪合成数据。

Patent Agency Ranking