METHOD FOR CLASSIFICATION OF OBJECTS IN A GRAPH DATA STREAM
    1.
    发明申请
    METHOD FOR CLASSIFICATION OF OBJECTS IN A GRAPH DATA STREAM 有权
    在图形数据流中分类对象的方法

    公开(公告)号:US20120054129A1

    公开(公告)日:2012-03-01

    申请号:US12871168

    申请日:2010-08-30

    Applicant: Charu Aggarwal

    Inventor: Charu Aggarwal

    CPC classification number: G06N99/005

    Abstract: A method for classifying objects in a graph data stream, including receiving a training stream of graph data, the training stream including a plurality of objects along with class labels that are associated with each of the objects, first determining discriminating sets of edges in the training stream for the class labels, wherein a discriminating set of edges is one that is indicative of the object that contains these edges having a given class label, receiving an incoming data stream of the graph data, wherein class labels have not yet been assigned to objects in the incoming data stream, second determining, based on the discriminating sets of edges, class labels that are associated with the objects in the incoming data stream; and outputting to an information repository object class label pairs based on the second determining.

    Abstract translation: 一种用于对图形数据流中的对象进行分类的方法,包括接收图形数据的训练流,训练流包括多个对象以及与每个对象相关联的类标签,首先确定训练中的边缘识别集合 用于类标签的流,其中,鉴别集合的边是指示包含具有给定类标签的这些边的对象,接收图数据的输入数据流,其中类标签尚未被分配给对象 在输入数据流中,基于所识别的边缘集合,第二确定与输入数据流中的对象相关联的类标签; 以及基于所述第二确定将信息输出到信息库对象类标签对。

    GRAPHICAL MODELS FOR REPRESENTING TEXT DOCUMENTS FOR COMPUTER ANALYSIS
    2.
    发明申请
    GRAPHICAL MODELS FOR REPRESENTING TEXT DOCUMENTS FOR COMPUTER ANALYSIS 有权
    用于表示计算机分析的文本文档的图形模型

    公开(公告)号:US20110302168A1

    公开(公告)日:2011-12-08

    申请号:US12796266

    申请日:2010-06-08

    Applicant: Charu Aggarwal

    Inventor: Charu Aggarwal

    CPC classification number: G06F17/30619

    Abstract: In a method for representing a text document with a graphical model, a document including a plurality of ordered words is received and a graph data structure for the document is created. The graph data structure includes a plurality of nodes and edges, with each node representing a distinct word in the document and each edge identifying a number of times two nodes occur within a predetermined distance from each other. The graph data structure is stored in an information repository.

    Abstract translation: 在用图形模型表示文本文档的方法中,接收包括多个有序字的文档,并创建文档的图形数据结构。 图形数据结构包括多个节点和边缘,其中每个节点表示文档中的不同字,每个边缘标识两个节点彼此之间预定距离内发生的次数。 图形数据结构存储在信息库中。

    System and method for distributed privacy preserving data mining
    3.
    发明申请
    System and method for distributed privacy preserving data mining 有权
    分布式隐私保护数据挖掘的系统和方法

    公开(公告)号:US20060015474A1

    公开(公告)日:2006-01-19

    申请号:US10892691

    申请日:2004-07-16

    Abstract: Distributed privacy preserving data mining techniques are provided. A first entity of a plurality of entities in a distributed computing environment exchanges summary information with a second entity of the plurality of entities via a privacy-preserving data sharing protocol such that the privacy of the summary information is preserved, the summary information associated with an entity relating to data stored at the entity. The first entity may then mine data based on at least the summary information obtained from the second entity via the privacy-preserving data sharing protocol. The first entity may obtain, from the second entity via the privacy-preserving data sharing protocol, information relating to the number of transactions in which a particular itemset occurs and/or information relating to the number of transactions in which a particular rule is satisfied.

    Abstract translation: 提供分布式隐私保护数据挖掘技术。 分布式计算环境中的多个实体的第一实体经由隐私保护数据共享协议与多个实体中的第二实体交换摘要信息,使得保留摘要信息的隐私,与 与实体存储的数据相关的实体。 然后,第一实体可以至少基于通过隐私保护数据共享协议从第二实体获得的摘要信息来挖掘数据。 第一实体可以通过隐私保护数据共享协议从第二实体获得与特定项目集出现的交易数量有关的信息和/或与其中满足特定规则的交易数量有关的信息。

    Method and apparatus for privacy preserving data mining by restricting attribute choice
    6.
    发明申请
    Method and apparatus for privacy preserving data mining by restricting attribute choice 有权
    通过限制属性选择来保护数据挖掘隐私的方法和装置

    公开(公告)号:US20070233711A1

    公开(公告)日:2007-10-04

    申请号:US11397297

    申请日:2006-04-04

    Abstract: Improved techniques for privacy preserving data mining of multidimensional data records are disclosed. For example, a technique for generating at least one output data set from at least one input data set for use in association with a data mining process comprises the following steps/operations. At least one relevant attribute of the at least one input data set is selected through determination of at least one relevance coefficient. The at least one output data set is generated from the at least one input data set, wherein the at least one output data set comprises the at least one relevant attribute of the at least one input data set, as determined by use of the at least one relevance coefficient.

    Abstract translation: 公开了用于多维数据记录的隐私保护数据挖掘的改进技术。 例如,用于从至少一个输入数据集生成至少一个输出数据集用于与数据挖掘过程相关联使用的技术包括以下步骤/操作。 通过确定至少一个相关性系数来选择至少一个输入数据集的至少一个相关属性。 所述至少一个输出数据集是从所述至少一个输入数据集生成的,其中所述至少一个输出数据组包括所述至少一个输入数据集的至少一个相关属性,如通过至少一个 一个相关系数。

    System and method of flexible data reduction for arbitrary applications
    7.
    发明申请
    System and method of flexible data reduction for arbitrary applications 失效
    用于任意应用的灵活数据简化的系统和方法

    公开(公告)号:US20060026175A1

    公开(公告)日:2006-02-02

    申请号:US10901278

    申请日:2004-07-28

    Applicant: Charu Aggarwal

    Inventor: Charu Aggarwal

    Abstract: The present invention is directed to the use of an evolutionary algorithm to locate optimal solution subspaces. The evolutionary algorithm uses a point-based coding of the subspace determination problem and searches selectively over the space of possible coded solutions. Each feasible solution to the problem, or individual in the population of feasible solutions, is coded as a string, which facilitates use of the evolutionary algorithm to determine the optimal solution to the fitness function. The fitness of each string is determined by solving the objective function for that string. The resulting fitness value can then be converted to a rank, and all of the members of the population of solutions can be evaluated using selection, crossover, and mutation processes that are applied sequentially and iteratively to the individuals in the population of solutions. The population of solutions is updated as the individuals in the population evolve and converge, that is become increasingly genetically similar to one another. The iterations of selection, crossover and mutation are performed until a desired level of convergence among the individuals in the population of solutions has been achieved.

    Abstract translation: 本发明涉及使用进化算法来定位最优解子空间。 进化算法使用子空间确定问题的基于点的编码,并在可能的编码解决方案的空间上有选择地搜索。 问题的每个可行解决方案或可行解决方案中的个体都被编码为字符串,这有助于使用进化算法来确定适合度函数的最优解。 每个字符串的适合度是通过求解该字符串的目标函数来确定的。 然后可以将得到的适合度值转换成等级,并且可以使用对于解决方案群体中的个体顺序和迭代地应用的选择,交叉和突变过程来评估解决方案群体的所有成员。 解决方案的人口随着人口中的个体发展和趋同而得到更新,这种变化越来越多地基因上彼此相似。 执行选择,交叉和突变的迭代,直到解决方案群体中的个体之间达到期望的收敛水平。

    Methods and apparatus for privacy preserving data mining using statistical condensing approach
    9.
    发明申请
    Methods and apparatus for privacy preserving data mining using statistical condensing approach 有权
    使用统计冷凝方法保护数据挖掘隐私的方法和设备

    公开(公告)号:US20050049991A1

    公开(公告)日:2005-03-03

    申请号:US10641935

    申请日:2003-08-14

    Abstract: Methods and apparatus for generating at least one output data set from at least one input data set for use in association with a data mining process are provided. First, data statistics are constructed from the at least one input data set. Then, an output data set is generated from the data statistics. The output data set differs from the input data set but maintains one or more correlations from within the input data set. The correlations may be the inherent correlations between different dimensions of a multidimensional input data set. A significant amount of information from the input data set may be hidden so that the privacy level of the data mining process may be increased.

    Abstract translation: 提供了用于从与数据挖掘过程相关联使用的至少一个输入数据集生成至少一个输出数据集的方法和装置。 首先,从至少一个输入数据集构建数据统计。 然后,从数据统计生成输出数据集。 输出数据集与输入数据集不同,但保持与输入数据集内的一个或多个相关。 相关性可以是多维输入数据集的不同维度之间的固有相关性。 可以隐藏来自输入数据集的大量信息,从而可以增加数据挖掘过程的隐私级别。

    Event mining in social networks
    10.
    发明授权
    Event mining in social networks 有权
    社交网络中的事件挖掘

    公开(公告)号:US08914371B2

    公开(公告)日:2014-12-16

    申请号:US13324513

    申请日:2011-12-13

    CPC classification number: G06F17/30516 G06F17/3071 H04L51/32

    Abstract: A method and system for detecting an event from a social stream. The method includes the steps of: receiving a social stream from a social network, where the social stream includes at least one object and the object includes a text, sender information of the text, and recipient information of the text; assigning said object to a cluster based on a similarity value between the object and the clusters; monitoring changes in at least one of the clusters; and triggering an alarm when the changes in at least one of the clusters exceed a first threshold value, where at least one of the steps is carried out using a computer device.

    Abstract translation: 一种用于从社交流中检测事件的方法和系统。 该方法包括以下步骤:从社交网络接收社交流,其中社交流包括至少一个对象,并且对象包括文本,文本的发送者信息和文本的接收者信息; 基于对象和群集之间的相似度值将所述对象分配给群集; 监视至少一个集群的变化; 并且当至少一个所述簇中的变化超过第一阈值时触发报警,其中使用计算机设备执行至少一个所述步骤。

Patent Agency Ranking