Distributed data clustering system and method
    1.
    发明授权
    Distributed data clustering system and method 失效
    分布式数据集群系统及方法

    公开(公告)号:US07039638B2

    公开(公告)日:2006-05-02

    申请号:US09844730

    申请日:2001-04-27

    Abstract: A distributed data clustering system having an integrator and at least two computing units. Each computing unit is loaded with common global parameter values and a particular local data set. Each computing unit then generates local sufficient statistics based on the local data set and global parameter values. The integrator employs the local sufficient statistics of all the computing units to update the global parameter values.

    Abstract translation: 一种具有积分器和至少两个计算单元的分布式数据聚类系统。 每个计算单元加载有公共全局参数值和特定的本地数据集。 然后,每个计算单元基于本地数据集和全局参数值生成本地足够的统计信息。 集成商使用所有计算单元的本地足够的统计信息来更新全局参数值。

    Data de-duplication
    2.
    发明申请
    Data de-duplication 有权
    重复数据删除

    公开(公告)号:US20050182780A1

    公开(公告)日:2005-08-18

    申请号:US10780235

    申请日:2004-02-17

    CPC classification number: G06F17/30489 G06F17/30303 Y10S707/99942

    Abstract: Generating masks for de-duplication in a database where distributed entities provide activity data for said database. Determining from activity input data which entities add variable data to a given data field. Generating a list of the masks which effectively remove the variable data portion in the field. Consolidating input data using the generated masks.

    Abstract translation: 在分布式实体为数据库提供活动数据的数据库中生成重复数据删除的掩码。 从活动输入数据确定哪些实体将变量数据添加到给定的数据字段。 生成有效删除字段中的可变数据部分的掩码列表。 使用生成的掩码合并输入数据。

    Data cleaning
    3.
    发明申请
    Data cleaning 审中-公开
    数据清理

    公开(公告)号:US20050131855A1

    公开(公告)日:2005-06-16

    申请号:US10733750

    申请日:2003-12-11

    CPC classification number: G06F16/2272

    Abstract: A process for rapid data recovery, data cleaning and an automated self-maintenance of the data recovery mechanism is provided. Dirty input data records are used in conjunction with and to build and revise a fast indexing table wherein index keys point to clean data records with which the input data should be rightly associated. Mechanisms for automated revision of the indexing table are described. Said table forms a tool useful in data mining and knowledge discovery to analysis of heuristic processes.

    Abstract translation: 提供了快速数据恢复,数据清理和数据恢复机制的自动维护的过程。 脏输入数据记录结合使用并构建和修改快速索引表,其中索引键指向要与输入数据正确关联的清洁数据记录。 描述索引表自动修订的机制。 所述表格形成了一种在数据挖掘和知识发现中有用的工具,用于分析启发式过程。

    Classifier indexing
    4.
    发明授权
    Classifier indexing 有权
    分类器索引

    公开(公告)号:US09430562B2

    公开(公告)日:2016-08-30

    申请号:US12242752

    申请日:2008-09-30

    CPC classification number: G06F17/30705

    Abstract: Provided are, among other things, systems, methods and techniques for document-based processing. In one implementation, a document is input; features are extracted from it; an index is queried using at least a subset of the extracted features and, in response, identifications for selected document classifiers are received from a larger pool of document classifiers; the document is processed using individual ones of the selected document classifiers, thereby generating corresponding classifier outputs; and then, based on such classifier outputs, (1) the document is categorized within a computer database and/or (2) feedback information is provided to a user.

    Abstract translation: 尤其是基于文档处理的系统,方法和技术。 在一个实现中,输入文档; 特征从中提取出来; 使用提取的特征的至少一个子集来查询索引,并且作为响应,从较大的文档分类器池接收所选择的文档分类器的标识; 使用所选择的文档分类器中的单独的文档分类器处理文档,从而生成相应的分类器输出; 然后,基于这样的分类器输出,(1)文档被分类在计算机数据库中和/或(2)向用户提供反馈信息。

    Document classification
    6.
    发明授权
    Document classification 有权
    文件分类

    公开(公告)号:US08856123B1

    公开(公告)日:2014-10-07

    申请号:US11780803

    申请日:2007-07-20

    Applicant: George Forman

    Inventor: George Forman

    CPC classification number: G06F17/30613

    Abstract: Provided are, among other things, systems, methods and techniques for classifying a collection of documents. A term is identified based on an indication of ability of the term's presence within a given document to predict whether the given document should be classified into an identified category. A document index is then queried using the identified term and, in response, search results that define a candidate set of documents are received. Finally, a classifier is applied to documents within the candidate set to determine which of the documents should be classified into the identified category.

    Abstract translation: 除其他之外,提供用于分类文档集合的系统,方法和技术。 基于在给定文档中存在该术语的能力的指示来确定术语,以预测给定文档是否应被分类为识别的类别。 然后使用所识别的术语查询文档索引,并且作为响应,接收定义候选文档集合的搜索结果。 最后,将分类器应用于候选集中的文档,以确定将哪些文档分类为所识别的类别。

    Method and system for developing a classification tool
    7.
    发明授权
    Method and system for developing a classification tool 有权
    开发分类工具的方法和系统

    公开(公告)号:US08311957B2

    公开(公告)日:2012-11-13

    申请号:US12618181

    申请日:2009-11-13

    CPC classification number: G06N5/02

    Abstract: An exemplary embodiment of the present invention provides a computer implemented method of developing a classifier. The method includes obtaining a set of training data comprising labeled cases. The method also includes training a classifier based, at least in part, on the training data. The method also includes applying the classifier to a plurality of unlabeled cases to generate classification scores for each of the unlabeled cases, wherein each classification score corresponds with an instance of a corresponding case. Furthermore, the classification score corresponding to a first instance in a case is computed based, at least in part, on a value of a case-centric feature corresponding to the first instance, wherein the value of the case-centric feature is based, at least in part, on characteristics of the first instance and a second instance in the case.

    Abstract translation: 本发明的示例性实施例提供了一种开发分类器的计算机实现方法。 该方法包括获得包括标记情况的一组训练数据。 该方法还包括至少部分地基于训练数据来训练分类器。 该方法还包括将分类器应用于多个未标记的情况以产生每个未标记情况的分类分数,其中每个分类分数对应于相应病例的实例。 此外,至少部分地基于与第一实例对应的以案例为中心的特征的值来计算与案例中的第一实例相对应的分类分数,其中以案例为中心的特征的值基于 至少部分是关于一审的特征和第二例的情况。

    Systems and methods for collaborative filtering using collaborative inductive transfer
    8.
    发明授权
    Systems and methods for collaborative filtering using collaborative inductive transfer 有权
    使用协同感应传输协同过滤的系统和方法

    公开(公告)号:US08180715B2

    公开(公告)日:2012-05-15

    申请号:US12332930

    申请日:2008-12-11

    Applicant: George Forman

    Inventor: George Forman

    CPC classification number: G06N99/005

    Abstract: A database includes a list of members of a first group, a list of members of a second group, and ratings for at least some of the members of the second group. The database is accessed. The ratings are attributed to the members of the first group. A machine learning training set is built for a particular member of the first group. The training set includes class labels corresponding to the particular member's ratings for the members of the second group, and features that include supplied and predicted ratings from at least a subset of processed members of the first group. A predictor for the particular member of the first group is trained based on the machine learning training set. The predictor corresponding to the particular member is used to generate predicted ratings for one or more members of the second group the particular member has not rated.

    Abstract translation: 数据库包括第一组的成员的列表,第二组的成员的列表以及第二组的至少一些成员的评级。 访问数据库。 评分归因于第一组的成员。 为第一组的特定成员构建机器学习训练集。 训练集包括对应于特定成员对第二组成员的评级的类标签,以及包括来自第一组的已处理成员的至少一个子集的提供和预测评级的特征。 基于机器学习训练集训练第一组的特定成员的预测器。 对应于该特定成员的预测器用于产生特定成员尚未评估的第二组中的一个或多个成员的预测等级。

    Classifier Indexing
    9.
    发明申请
    Classifier Indexing 有权
    分类器索引

    公开(公告)号:US20100082642A1

    公开(公告)日:2010-04-01

    申请号:US12242752

    申请日:2008-09-30

    CPC classification number: G06F17/30705

    Abstract: Provided are, among other things, systems, methods and techniques for document-based processing. In one implementation, a document is input; features are extracted from it; an index is queried using at least a subset of the extracted features and, in response, identifications for selected document classifiers are received from a larger pool of document classifiers; the document is processed using individual ones of the selected document classifiers, thereby generating corresponding classifier outputs; and then, based on such classifier outputs, (1) the document is categorized within a computer database and/or (2) feedback information is provided to a user.

    Abstract translation: 尤其是基于文档处理的系统,方法和技术。 在一个实现中,输入文档; 特征从中提取出来; 使用提取的特征的至少一个子集来查询索引,并且作为响应,从较大的文档分类器池接收所选择的文档分类器的标识; 使用所选择的文档分类器中的单独的文档分类器处理文档,从而生成相应的分类器输出; 然后,基于这样的分类器输出,(1)文档被分类在计算机数据库中和/或(2)向用户提供反馈信息。

    Method for assessing electronic devices
    10.
    发明授权
    Method for assessing electronic devices 有权
    电子设备评估方法

    公开(公告)号:US07596431B1

    公开(公告)日:2009-09-29

    申请号:US11590525

    申请日:2006-10-31

    CPC classification number: G06F11/3058 G06F11/3034 H05K7/20836

    Abstract: In a method for assessing a plurality of electronic devices, cooling efficiencies for the plurality of electronic devices are calculated, where the cooling efficiencies comprise measures of energy usage requirements to respectively maintain the plurality of electronic devices within predetermined temperature ranges. In addition, the plurality of electronic devices are ranked according to their cooling efficiencies and the plurality of electronic devices are stored according to their rankings.

    Abstract translation: 在用于评估多个电子设备的方法中,计算多个电子设备的冷却效率,其中冷却效率包括将多个电子设备分别保持在预定温度范围内的能量使用要求的测量。 此外,根据其冷却效率对多个电子设备进行排序,并且根据其排名来存储多个电子设备。

Patent Agency Ranking