SYSTEMS AND METHODS FOR ORGANIZING DATA SETS

    公开(公告)号:US20170329838A1

    公开(公告)日:2017-11-16

    申请号:US15666409

    申请日:2017-08-01

    Applicant: Kofax, Inc.

    CPC classification number: G06F17/30598 G06F17/30312 G06F17/3053 G06N99/005

    Abstract: According to one embodiment, a computer-implemented method for cleaning up a data set having a possible incorrect label includes: selecting a plurality of training documents; estimating a quality of an organization of a plurality of categories; and determining whether the quality of the organization is greater than a predetermined quality threshold. Corresponding system and computer program product embodiments are also presented. Other aspects and advantages of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.

    SYSTEMS AND METHODS FOR ORGANIZING DATA SETS
    2.
    发明申请
    SYSTEMS AND METHODS FOR ORGANIZING DATA SETS 有权
    用于组织数据集的系统和方法

    公开(公告)号:US20150269245A1

    公开(公告)日:2015-09-24

    申请号:US14733742

    申请日:2015-06-08

    Applicant: Kofax, Inc.

    CPC classification number: G06F17/30598 G06F17/30312 G06F17/3053 G06N99/005

    Abstract: A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.

    Abstract translation: 提供了一种用于组织数据集的方法。 在使用中,创建或更新自动决策系统以确定数据元素是否符合预定义的组织,其中决策系统基于一组预先组织的数据元素。 使用决策系统来组织多个数据元素。 基于来自决策系统对于至少一个有组织数据元素的分数或置信度,选择至少一个有组织数据元素来输出给用户。 此外,至少一个有组织数据元素的至少一部分被输出给用户。 从用户接收到包括至少一个有组织数据元素的组织的确认,修改和否定中的至少一个的响应。 基于用户响应重新创建或更新自动决策系统。 还提出了其他实施例。

    Systems and methods for organizing data sets
    3.
    发明授权
    Systems and methods for organizing data sets 有权
    用于组织数据集的系统和方法

    公开(公告)号:US09378268B2

    公开(公告)日:2016-06-28

    申请号:US14733742

    申请日:2015-06-08

    Applicant: Kofax, Inc.

    CPC classification number: G06F17/30598 G06F17/30312 G06F17/3053 G06N99/005

    Abstract: A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.

    Abstract translation: 提供了一种用于组织数据集的方法。 在使用中,创建或更新自动决策系统以确定数据元素是否符合预定义的组织,其中决策系统基于一组预先组织的数据元素。 使用决策系统来组织多个数据元素。 基于来自决策系统对于至少一个有组织数据元素的分数或置信度,选择至少一个有组织数据元素来输出给用户。 此外,至少一个有组织数据元素的至少一部分被输出给用户。 从用户接收到包括至少一个有组织数据元素的组织的确认,修改和否定中的至少一个的响应。 基于用户响应重新创建或更新自动决策系统。 还提出了其他实施例。

    DATA CLASSIFICATION USING MACHINE LEARNING TECHNIQUES
    4.
    发明申请
    DATA CLASSIFICATION USING MACHINE LEARNING TECHNIQUES 审中-公开
    使用机器学习技术的数据分类

    公开(公告)号:US20140207717A1

    公开(公告)日:2014-07-24

    申请号:US14225298

    申请日:2014-03-25

    Applicant: Kofax, Inc.

    Abstract: Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g. for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. One exemplary system includes a memory; and a processor in communication with the memory, the processor being configured to process at least some instructions stored in the memory. The memory stores computer executable program code comprising instructions for: training a classifier based on an invoice format associated with a first entity; accessing a plurality of invoices labeled as being associated with at least one of the first entity and other entities; and outputting an identifier of at least one of the invoices having a high probability of not being associated with the first entity.

    Abstract translation: 介绍了用于分类文件的系统,方法和计算机程序产品。 用于分析文件的系统,方法和计算机程序产品,例如 也用于验证发票与实体的关联。 介绍了管理医疗记录的系统,方法和计算机程序产品。 一个示例性系统包括存储器; 以及与所述存储器通信的处理器,所述处理器被配置为处理存储在所述存储器中的至少一些指令。 存储器存储包括以下指令的计算机可执行程序代码:用于基于与第一实体相关联的发票格式来训练分类器; 访问标记为与第一实体和其他实体中的至少一个相关联的多个发票; 并且输出具有与第一实体不相关联的高可能性的发票中的至少一个的标识符。

    Systems and methods for organizing data sets

    公开(公告)号:US10235446B2

    公开(公告)日:2019-03-19

    申请号:US15666409

    申请日:2017-08-01

    Applicant: Kofax, Inc.

    Abstract: According to one embodiment, a computer-implemented method for cleaning up a data set having a possible incorrect label includes: selecting a plurality of training documents; estimating a quality of an organization of a plurality of categories; and determining whether the quality of the organization is greater than a predetermined quality threshold. Corresponding system and computer program product embodiments are also presented. Other aspects and advantages of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.

    SYSTEMS AND METHODS FOR ORGANIZING DATA SETS
    6.
    发明申请
    SYSTEMS AND METHODS FOR ORGANIZING DATA SETS 审中-公开
    用于组织数据集的系统和方法

    公开(公告)号:US20130041863A1

    公开(公告)日:2013-02-14

    申请号:US13655267

    申请日:2012-10-18

    Applicant: KOFAX, INC.

    CPC classification number: G06F16/285 G06F16/22 G06F16/24578 G06N20/00

    Abstract: A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least: one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.

    Abstract translation: 提供了一种用于组织数据集的方法。 在使用中,创建或更新自动决策系统以确定数据元素是否符合预定义的组织,其中决策系统基于一组预先组织的数据元素。 使用决策系统来组织多个数据元素。 基于来自决策系统对于至少一个有组织数据元素的分数或置信度,选择至少一个有组织数据元素来输出给用户。 此外,至少一个有组织数据元素的至少一部分被输出给用户。 从用户接收到响应包括至少一个有组织数据元素的组织的确认,修改和否定中的至少一个。 基于用户响应重新创建或更新自动决策系统。 还提出了其他实施例。

    Systems and methods for organizing data sets

    公开(公告)号:US09754014B2

    公开(公告)日:2017-09-05

    申请号:US15422435

    申请日:2017-02-01

    Applicant: Kofax, Inc.

    CPC classification number: G06F17/30598 G06F17/30312 G06F17/3053 G06N99/005

    Abstract: According to one embodiment, a computer-implemented method for confirming/rejecting a most relevant example includes: generating a binary decision model by training a binary classifier using a plurality of training documents; classifying one or more test documents into one of a plurality of categories using the binary decision model, wherein the one or more test documents lack a user-defined category label; selecting a most relevant example of the classified test documents from among the classified test documents; displaying, using a display of the computer, the most relevant example of the classified test documents to a user; receiving, via the computer and from the user, a confirmation or a negation of a classification label of the most relevant example of the classified test documents; and storing the confirmation or the negation of the classification label of the most relevant example of the classified test documents to a memory of the computer.

    SYSTEMS AND METHODS FOR ORGANIZING DATA SETS

    公开(公告)号:US20170140030A1

    公开(公告)日:2017-05-18

    申请号:US15422435

    申请日:2017-02-01

    Applicant: Kofax, Inc.

    CPC classification number: G06F17/30598 G06F17/30312 G06F17/3053 G06N99/005

    Abstract: According to one embodiment, a computer-implemented method for confirming/rejecting a most relevant example includes: generating a binary decision model by training a binary classifier using a plurality of training documents; classifying one or more test documents into one of a plurality of categories using the binary decision model, wherein the one or more test documents lack a user-defined category label; selecting a most relevant example of the classified test documents from among the classified test documents; displaying, using a display of the computer, the most relevant example of the classified test documents to a user; receiving, via the computer and from the user, a confirmation or a negation of a classification label of the most relevant example of the classified test documents; and storing the confirmation or the negation of the classification label of the most relevant example of the classified test documents to a memory of the computer.

Patent Agency Ranking