SYSTEMS AND METHODS FOR ENABLING MANUAL CLASSIFICATION OF UNRECOGNIZED DOCUMENTS TO COMPLETE WORKFLOW FOR ELECTRONIC JOBS AND TO ASSIST MACHINE LEARNING OF A RECOGNITION SYSTEM USING AUTOMATICALLY EXTRACTED FEATURES OF UNRECOGNIZED DOCUMENTS
    21.
    发明申请
    SYSTEMS AND METHODS FOR ENABLING MANUAL CLASSIFICATION OF UNRECOGNIZED DOCUMENTS TO COMPLETE WORKFLOW FOR ELECTRONIC JOBS AND TO ASSIST MACHINE LEARNING OF A RECOGNITION SYSTEM USING AUTOMATICALLY EXTRACTED FEATURES OF UNRECOGNIZED DOCUMENTS 审中-公开
    使用手册分类未经许可的文件来完成电子作业的工作流程并使用自动提取的未经许可文件的特征来协助识别系统的机器学习的系统和方法

    公开(公告)号:US20090116755A1

    公开(公告)日:2009-05-07

    申请号:US12266454

    申请日:2008-11-06

    CPC classification number: G06K9/00442 G06K9/6885

    Abstract: A method in a document analysis system automatically extracts image and text features from each received electronic document and compares the extracted features with feature sets associated with each category of document to determine whether the document is recognizable as belonging to a document category. If an electronic document is recognized as belonging to one of the document categories, the method classifies the electronic document as belonging to that document category. If, however, an electronic document is unrecognized, the method submits the unrecognized document to a learning phase, in which the unrecognized document is presented to a human trainer for manual classification of the unrecognized electronic document into a document category, and automatically modifies at least one of the features and the weights of the feature set of the document category corresponding to the manually-classified electronic document using the automatically extracted features of the manually-classified document.

    Abstract translation: 文档分析系统中的方法自动从每个接收到的电子文档中提取图像和文本特征,并将所提取的特征与与每个类别的文档相关联的特征集合进行比较,以确定文档是否可识别为属于文档类别。 如果电子文档被识别为属于文档类别之一,则该方法将电子文档归类为属于该文档类别。 然而,如果电子文档无法识别,则该方法将无法识别的文档提交到学习阶段,在该阶段将未被识别的文档呈现给人类教练,以将未被识别的电子文档手动分类为文档类别,并至少自动修改 使用手动分类文档的自动提取的特征,对应于手动分类的电子文档的文档类别的特征集的特征和权重之一。

    Method and system for secure data entry
    22.
    发明授权
    Method and system for secure data entry 有权
    用于安全数据输入的方法和系统

    公开(公告)号:US08270720B1

    公开(公告)日:2012-09-18

    申请号:US11708201

    申请日:2007-02-20

    CPC classification number: G06F21/6254

    Abstract: The present invention includes a method of secure data entry that enables complex data entry work to be performed by unskilled workers that results in data entry with higher productivity, higher quality and higher security than data entry performed by highly skilled workers. The invention identifies data fields on an electronic image of an identified input page, sequences identified data field images, and individually displays data field images for manual data entry. The invention also provides for extracting data from a data field image and displaying extracted data along with the corresponding data field image for approval or correction. Sequenced data field images are optionally reordered or randomized for display and manual entry.

    Abstract translation: 本发明包括一种安全数据输入的方法,其使复杂数据输入工作能够由非熟练工人执行,导致数据输入具有比高技能工人执行的数据输入更高的生产率,更高的质量和更高的安全性。 本发明识别识别的输入页面的电子图像上的数据字段,序列识别的数据字段图像,并且单独地显示用于手动数据输入的数据字段图像。 本发明还提供从数据场图像提取数据并显示提取的数据以及相应的数据场图像以供批准或校正。 序列数据字段图像可选地重新排序或随机化以进行显示和手动输入。

    SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA BY NARROWING DATA SEARCH SCOPE USING CONTOUR MATCHING
    23.
    发明申请
    SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA BY NARROWING DATA SEARCH SCOPE USING CONTOUR MATCHING 审中-公开
    通过使用轮廓匹配来缩小数据搜索范围的自动提取数据的系统和方法

    公开(公告)号:US20110255794A1

    公开(公告)日:2011-10-20

    申请号:US13007466

    申请日:2011-01-14

    CPC classification number: G06K9/00442 G06K9/48 G06K9/72 G06K2209/01

    Abstract: A method of extracting data by narrowing a scope of data search using contour matching of select elements in a document is provided. The method includes: analyzing each document to automatically extract images and text features wherein said analyzing compares extracted features with a first search space of candidate features to try and recognize the extracted features; automatically processing each unrecognized feature using a contour recognition engine to generate a contour of the unrecognized feature; automatically selecting a second search space of candidate features through contour matching using the contour of the unrecognized feature, wherein the second search space of candidate features is narrower than the first search space of candidate features; and comparing the unrecognized feature with said second search space to identify the previously unrecognized feature.

    Abstract translation: 提供了通过使用文档中的选择元素的轮廓匹配来缩小数据搜索的范围来提取数据的方法。 该方法包括:分析每个文档以自动提取图像和文本特征,其中所述分析将提取的特征与候选特征的第一搜索空间进行比较,以尝试并识别所提取的特征; 使用轮廓识别引擎自动处理每个无法识别的特征以产生未被识别的特征的轮廓; 通过使用所述无法识别的特征的轮廓,通过轮廓匹配来自动选择候选特征的第二搜索空间,其中候选特征的第二搜索空间比候选特征的第一搜索空间窄; 以及将所述无法识别的特征与所述第二搜索空间进行比较,以识别先前未被识别的特征。

    SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS USING EXTERNAL DATA
    24.
    发明申请
    SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS USING EXTERNAL DATA 审中-公开
    使用外部数据自动从电子文档中提取数据的系统和方法

    公开(公告)号:US20110255788A1

    公开(公告)日:2011-10-20

    申请号:US13007422

    申请日:2011-01-14

    CPC classification number: G06K9/00442 G06K9/48 G06K9/72 G06K2209/01

    Abstract: In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically extracting data from each received electronic document at least in part using data external to the electronic document but associated with the job containing the document is provided. The method includes: analyzing each electronic document in a job to automatically extract images and text features; and, if any of the images and text features extracted from the electronic document is not recognized, using data external to said document but associated with said job to identify the unrecognized feature, wherein the external source may be one of at least one other document in the job and a database having known values associated with the job.

    Abstract translation: 在从多个用户接收和处理作业的文档分析系统中,其中每个作业可以包含多个电子文档,以从电子文档中提取数据;一种从每个接收到的电子文档中至少部分地使用 提供电子文档外部但与包含文档的作业相关联的数据。 该方法包括:分析作业中的每个电子文档以自动提取图像和文本特征; 并且如果从电子文档中提取的任何图像和文本特征不被识别,则使用所述文档外部但与所述作业相关联的数据来识别未被识别的特征,其中所述外部源可以是至少一个其他文档之一 该作业和具有与作业相关联的已知值的数据库。

    SYSTEMS AND METHODS FOR TRAINING A DOCUMENT CLASSIFICATION SYSTEM USING DOCUMENTS FROM A PLURALITY OF USERS
    25.
    发明申请
    SYSTEMS AND METHODS FOR TRAINING A DOCUMENT CLASSIFICATION SYSTEM USING DOCUMENTS FROM A PLURALITY OF USERS 审中-公开
    使用多个用户的文档来培训文档分类系统的系统和方法

    公开(公告)号:US20090116756A1

    公开(公告)日:2009-05-07

    申请号:US12266469

    申请日:2008-11-06

    CPC classification number: G06K9/00442 G06K9/6885

    Abstract: A method of training a document analysis system that automatically extracts image and text features from each received electronic document and compares the extracted features with feature sets associated with each document category is provided. If an electronic document is recognized as belonging to one of the document categories with predetermined confidence, the method classifies the electronic document as being of that one document category. If an electronic document is not recognized as belonging to one of the document categories with predetermined confidence, however, the method submits the unrecognized document to a training phase in which the document is recognized as belonging to a document category and automatically modifies at least one of the features and the weights of the features of the feature set for the document category for the now-recognized document.

    Abstract translation: 提供了一种从每个接收的电子文档自动提取图像和文本特征的文档分析系统的训练方法,并将所提取的特征与与每个文档类别相关联的特征集进行比较。 如果电子文档被确定为具有预定置信度的文档类别之一,则该方法将电子文档分类为该一个文档类别。 然而,如果电子文档不被确定为具有预定的置信度的文档类别之一,则该方法将该无法识别的文档提交到将该文档识别为属于文档类别的训练阶段,并自动修改 功能集的特征和功能集的权重为现在被认可的文档的文档类别。

    SYSTEMS AND METHODS FOR PARALLEL PROCESSING OF DOCUMENT RECOGNITION AND CLASSIFICATION USING EXTRACTED IMAGE AND TEXT FEATURES
    26.
    发明申请
    SYSTEMS AND METHODS FOR PARALLEL PROCESSING OF DOCUMENT RECOGNITION AND CLASSIFICATION USING EXTRACTED IMAGE AND TEXT FEATURES 审中-公开
    使用提取的图像和文字特征并行处理文档识别和分类的系统和方法

    公开(公告)号:US20090116746A1

    公开(公告)日:2009-05-07

    申请号:US12266468

    申请日:2008-11-06

    CPC classification number: G06K9/00442 G06K9/6885

    Abstract: A method of parallel processing jobs received from a plurality of users by a document analysis system that automatically classifies documents to organize each job, automatically separates each job into its constituent electronic document and automatically separate the document into subsets of electronic pages. For each page of each subset, the method automatically extracts image features that are indicative of how the document is laid out or textually-organized. For each subset, the method automatically compares the extracted features with feature sets associated with each document category to determine a comparison score for the subset. The method then classifies the electronic document as being one of the categories of documents using the comparison score for each of the subsets and organize the job according to the categories of documents the job contains.

    Abstract translation: 通过文档分析系统从多个用户接收的并行处理作业的方法,其自动分类文档以组织每个作业,将每个作业自动分离成其组成电子文档,并将文档自动分离成电子页面的子集。 对于每个子集的每个页面,该方法自动提取表示文档布局或文本组织的图像特征。 对于每个子集,该方法自动将提取的特征与与每个文档类别相关联的特征集进行比较,以确定子集的比较分数。 然后,该方法将电子文档分类为使用每个子集的比较分数的文档类别之一,并根据作业所包含的文档的类别来组织作业。

    SYSTEMS AND METHODS TO AUTOMATICALLY CLASSIFY ELECTRONIC DOCUMENTS USING EXTRACTED IMAGE AND TEXT FEATURES AND USING A MACHINE LEARNING SUBSYSTEM
    27.
    发明申请
    SYSTEMS AND METHODS TO AUTOMATICALLY CLASSIFY ELECTRONIC DOCUMENTS USING EXTRACTED IMAGE AND TEXT FEATURES AND USING A MACHINE LEARNING SUBSYSTEM 审中-公开
    使用提取的图像和文字特征以及使用机器学习子系统自动分类电子文档的系统和方法

    公开(公告)号:US20090116736A1

    公开(公告)日:2009-05-07

    申请号:US12266462

    申请日:2008-11-06

    CPC classification number: G06K9/00442 G06K9/6885

    Abstract: A document analysis system that automatically classifies documents by recognizing in each document distinctive features comprises a document acquisition system, a document recognition training system, a document classification system, a document recognition system, and a job organization system. The document acquisition system receives jobs wherein each job containing at least one electronic document. The document feature recognition system automatically extracts image and text features from each received document. The document classification system automatically classifies recognized electronic documents by finding the best match between the extracted features of each of the document and feature sets associated with each category of document. The document recognition training system automatically trains the feature set for each corresponding category of documents, wherein the training system using extracted features of unrecognized documents automatically modifies the feature set for a document category. The job organization system automatically organizes each job according to the document categories it contains.

    Abstract translation: 一种文档分析系统,通过在每个文档中识别独特的特征来自动分类文档包括文档获取系统,文档识别训练系统,文档分类系统,文档识别系统和作业组织系统。 文档获取系统接收作业,其中每个作业包含至少一个电子文档。 文档特征识别系统自动从每个收到的文档中提取图像和文本特征。 文档分类系统通过找到与每个文档类别相关联的每个文档和特征集的提取的特征之间的最佳匹配来自动对识别的电子文档进行分类。 文档识别训练系统自动训练每个相应类别的文档的特征集,其中使用提取的无法识别的文档的特征的训练系统自动修改文档类别的特征集。 作业组织系统根据其所包含的文档类别自动组织每个作业。

    Content collection
    28.
    发明授权
    Content collection 有权
    内容收集

    公开(公告)号:US07356589B2

    公开(公告)日:2008-04-08

    申请号:US11197756

    申请日:2005-08-04

    Abstract: In a web service system with one or more web servers, a system and method for distributing content directly from each web server to a single computer transfers files generated on web servers to a central location for access by a system operator. If files generated by multiple web servers are aggregated on a single computer, processing and analysis can be performed on all of the files. Generally, in one aspect, the invention relates to a system and method for transmitting content from one computer to another in a web service system. The web service system includes web servers that provide web pages in response to web page requests. First and second web server agents provide an interface between the web service system and first and second computers, respectively. The first web server agent runs on the first computer and identifies at least a portion of a file for transmission to the second web server agent running on the second computer in the web service system. At least a portion of the file from the first web server agent is transmitted to the second web server agent and then stored by the second web server agent.

    Abstract translation: 在具有一个或多个web服务器的Web服务系统中,用于将内容直接从每个web服务器分发到单个计算机的系统和方法将在web服务器上生成的文件传送到中央位置以供系统操作者访问。 如果由多个Web服务器生成的文件聚合在一台计算机上,则可以对所有文件执行处理和分析。 通常,一方面,本发明涉及一种用于在Web服务系统中将内容从一台计算机传送到另一台计算机的系统和方法。 Web服务系统包括提供网页以响应网页请求的web服务器。 第一和第二Web服务器代理分别在Web服务系统和第一和第二计算机之间提供接口。 第一网络服务器代理在第一计算机上运行并且识别文件的至少一部分以传送到在web服务系统中在第二计算机上运行的第二web服务器代理。 来自第一web服务器代理的文件的至少一部分被发送到第二web服务器代理,然后被第二web服务器代理存储。

    Content collection
    29.
    发明申请

    公开(公告)号:US20060026262A1

    公开(公告)日:2006-02-02

    申请号:US11197756

    申请日:2005-08-04

    Abstract: In a web service system with one or more web servers, a system and method for distributing content directly from each web server to a single computer transfers files generated on web servers to a central location for access by a system operator. If files generated by multiple web servers are aggregated on a single computer, processing and analysis can be performed on all of the files. Generally, in one aspect, the invention relates to a system and method for transmitting content from one computer to another in a web service system. The web service system includes web servers that provide web pages in response to web page requests. First and second web server agents provide an interface between the web service system and first and second computers, respectively. The first web server agent runs on the first computer and identifies at least a portion of a file for transmission to the second web server agent running on the second computer in the web service system. At least a portion of the file from the first web server agent is transmitted to the second web server agent and then stored by the second web server agent.

Patent Agency Ranking