-
公开(公告)号:US08719197B2
公开(公告)日:2014-05-06
申请号:US13090216
申请日:2011-04-19
摘要: Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g., associated with legal discovery are also presented. Systems, methods and computer program products for cleaning up data are also presented. Systems, methods and computer program products for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. Systems, methods and computer program products for face recognition are presented.
摘要翻译: 介绍了用于分类文件的系统,方法和计算机程序产品。 还提供了用于分析文档的系统,方法和计算机程序产品,例如与法律发现相关联的产品。 还介绍了用于清理数据的系统,方法和计算机程序产品。 还介绍了用于验证发票与实体关联的系统,方法和计算机程序产品。 介绍了管理医疗记录的系统,方法和计算机程序产品。 介绍了面部识别的系统,方法和计算机程序产品。
-
公开(公告)号:US07958067B2
公开(公告)日:2011-06-07
申请号:US11752673
申请日:2007-05-23
IPC分类号: G06F15/18
摘要: Methods for classifying documents are presented. Methods for analyzing documents associated with legal discovery are also presented. Methods for cleaning up data are also presented. Methods for verifying an association of an invoice with an entity are also presented. Method for managing medical records are presented. Method for face recognition are presented.
摘要翻译: 介绍文件分类方法。 还提供了分析与法律发现相关的文件的方法。 还提供了清理数据的方法。 还提供了用于验证发票与实体的关联的方法。 介绍医疗记录管理方法。 提出了面部识别方法。
-
公开(公告)号:US08693043B2
公开(公告)日:2014-04-08
申请号:US10742131
申请日:2003-12-19
申请人: Mauritius A. R. Schmidtler , Scott S. Texeira , Christopher K. Harris , Sameer Samat , Roland Borrey , Anthony Macciola
发明人: Mauritius A. R. Schmidtler , Scott S. Texeira , Christopher K. Harris , Sameer Samat , Roland Borrey , Anthony Macciola
IPC分类号: G06K15/00
CPC分类号: G06F17/21 , G06K9/00442 , H04N1/32112 , H04N2201/3225 , H04N2201/3243
摘要: A method and system for delineating document boundaries and identifying document types by analyzing digital images of one or more documents, automatically categorizing one or more pages or subdocuments within the one or more documents and automatically generating delineation identifiers, such as computer-generated images of separation pages inserted between digital images belonging to different categories, a description of the categorization sequence of the digital images, or a computer-generated electronic label affixed or associated with said digital images.
摘要翻译: 通过分析一个或多个文档的数字图像来自动分类一个或多个文档中的一个或多个页面或子文档并自动生成描绘标识符(例如计算机生成的分离图像)的方法和系统,用于描绘文档边界和识别文档类型 插入属于不同类别的数字图像之间的页面,数字图像的分类序列的描述,或者附加或与所述数字图像相关联的计算机生成的电子标签。
-
公开(公告)号:US08239335B2
公开(公告)日:2012-08-07
申请号:US13033536
申请日:2011-02-23
CPC分类号: G06F17/30707 , G06N99/005
摘要: A system and article of manufacture enabling adapting to a shift in document content according to one embodiment of the present invention includes instructions for: receiving at least one labeled seed document; receiving unlabeled documents; receiving at least one predetermined cost factor; training a transductive classifier using the at least one predetermined cost factor, the at least one seed document, and the unlabeled documents; classifying the unlabeled documents having a confidence level above a predefined threshold into a plurality of categories using the classifier; reclassifying at least some of the categorized documents into the categories using the classifier; and outputting identifiers of the categorized documents to at least one of a user, another system, and another process. Systems and articles of manufacture for separating documents are also presented. Systems and articles of manufacture for document searching are also presented.
摘要翻译: 根据本发明的一个实施例的能够适应文档内容的移动的系统和制品包括用于:接收至少一个标记的种子文档的指令; 收到未标记的文件; 接收至少一个预定的成本因子; 使用所述至少一个预定成本因素,所述至少一个种子文档和所述未标记的文档来训练转换分类器; 使用分类器将具有高于预定义阈值的置信水平的未标记文档分类成多个类别; 使用分类器将至少一些分类文档重新分类为类别; 以及将分类文档的标识符输出到用户,另一系统和另一进程中的至少一个。 还介绍了分离文件的系统和制造。 还提供了用于文档搜索的系统和制品。
-
公开(公告)号:US07937345B2
公开(公告)日:2011-05-03
申请号:US11752719
申请日:2007-05-23
IPC分类号: G06F15/18
CPC分类号: G06F17/30707 , G06N99/005
摘要: A method for adapting to a shift in document content according to one embodiment of the present invention includes receiving at least one labeled seed document; receiving unlabeled documents; receiving at least one predetermined cost factor; training a transductive classifier using the at least one predetermined cost factor, the at least one seed document, and the unlabeled documents; classifying the unlabeled documents having a confidence level above a predefined threshold into a plurality of categories using the classifier; reclassifying at least some of the categorized documents into the categories using the classifier; and outputting identifiers of the categorized documents to at least one of a user, another system, and another process. Methods for separating documents are also presented. Methods for document searching are also presented.
摘要翻译: 根据本发明的一个实施例的用于适应文档内容的偏移的方法包括:接收至少一个标记的种子文档; 收到未标记的文件; 接收至少一个预定的成本因子; 使用所述至少一个预定成本因素,所述至少一个种子文档和所述未标记的文档来训练转换分类器; 使用分类器将具有高于预定义阈值的置信水平的未标记文档分类成多个类别; 使用分类器将至少一些分类文档重新分类为类别; 以及将分类文档的标识符输出到用户,另一系统和另一进程中的至少一个。 还提供了分离文件的方法。 还提供了文档搜索的方法。
-
公开(公告)号:US20080086432A1
公开(公告)日:2008-04-10
申请号:US11752691
申请日:2007-05-23
IPC分类号: G06F15/18
CPC分类号: G06N20/00 , G06F16/353
摘要: Methods for analyzing prior art are presented. One method includes training a classifier based on a search query; accessing a plurality of prior art documents; performing a document classification technique on at least some of the prior art documents using the classifier; and outputting identifiers of at least some of the prior art documents based on the classification thereof. Methods for adapting a patent classification to a shift in document content are also presented. Methods for matching documents to claims are presented. Methods for classifying a patent or patent application are also presented. Methods for classifying a patent or patent application are also presented.
摘要翻译: 提出了分析现有技术的方法。 一种方法包括基于搜索查询训练分类器; 访问多个现有技术文件; 在使用分类器的至少一些现有技术文件上执行文档分类技术; 并且基于其分类,输出至少一些现有技术文献的标识符。 还介绍了将专利分类适应于文档内容转移的方法。 介绍了将文档与权利要求进行匹配的方法。 还介绍了专利或专利申请的分类方法。 还介绍了专利或专利申请的分类方法。
-
公开(公告)号:US20110196870A1
公开(公告)日:2011-08-11
申请号:US13090216
申请日:2011-04-19
摘要: Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g., associated with legal discovery are also presented. Systems, methods and computer program products for cleaning up data are also presented. Systems, methods and computer program products for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. Systems, methods and computer program products for face recognition are presented.
摘要翻译: 介绍了用于分类文件的系统,方法和计算机程序产品。 还提供了用于分析文档的系统,方法和计算机程序产品,例如与法律发现相关联的产品。 还介绍了用于清理数据的系统,方法和计算机程序产品。 还介绍了用于验证发票与实体关联的系统,方法和计算机程序产品。 介绍了管理医疗记录的系统,方法和计算机程序产品。 介绍了面部识别的系统,方法和计算机程序产品。
-
公开(公告)号:US20080082352A1
公开(公告)日:2008-04-03
申请号:US11752673
申请日:2007-05-23
IPC分类号: G06Q10/00
摘要: Methods for classifying documents are presented. Methods for analyzing documents associated with legal discovery are also presented. Methods for cleaning up data are also presented. Methods for verifying an association of an invoice with an entity are also presented. Method for managing medical records are presented. Method for face recognition are presented.
摘要翻译: 介绍文件分类方法。 还提供了分析与法律发现相关的文件的方法。 还提供了清理数据的方法。 还提供了用于验证发票与实体的关联的方法。 介绍医疗记录管理方法。 提出了面部识别方法。
-
公开(公告)号:US20080086433A1
公开(公告)日:2008-04-10
申请号:US11752719
申请日:2007-05-23
CPC分类号: G06F17/30707 , G06N99/005
摘要: A method for adapting to a shift in document content according to one embodiment of the present invention includes receiving at least one labeled seed document; receiving unlabeled documents; receiving at least one predetermined cost factor; training a transductive classifier using the at least one predetermined cost factor, the at least one seed document, and the unlabeled documents; classifying the unlabeled documents having a confidence level above a predefined threshold into a plurality of categories using the classifier; reclassifying at least some of the categorized documents into the categories using the classifier; and outputting identifiers of the categorized documents to at least one of a user, another system, and another process. Methods for separating documents are also presented. Methods for document searching are also presented.
摘要翻译: 根据本发明的一个实施例的用于适应文档内容的偏移的方法包括:接收至少一个标记的种子文档; 收到未标记的文件; 接收至少一个预定的成本因子; 使用所述至少一个预定成本因素,所述至少一个种子文档和所述未标记的文档来训练转换分类器; 使用分类器将具有高于预定义阈值的置信水平的未标记文档分类成多个类别; 使用分类器将至少一些分类文档重新分类为类别; 以及将分类文档的标识符输出到用户,另一系统和另一进程中的至少一个。 还提供了分离文件的方法。 还提供了文档搜索的方法。
-
公开(公告)号:US20050134935A1
公开(公告)日:2005-06-23
申请号:US10742131
申请日:2003-12-19
申请人: Mauritius Schmidtler , Scott Texeira , Christopher Harris , Sameer Samat , Roland Borrey , Anthony Macciola
发明人: Mauritius Schmidtler , Scott Texeira , Christopher Harris , Sameer Samat , Roland Borrey , Anthony Macciola
CPC分类号: G06F17/21 , G06K9/00442 , H04N1/32112 , H04N2201/3225 , H04N2201/3243
摘要: A method and system for delineating document boundaries and identifying document types by analyzing digital images of one or more documents, automatically categorizing one or more pages or subdocuments within the one or more documents and automatically generating delineation identifiers, such as computer-generated images of separation pages inserted between digital images belonging to different categories, a description of the categorization sequence of the digital images, or a computer-generated electronic label affixed or associated with said digital images.
摘要翻译: 通过分析一个或多个文档的数字图像来自动分类一个或多个文档中的一个或多个页面或子文档并自动生成描绘标识符(例如计算机生成的分离图像)的方法和系统,用于描绘文档边界和识别文档类型 插入属于不同类别的数字图像之间的页面,数字图像的分类序列的描述,或者附加或与所述数字图像相关联的计算机生成的电子标签。
-
-
-
-
-
-
-
-
-