EVOLUTIONARY TAGGER
    1.
    发明申请
    EVOLUTIONARY TAGGER 审中-公开

    公开(公告)号:US20110231384A1

    公开(公告)日:2011-09-22

    申请号:US12634627

    申请日:2009-12-09

    IPC分类号: G06F17/30 G06N3/12 G06F15/18

    CPC分类号: G06F16/367

    摘要: The invention is a process, system, workflow system for data retrieval processes, software, Web Site, service and SaaS (Software as a Service) created to support a data retrieval process from various document types to custom or preset retrieval data structures. The program supports manual, automatic and semiautomatic data retrieval using its internal features or external add-ons. It links data points in the structure to the corresponding data points in the document, stores documents, structures and links between them and outputs results in various formats. Links between a document and a retrieval data structure are established either automatically or manually by the user. After all required links are set, results can be retrieved from the program as an XML (Extensible Markup Language) structure with required data or as a PDF (Portable Document Format) or HTML (Hypertext Format Language), in MS Office formats and others containing a/the retrieval data structure, the original document or both with links between corresponding data points.The system incorporates a Text Mining engine, which provides automatic information retrieval capabilities. The engine implements Text mining technology that is based on Evolutionary Bayesian Ontology Classification. This technology uses Bayesian Ontology for modeling the problem's domain and applies Evolutionary Search for the most plausible classification decision.The ability to learn from data is a key feature of Bayesian Ontology, and for our embodiment. The complexity and size of semantic and format dependencies between elements in a natural language text is too high for analytical descriptions. Plus, we intend to save the user the trouble of building their own data retrieval models. Instead, we rely on an algorithm that automatically links user's data selections to the closest categories in pre-built ontologies and generates selection specific classifiers. Every individual ontology keeps learning from user corrections during its life cycle. The system is specifically built with the ability to accumulate data models learned from various types of documents. The more documents have been processed by the system, the higher generalization capabilities it possesses for automatic processing of new, unseen documents.

    Electronic check image storage and retrieval system
    2.
    发明授权
    Electronic check image storage and retrieval system 失效
    电子检查图像存储和检索系统

    公开(公告)号:US06574377B1

    公开(公告)日:2003-06-03

    申请号:US09340245

    申请日:1999-07-01

    IPC分类号: G06K954

    摘要: A method and apparatus for storing and retrieving images of documents, e.g. checks. The method comprises placing a plurality of documents in a document imaging machine and forming an electronic image of each document, storing each electronic image in an electronic storage device, providing at least one user interface device in communication on a communication link with the electronic storage device, placing a request for at least one document image on the user interface device, transmitting the request by the communication link to the electronic storage device, searching the electronic storage device for the requested electronic image of the document, retrieving the at least one electronic image or providing an indication that the image was not found, storing the electronic image, if found, in an electronic file, for transmission to the user interface device at user option, providing the electronic image to the user interface device at command of a user at the user interface device for storage at the user interface device and displaying the requested electronic image on a display of the user interface device. Preferably, the electronic; images are stored with embedded identifying information in a TIFF file format and the check images can be displayed on a display device which permits the user to view both sides of the checks simultaneously and perform functions such as zooming and rotation of the images.

    摘要翻译: 一种用于存储和检索文档的图像的方法和装置,例如 检查。 该方法包括将多个文档放置在文档成像机中并形成每个文档的电子图像,将每个电子图像存储在电子存储装置中,在与电子存储装置的通信链路上提供通信中的至少一个用户界面装置 向所述用户界面设备发送至少一个文档图像的请求,通过所述通信链路将所述请求发送到所述电子存储设备,在所述电子存储设备中搜索所请求的所述文档的电子图像,检索所述至少一个电子图像 或者提供没有找到图像的指示,如果发现的话,将电子图像存储在电子文件中,以用户选项传送到用户界面装置,则以用户的命令将电子图像提供给用户界面装置 所述用户界面设备用于在所述用户界面设备处存储并显示所请求的电子图像 在用户界面设备的显示器上。 优选地,电子; 图像以TIFF文件格式存储嵌入的识别信息,并且检查图像可以显示在允许用户同时查看检查的两侧并执行诸如图像的缩放和旋转的功能的显示装置上。

    Extracting data from semi-structured text documents
    3.
    发明申请
    Extracting data from semi-structured text documents 审中-公开
    从半结构化文本文档中提取数据

    公开(公告)号:US20060242180A1

    公开(公告)日:2006-10-26

    申请号:US10565611

    申请日:2004-07-23

    IPC分类号: G06F17/00

    CPC分类号: G06F16/86 G06F16/38

    摘要: The invention is a process, system, and workflow for extracting and warehousing data from semi-structured documents in any language. This includes, but is not limited to, one or more of methods for: the automatic building of text mining term models; the optimization or evolution of such text mining term models; the implementation of document specific (or company specific) memory; and the tying or linking of the extracted data, or metadata, once placed in a target electronic document, to the machine readable, underlying source document, thus providing verification and provenance. The process preferably incorporates a wizard-based method for producing pattern recognition text mining term models to extract data from text. The invention also includes a system, method and workflow for handling a subsequent document of similar design and structure, specifically the automatic extraction of target elements and addition of the same to a database.

    摘要翻译: 本发明是用于从任何语言的半结构化文档中提取和存储数据的过程,系统和工作流程。 这包括但不限于以下一种或多种方法:自动构建文本挖掘术语模型; 这种文本挖掘术语模型的优化或演变; 执行文件具体(或公司专用)记忆; 以及将提取的数据或元数据一旦放置在目标电子文档中的连接或连接到机器可读的底层源文档,从而提供验证和来源。 该过程优选地包括用于产生模式识别文本挖掘术语模型以从文本提取数据的基于向导的方法。 本发明还包括用于处理具有类似设计和结构的后续文档的系统,方法和工作流程,具体地,将目标元素的自动提取并将其添加到数据库。

    Electronic check image storage and retrieval system
    4.
    发明授权
    Electronic check image storage and retrieval system 失效
    电子检查图像存储和检索系统

    公开(公告)号:US06181837B2

    公开(公告)日:2001-01-30

    申请号:US08342265

    申请日:1994-11-18

    IPC分类号: G06K954

    摘要: A method and apparatus for storing and retrieving images of documents, e.g. checks. The method comprises placing a plurality of documents in a document imaging machine and forming an electronic image of each document, storing each electronic image in an electronic storage device, providing at least one user interface device in communication on a communication link with the electronic storage device, placing a request for at least one document image on the user interface device, transmitting the request by the communication link to the electronic storage device, searching the electronic storage device for the requested electronic image of the document, retrieving the at least one electronic image or providing an indication that the image was not found, storing the electronic image, if found, in an electronic file, for transmission to the user interface device at user option, providing the electronic image to the user interface device at command of a user at the user interface device for storage at the user interface device and displaying the requested electronic image on a display of the user interface device. Preferably, the electronic, images are stored with embedded identifying information in a TIFF® (trademark of Aldus Corp.) file format and the check images can be displayed on a display device which permits the user to view both sides of the checks simultaneously and perform functions such as zooming and rotation of the images.

    摘要翻译: 一种用于存储和检索文档的图像的方法和装置,例如 检查。 该方法包括将多个文档放置在文档成像机中并形成每个文档的电子图像,将每个电子图像存储在电子存储装置中,在与电子存储装置的通信链路上提供通信中的至少一个用户界面装置 向所述用户界面设备发送至少一个文档图像的请求,通过所述通信链路将所述请求发送到所述电子存储设备,在所述电子存储设备中搜索所请求的所述文档的电子图像,检索所述至少一个电子图像 或者提供没有找到图像的指示,如果发现的话,将电子图像存储在电子文件中,以用户选项传送到用户界面装置,则以用户的命令将电子图像提供给用户界面装置 所述用户界面设备用于在所述用户界面设备处存储并显示所请求的电子图像 在用户界面设备的显示器上。 优选地,电子图像以嵌入的识别信息存储在TIFF(Aldus Corp.的商标)文件格式中,并且可以在允许用户同时查看检查的两侧的显示装置上显示检查图像 并执行诸如图像的缩放和旋转等功能。

    XBRL DATA MAPPING BUILDER
    5.
    发明申请
    XBRL DATA MAPPING BUILDER 审中-公开
    XBRL数据映射建筑

    公开(公告)号:US20110137923A1

    公开(公告)日:2011-06-09

    申请号:US12634635

    申请日:2009-12-09

    IPC分类号: G06F17/30

    摘要: A method and computer program for automatic mapping of Extensible Business Reports Language (XBRL) Data to corresponding locations in an initial business document. The program takes XBRL filing, together with text of the initial report, and starts a data mapping engine based on Evolutionary Optimization. The engine searches for the most plausible locations in the document for every data item. After the data locations have been identified, the program tags them in the document and creates visualization forms so a user could easily see and verify correspondence between 2 formats of the same data: saved in XBRL filing and presented in the document.

    摘要翻译: 一种用于将可扩展业务报告语言(XBRL)数据自动映射到初始业务文档中相应位置的方法和计算机程序。 该程序与XBRL文件一起提交初始报告文本,并启动基于Evolutionary Optimization的数据映射引擎。 引擎为每个数据项搜索文档中最合理的位置。 在数据位置被识别之后,程序将它们标记在文档中并创建可视化表单,以便用户可以轻松地查看和验证相同数据的两种格式之间的对应关系:保存在XBRL文件中并呈现在文档中。

    Method and apparatus for correcting erroneously decoded magnetic ink
characters
    6.
    发明授权
    Method and apparatus for correcting erroneously decoded magnetic ink characters 失效
    用于校正错误解码的磁性墨水字符的方法和装置

    公开(公告)号:US5963659A

    公开(公告)日:1999-10-05

    申请号:US435830

    申请日:1995-05-05

    摘要: A method and apparatus for storing and retrieving images of documents, e.g. checks. The method comprises placing a plurality of documents in a document imaging machine and forming an electronic image of each document, storing each electronic image in an electronic storage device, providing at least one user interface device in communication on a communication link with the electronic storage device, placing a request for at least one document image on the user interface device, transmitting the request by the communication link to the electronic storage device, searching the electronic storage device for the requested electronic image of the document, retrieving the at least one electronic image or providing an indication that the image was not found, storing the electronic image, if found, in an electronic file, for transmission to the user interface device at user option, providing the electronic image to the user interface device at command of a user at the user interface device for storage at the user interface device and displaying the requested electronic image on a display of the user interface device. Preferably, the electronic, images are stored with embedded identifying information in a TIFF.RTM. (trademark of Aldus Corp.) file format and the check images can be displayed on a display device which permits the user to view both sides of the checks simultaneously and perform functions such as zooming and rotation of the images.

    摘要翻译: 一种用于存储和检索文档的图像的方法和装置,例如 检查。 该方法包括将多个文档放置在文档成像机中并形成每个文档的电子图像,将每个电子图像存储在电子存储装置中,在与电子存储装置的通信链路上提供通信中的至少一个用户界面装置 向所述用户界面设备发送至少一个文档图像的请求,通过所述通信链路将所述请求发送到所述电子存储设备,在所述电子存储设备中搜索所请求的所述文档的电子图像,检索所述至少一个电子图像 或者提供没有找到图像的指示,如果发现的话,将电子图像存储在电子文件中,以用户选项传送到用户界面装置,则以用户的命令将电子图像提供给用户界面装置 所述用户界面设备用于在所述用户界面设备处存储并显示所请求的电子图像 在用户界面设备的显示器上。 优选地,电子图像以嵌入的识别信息存储在TIFF TM(Aldus Corp.的商标)文件格式中,并且可以在允许用户同时查看检查的两侧并执行的检查图像上显示检查图像 功能如图像的缩放和旋转。