Data extraction and conversion methods and apparatuses
    1.
    发明申请
    Data extraction and conversion methods and apparatuses 审中-公开
    数据提取和转换方法和装置

    公开(公告)号:US20070174306A1

    公开(公告)日:2007-07-26

    申请号:US11330792

    申请日:2006-01-11

    CPC classification number: G06F17/272 G06F17/2247

    Abstract: Data extraction and conversion processes and apparatuses are described according to some aspects. In one aspect, a data extraction and conversion process comprises applying at least one template to the information sources, analyzing the data from the information sources according to the templates, thereby generating parsed data values, and writing the parsed data values from the information sources into a common format. The templates comprise a plurality of parsing steps in a multi-path configuration. In another aspect, an apparatus comprises a computer-readable medium having a plurality of parsing step modules and configured to receive data from the information sources, an input device configured to select and arrange at least two parsing step modules as parsing steps in a multi-path configuration, thereby creating a template, and processing circuitry configured to generate parsed data values by analyzing data from the information sources according to the template. The processing circuitry also writes the parsed data values in a common format. Both the computer-readable medium and the input device are operably connected to the processing circuitry.

    Abstract translation: 根据一些方面描述数据提取和转换处理和装置。 在一个方面,数据提取和转换过程包括将至少一个模板应用于信息源,根据模板分析来自信息源的数据,由此产生解析的数据值,并将来自信息源的解析数据值写入 一种通用格式。 模板包括多路径配置中的多个解析步骤。 在另一方面,一种装置包括具有多个分析步骤模块并被配置为从信息源接收数据的计算机可读介质,配置为选择并排列至少两个解析步骤模块作为解析步骤的输入设备, 路径配置,从而创建模板,以及处理电路,被配置为通过根据模板分析来自信息源的数据来生成解析的数据值。 处理电路还以通用格式写入解析的数据值。 计算机可读介质和输入设备都可操作地连接到处理电路。

    Universal parsing agent system and method
    2.
    发明申请
    Universal parsing agent system and method 有权
    通用解析代理系统和方法

    公开(公告)号:US20050108267A1

    公开(公告)日:2005-05-19

    申请号:US10714541

    申请日:2003-11-14

    CPC classification number: G06F17/30569

    Abstract: A system and method for extracting a plurality of structured data from one or more information sources. The method comprises receiving the information sources, receiving at least one pattern descriptor selected from a graphical user interface, and receiving one or more templates with each templates having at least one pattern descriptor. The method then proceeds to apply the one or more templates to the information sources. The method generates the plurality of structured data in a common format by parsing the information sources with the templates. The method stores the structured data in the common format.

    Abstract translation: 一种用于从一个或多个信息源提取多个结构化数据的系统和方法。 该方法包括接收信息源,接收从图形用户界面中选择的至少一个模式描述符,以及接收一个或多个模板,每个模板具有至少一个模式描述符。 然后,该方法继续将一个或多个模板应用于信息源。 该方法通过使用模板解析信息源来生成通用格式的多个结构化数据。 该方法以通用格式存储结构化数据。

Patent Agency Ranking