Structured-text cataloging method, structured-text searching method, and portable medium used in the methods
    2.
    发明授权
    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods 失效
    结构化文本编目方法,结构化文本搜索方法和方法中使用的便携式媒体

    公开(公告)号:US06226632B1

    公开(公告)日:2001-05-01

    申请号:US09589226

    申请日:2000-06-08

    IPC分类号: G06F1730

    摘要: A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.

    摘要翻译: 文本编目方法包括对从文本数据库中要编目的文本的逻辑结构的分析获得的已经分析的文本数据进行编目的步骤,通过将文本的逻辑结构顺序叠加来创建结构索引的步骤 其中单个元组用于表示在一个文本和相同元素类型中具有相同出现位置的文本中的一组元素,单个元字符串数据用于表示组 在文本中具有相同的出现位置的文本中的字符串数据段,并且将上下文标识符分配给构成用于唯一地标识元数据的结构索引的树状结构的每个元模型; 生成结构化全文数据的步骤,该结构化全文数据由包含在要编目的每个文本的已分析文本数据中的所有字符串数据之间的关联关系的定义以及元字符串的上下文标识符组成 用于表示字符串数据片段的结构索引中的数据; 以及字符串索引更新步骤,包括提取部分字符串,生成结构化字符位置信息和更新字符串索引的子步骤。

    Structured-text cataloging method, structured-text searching method, and
portable medium used in the methods
    3.
    发明授权
    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods 失效
    结构化文本编目方法,结构化文本搜索方法和方法中使用的便携式媒体

    公开(公告)号:US6105022A

    公开(公告)日:2000-08-15

    申请号:US28513

    申请日:1998-02-23

    IPC分类号: G06F17/21 G06F17/30

    摘要: A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data-included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.

    摘要翻译: 文本编目方法包括对从文本数据库中要编目的文本的逻辑结构的分析获得的已经分析的文本数据进行编目的步骤,通过将文本的逻辑结构顺序叠加来创建结构索引的步骤 其中单个元组用于表示在一个文本和相同元素类型中具有相同出现位置的文本中的一组元素,单个元字符串数据用于表示组 在文本中具有相同的出现位置的文本中的字符串数据段,并且将上下文标识符分配给构成用于唯一地标识元数据的结构索引的树状结构的每个元模型; 生成包括在要编目的每个文本的已分析文本数据中的所有字符串数据之间的关联关系的定义的结构化全文数据的步骤,以及元字符数据片段的上下文标识符, 用于表示字符串数据的结构索引中的字符串数据; 以及字符串索引更新步骤,包括提取部分字符串,生成结构化字符位置信息和更新字符串索引的子步骤。

    Document search method and apparatus and portable medium used therefor
    4.
    发明授权
    Document search method and apparatus and portable medium used therefor 失效
    文件检索方法及其使用的便携式媒体

    公开(公告)号:US06377946B1

    公开(公告)日:2002-04-23

    申请号:US09256178

    申请日:1999-02-24

    IPC分类号: G06F1730

    摘要: A document search method and apparatus and a portable medium used therefor are described, in which when registering a document in a data base, the logic structures of each document to be registered are superposed one on another to generate a structure index in which the structure elements having the same position of occurrence in the document are represented by a single meta-node. At the time of document search, a mass of the meta-nodes meeting a specified structural condition is determined with reference to the structure index. A string index is searched with the meta-node identifiers as a key thereby to determine a mass of documents meeting the specified condition. As a result, a highly accurate structure-specified search is made possible on a document data base including a mass of structured documents. In the structure-specified search of structured documents, the conditions for the position of occurrence of the logic elements in the document are specified, thereby making possible a highly accurate structure-specified search.

    摘要翻译: 描述了一种文档搜索方法和装置及其使用的便携式介质,其中当在数据库中注册文档时,将要注册的每个文档的逻辑结构彼此叠加以生成结构元素 在文档中具有相同的出现位置由单个元节点表示。 在文档搜索时,参考结构索引确定满足指定结构条件的大量元节点。 使用元节点标识符作为关键字搜索字符串索引,从而确定满足指定条件的文档的大小。 结果,在包括大量结构化文档的文档数据库上可以进行高度精确的结构指定搜索。 在结构化指定的结构化文档搜索中,指定了文档中逻辑元素的发生位置的条件,从而使得可以进行高精度的结构指定搜索。

    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods
    5.
    发明授权
    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods 失效
    结构化文本编目方法,结构化文本搜索方法和方法中使用的便携式媒体

    公开(公告)号:US06745202B2

    公开(公告)日:2004-06-01

    申请号:US10303782

    申请日:2002-11-26

    IPC分类号: G06F1730

    摘要: A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.

    摘要翻译: 文本编目方法包括对从文本数据库中要编目的文本的逻辑结构的分析获得的已经分析的文本数据进行编目的步骤,通过将文本的逻辑结构顺序叠加来创建结构索引的步骤 其中单个元组用于表示在一个文本和相同元素类型中具有相同出现位置的文本中的一组元素,单个元字符串数据用于表示组 在文本中具有相同的出现位置的文本中的字符串数据段,并且将上下文标识符分配给构成用于唯一地标识元数据的结构索引的树状结构的每个元模型; 生成结构化全文数据的步骤,该结构化全文数据由包含在要编目的每个文本的已分析文本数据中的所有字符串数据之间的关联关系的定义以及元字符串的上下文标识符组成 用于表示字符串数据的结构索引中的数据; 以及字符串索引更新步骤,包括提取部分字符串,生成结构化字符位置信息和更新字符串索引的子步骤。

    Document search method for registering documents, generating a structure index with elements having position of occurrence in documents represented by meta-nodes
    7.
    发明授权
    Document search method for registering documents, generating a structure index with elements having position of occurrence in documents represented by meta-nodes 失效
    用于注册文件的文档搜索方法,在由元节点表示的文档中生成具有出现位置的元素的结构索引

    公开(公告)号:US06510425B1

    公开(公告)日:2003-01-21

    申请号:US09972004

    申请日:2001-10-09

    IPC分类号: G06F1730

    摘要: A document search method and apparatus and a portable medium used therefor are described, in which when registering a document in a data base, the logic structures of each document to be registered are superposed one on another to generate a structure index in which the structure elements having the same position of occurrence in the document are represented by a single meta-node. At the time of document search, a mass of the meta-nodes meeting a specified structural condition is determined with reference to the structure index. A string index is searched with the meta-node identifiers as a key thereby to determine a mass of documents meeting the specified condition. As a result, a highly accurate structure-specified search is made possible on a document data base including a mass of structured documents. In the structure-specified search of structured documents, the conditions for the position of occurrence of the logic elements in the document are specified, thereby making possible a highly accurate structure-specified search.

    摘要翻译: 描述了一种文档搜索方法和装置及其使用的便携式介质,其中当在数据库中注册文档时,将要注册的每个文档的逻辑结构彼此叠加以生成结构元素 在文档中具有相同的出现位置由单个元节点表示。 在文档搜索时,参考结构索引确定满足指定结构条件的大量元节点。 使用元节点标识符作为关键字搜索字符串索引,从而确定满足指定条件的文档的大小。 结果,在包括大量结构化文档的文档数据库上可以进行高度精确的结构指定搜索。 在结构化指定的结构化文档搜索中,指定了文档中逻辑元素的发生位置的条件,从而使得可以进行高精度的结构指定搜索。

    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods
    8.
    发明授权
    Structured-text cataloging method, structured-text searching method, and portable medium used in the methods 失效
    结构化文本编目方法,结构化文本搜索方法和方法中使用的便携式媒体

    公开(公告)号:US06389413B2

    公开(公告)日:2002-05-14

    申请号:US09814692

    申请日:2001-03-15

    IPC分类号: G06F1730

    摘要: A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.

    摘要翻译: 文本编目方法包括对从文本数据库中要编目的文本的逻辑结构的分析获得的已经分析的文本数据进行编目的步骤,通过将文本的逻辑结构顺序叠加来创建结构索引的步骤 其中单个元组用于表示在一个文本和相同元素类型中具有相同出现位置的文本中的一组元素,单个元字符串数据用于表示组 在文本中具有相同的出现位置的文本中的字符串数据段,并且将上下文标识符分配给构成用于唯一地标识元数据的结构索引的树状结构的每个元模型; 生成结构化全文数据的步骤,该结构化全文数据由包含在要编目的每个文本的已分析文本数据中的所有字符串数据之间的关联关系的定义以及元字符串的上下文标识符组成 用于表示字符串数据的结构索引中的数据; 以及字符串索引更新步骤,包括提取部分字符串,生成结构化字符位置信息和更新字符串索引的子步骤。

    News clipping method and system
    9.
    发明授权
    News clipping method and system 失效
    新闻剪辑方法和系统

    公开(公告)号:US5970485A

    公开(公告)日:1999-10-19

    申请号:US891064

    申请日:1997-07-10

    IPC分类号: G06F3/14 G06F3/048 G06F17/30

    摘要: A method of fast clipping, despite of large number of users, can be achieved through analyzing query expressions, storing the number of query terms included in the query expressions in a term number count table, generating a finite automaton for matching the terms occurring in text data with all terms included in the query expressions, generating a user identifier table for storing the identifiers of users in association with the terms included in the query expressions, matching the terms by scanning the text data by the finite automaton, calculating for each user the occurrence count of terms occurring in the text data as substrings coincident with the terms included in the query expressions made to the user identifier table, storing the calculated occurrence count in the term occurrence count region of the table, comparing the calculated term occurrence count of the table with the number of terms included the query expressions, and when a match is found from the comparison, delivering the text data to the user.

    摘要翻译: 可以通过分析查询表达式,将包含在查询表达式中的查询项的数量存储在术语数量计数表中来实现快速裁剪的方法,尽管有大量用户,生成用于匹配文本中出现的术语的有限自动机 具有包括在查询表达式中的所有术语的数据,生成用于存储与查询表达式中包括的术语相关联的用户标识符的用户标识符表,通过有限自动机扫描文本数据来匹配术语,为每个用户计算 在文本数据中出现的术语的出现次数与作为对用户标识符表的查询表达式中包括的术语一致的子字符串存储,将计算的出现次数存储在表的术语出现计数区域中,将计算出的术语发生次数 表中包含查询表达式的术语数量,当从比较中找到匹配时,传递 g给用户的文本数据。

    Method and system for management of structured document and medium having processing program therefor
    10.
    发明授权
    Method and system for management of structured document and medium having processing program therefor 有权
    具有处理程序的结构化文件和介质的管理方法和系统

    公开(公告)号:US07107527B2

    公开(公告)日:2006-09-12

    申请号:US10834044

    申请日:2004-04-29

    IPC分类号: G06F15/00 G06F17/00

    CPC分类号: G06F17/30011 G06F17/3089

    摘要: In a structured document managing method and system for managing a structured document formed by a plurality of elements, any file forming a registered document is selected as an object of updating from relationship data indicating an entity structure and a logical structure of the registered document and the data content of the selected update object file is updated. There is generated partial relationship data which indicates an entity structure and a logical structure of the update object file after updating. Relationship data of the registered document is updated by use of the generated partial relationship data. Thereby, a logical structure and an entity structure possessed by a document are managed in association with each other in a mutually convertible form.

    摘要翻译: 在用于管理由多个元素形成的结构化文档的结构化文档管理方法和系统中,形成登记文档的任何文件被选择为从指示登记文档的实体结构和逻辑结构的关系数据的更新对象,并且 更新所选择的更新对象文件的数据内容。 生成部分关系数据,其表示更新后的实体结构和更新对象文件的逻辑结构。 通过使用生成的部分关系数据来更新注册文档的关系数据。 因此,由文档拥有的逻辑结构和实体结构以相互可转换的形式相互关联地进行管理。