摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data-included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A document search method and apparatus and a portable medium used therefor are described, in which when registering a document in a data base, the logic structures of each document to be registered are superposed one on another to generate a structure index in which the structure elements having the same position of occurrence in the document are represented by a single meta-node. At the time of document search, a mass of the meta-nodes meeting a specified structural condition is determined with reference to the structure index. A string index is searched with the meta-node identifiers as a key thereby to determine a mass of documents meeting the specified condition. As a result, a highly accurate structure-specified search is made possible on a document data base including a mass of structured documents. In the structure-specified search of structured documents, the conditions for the position of occurrence of the logic elements in the document are specified, thereby making possible a highly accurate structure-specified search.
摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A document search method and apparatus and a portable medium used therefor are described, in which when registering a document in a data base, the logic structures of each document to be registered are superposed one on another to generate a structure index in which the structure elements having the same position of occurrence in the document are represented by a single meta-node. At the time of document search, a mass of the meta-nodes meeting a specified structural condition is determined with reference to the structure index. A string index is searched with the meta-node identifiers as a key thereby to determine a mass of documents meeting the specified condition. As a result, a highly accurate structure-specified search is made possible on a document data base including a mass of structured documents. In the structure-specified search of structured documents, the conditions for the position of occurrence of the logic elements in the document are specified, thereby making possible a highly accurate structure-specified search.
摘要:
A text cataloging method includes a step of cataloging already-analyzed-text data obtained from an analysis of a logical structure of a text to be cataloged in a text database, a step of creating a structure index by sequentially superposing logical structures of texts to be cataloged, wherein a single metaelement is used for representing a group of elements in the texts having the same position of appearance in one of the texts and the same element type, a single piece of meta-character-string data is used for representing a group of pieces of character-string data in the texts having the same position of appearance in one of the texts, and a context identifier is assigned to each metanode composing a tree-like structure of the structure index for uniquely identifying the metanode; a step of generating structured-full-text data composed of definitions of associative relations between all pieces of character-string data included in already-analyzed-text data of each text to be cataloged, and context identifiers of pieces of meta-character-string data in the structure index used for representing the pieces of character-string data; and a character-string-index updating step, including the sub-steps of extracting partial character strings, generating structured-character-position information, and updating a character-string index.
摘要:
A method of fast clipping, despite of large number of users, can be achieved through analyzing query expressions, storing the number of query terms included in the query expressions in a term number count table, generating a finite automaton for matching the terms occurring in text data with all terms included in the query expressions, generating a user identifier table for storing the identifiers of users in association with the terms included in the query expressions, matching the terms by scanning the text data by the finite automaton, calculating for each user the occurrence count of terms occurring in the text data as substrings coincident with the terms included in the query expressions made to the user identifier table, storing the calculated occurrence count in the term occurrence count region of the table, comparing the calculated term occurrence count of the table with the number of terms included the query expressions, and when a match is found from the comparison, delivering the text data to the user.
摘要:
In a structured document managing method and system for managing a structured document formed by a plurality of elements, any file forming a registered document is selected as an object of updating from relationship data indicating an entity structure and a logical structure of the registered document and the data content of the selected update object file is updated. There is generated partial relationship data which indicates an entity structure and a logical structure of the update object file after updating. Relationship data of the registered document is updated by use of the generated partial relationship data. Thereby, a logical structure and an entity structure possessed by a document are managed in association with each other in a mutually convertible form.