摘要:
In document retrieval having the relevance feedback function to modify a searching profile for retrieval on the basis of a user's evaluation to evaluate a search result as pertinent or impertinent, recommencement of the relevance feedback returned to a desired time is permitted. An evaluation inputted by a user, a searching profile modified by the evaluation and a search result based on the searching profile are all saved while making the correspondence between them. When a request for restoration of searching profile is made, a searching profile corresponding to an evaluation designated by the user is restored.
摘要:
A text mining method whereby documents (texts) can be analyzed from a wide variety of visual points. The text mining method includes: distinctive word and/or phrase extraction step of extracting words and/or phrases characteristically emerging in a processing subject document set obtained by taking out whole or a part of a set of documents registered beforehand; definition information setting step of setting definition information including a specified word or phrase or specified bibliography information; coincident word and/or phrase acquisition step of acquiring coincident words and/or phrases coincident in a predetermined range with a word or phrase or bibliography information included in said definition information from among words and/or phrases extracted at said distinctive word and/or phrase extraction step; and multiplex coincident word and/or phrase acquisition step of acquiring coincident words and/or phrases coincident in a predetermined range with an individual word or phrase or bibliography information acquired from each of a plurality of different definition information pieces.
摘要:
A document retrieval method using a computer program includes retrieving a first set of documents using a first query expression generated by the computer program. The first set of documents is provided to a user. An evaluation of the first set of documents is received from the user. The first query expression is changed to a second query expression generated by the computer program based on the evaluation.
摘要:
In a text mining technique, if the system only extracts characteristic words and phrases frequently cooccurring with the respective components of an analysis axis as an analysis condition, similar words and phrases are extracted for any component. To clearly indicate existence of characteristic words and phrases which do not appear as cooccurrence words and phrases for other components of the analysis axis, it is desired to appropriately present distinguishable features between the components to the user. For this purpose, the frequency of appearances of a plurality of characteristic words and phrases in a document satisfying each analysis condition is calculated. As a result, multiple cooccurrence words and phrases and component-cooccurrence words and phrases are discriminatively displayed. It is therefore possible for the user to appropriately analyze the contents of a plurality of documents.
摘要:
A document retrieval system is provided which has a document display interface which is easy to recognize the important portions even if a document retrieved by using a query expression designated by a document or a long sentence is displayed. When a text is registered, predetermined character strings and location information which are extracted from the text are stored in a location information file. A weight of each character string is calculated by a predetermined method and is stored in a weight file. In retrieving a document, predetermined character strings are extracted from a designated query expression. A similarity is calculated between the query expression and texts in the database by using the location information and the weights acquired from the location file and the weight file. In displaying the document, character strings having the high weights are extracted from the character strings used for the retrieval. Then, the display format of a portion which contains the extracted character strings is changed to display the text.
摘要:
Retrieval conditions inputted from a plurality of users are registered. According to the retrieval conditions, a retrieval is conducted for a text inputted. As a result of the retrieval, similarity of the text is calculated for each retrieval condition. The text is delivered to users of which the retrieval condition satisfies the similarity.
摘要:
Retrieval conditions inputted from a plurality of users are registered. According to the retrieval conditions, a retrieval is conducted for a text inputted. As a result of the retrieval, similarity of the text is calculated for each retrieval condition. The text is delivered to users of which the retrieval condition satisfies the similarity.
摘要:
Word boundary identification operations such as morpheme analysis is performed on documents to be registered, and the top positions and the end positions of words are identified. Word boundary information is obtained based on these identification results. Search indexes are created for sub-strings of a predetermined length (n-grams) extracted from the document being registered. The search index includes document identification information as well as occurrence position information which indicates that the string is located at the n-th position from the beginning of the text data, and word boundary information for an n-gram in a document.
摘要:
Retrieval conditions inputted from a plurality of users are registered. According to the retrieval conditions, a retrieval is conducted for a text inputted. As a result of the retrieval, similarity of the text is calculated for each retrieval condition. The text is delivered to users of which the retrieval condition satisfies the similarity.
摘要:
Retrieval conditions inputted from a plurality of users are registered. According to the retrieval conditions, a retrieval is conducted for a text inputted. As a result of the retrieval, similarity of the text is calculated for each retrieval condition. The text is delivered to users of which the retrieval condition satisfies the similarity.