摘要:
A document retrieval is performed with similarities between documents in numeric data taken into consideration. To this end, generated is a set E of intervals in which each element of a set D of numeric values representing a feature A is included in any one of the intervals. Each numeric value in each document is indexed by assigning, with 1, an interval including an element x of the set D, and with 0, an interval without the element x. Each document data including numeric values is indexed by indexing its text part with term frequencies, and by indexing its numeric-value part with the above-described numeric value indexing scheme. By use of indices thus created for each of the document data, similarities between the document data are calculated using a vector space model or a probability model, and the document data are presented in order of similarity.
摘要:
A document retrieval is performed with similarities between documents in numeric data taken into consideration. To this end, generated is a set E of intervals in which each element of a set D of numeric values representing a feature A is included in any one of the intervals. Each numeric value in each document is indexed by assigning, with 1, an interval including an element x of the set D, and with 0, an interval without the element x. Each document data including numeric values is indexed by indexing its text part with term frequencies, and by indexing its numeric-value part with the above-described numeric value indexing scheme. By use of indices thus created for each of the document data, similarities between the document data are calculated using a vector space model or a probability model, and the document data are presented in order of similarity.
摘要:
To provide a summary of a search result in an associative search system based on multiple viewpoints. By indexing one document database in plural ways, a summary of a search result can be displayed from multiple viewpoints. By managing documents in indexed versions of the document database by common identifiers, summaries of a document set obtained as a search result can be created using the different indexes.
摘要:
A system for effectively collecting, without omissions, spelling variations centering on particular technical terms occurring in documents. In advance, the system sorts technical terms considered to be potential spelling variations from among a large-scale collection of terms. By measuring the edit distance adjusted for the cost of the terms that are potential spelling variations, the system can collect terms considered spelling variations from among the potential spelling variation terms with a high degree of accuracy.
摘要:
A system functioning between the clinical trial facility (trial site) and the party requesting clinical trial (sponsor) to make clinical trial more efficient. Information relating to the sponsor and information relating to the trial site, and trial basic information including the trial target disease name, the total length of trial (trial period), total number of cases and total budget registered by the sponsor, and trial contract conditions including the trial name, trial period, number of cases and budget are stored in a clinical trial database; and the process and information service unit distributes information on trial contract conditions registered in the contract information storage area to the trial site, registers applicant trial sites for trial relating to the distributed trial contract conditions, and distributes the status of applicant responses for trial contract conditions, to the trial sponsor who registered matching contract conditions.
摘要:
Feature of a compound is predicted by using information on interactions between substances. A database of interactions between compounds and genes/proteins is constructed on the base of information collected from bibliographic databases, gene/protein databases, and disease databases, and an interaction network is prepared by mapping the collected information to thereby enable prediction of the features of a compound.
摘要:
Data newly obtained on genes by experiments, data obtained from texts and data obtained through the Internet or the like are integrated to provide novel knowledge. Information on the association between terms such as a gene, a compound, a disease, gene functions and the like accumulated in a data storage system is used to reconstruct a network of terms connecting a first query and a second query designated by a first query input unit l so to display on a display device. Thereby, a term associating the first query and the second query is displayed. As a result, a user is provided with knowledge how the first query and the second query are associated.
摘要:
A known method for selecting words (or word sequences), which is an important aspect of information retrieval, involves the problems of inability to eliminate high-frequency common words and of often arbitrary setting of the threshold value for dividing important and unimportant words. These problems are solved by normalizing the difference between the word distribution in a subset of all documents containing a word to be extracted (or a subset of said document set) and the word distribution in the set of all documents with the number of words in the said subset of all documents containing the word as a parameter, and the accuracy of support information retrieval is thereby enhanced.
摘要:
Ordering is properly performed for document databases registered in an associative search server. In an associative search server capable of performing an associative search by correlating a plurality of document databases, the history of the associative search is stored as an associative search recording table by associative search recording table storing means. By using this associative search recording table, a showing order of document databases presented by document database selecting means is properly set by showing order changing means. Alternatively, by registration fee calculating means, calculation is properly carried out as to registration fees of the document database registered in the associative search server.
摘要:
A system for displaying the results of a search provided by one of two different search systems enabling continuous searching. One search system includes a search takeover data production command used to output search takeover data articles from the search. The other search system includes a search takeover data reading command used to read search takeover data. A document identifier correspondence table associates the identifiers specified in the search takeover data. When a user clicks a search system transfer instruction button in one search system, the search takeover data producing command is executed to produce search takeover data which is passed to the other search system. The latter search system regards the list of identifiers of articles which was passed by the search takeover data reading command as the search results, and thus operates continuously.