摘要:
A method for making document information searches. In performing a document search with respect to the desired key word, two stages of presearch are carried out. In a first stage of presearch, a character component table in which an existence of character codes for every document is stated with respect to all the character codes contained in the group of document text data of stored documents is generated, and the character component table is searched for all the character strings constituting a desiredly designated search subject key word to thereby extract all the documents each containing all the character codes constituting the search subject key word. In a second stage of presearch, contracted text data for every document in which adjuncts and duplication of repeatedly stated words contained in advance in the text data are eliminated is generated, and the documents each containing the search subject key words by word are extracted from the documents extracted by the first presearch. After the second stage of presearch, text search is performed in accordance with a neighbor condition, a contextual condition, or the like.
摘要:
A parallel comparator for performing a parallel and high-speed processing for collation of partial character strings which are partially taken out of a plurality of character strings of interest to be searched out with a character string to be searched in which document data to be searched is arranged sequentially from a leading character, is provided in a front stage of an automaton executing device. Only when a part of the character string to be searched coincides with the partial character string set in the comparator, the collation of the remaining portion of the character string to be searched is performed by the automaton executing device. Also, it is possible to set "don't care" in which a character at any position in the partial character string is ignored at the time of comparison by the comparator and to set a negation condition in which the comparison by the comparator is made taking the negation of a character at any position in the partial character string.
摘要:
A document retrieval method and system for retrieving, from a document database storing document data in the form of character codes, a document which contains given search terms and which meets a given search query condition. From documents loaded from the document database, a document containing terms which match the search terms is searched to generate document identification (ID) information including a document identifier of the searched document and containing match terms found to match with the search terms as well as term identifiers of the match terms and position information of the match terms in the searched document. A decision is then made as to whether or not the position information of the match terms satisfies a positional condition specified in the search query condition concerning a positional relation between the search terms, and match information is then generated indicating satisfaction of the search query condition when the positional condition is satisfied. Through a proximity condition decision, it is ascertained whether the match terms satisfy an inter-term distance condition specified in the search query condition. Through a contextual condition decision, it is determined whether the match terms satisfy a concurrence condition specifying concurrence of the search terms in a same sub-sentence, a same sentence or a same paragraph. Through a logical condition, it is decided whether the match terms satisfy a logical condition between the search terms specified in the search query condition.
摘要:
A method and apparatus for performing a document information search to uncover specified text data containing a given search subject key word from a group of document text data stored in a memory. In the document information search method, two stages of presearch are carried out to perform the document search with respect to a desired subject key word. In a first stage of presearch, a character component table is generated in which the existence of character codes for every document is set forth with respect to all the character codes contained in the group of document text data of stored documents. The character component table is searched for all the character codes comprising a designated search subject key word to thereby extract all documents containing all the character codes comprising the search subject key word. Further, in the presearch step, all texts without the possibility of containing the search subject key word are eliminated. A comprehensive, narrowed text search is thereby performed in accordance with the search subject key word.
摘要:
A method and apparatus for making document information search and a magnetic disk unit to be used for realizing the method and apparatus. In the document information search method, in performing document search with respect to a desired subject key word, two stages of presearch are carried out. In a first stage of presearch (step 402), a character component table (500) in which existence of character codes for every document is stated with respect to all the character codes contained in the group of document text data of stored documents is generated, and the character component table is searched for all the character codes constituting a desiredly designated search subject key word to thereby extract all the documents each containing all the character codes constituting the search subject key word. In a second stage of presearch step 403), contracted text data for every document in which adjuncts and duplication of repeatedly stated words contained in advance in the text data are eliminated is generated, and the documents each containing the search subject key words by word are extracted from the documents extracted by the first presearch. After the second stage of presearch, text search is performed in accordance with a neighbor condition, a contextual condition, or the like (step 404). Further, as a term comparator means, hardware (1106) for exclusive use for term comparison in accordance with a finite automation is employed. Further, as for different notation and synonym, inputted terms are once subject to different notation development in a different notation development processing portion (2601), each of the different-notation developed terms is subject to synonym development in a synonym development processing portion (2602) while referring to a synonym dictionary, and then the results of synonym development are further subject to different notation development in a different notation development processing portion (2603) in accordance with a conversion rule table (2603).
摘要:
A range-conditional character string retrieving method and system capable of performing retrieval of a numerical value from a character string at an increased speed by shortening the time taken for generation of finite automaton, range condition retrieval for a character string containing admixedly numeric characters and non-numeric characters such as alphabetic letters and highly intelligent retrieval of a numerical value with designation of preceding and succeeding characters. Given range condition is partitioned in accordance with difference in the digit number between upper and lower limit values, whereon retrieval is performed in each of partitioned ranges in parallel. When a finite automaton transits from a predetermined state to at least two state in dependence on the result of collation of a character string subjected to retrieval, conditions for the state transitions are designated in terms of corresponding codes. A numerical value detecting unit for detecting a numerical value of interest from the character string subjected to retrieval is provided in association with a range decision unit for deciding whether the numerical value detected by the numerical value detecting unit falls within a specified range. A character string collating unit for retrieving a specific character string from the string subjected to retrieval is provided in association with a range condition collating unit for detecting a numerical value falling within a specific range from the specific character string.
摘要:
A typical structure of a file server system is a file server system having a plurality of file servers connected in parallel on a network and sharing files placed distributedly in the file servers among a plurality of client computers, and there are provided in a specific file server among the plurality of file servers, a load information monitoring device for measuring respective loads of the plurality of file servers and a file access request distributing device for referring to the loads measured by the load information monitoring device so as to select a file server having a light load from the plurality of file servers having light loads, and distributing a file access request transmitted from client computers to the selected file server.
摘要:
In associative memory device, a search key is stored in the first storage element and a storage key is stored in the second storage elements, respectively via a first data bus. The search key is supplied to the comparator via a second data bus, and the storage key stored in the second storage element is supplied to the comparator. The comparator compares the search key with the storage key. When the storage key is consistent with the search key, the comparator delivers as the associative operation results a comparison consistency output signal to a priority encoder circuit which outputs code information having a limited bit length. This code information is transferred to CPU via a selector circuit. If the comparator delivers a comparison inconsistency output signal, this signal is directly passed to CPU via the priority encoder circuit, so that the contents of the first storage element is rewritten. The first and second storage elements are designated by an address signal and data is read or written via the first data bus, so that they are used as a usual memory device.
摘要:
In a computer system having an interface with an I/O bus given disconnect/reconnect functions and a plurality of magnetic disk subsystems connected with the I/O bus, control divides a file at a disk access time with reference to disk management information, file management information and file descriptor translation information to read/write a plurality of such files asynchronously. Thus, high speed file access can be realized by only the plurality of magnetic disk subsystems without requiring any special control hardware. A corresponding relation between subfiles in a virtual directory and the application request file can be taken to construct the directory, thus making the divided storage transparent to the application.
摘要:
In associative memory device, a search key is stored in the first storage element and a storage key is stored in the second storage elements, respectively via a first data bus. The search key is supplied to the comparator via a second data bus, and the storage key stored in the second storage element is supplied to the comparator. The comparator compares the search key with the storage key. When the storage key is consistent with the search key, the comparator delivers as the associative operation results a comparison consistency output signal to a priority encoder circuit which outputs code information having a limited bit length. This code information is transferred to CPU via a selector circuit. If the comparator delivers a comparison inconsistency output signal, this signal is directly passed to CPU via the priority encoder circuit, so that the contents of the first storage element is rewritten. The first and second storage elements are designated by an address signal and data is read or written via the first data bus, so that they are used as a usual memory device.