摘要:
A computer-readable, non-transitory medium stores a program that manages compressed file groups on a plurality of slave servers. The file groups include compressed files that are to be searched and have character strings. Each of the compressed file groups is expanded, using a Huffman tree that was used for compressing the compressed file group. A common compression parameter is generated based on appearance frequency, by summing, for each character, the appearance frequency in each of the compressed file groups. The expanded files are recompressed using the common Huffman tree such that sums of the access frequencies of the compressed files that are origins of the recompressed files are substantially equivalent among various slave servers. New archives including the re-compressed files are transmitted to the respective slave servers.
摘要:
A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined.
摘要:
A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined.
摘要:
A recording medium stores therein an information retrieval program that causes a computer to execute generating a Huffman tree based on an XML tag written in an XML file and an appearance frequency of character data exclusive of the XML tag; compressing the XML file using the Huffman tree; receiving a retrieval condition that includes a retrieval keyword and type information concerning the retrieval keyword; setting a decompression start flag for a compression code that is for an XML start tag related to the type information, the decompression start flag instructing commencement of decompression of a compression code string subsequent to the XML start tag; detecting, in the compressed XML file, the compression code for which the decompression start flag has been set; and decompressing, when the compression code for which the decompression start flag has been set is detected, the compression code string, using the Huffman tree.
摘要:
A computer-readable recording medium stores therein a sequence-map generating program that causes a computer to execute extracting from files that include character strings written therein, a word having q (q≧2) characters; extracting from the word extracted at the extracting the word, consecutive characters from a character position s-th (1≦s≦q−r+1) from a head of the word to a character position determined by a number of characters r (r≦q); and generating, for each character position s-th from the head, a consecutive-character sequence map including a flag row that indicates, for each file, whether a file includes the consecutive characters extracted at the extracting the consecutive characters.
摘要:
An information retrieval apparatus includes contents, an index data generating unit, a character frequency management data generating unit, a compressing/encrypting unit, a retrieval initializing unit, a full text retrieving unit, and a retrieval result displaying unit. The character frequency management data generating unit generates character frequency management data based on the contents. The compressing/encrypting unit compresses the contents and encrypts the character frequency management data. The retrieval initializing unit decrypts encrypted character frequency management data. The full text retrieving unit executes full text retrieval for compressed contents using the character frequency management data and index data when receiving a retrieval keyword. The retrieval result displaying unit decompresses a retrieval candidate selected from retrieval candidates and displays as a retrieval result.
摘要:
A dictionary server includes a retrieval-display processing unit. Upon receipt of a request for retrieval of semantic information related to a term from a client PC, the retrieval-display processing unit acquires the semantic information, header information, and link information related to the semantic information from knowledge reference data, dictionary content data, and dictionary data. Based on the acquired information, the retrieval-display processing unit causes the client PC to display items on webpage related to the semantic information, the header information, and the link information.
摘要:
A method of controlling decompression, wherein the method includes: transmitting, by a first computer that already has stored therein compressed data that are compressed based on compression parameters, identification information for identifying the first computer to a second computer that stores therein the compression parameters; and encrypting, by the second computer, the compression parameters using the identification information received from the first computer. The compression parameters include at least a frequency of appearance and an allocated sign for each piece of character data. The method also includes: transmitting, by the second computer, the encrypted compression parameters to the first computer; decrypting, by the first computer, the encrypted compression parameters received from the second computer using the identification information; and decompressing, by the first computer, the compressed data based on the decrypted compression parameters.
摘要:
A recording medium stores therein an information retrieval program that causes a computer to execute generating a Huffman tree based on an XML tag written in an XML file and an appearance frequency of character data exclusive of the XML tag; compressing the XML file using the Huffman tree; receiving a retrieval condition that includes a retrieval keyword and type information concerning the retrieval keyword; setting a decompression start flag for a compression code that is for an XML start tag related to the type information, the decompression start flag instructing commencement of decompression of a compression code string subsequent to the XML start tag; detecting, in the compressed XML file, the compression code for which the decompression start flag has been set; and decompressing, when the compression code for which the decompression start flag has been set is detected, the compression code string, using the Huffman tree.
摘要:
A computer-readable recording medium stores therein an information processing program that causes a computer to execute storing an aggregate of layers of nodes respectively having a pointer to an upper node, pointers to a leaf and/or a lower node and branches to lower nodes; obtaining a totaling result of appearance frequencies of character codes described in a file; classifying the character codes by layer, based on appearance probabilities thereof and the totaling result; calculating, based on a quantity of character codes in an ith layer and for the ith layer, a quantity of pointers pointing to leaves, and based on the quantity calculated and for the ith layer, further calculating a number of times nodes are used and a quantity of pointers pointing to lower nodes; generating, based on calculation results, a Huffman tree; and converting the Huffman tree into a node-less Huffman tree and storing the node-less Huffman tree.