摘要:
A computer-readable recording medium stores therein an information search program that causes a computer to search for text items described in a text file. The information search program causes the computer to execute receiving input of a search keyword; searching an index file for a writing keyword that includes the search keyword, the index file including writing keywords described, for respective entries, in an order identical to the order in which the text items are described in the text file; identifying an entry that corresponds to the writing keyword retrieved at the searching; and outputting the identified entry.
摘要:
A recording medium stores an information processing program that causes a computer to execute storing a compression symbol map group having a bit string indicating for each character code, presence or absence of the character code in a file group, and a Huffman tree whose leaf corresponding to the character code has a pointer to a compression symbol map of the character code, the Huffman tree converting the character code into a compression symbol of the character code; compressing sequentially and according to the Huffman tree, a character code to be compressed and described in a file of the file group; detecting access to the leaf at the compressing; identifying by a pointer in the accessed leaf, a compression symbol map of the character code to be compressed; and updating a bit that indicates presence or absence of the character code to be compressed, in the identified compression symbol map.
摘要:
A computer-readable recording medium stores therein an information retrieval program that causes a computer to execute a retrieval process in which files to be retrieved are narrowed down by using a bit string for each character in the files to find characters making up a retrieval keyword to retrieve a keyword identical to or related to the retrieval keyword in the files to be retrieved. The bit strings indicate the presence of the characters in the files. The information retrieval program causes the computer to execute extracting, from among the bit strings, a bit string of an arbitrary character; and compressing the extracted bit string, by using a special Huffman tree having leaves of plural types of symbol strings covering patterns represented by a predetermined number of bits and a special symbol string having a number of bits greater than the predetermined number of bits.
摘要:
An information retrieval apparatus includes contents, an index data generating unit, a character frequency management data generating unit, a compressing/encrypting unit, a retrieval initializing unit, a full text retrieving unit, and a retrieval result displaying unit. The character frequency management data generating unit generates character frequency management data based on the contents. The compressing/encrypting unit compresses the contents and encrypts the character frequency management data. The retrieval initializing unit decrypts encrypted character frequency management data. The full text retrieving unit executes full text retrieval for compressed contents using the character frequency management data and index data when receiving a retrieval keyword. The retrieval result displaying unit decompresses a retrieval candidate selected from retrieval candidates and displays as a retrieval result.
摘要:
A computer-readable recording medium stores therein an information processing program that causes a computer to execute storing an aggregate of layers of nodes respectively having a pointer to an upper node, pointers to a leaf and/or a lower node and branches to lower nodes; obtaining a totaling result of appearance frequencies of character codes described in a file; classifying the character codes by layer, based on appearance probabilities thereof and the totaling result; calculating, based on a quantity of character codes in an ith layer and for the ith layer, a quantity of pointers pointing to leaves, and based on the quantity calculated and for the ith layer, further calculating a number of times nodes are used and a quantity of pointers pointing to lower nodes; generating, based on calculation results, a Huffman tree; and converting the Huffman tree into a node-less Huffman tree and storing the node-less Huffman tree.
摘要:
A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined.
摘要:
An information retrieval apparatus includes an acquiring unit that acquires a numerical value defining a boundary of a numerical range; a detecting unit that detects a number of places in and a head numeral of the numerical value; an extracting unit that extracts from a bit string group, a bit string indicating whether a numerical value in a numerical value group having the number of places and the head numeral is present in files subject to retrieval; a specifying unit that specifies a file corresponding to a bit in the extracted bit string, the bit indicating the presence of a numerical value of the numerical value group; a determining unit that determines whether a numerical value in the specified file meets the boundary condition; and a designating unit that, based on a determination by the determining unit designates the specified file to have a numerical value within the numerical range.
摘要:
A file processing method, a data processing apparatus and a storage medium divide data and index data with respect to the data into a plurality of sections, and compress the sections to obtain a compressed file, and store the compressed file in a storage medium together with address information of the sections after the compression.
摘要:
An encrypting method including encrypting a first data segment of encryption target data on the basis of first key information, generating second key information on the basis of the first data segment by using a predetermined algorithm, and encrypting a second data segment of the encryption target data, which is different from the first data segment, on the basis of the second key information.
摘要:
A computer-readable recording medium stores therein an information searching program that causes a computer having access to archives including a compressed file group of compressed files that are to be searched and that have described therein character strings, to execute: sorting the compressed files in descending order of access frequency of the compressed files; combining the compressed files in descending order of access frequency after the sorting at the sorting such that a storage capacity of a cache area for a storage area that stores therein the compressed file group is not exceeded by a combined size of the compressed files combined; and writing, from the storage area into the cache area, the compressed files combined at the combining, the compressed files combined being written prior to a search of the compressed files combined.