Abstract:
An index creation device reads target text data therein and creates a bitmap index in which, with regard to each of a character or a word and a tag that appear in the target text data, an appearance position of each of the character or the word and the tag in text data is represented as bitmap data.
Abstract:
A non-transitory computer-readable recording medium stores a semantic structure search program. The semantic structure search program causes a computer to execute the following process. The computer generates a plurality of search semantic symbols from a search request. Next, the computer specifies a position of a specific word that corresponds to the search request in a search target document, by the plurality of search semantic symbols and document semantic structure position information. The document semantic structure position information includes a relationship information between a plurality of semantic symbols and a plurality of positions of a plurality of words in the search target document. The plurality of semantic symbols represent a semantic structure corresponding to the plurality of words. Thereafter, the computer outputs a search result including the specific word and the position of the specific word in the search target document.
Abstract:
A non-transitory computer-readable recording medium stores therein an index generating program that causes a computer to execute a process including: inputting control statements including plural phrases and having contents that change according to description positions of the plural phrases; generating first index information related to positional information of each of the phrases in the control statements; and generating, from the first index information, a second index information group related to the phrases targeted by each of reserved words included in the control statements.
Abstract:
An information processing device executes a process includes determining whether or not encoding target data is in an inflective form of a word when the encoding target data included in target sentence data is encoded; and registering the encoding target data and a code assigned to the encoding target data in a dynamic dictionary in association with each other, in a case where the encoding target data is in the inflective form of the word.
Abstract:
A non-transitory computer-readable recording medium stores a document encoding program that causes a computer to execute a process including: first generating index information in which an appearance position is associated with each word appearing on document data of a target as bit map data at the time of encoding the document data of the target in word unit; second generating document structure information in which a relationship with respect to the appearance position included in the index information is associated with each specific sub structure included in the document data as bit map data; and retaining the index information and the document structure information in a storage in association with each other.
Abstract:
An information processing apparatus includes: a processor configured to: conduct lexical analysis on an interpreter-type source code; compress a source code, on which the lexical analysis has been conducted, by using a compression dictionary that associates an internal code and a compression code; when an execution command of an interpreter is received for the source code compressed, convert the source code compressed into an internal code in accordance with the compression dictionary; and sequentially execute processing in accordance with the internal code converted.
Abstract:
A system includes circuitry configured to: read a plurality of character information and a plurality of identifiers that are included in a text file; determine whether a character information among the plurality of character information is included between the at least one pair of identifiers among the plurality of identifiers in the text file; and associate the character information with the at least one pair of identifiers when it is determined that the character information is included between the at least one pair of identifiers.
Abstract:
At a preliminary stage, a compressing unit generates frequency information, outputs a compression code associated with a piece of first data of the longest matching character string among the pieces of first data contained in the frequency information, when the longest matching character string has a length smaller than the predetermined length and outputs a compression code associated with a piece of position information matching with position information about the longest matching character string among the pieces of position information about the second data contained in the frequency information and a compression code associated with length information about the longest matching character string among the pieces of first data contained in the frequency information, when the longest matching character string has a length equal to or larger than the predetermined length.
Abstract:
A non-transitory computer-readable recording medium has stored therein a program for causing a computer to execute a process. The process includes: when obtaining a character string including one unit of character information at one position in the character string, referring to presence/absence information indicating whether or not at least one character string, in a character string group including a plurality of character strings to which compression codes have been assigned, includes the one unit of character information at the one position; and searching the character string group for the obtained character string except for a case that the presence/absence information indicates that none of the character strings included in the character string group include the one unit of character information at the one position.
Abstract:
A non-transitory computer-readable recording medium storing an information processing program for causing a computer to perform processing including: executing preprocessing processing that includes calculating vectors for a plurality of subtexts of text information included in a plurality of pieces of history information in which information on a plurality of question sentences and a plurality of response sentences is recorded; executing training processing that includes training a training model based on training data that defines relationships between the vectors of some subtexts and the vectors of other subtexts among the plurality of subtexts; and executing generation processing that includes calculating, when accepting a new question sentence, the vectors of the subtexts by inputting the vectors of the new question sentence to the training model, and generating a response that corresponds to the new question sentence, based on the calculated vectors.