摘要:
To improve the compressibility of data including consecutive runs of the same value. First encoding means (1a) encodes, within the compression target data (4), each part with a sequence of the same value into the number of consecutive runs of the value. Decomposing means (1b) decomposes the number of consecutive runs of the same value into an addition of integers belonging to a predetermined integer group. Calculating means (1d) calculates the probability of occurrence of each integer obtained by the decomposition. Second encoding means (1e) encodes each integer by assigning shorter codes to integers with higher probabilities of occurrence.
摘要:
The information processing apparatus calculates a second index indicating a position of an amino acid on a codon file based on a first index indicating positions of a plurality of codons on the codon file with respect to a plurality of codons having different base sequences indicating the same amino acid. Based on the second index, the information processing amino acid sequences that are repeatedly expressed in the codon file based on the second index. The information processing apparatus specifies each codon sequence corresponding to the position of each amino acid sequence repeatedly expressed in the codon file as a codon sequence having homology.
摘要:
The information processing device calculates vectors for a plurality of subtexts of text information included in a plurality of pieces of history information in which information on a plurality of question sentences and a plurality of response sentences is recorded. The information processing device executes training of a training model, based on training data that defines relationships between the vectors of some subtexts and the vectors of other subtexts among the plurality of subtexts. When accepting a new question sentence, the information processing device calculates the vectors of the subtexts by inputting the vectors of the new question sentence to the training model. The information processing device generates a response that corresponds to the new question sentence, based on the calculated vectors.
摘要:
An information processing device calculates vectors of a plurality of pieces of basic information by performing Poincare Embeddings on the plurality of pieces of basic information, based on a common concept table that classifies the plurality of pieces of space-specific basic information defined in a plurality of spaces with a common concept. The information processing device calculates a vector of structural information with a granularity larger than the basic information, based on the vectors of the plurality of pieces of basic information. The information processing device generates an inverted index that defines a relationship between a position of the basic information in a file that corresponds to the same space and the vector of the basic information and a relationship between a position of the structural information in the file and the vector of the structural information.
摘要:
A computer: acquires, from a compression dictionary that associates each of codes having a length according to a frequency of appearance of a set of a word and a word meaning of the word with that set, the set of any word and that word meaning, and one of the codes associated with the set of the any word and that word meaning; selects, from among a plurality of fixed-length codes stored in the memory with a same length in association with the set of the word and that word meaning, one of the fixed-length codes associated with the set of the any word and that word meaning; generates a conversion dictionary that associates the selected one of the fixed-length codes with the acquired one of the codes; and specifies, by the conversion dictionary, the individual fixed-length codes associated with each of the codes contained in compressed data.
摘要:
An information processing device (100) compares the codons included in reference codon sequence data and the codons included in analysis-target codon sequence data, at each sequence position of the codons. Then, the information processing device (100) identifies a plurality of codons positioned at the sequence positions that are subsequent to the sequence position at which codons are nonidentical. Moreover, the information processing device (100) refers to a memory unit that stores the type of mutation, which has occurred at a particular codon, in a corresponding manner to a plurality of codons subsequent to the particular codon, on account of occurrence of the mutation in the particular codon. Then, the information processing device (100) identifies the type of mutation corresponding to the plurality of codons.
摘要:
An information processing apparatus (100) analyzes first text information and second identification information and acquires, regarding words included in the first and the second text information, first word information and second word information each of which identifies a combination of a word and a word meaning of the word. The information processing apparatus (100) converts the first word information to a first word meaning vector and converts the second word information to a second word meaning vector. The information processing apparatus (100) learns parameters of a conversion model by using the first word meaning vector and the second word meaning vector.
摘要:
An index generation device 100 inputs data described by a combination of an item and a value and generates index information regarding appearance positions of each of the item and the value for each of the item and the value included in the data. Therefore, it is possible to efficiently search, for example, on XBRL data for a search condition combining the item and the value.
摘要:
An information processing apparatus (100) conducts lexical analysis on an interpreter-type source code and compresses the source code, on which the lexical analysis has been conducted, by using an internal-code correspondence table (410) that associates an internal code and a compression code. When an execution command of an interpreter is received for the compressed source code, the information processing apparatus (100) converts the compressed source code into an internal code in accordance with an internal-code correspondence table (410) and sequentially executes processing in accordance with the converted internal code, whereby when an interpreter-type source code is stored in a compressed state, the execution speed of the interpreter for the source code is improved.
摘要:
[Problem to be solved] It is an object in one aspect of an embodiment of the invention to suppress narrow down noise generated when targets are narrowed down at the time of a string search performed on document data. [Solution] According to an aspect of an embodiment, a computer changes, in accordance with whether a document element that has a predetermined number of child elements is present in a document file, control of determining whether data in the document file is to be included in which of a plurality of blocks by changing the control between the control performed for each document element in the hierarchy of the child elements and the control performed for each document element in the hierarchy of the document element or in the hierarchy higher than the hierarchy of the document element; divides the document file into the plurality of blocks; and generates, for each piece of data obtained by being divided, index information that indicates whether each of the pieces of the data includes predetermined character information.