摘要:
A memory for a question-answering device that reduces influence of noise on answer generation and is capable of generating highly accurate answers includes: a memory configured to normalize vector expressions of answers included in a set of answers extracted from a prescribed background knowledge source for each of a plurality of mutually different questions and to store the results as normalized vectors; and a key-value memory access unit responsive to application of a question vector derived from a question for accessing the memory and for updating the question vector by using a degree of relatedness between the question vector and the plurality of questions and using the normalized vectors corresponding to respective ones of the plurality of questions.
摘要:
A text classifier 90 for answer identification is capable of highly accurate identification of an answer candidate to a question, by effectively using background knowledge related to the question, in order to extract an answer candidate to the question, the text classifier including: a BERT (Bidirectional Encoder Representation from Transformers) receiving a question and an answer candidate as inputs; a knowledge integration transformer receiving the output of BERT as an input; a background knowledge representation generator receiving a question and an answer as inputs and generating a group of background knowledge representation vectors for the question; and a vector converter respectively converting the question and the answer candidate to embedded vectors and inputting the same to the background knowledge representation generator. The knowledge integration transformer receives the group of background knowledge representation vectors as attention and outputs a label indicating whether the answer candidate includes the correct answer to the question.
摘要:
[Object] An object is to provide a device capable of efficiently collecting contradictory expressions in units smaller than a sentence.[Solution] A contradictory expression collecting device includes: a first-stage contradiction pattern classifying unit extracting a pattern pair consisting of mutually contradictory patterns by machine learning using as training data pattern pairs consisting of patterns in the form of “subject X predicate object Y”; an additional contradiction pattern pair deriving unit 130 deriving a new pattern pair by rewriting one pattern of each extracted pattern pair by using entailment relation; a training data expanding unit for expanding training data by adding to the training data those of the newly derived pattern pairs which are highly likely consisting of mutually contradicting patterns; and an SVM 142 performing a second-stage classification classifying given pattern pairs to pattern pairs consisting of mutually contradictory patterns and to other pairs, based on machine learning using the expanded training data.
摘要:
A training data generator and a training device include: a supposed input storage storing a plurality of supposed inputs supposed as inputs to a dialogue apparatus; expanded causality DB storing a plurality of causality expressions; a training data preparing unit extracting, for each of the plurality of supposed inputs stored in supposed input storage, a causality expression having a prescribed relation with said supposed input from the plurality of causality expressions, for forming a training data sample having the supposed input as an input and the extracted causality expression as an answer and storing in a training data storage; and a training unit training a response generating neural network designed to generate an output sentence to a natural language input sentence, by using the training data samples stored in training data storage.
摘要:
A context analysis apparatus includes an analysis control unit for detecting a predicate of which subject is omitted and antecedent candidates thereof, and an anaphora/ellipsis analysis unit determining a word to be identified. The anaphora/ellipsis analysis unit includes: word vector generating units generating a plurality of different types of word vectors from sentences for the antecedent candidates; a convolutional neural network receiving as an input a word vector and trained to output a score indicating the probability of each antecedent candidate being the omitted word; and a list storage unit and a identification unit determining a antecedent candidate having the highest score. The word vectors include a plurality of word vectors each extracted at least by using the object of analysis and character sequences of the entire sentences other than the candidates. Similar processing is also possible on other words such as a referring expression.
摘要:
annotation data generation assisting system includes: an input/output device receiving an input through an interactive process; morphological analysis system and dependency parsing system performing morphological and dependency parsing on text data in text archive; first to fourth candidate generating units detecting a zero anaphor or a referring expression in the dependency relation of a predicate in a sequence of morphemes, identifying a position as an object of annotation and estimating candidates of expressions to be inserted by using language knowledge; a candidate DB storing estimated candidates; and an interactive annotation device reading candidates of annotation from candidate DB and annotate a candidate selected by an interactive process by input/output device.
摘要:
A causality recognizing apparatus includes a candidate vector generating unit configured to receive a causality candidate for generating a candidate vector representing a word sequence forming the candidate; a context vector generating unit generating a context vector representing a context in which noun-phrases of cause and effect parts of the causality candidate appear; a binary pattern vector generating unit, an answer vector generating unit and a related passage vector generating unit, generating a word vector representing background knowledge for determining whether or not there is causality between the noun-phrase included in the cause part and the noun-phrase included in the effect part; and a multicolumn convolutional neural network learned in advance to receive these word vectors and to determine whether or not the causality candidate has causality.
摘要:
A dialogue system includes: a question generating unit receiving an input sentence from a user and generating a question using an expression included in the input sentence, by using a dependency relation; an answer obtaining unit inputting the question generated by the question generating unit to a question-answering system and obtaining an answer to the question from question-answering system; and an utterance generating unit for generating an output sentence to the input sentence, based on the answer obtained by the answer obtaining unit.
摘要:
A program for training a representation generator generating a representation representing an answer part included in a passage to classify whether the passage is related to an answer or not. The program causes a computer to operate as: a fake representation generator responsive to a question and a passage for outputting a fake representation representing an answer part of the passage; a real representation generator for outputting, for the question and a core answer, a real representation representing the core answer, in the same format as fake representation; a discriminator for discriminating whether fake representation and real representation are a real or fake representation; and a generative adversarial network unit training the discriminator and fake representation generator through generative adversarial network such that error determination of fake representation is maximized and error determination of real representation is minimized.
摘要:
A summary generating apparatus includes; a text storage device storing text with information indicating a portion to be focused on; word vector converters vectorizing each word of the text and adding an element indicating whether the word is focused on or not to the vector and thereby converting the text to a word vector sequence; an LSTM implemented by a neural network performing sequence-to-sequence type conversion, pre-trained by machine learning to output, in response to each of the word vectors of the word vector sequence input in a prescribed order, a summary of the text consisting of the words represented by the word sequence; and input units inputting each of the word vectors of the word vector sequence in the prescribed order to the neural network.