Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes obtaining partitioned training data for the neural network, wherein the partitioned training data comprises a plurality of training items each of which is assigned to a respective one of a plurality of partitions, wherein each partition is associated with a respective difficulty level; and training the neural network on each of the partitions in a sequence from a partition associated with an easiest difficulty level to a partition associated with a hardest difficulty level, wherein, for each of the partitions, training the neural network comprises: training the neural network on a sequence of training items that includes training items selected from the training items in the partition interspersed with training items selected from the training items in all of the partitions.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory using reinforcement learning. One of the methods includes providing an output derived from the system output portion of the neural network output as a system output in the sequence of system outputs; selecting a memory access process from a predetermined set of memory access processes for accessing the external memory from the reinforcement learning portion of the neural network output; writing and reading data from locations in the external memory in accordance with the selected memory access process using the differentiable portion of the neural network output; and combining the data read from the external memory with a next system input in the sequence of system inputs to generate a next neural network input in the sequence of neural network inputs.