-
公开(公告)号:US11783811B2
公开(公告)日:2023-10-10
申请号:US17031345
申请日:2020-09-24
发明人: Gakuto Kurata , George Andrei Saon
摘要: A computer-implemented method is provided for model training. The method includes training a second end-to-end neural speech recognition model that has a bidirectional encoder to output same symbols from an output probability lattice of the second end-to-end neural speech recognition model as from an output probability lattice of a trained first end-to-end neural speech recognition model having a unidirectional encoder. The method also includes building a third end-to-end neural speech recognition model that has a unidirectional encoder by training the third end-to-end neural speech recognition model as a student by using the trained second end-to-end neural speech recognition model as a teacher in a knowledge distillation method.
-
公开(公告)号:US11741355B2
公开(公告)日:2023-08-29
申请号:US16047526
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
CPC分类号: G06N3/08 , G06N3/045 , G10L15/02 , G10L25/51 , G10L2015/025
摘要: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
-
公开(公告)号:US11574181B2
公开(公告)日:2023-02-07
申请号:US16406426
申请日:2019-05-08
发明人: Takashi Fukuda , Masayuki Suzuki , Gakuto Kurata
摘要: Fusion of neural networks is performed by obtaining a first neural network and a second neural network. The first and the second neural networks are the result of a parent neural network subjected to different training. A similarity score is calculated of a first component of the first neural network and a corresponding second component of the second neural network. An interpolation weight is determined for the first and the second components by using the similarity score. A neural network parameter of the first component is updated based on the interpolation weight and a corresponding neural network parameter of the second component to obtain a fused neural network.
-
公开(公告)号:US20220188622A1
公开(公告)日:2022-06-16
申请号:US17118139
申请日:2020-12-10
发明人: Toru Nagano , Takashi Fukuda , Gakuto Kurata
摘要: An approach to identifying alternate soft labels for training a student model may be provided. A teaching model may generate a soft label for a labeled training data. The training data can be an acoustic file for speech or a spoken natural language. A pool of soft labels previously generated by teacher models can be searched at the label level to identify soft labels that are similar to the generated soft label. The similar soft labels can have similar length or sequence at the word phoneme, and/or state level. The identified similar soft labels can be used in conjunction with the generated soft label to train a student model.
-
公开(公告)号:US11195513B2
公开(公告)日:2021-12-07
申请号:US15717194
申请日:2017-09-27
发明人: Gakuto Kurata , Toru Nagano , Yuta Tsuboi
IPC分类号: G10L15/02 , G10L15/16 , G10L15/22 , G10L15/14 , G10L15/187 , G10L15/06 , G06N3/04 , G06N3/08 , G06F40/129 , G06F40/242 , G10L13/08
摘要: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.
-
公开(公告)号:US11011156B2
公开(公告)日:2021-05-18
申请号:US16381426
申请日:2019-04-11
发明人: Gakuto Kurata
IPC分类号: G10L15/06 , G10L15/22 , G10L15/183 , G10L15/16
摘要: A computer-implemented method for training a model is disclosed. The model is capable of retaining a history of one or more preceding elements and has a direction of prediction. The method includes obtaining a training sequence of elements. The method also includes splitting the training sequence into a plurality of parts. The method further includes selecting one part of the plurality of the parts depending on the direction of the model to generate a modified training data. The method includes further training the model using the modified training data.
-
公开(公告)号:US10991363B2
公开(公告)日:2021-04-27
申请号:US15804305
申请日:2017-11-06
摘要: An apparatus, method, and computer program product for adapting an acoustic model to a specific environment are defined. An adapted model obtained by adapting an original model to the specific environment using adaptation data, the original model being trained using training data and being used to calculate probabilities of context-dependent phones given an acoustic feature. Adapted probabilities obtained by adapting original probabilities using the training data and the adaptation data, the original probabilities being trained using the training data and being prior probabilities of context-dependent phones. An adapted acoustic model obtained from the adapted model and the adapted probabilities.
-
公开(公告)号:US10990902B2
公开(公告)日:2021-04-27
申请号:US16582343
申请日:2019-09-25
发明人: Gakuto Kurata
摘要: A method, system, and computer program product for learning a recognition model for recognition processing. The method includes preparing one or more examples for learning, each of which includes an input segment, an additional segment adjacent to the input segment and an assigned label. The input segment and the additional segment are extracted from an original training data. A classification model is trained, using the input segment and the additional segment in the examples, to initialize parameters of the classification model so that extended segments including the input segment and the additional segment are reconstructed from the input segment. Then, the classification model is tuned to predict a target label, using the input segment and the assigned label in the examples, based on the initialized parameters. At least a portion of the obtained classification model is included in the recognition model.
-
公开(公告)号:US20210110240A1
公开(公告)日:2021-04-15
申请号:US17132631
申请日:2020-12-23
发明人: Satoshi Hara , Gakuto Kurata , Shigeru Nakagawa , Seiji Takeda
摘要: A computer implemented method for training a neural network to capture a structural feature specific to a set of chemical compounds is disclosed. In the method, the computer system reads an expression describing a structure of the chemical compound for each chemical compound in the set and enumerates one or more combinations of a position and a type of a structural element appearing in the expression for each chemical compound in the set. The computer system also generates training data based on the one or more enumerated combinations for each chemical compound in the set. The training data includes one or more values with a length, each of which indicates whether or not a corresponding type of the structural element appears at a corresponding position for each combination. Furthermore, the computer system trains the neural network based on the training data for the set of the chemical compounds.
-
公开(公告)号:US20210082399A1
公开(公告)日:2021-03-18
申请号:US16570022
申请日:2019-09-13
发明人: Gakuto Kurata , Kartik Audhkhasi
摘要: A technique for aligning spike timing of models is disclosed. A first model having a first architecture trained with a set of training samples is generated. Each training sample includes an input sequence of observations and an output sequence of symbols having different length from the input sequence. Then, one or more second models are trained with the trained first model by minimizing a guide loss jointly with a normal loss for each second model and a sequence recognition task is performed using the one or more second models. The guide loss evaluates dissimilarity in spike timing between the trained first model and each second model being trained.
-
-
-
-
-
-
-
-
-