-
公开(公告)号:US11741355B2
公开(公告)日:2023-08-29
申请号:US16047526
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
CPC分类号: G06N3/08 , G06N3/045 , G10L15/02 , G10L25/51 , G10L2015/025
摘要: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
-
公开(公告)号:US10726828B2
公开(公告)日:2020-07-28
申请号:US15609665
申请日:2017-05-31
发明人: Takashi Fukuda , Osamu Ichikawa , Gakuto Kurata , Masayuki Suzuki
IPC分类号: G10L15/06 , G10L15/05 , G10L21/003 , G10L25/78
摘要: A method, computer system, and a computer program product for generating a plurality of voice data having a particular speaking style is provided. The present invention may include preparing a plurality of original voice data corresponding to at least one word or at least one phrase is prepared. The present invention may also include attenuating a low frequency component and a high frequency component in the prepared plurality of original voice data. The present invention may then include reducing power at a beginning and an end of the prepared plurality of original voice data. The present invention may further include storing a plurality of resultant voice data obtained after the attenuating and the reducing.
-
公开(公告)号:US20200034703A1
公开(公告)日:2020-01-30
申请号:US16047526
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
摘要: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
-
公开(公告)号:US10460723B2
公开(公告)日:2019-10-29
申请号:US15992778
申请日:2018-05-30
摘要: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
-
公开(公告)号:US20180277104A1
公开(公告)日:2018-09-27
申请号:US15992778
申请日:2018-05-30
IPC分类号: G10L15/16 , G10L15/02 , G10L15/06 , G10L25/24 , G10L21/038
摘要: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
-
公开(公告)号:US20200058297A1
公开(公告)日:2020-02-20
申请号:US16665159
申请日:2019-10-28
摘要: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information from a neural network having periodic indications and components of a frequency spectrum of the audio signal data inputted thereto. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
-
公开(公告)号:US20200034702A1
公开(公告)日:2020-01-30
申请号:US16047287
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
摘要: A student neural network may be trained by a computer-implemented method, including: selecting a teacher neural network among a plurality of teacher neural networks, inputting an input data to the selected teacher neural network to obtain a soft label output generated by the selected teacher neural network, and training a student neural network with at least the input data and the soft label output from the selected teacher neural network.
-
公开(公告)号:US10546238B2
公开(公告)日:2020-01-28
申请号:US16378696
申请日:2019-04-09
发明人: Takashi Fukuda , Osamu Ichikawa
摘要: A technique for training a neural network including an input layer, one or more hidden layers and an output layer, in which the trained neural network can be used to perform a task such as speech recognition. In the technique, a base of the neural network having at least a pre-trained hidden layer is prepared. A parameter set associated with one pre-trained hidden layer in the neural network is decomposed into a plurality of new parameter sets. The number of hidden layers in the neural network is increased by using the plurality of the new parameter sets. Pre-training for the neural network is performed.
-
公开(公告)号:US20200005769A1
公开(公告)日:2020-01-02
申请号:US16019676
申请日:2018-06-27
发明人: Osamu Ichikawa , Takashi Fukuda
摘要: A method is provided for training a neural network-based (NN-based) acoustic model. The method includes receiving, by a processor, the neural network-based (NN-based) acoustic model, trained by a one-hot scheme and having an input layer, a set of middle layers, and an original output layer. At least each of the middle layers subsequent to a first one of the middle layers have trained parameters. The method further includes stacking, by the processor, a new output layer on the original output layer of the NN-based acoustic model to form a new NN-based acoustic model. The new output layer has a same size as the original output layer. The method also includes retraining, by the processor, only the new output layer and the original output layer of the new NN-based acoustic model in the one-hot scheme, with the trained parameters of middle layers subsequent to at least the first one being fixed.
-
公开(公告)号:US10373607B2
公开(公告)日:2019-08-06
申请号:US15621778
申请日:2017-06-13
发明人: Takashi Fukuda , Osamu Ichikawa , Futoshi Iwama
IPC分类号: G10L13/08 , G10L15/01 , G10L15/193
摘要: A method, for testing words defined in a pronunciation lexicon used in an automatic speech recognition (ASR) system, is provided. The method includes: obtaining test sentences which can be accepted by a language model used in the ASR system. The test sentences cover words defined in the pronunciation lexicon. The method further includes obtaining variations of speech data corresponding to each test sentence, and obtaining a plurality of texts by recognizing the variations of speech data, or a plurality of texts generated by recognizing the variation of speech data. The method also includes constructing a word graph, using the plurality of texts, for each test sentence, where each word in the word graph corresponds to each word defined in the pronunciation lexicon; and determining whether or not all or parts of words in a test sentence are present in a path of the word graph derived from the test sentence.
-
-
-
-
-
-
-
-
-