-
公开(公告)号:US12027153B2
公开(公告)日:2024-07-02
申请号:US17580846
申请日:2022-01-21
发明人: Takashi Fukuda , Tohru Nagano
CPC分类号: G10L15/02 , G06F7/24 , G10L15/063 , G10L2015/025
摘要: A computer-implemented method for preparing training data for a speech recognition model is provided including obtaining a plurality of sentences from a corpus, dividing each phoneme in each sentence of the plurality of sentences into three hidden states, calculating, for each sentence of the plurality of sentences, a score based on a variation in duration of the three hidden states of each phoneme in the sentence, and sorting the plurality of sentences by using the calculated scores.
-
公开(公告)号:US11741355B2
公开(公告)日:2023-08-29
申请号:US16047526
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
CPC分类号: G06N3/08 , G06N3/045 , G10L15/02 , G10L25/51 , G10L2015/025
摘要: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
-
公开(公告)号:US11574181B2
公开(公告)日:2023-02-07
申请号:US16406426
申请日:2019-05-08
发明人: Takashi Fukuda , Masayuki Suzuki , Gakuto Kurata
摘要: Fusion of neural networks is performed by obtaining a first neural network and a second neural network. The first and the second neural networks are the result of a parent neural network subjected to different training. A similarity score is calculated of a first component of the first neural network and a corresponding second component of the second neural network. An interpolation weight is determined for the first and the second components by using the similarity score. A neural network parameter of the first component is updated based on the interpolation weight and a corresponding neural network parameter of the second component to obtain a fused neural network.
-
公开(公告)号:US20220414448A1
公开(公告)日:2022-12-29
申请号:US17356907
申请日:2021-06-24
发明人: Takashi Fukuda , Samuel Thomas
摘要: Methods and systems for training a neural network include training language-specific teacher models using different respective source language datasets. A student model is trained, using the different respective source language datasets and soft labels generated by the language-specific teacher models, including shuffling the source language datasets and shuffling weights of language-dependent layers in language-specific parts of the student model. Weights of language-independent layers of the student model are copied to a language-independent layers of a target model to initialize language-independent layers of the target model. The target model is trained with a target language dataset.
-
公开(公告)号:US20220188622A1
公开(公告)日:2022-06-16
申请号:US17118139
申请日:2020-12-10
发明人: Toru Nagano , Takashi Fukuda , Gakuto Kurata
摘要: An approach to identifying alternate soft labels for training a student model may be provided. A teaching model may generate a soft label for a labeled training data. The training data can be an acoustic file for speech or a spoken natural language. A pool of soft labels previously generated by teacher models can be searched at the label level to identify soft labels that are similar to the generated soft label. The similar soft labels can have similar length or sequence at the word phoneme, and/or state level. The identified similar soft labels can be used in conjunction with the generated soft label to train a student model.
-
公开(公告)号:US20220180206A1
公开(公告)日:2022-06-09
申请号:US17116117
申请日:2020-12-09
发明人: Takashi Fukuda
摘要: Methods and systems for training a neural network include clustering a full set of training data samples into specialized training clusters. Specialized teacher neural networks are trained using respective specialized training clusters of the specialized training clusters. Soft labels are generated for the full set of training data samples using the specialized teacher neural networks. A student model is trained using the full set of training data samples, the specialized training clusters, and the soft labels.
-
公开(公告)号:US10726828B2
公开(公告)日:2020-07-28
申请号:US15609665
申请日:2017-05-31
发明人: Takashi Fukuda , Osamu Ichikawa , Gakuto Kurata , Masayuki Suzuki
IPC分类号: G10L15/06 , G10L15/05 , G10L21/003 , G10L25/78
摘要: A method, computer system, and a computer program product for generating a plurality of voice data having a particular speaking style is provided. The present invention may include preparing a plurality of original voice data corresponding to at least one word or at least one phrase is prepared. The present invention may also include attenuating a low frequency component and a high frequency component in the prepared plurality of original voice data. The present invention may then include reducing power at a beginning and an end of the prepared plurality of original voice data. The present invention may further include storing a plurality of resultant voice data obtained after the attenuating and the reducing.
-
公开(公告)号:US20200075042A1
公开(公告)日:2020-03-05
申请号:US16116042
申请日:2018-08-29
发明人: Masayuki Suzuki , Takashi Fukuda , Toru Nagano
摘要: A technique for detecting a music segment in an audio signal is disclosed. A time window is set for each section in an audio signal. A maximum and a statistic of the audio signal within the time window are calculated. A density index is computed for the section using the maximum and the statistic. The density index is a measure of the statistic relative to the maximum. The section is estimated as a music segment based, at least in part, on a condition with respect to the density index.
-
公开(公告)号:US20200034703A1
公开(公告)日:2020-01-30
申请号:US16047526
申请日:2018-07-27
发明人: Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran
摘要: A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.
-
公开(公告)号:US10460723B2
公开(公告)日:2019-10-29
申请号:US15992778
申请日:2018-05-30
摘要: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
-
-
-
-
-
-
-
-
-