-
公开(公告)号:US11494433B2
公开(公告)日:2022-11-08
申请号:US16406416
申请日:2019-05-08
发明人: Yoshinori Kabeya , Toru Nagano , Masayuki Suzuki , Issei Yoshida
IPC分类号: G06F16/683 , G06F16/632 , G06F16/635 , G06F16/638 , G10L15/22 , G06F16/332
摘要: A system and method for expanding a question and answer (Q&A) database. The method includes obtaining a set of Q&A documents and speech recognition results, each Q&A document in the set having an identifier, and each speech recognition result having an identifier common with the identifier of a relevant Q&A document, and adding one or more repetition parts extracted from the speech recognition results to a corresponding Q&A document in the set to generate an expanded set of Q&A documents for increasing Q&A document extraction accuracy.
-
公开(公告)号:US11227579B2
公开(公告)日:2022-01-18
申请号:US16535829
申请日:2019-08-08
发明人: Toru Nagano , Takashi Fukuda , Masayuki Suzuki , Gakuto Kurata
IPC分类号: G10L13/033 , G10L15/18 , G06F40/205 , G06F40/284
摘要: A technique for data augmentation for speech data is disclosed. Original speech data including a sequence of feature frames is obtained. A partially prolonged copy of the original speech data is generated by inserting one or more new frames into the sequence of the feature frames. The partially prolonged copy is output as augmented speech data for training an acoustic model for training an acoustic model.
-
公开(公告)号:US11138965B2
公开(公告)日:2021-10-05
申请号:US15801820
申请日:2017-11-02
发明人: Gakuto Kurata , Toru Nagano , Yuta Tsuboi
IPC分类号: G10L15/02 , G10L15/14 , G10L15/16 , G10L15/22 , G10L15/06 , G06N3/04 , G06N3/08 , G06F40/129 , G06F40/242 , G10L15/187 , G10L13/08
摘要: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.
-
公开(公告)号:US11037583B2
公开(公告)日:2021-06-15
申请号:US16116042
申请日:2018-08-29
发明人: Masayuki Suzuki , Takashi Fukuda , Toru Nagano
摘要: A technique for detecting a music segment in an audio signal is disclosed. A time window is set for each section in an audio signal. A maximum and a statistic of the audio signal within the time window are calculated. A density index is computed for the section using the maximum and the statistic. The density index is a measure of the statistic relative to the maximum. The section is estimated as a music segment based, at least in part, on a condition with respect to the density index.
-
公开(公告)号:US10250976B1
公开(公告)日:2019-04-02
申请号:US15886092
申请日:2018-02-01
发明人: Hiroshi Horii , Toru Nagano
摘要: Methods and systems for controlling sensors include analyzing streams of sensor data from respective sensors to determine if multiple streams from the plurality of streams share a context. All but one sensor is deactivated associated with the multiple streams to conserve battery power across the sensors.
-
公开(公告)号:US20190096388A1
公开(公告)日:2019-03-28
申请号:US15717194
申请日:2017-09-27
发明人: Gakuto Kurata , Toru Nagano , Yuta Tsuboi
摘要: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.
-
17.
公开(公告)号:US20170345415A1
公开(公告)日:2017-11-30
申请号:US15678195
申请日:2017-08-16
发明人: Gakuto Kurata , Toru Nagano , Masayuki Suzuki
IPC分类号: G10L15/06 , G10L15/02 , G10L21/0208 , G10L25/18 , G10L25/24 , G10L15/065 , G10L15/187
CPC分类号: G10L15/063 , G10L15/02 , G10L15/065 , G10L15/187 , G10L21/0208 , G10L25/18 , G10L25/24 , G10L2015/025
摘要: Embodiments include methods and systems for improving an acoustic model. Aspects include acquiring a first standard deviation value by calculating standard deviation of a feature from first training data and acquiring a second standard deviation value by calculating standard deviation of a feature from second training data acquired in a different environment from an environment of the first training data. Aspects also include creating a feature adapted to an environment where the first training data is recorded, by multiplying the feature acquired from the second training data by a ratio obtained by dividing the first standard deviation value by the second standard deviation value. Aspects further include reconstructing an acoustic model constructed using training data acquired in the same environment as the environment of the first training data using the feature adapted to the environment where the first training data is recorded.
-
公开(公告)号:US20220188622A1
公开(公告)日:2022-06-16
申请号:US17118139
申请日:2020-12-10
发明人: Toru Nagano , Takashi Fukuda , Gakuto Kurata
摘要: An approach to identifying alternate soft labels for training a student model may be provided. A teaching model may generate a soft label for a labeled training data. The training data can be an acoustic file for speech or a spoken natural language. A pool of soft labels previously generated by teacher models can be searched at the label level to identify soft labels that are similar to the generated soft label. The similar soft labels can have similar length or sequence at the word phoneme, and/or state level. The identified similar soft labels can be used in conjunction with the generated soft label to train a student model.
-
公开(公告)号:US11195513B2
公开(公告)日:2021-12-07
申请号:US15717194
申请日:2017-09-27
发明人: Gakuto Kurata , Toru Nagano , Yuta Tsuboi
IPC分类号: G10L15/02 , G10L15/16 , G10L15/22 , G10L15/14 , G10L15/187 , G10L15/06 , G06N3/04 , G06N3/08 , G06F40/129 , G06F40/242 , G10L13/08
摘要: A technique for estimating phonemes for a word written in a different language is disclosed. A sequence of graphemes of a given word in a source language is received. The sequence of the graphemes in the source language is converted into a sequence of phonemes in the source language. One or more sequences of phonemes in a target language are generated from the sequence of the phonemes in the source language by using a neural network model. One sequence of phonemes in the target language is determined for the given word. Also, technique for estimating graphemes of a word from phonemes in a different language is disclosed.
-
公开(公告)号:US11151449B2
公开(公告)日:2021-10-19
申请号:US15878933
申请日:2018-01-24
发明人: Masayuki Suzuki , Toru Nagano
摘要: A method, computer program product, and apparatus for adapting a trained neural network having one or more batch normalization layers are provided. The method includes adapting only the one or more batch normalization layers using adaptation data. The method also includes adapting the whole of the neural network having the one or more adapted batch normalization layers, using the adaptation data.
-
-
-
-
-
-
-
-
-