专利检索 ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Takashi Fukuda" 第 1 页

1.

发明授权
Data sorting for generating RNN-T models 有权

公开(公告)号：US12027153B2

公开(公告)日：2024-07-02

申请号：US17580846

申请日：2022-01-21

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Tohru Nagano

IPC分类号： G10L15/02 , G06F7/24 , G10L15/06

CPC分类号： G10L15/02 , G06F7/24 , G10L15/063 , G10L2015/025

摘要： A computer-implemented method for preparing training data for a speech recognition model is provided including obtaining a plurality of sentences from a corpus, dividing each phoneme in each sentence of the plurality of sentences into three hidden states, calculating, for each sentence of the plurality of sentences, a score based on a variation in duration of the three hidden states of each phoneme in the sentence, and sorting the plurality of sentences by using the calculated scores.

2.

发明授权
Training of student neural network with teacher neural networks 有权

公开(公告)号：US11741355B2

公开(公告)日：2023-08-29

申请号：US16047526

申请日：2018-07-27

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran

IPC分类号： G06N3/08 , G06N3/045 , G10L25/51 , G10L15/02

CPC分类号： G06N3/08 , G06N3/045 , G10L15/02 , G10L25/51 , G10L2015/025

摘要： A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.

3.

发明授权
Fusion of neural networks 有权

公开(公告)号：US11574181B2

公开(公告)日：2023-02-07

申请号：US16406426

申请日：2019-05-08

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Masayuki Suzuki , Gakuto Kurata

IPC分类号： G06N3/08 , G06N3/04

摘要： Fusion of neural networks is performed by obtaining a first neural network and a second neural network. The first and the second neural networks are the result of a parent neural network subjected to different training. A similarity score is calculated of a first component of the first neural network and a corresponding second component of the second neural network. An interpolation weight is determined for the first and the second components by using the similarity score. A neural network parameter of the first component is updated based on the interpolation weight and a corresponding neural network parameter of the second component to obtain a fused neural network.

4.

发明申请
CROSS-LINGUAL KNOWLEDGE TRANSFER LEARNING 有权

公开(公告)号：US20220414448A1

公开(公告)日：2022-12-29

申请号：US17356907

申请日：2021-06-24

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Samuel Thomas

IPC分类号： G06N3/08 , G06N3/04 , G06N20/20 , G06F7/76

摘要： Methods and systems for training a neural network include training language-specific teacher models using different respective source language datasets. A student model is trained, using the different respective source language datasets and soft labels generated by the language-specific teacher models, including shuffling the source language datasets and shuffling weights of language-dependent layers in language-specific parts of the student model. Weights of language-independent layers of the student model are copied to a language-independent layers of a target model to initialize language-independent layers of the target model. The target model is trained with a target language dataset.

5.

发明申请
ALTERNATIVE SOFT LABEL GENERATION 有权

公开(公告)号：US20220188622A1

公开(公告)日：2022-06-16

申请号：US17118139

申请日：2020-12-10

申请人： International Business Machines Corporation

发明人： Toru Nagano , Takashi Fukuda , Gakuto Kurata

IPC分类号： G06N3/08 , G06N5/02 , G06K9/62 , G10L15/16 , G10L15/02

摘要： An approach to identifying alternate soft labels for training a student model may be provided. A teaching model may generate a soft label for a labeled training data. The training data can be an acoustic file for speech or a spoken natural language. A pool of soft labels previously generated by teacher models can be searched at the label level to identify soft labels that are similar to the generated soft label. The similar soft labels can have similar length or sequence at the word phoneme, and/or state level. The identified similar soft labels can be used in conjunction with the generated soft label to train a student model.

6.

发明申请
KNOWLEDGE DISTILLATION USING DEEP CLUSTERING 有权

公开(公告)号：US20220180206A1

公开(公告)日：2022-06-09

申请号：US17116117

申请日：2020-12-09

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda

IPC分类号： G06N3/08 , G06F16/28

摘要： Methods and systems for training a neural network include clustering a full set of training data samples into specialized training clusters. Specialized teacher neural networks are trained using respective specialized training clusters of the specialized training clusters. Soft labels are generated for the full set of training data samples using the specialized teacher neural networks. A student model is trained using the full set of training data samples, the specialized training clusters, and the soft labels.

7.

发明授权
Generation of voice data as data augmentation for acoustic model training 有权

公开(公告)号：US10726828B2

公开(公告)日：2020-07-28

申请号：US15609665

申请日：2017-05-31

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Osamu Ichikawa , Gakuto Kurata , Masayuki Suzuki

IPC分类号： G10L15/06 , G10L15/05 , G10L21/003 , G10L25/78

摘要： A method, computer system, and a computer program product for generating a plurality of voice data having a particular speaking style is provided. The present invention may include preparing a plurality of original voice data corresponding to at least one word or at least one phrase is prepared. The present invention may also include attenuating a low frequency component and a high frequency component in the prepared plurality of original voice data. The present invention may then include reducing power at a beginning and an end of the prepared plurality of original voice data. The present invention may further include storing a plurality of resultant voice data obtained after the attenuating and the reducing.

8.

发明申请
DETECTION OF MUSIC SEGMENT IN AUDIO SIGNAL 审中-公开

公开(公告)号：US20200075042A1

公开(公告)日：2020-03-05

申请号：US16116042

申请日：2018-08-29

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Masayuki Suzuki , Takashi Fukuda , Toru Nagano

IPC分类号： G10L25/81 , G10L25/21

摘要： A technique for detecting a music segment in an audio signal is disclosed. A time window is set for each section in an audio signal. A maximum and a statistic of the audio signal within the time window are calculated. A density index is computed for the section using the maximum and the statistic. The density index is a measure of the statistic relative to the maximum. The section is estimated as a music segment based, at least in part, on a condition with respect to the density index.

9.

发明申请
TRAINING OF STUDENT NEURAL NETWORK WITH TEACHER NEURAL NETWORKS 审中-公开

公开(公告)号：US20200034703A1

公开(公告)日：2020-01-30

申请号：US16047526

申请日：2018-07-27

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Masayuki Suzuki , Osamu Ichikawa , Gakuto Kurata , Samuel Thomas , Bhuvana Ramabhadran

IPC分类号： G06N3/08 , G06N3/04 , G10L15/02 , G10L25/51

摘要： A student neural network may be trained by a computer-implemented method, including: inputting common input data to each teacher neural network among a plurality of teacher neural networks to obtain a soft label output among a plurality of soft label outputs from each teacher neural network among the plurality of teacher neural networks, and training a student neural network with the input data and the plurality of soft label outputs.

10.

发明授权
Sound identification utilizing periodic indications 有权

公开(公告)号：US10460723B2

公开(公告)日：2019-10-29

申请号：US15992778

申请日：2018-05-30

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Takashi Fukuda , Osamu Ichikawa , Bhuvana Ramabhadran

IPC分类号： G10L15/00 , G10L15/16 , G10L15/02 , G10L15/06 , G10L25/24

摘要： A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类