Fusion of neural networks
    3.
    发明授权

    公开(公告)号:US11574181B2

    公开(公告)日:2023-02-07

    申请号:US16406426

    申请日:2019-05-08

    IPC分类号: G06N3/08 G06N3/04

    摘要: Fusion of neural networks is performed by obtaining a first neural network and a second neural network. The first and the second neural networks are the result of a parent neural network subjected to different training. A similarity score is calculated of a first component of the first neural network and a corresponding second component of the second neural network. An interpolation weight is determined for the first and the second components by using the similarity score. A neural network parameter of the first component is updated based on the interpolation weight and a corresponding neural network parameter of the second component to obtain a fused neural network.

    CROSS-LINGUAL KNOWLEDGE TRANSFER LEARNING

    公开(公告)号:US20220414448A1

    公开(公告)日:2022-12-29

    申请号:US17356907

    申请日:2021-06-24

    摘要: Methods and systems for training a neural network include training language-specific teacher models using different respective source language datasets. A student model is trained, using the different respective source language datasets and soft labels generated by the language-specific teacher models, including shuffling the source language datasets and shuffling weights of language-dependent layers in language-specific parts of the student model. Weights of language-independent layers of the student model are copied to a language-independent layers of a target model to initialize language-independent layers of the target model. The target model is trained with a target language dataset.

    ALTERNATIVE SOFT LABEL GENERATION

    公开(公告)号:US20220188622A1

    公开(公告)日:2022-06-16

    申请号:US17118139

    申请日:2020-12-10

    摘要: An approach to identifying alternate soft labels for training a student model may be provided. A teaching model may generate a soft label for a labeled training data. The training data can be an acoustic file for speech or a spoken natural language. A pool of soft labels previously generated by teacher models can be searched at the label level to identify soft labels that are similar to the generated soft label. The similar soft labels can have similar length or sequence at the word phoneme, and/or state level. The identified similar soft labels can be used in conjunction with the generated soft label to train a student model.

    KNOWLEDGE DISTILLATION USING DEEP CLUSTERING

    公开(公告)号:US20220180206A1

    公开(公告)日:2022-06-09

    申请号:US17116117

    申请日:2020-12-09

    发明人: Takashi Fukuda

    IPC分类号: G06N3/08 G06F16/28

    摘要: Methods and systems for training a neural network include clustering a full set of training data samples into specialized training clusters. Specialized teacher neural networks are trained using respective specialized training clusters of the specialized training clusters. Soft labels are generated for the full set of training data samples using the specialized teacher neural networks. A student model is trained using the full set of training data samples, the specialized training clusters, and the soft labels.

    Generation of voice data as data augmentation for acoustic model training

    公开(公告)号:US10726828B2

    公开(公告)日:2020-07-28

    申请号:US15609665

    申请日:2017-05-31

    摘要: A method, computer system, and a computer program product for generating a plurality of voice data having a particular speaking style is provided. The present invention may include preparing a plurality of original voice data corresponding to at least one word or at least one phrase is prepared. The present invention may also include attenuating a low frequency component and a high frequency component in the prepared plurality of original voice data. The present invention may then include reducing power at a beginning and an end of the prepared plurality of original voice data. The present invention may further include storing a plurality of resultant voice data obtained after the attenuating and the reducing.

    DETECTION OF MUSIC SEGMENT IN AUDIO SIGNAL
    8.
    发明申请

    公开(公告)号:US20200075042A1

    公开(公告)日:2020-03-05

    申请号:US16116042

    申请日:2018-08-29

    IPC分类号: G10L25/81 G10L25/21

    摘要: A technique for detecting a music segment in an audio signal is disclosed. A time window is set for each section in an audio signal. A maximum and a statistic of the audio signal within the time window are calculated. A density index is computed for the section using the maximum and the statistic. The density index is a measure of the statistic relative to the maximum. The section is estimated as a music segment based, at least in part, on a condition with respect to the density index.

    Sound identification utilizing periodic indications

    公开(公告)号:US10460723B2

    公开(公告)日:2019-10-29

    申请号:US15992778

    申请日:2018-05-30

    摘要: A computer-implemented method is provided. The computer-implemented method is performed by a speech recognition system having at least a processor. The method includes estimating sound identification information from a neural network having periodic indications and components of a frequency spectrum of an audio signal data inputted thereto. The method further includes performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information. The neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes. The method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.