APPARATUS AND METHOD WITH SPEECH RECOGNITION AND LEARNING

    公开(公告)号:US20210020167A1

    公开(公告)日:2021-01-21

    申请号:US16736895

    申请日:2020-01-08

    Abstract: A processor-implemented speech recognition method includes: applying, to an input layer of a neural network, a frame of a speech sequence; obtaining an output of a hidden layer of the neural network corresponding to the frame; calculating a statistical value of at least one previous output of the hidden layer corresponding to at least one previous frame of the speech sequence; normalizing the output based on the statistical value; applying the normalized output to a subsequent layer of the neural network; and recognizing the speech sequence based on the applying of the normalized output.

    SPEECH SIGNAL RECOGNITION SYSTEM AND METHOD
    3.
    发明申请

    公开(公告)号:US20190088251A1

    公开(公告)日:2019-03-21

    申请号:US15916512

    申请日:2018-03-09

    Abstract: A speech signal recognition method, apparatus, and system. The speech signal recognition method may include obtaining by or from a terminal an output of a personalization layer, with respect to a speech signal provided by a user of the terminal, having been implemented by input of the speech signal to the personalization layer, the personalization layer being previously trained based on speech features of the user, implementing a global model by providing the obtained output of the personalization layer to the global model, the global model being configured to output a phonemic signal indicating a phoneme included in the speech signal through the global model being previously trained based on speech features common to a plurality of users, and re-training the personalization layer based on the phonemic signal output from the global model, where the personalization layer and the global model collectively represent an acoustic model.

    METHOD AND DEVICE WITH TRAINING DATABASE CONSTRUCTION

    公开(公告)号:US20240086684A1

    公开(公告)日:2024-03-14

    申请号:US18467457

    申请日:2023-09-14

    CPC classification number: G06N3/0455 G06F16/2365

    Abstract: An electronic device includes one or more processors and a memory storing instructions configured to, when executed by the one or more processors, cause the one or more processors to: implement a machine learning-based conditional generative model configured to reconstruct target data from latent vectors, the conditional generative model trained based on an existing data set for a target task; determine an extrapolation weight; generate an augmented latent vector and augmented condition data by extrapolating, based on the extrapolation weight, from a latent vector corresponding to the existing dataset and from existing condition data corresponding to the existing dataset; and generate a new dataset comprising augmented target data generated by the conditional generative model based on the augmented condition data and based on the augmented latent vector.

    APPARATUS AND METHOD WITH MODEL TRAINING

    公开(公告)号:US20210110273A1

    公开(公告)日:2021-04-15

    申请号:US16844534

    申请日:2020-04-09

    Abstract: A processor-implemented model training method and apparatus are provided. The method calculates an entropy of each of a plurality of previously trained models based on training data, selects a previously trained model from the plurality of previously trained models based on the calculated entropy, and trains a target model, distinguished from the plurality of previously trained models, based on the training data and the selected previously trained model.

    SPEECH RECOGNITION APPARATUS AND METHOD
    7.
    发明申请

    公开(公告)号:US20200074986A1

    公开(公告)日:2020-03-05

    申请号:US16351612

    申请日:2019-03-13

    Inventor: Ki Soo KWON

    Abstract: A processor-implemented method of personalizing a speech recognition model includes: obtaining statistical information of first scaling vectors combined with a base model for speech recognition; obtaining utterance data of a user; and generating a personalized speech recognition model by modifying a second scaling vector combined with the base model based on the utterance data of the user and the statistical information.

Patent Agency Ranking