Speech waveform generation
    2.
    发明授权

    公开(公告)号:US11869482B2

    公开(公告)日:2024-01-09

    申请号:US17272325

    申请日:2018-09-30

    CPC classification number: G10L13/047

    Abstract: A method and apparatus for generating a speech waveform. Fundamental frequency information, glottal features and vocal tract features associated with an input may be received, wherein the glottal features include a phase feature, a shape feature, and an energy feature (1310). A glottal waveform is generated based on the fundamental frequency information and the glottal features through a first neural network model (1320). A speech waveform is generated based on the glottal waveform and the vocal tract features through a second neural network model (1330).

    DOMAIN ADAPTATION IN SPEECH RECOGNITION VIA TEACHER-STUDENT LEARNING

    公开(公告)号:US20190051290A1

    公开(公告)日:2019-02-14

    申请号:US15675249

    申请日:2017-08-11

    Abstract: Improvements in speech recognition in a new domain are provided via the student/teacher training of models for different speech domains. A student model for a new domain is created based on the teacher model trained in an existing domain. The student model is trained in parallel to the operation of the teacher model, with inputs in the new and existing domains respectfully, to develop a neural network that is adapted to recognize speech in the new domain. The data in the new domain may exclude transcription labels but rather are parallelized with the data analyzed in the existing domain analyzed by the teacher model. The outputs from the teacher model are compared with the outputs of the student model and the differences are used to adjust the parameters of the student model to better recognize speech in the second domain.

Patent Agency Ranking