Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Xi Wang"

1.

发明授权
Domain adaptation in speech recognition via teacher-student learning 有权

公开(公告)号：US10885900B2

公开(公告)日：2021-01-05

申请号：US15675249

申请日：2017-08-11

Applicant: Microsoft Technology Licensing, LLC

Inventor： Jinyu Li , Michael Lewis Seltzer , Xi Wang , Rui Zhao , Yifan Gong

IPC: G10L15/16 , G06N3/08 , G10L15/06 , G10L15/183 , G10L15/065 , G10L25/30 , G06N3/04 , G06N3/12 , G06N5/00

Abstract: Improvements in speech recognition in a new domain are provided via the student/teacher training of models for different speech domains. A student model for a new domain is created based on the teacher model trained in an existing domain. The student model is trained in parallel to the operation of the teacher model, with inputs in the new and existing domains respectfully, to develop a neural network that is adapted to recognize speech in the new domain. The data in the new domain may exclude transcription labels but rather are parallelized with the data analyzed in the existing domain analyzed by the teacher model. The outputs from the teacher model are compared with the outputs of the student model and the differences are used to adjust the parameters of the student model to better recognize speech in the second domain.

2.

发明授权
Speech waveform generation 有权

公开(公告)号：US11869482B2

公开(公告)日：2024-01-09

申请号：US17272325

申请日：2018-09-30

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yang Cui , Xi Wang , Lei He , Kao-Ping Soong

IPC: G10L13/047

CPC classification number: G10L13/047

Abstract: A method and apparatus for generating a speech waveform. Fundamental frequency information, glottal features and vocal tract features associated with an input may be received, wherein the glottal features include a phase feature, a shape feature, and an energy feature (1310). A glottal waveform is generated based on the fundamental frequency information and the glottal features through a first neural network model (1320). A speech waveform is generated based on the glottal waveform and the vocal tract features through a second neural network model (1330).

3.

发明申请
DOMAIN ADAPTATION IN SPEECH RECOGNITION VIA TEACHER-STUDENT LEARNING 审中-公开

公开(公告)号：US20190051290A1

公开(公告)日：2019-02-14

申请号：US15675249

申请日：2017-08-11

Applicant: Microsoft Technology Licensing, LLC

Inventor： Jinyu Li , Michael Lewis Seltzer , Xi Wang , Rui Zhao , Yifan Gong

IPC: G10L15/16 , G06N3/08 , G10L15/06 , G10L15/183

Abstract: Improvements in speech recognition in a new domain are provided via the student/teacher training of models for different speech domains. A student model for a new domain is created based on the teacher model trained in an existing domain. The student model is trained in parallel to the operation of the teacher model, with inputs in the new and existing domains respectfully, to develop a neural network that is adapted to recognize speech in the new domain. The data in the new domain may exclude transcription labels but rather are parallelized with the data analyzed in the existing domain analyzed by the teacher model. The outputs from the teacher model are compared with the outputs of the student model and the differences are used to adjust the parameters of the student model to better recognize speech in the second domain.

Patent Agency Ranking