-
公开(公告)号:US11429860B2
公开(公告)日:2022-08-30
申请号:US14853485
申请日:2015-09-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jinyu Li , Rui Zhao , Jui-Ting Huang , Yifan Gong
Abstract: Systems and methods are provided for generating a DNN classifier by “learning” a “student” DNN model from a larger more accurate “teacher” DNN model. The student DNN may be trained from un-labeled training data because its supervised signal is obtained by passing the un-labeled training data through the teacher DNN. In one embodiment, an iterative process is applied to train the student DNN by minimize the divergence of the output distributions from the teacher and student DNN models. For each iteration until convergence, the difference in the output distributions is used to update the student DNN model, and output distributions are determined again, using the unlabeled training data. The resulting trained student model may be suitable for providing accurate signal processing applications on devices having limited computational or storage resources such as mobile or wearable devices. In an embodiment, the teacher DNN model comprises an ensemble of DNN models.
-
公开(公告)号:US20210407498A1
公开(公告)日:2021-12-30
申请号:US17474829
申请日:2021-09-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Emilian Stoimenov , Rui Zhao , Kaustubh Prakash Kalgaonkar , Ivaylo Andreanov Enchev , Khuram Shahid , Anthony Phillip Stark , Guoli Ye , Mahadevan Srinivasan , Yifan Gong , Hosam Adel Khalil
Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.
-
公开(公告)号:US10885900B2
公开(公告)日:2021-01-05
申请号:US15675249
申请日:2017-08-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jinyu Li , Michael Lewis Seltzer , Xi Wang , Rui Zhao , Yifan Gong
IPC: G10L15/16 , G06N3/08 , G10L15/06 , G10L15/183 , G10L15/065 , G10L25/30 , G06N3/04 , G06N3/12 , G06N5/00
Abstract: Improvements in speech recognition in a new domain are provided via the student/teacher training of models for different speech domains. A student model for a new domain is created based on the teacher model trained in an existing domain. The student model is trained in parallel to the operation of the teacher model, with inputs in the new and existing domains respectfully, to develop a neural network that is adapted to recognize speech in the new domain. The data in the new domain may exclude transcription labels but rather are parallelized with the data analyzed in the existing domain analyzed by the teacher model. The outputs from the teacher model are compared with the outputs of the student model and the differences are used to adjust the parameters of the student model to better recognize speech in the second domain.
-
公开(公告)号:US10380150B2
公开(公告)日:2019-08-13
申请号:US15848929
申请日:2017-12-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Shen Huang , Yongzheng Zhang , Chi-Yi Kuan , Hu Wang , Rui Zhao , Zhou Jin
IPC: G06F16/332 , G06Q50/00 , G06F17/27 , G06K9/62
Abstract: Method and system for identifying user expectations in question answering in an on-line social network system are described. The automated support system is configured to address the technical problem of optimization of the processing of user input submitted to a computer in the form of a natural language. The automated support system uses machine learning algorithms to automatically extract, from the user input, information indicative of the user's expectations and obtain data relevant to the input based on said information indicative of the user's expectations.
-
公开(公告)号:US09967418B1
公开(公告)日:2018-05-08
申请号:US15339587
申请日:2016-10-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sandeep Kanumuri , Naveen Thumpudi , Sathyanarayanan Karivaradaswamy , Rui Zhao
IPC: H04N1/00 , G06F9/44 , G06F13/10 , H04N21/443
CPC classification number: H04N1/00962 , G06F9/4411 , G06F13/102 , H04N1/00938 , H04N5/232 , H04N21/4431 , H04N2201/0084
Abstract: The present disclosure provides devices and techniques for processing a media capture stream captured by a camera device using a chain device media foundation transform (DMFT). The techniques include configuring multiple DMFTs such that an original manufacturer (OEM) may have flexibility in independently selecting various functionalities from different sources (e.g., OS, OEM, IHV, ISV, or VARs) in order to maximize hardware capabilities while minimizing the drawbacks of creating a single DMFT. To that end, the implementation of the present disclosure includes a devices and techniques of chainable DMFTs such that a device transform manager may select a set of functionalities (e.g., face recognition, color effects, etc.) from multiple vendors to customize the camera's capabilities according to the OEM specification.
-
-
-
-