Patent search ap:("Microsoft Technology Licensing Page LLC") AND inv:"Rui Zhao"

1.

发明授权
Code-switching speech recognition with end-to-end connectionist temporal classification model 有权

公开(公告)号：US10964309B2

公开(公告)日：2021-03-30

申请号：US16410556

申请日：2019-05-13

Applicant: Microsoft Technology Licensing, LLC

Inventor： Jinyu Li , Guoli Ye , Rui Zhao , Yifan Gong , Ke Li

IPC: G10L15/00 , G10L15/06 , G10L15/04

Abstract: A CS CTC model may be initialed from a major language CTC model by keeping network hidden weights and replacing output tokens with a union of major and secondary language output tokens. The initialized model may be trained by updating parameters with training data from both languages, and a LID model may also be trained with the data. During a decoding process for each of a series of audio frames, if silence dominates a current frame then a silence output token may be emitted. If silence does not dominate the frame, then a major language output token posterior vector from the CS CTC model may be multiplied with the LID major language probability to create a probability vector from the major language. A similar step is performed for the secondary language, and the system may emit an output token associated with the highest probability across all tokens from both languages.

2.

发明申请
ON-DEVICE CUSTOM WAKE WORD DETECTION 审中-公开

公开(公告)号：US20200349927A1

公开(公告)日：2020-11-05

申请号：US16522416

申请日：2019-07-25

Applicant: Microsoft Technology Licensing, LLC

Inventor： Emilian Stoimenov , Rui Zhao , Kaustubh Prakash Kalgaonkar , Ivaylo Andreanov Enchev , Khuram Shahid , Anthony Phillip Stark , Guoli Ye , Mahadevan Srinivasan , Yifan Gong , Hosam Adel Khalil

IPC: G10L15/16 , G10L17/24 , G06N3/08

Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.

3.

发明申请
Learning Student DNN Via Output Distribution 审中-公开
Title translation: 通过输出分布学习学生DNN

公开(公告)号：US20160078339A1

公开(公告)日：2016-03-17

申请号：US14853485

申请日：2015-09-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Jinyu Li , Rui Zhao , Jui-Ting Huang , Yifan Gong

IPC: G06N3/08 , G09B5/00 , G06N99/00

CPC classification number: G06N3/084 , G06N3/0454 , G06N7/005 , G06N20/00 , G09B5/00

Abstract: Systems and methods are provided for generating a DNN classifier by “learning” a “student” DNN model from a larger more accurate “teacher” DNN model. The student DNN may be trained from un-labeled training data because its supervised signal is obtained by passing the un-labeled training data through the teacher DNN. In one embodiment, an iterative process is applied to train the student DNN by minimize the divergence of the output distributions from the teacher and student DNN models. For each iteration until convergence, the difference in the output distributions is used to update the student DNN model, and output distributions are determined again, using the unlabeled training data. The resulting trained student model may be suitable for providing accurate signal processing applications on devices having limited computational or storage resources such as mobile or wearable devices. In an embodiment, the teacher DNN model comprises an ensemble of DNN models.

Abstract translation: 提供了系统和方法，用于通过从更为精确的“教师”DNN模型学习“学生”DNN模型来生成DNN分类器。学生DNN可以从未标记的训练数据训练，因为其监督信号是通过传递未标记的训练数据通过教师DNN获得的。在一个实施例中，应用迭代过程来通过最小化来自教师和学生DNN模型的输出分布的偏差来训练学生DNN。对于每次迭代直到收敛，输出分布的差异用于更新学生DNN模型，并且使用未标记的训练数据再次确定输出分布。所得到的训练有素的学生模型可能适合于在具有有限计算或存储资源的设备（例如移动或可穿戴设备）上提供精确的信号处理应用。在一个实施例中，教师DNN模型包括DNN模型的集合。

4.

发明授权
Generating and using text-to-speech data for speech recognition models 有权

公开(公告)号：US12205596B2

公开(公告)日：2025-01-21

申请号：US18108316

申请日：2023-02-10

Applicant: Microsoft Technology Licensing, LLC

Inventor： Guoli Ye , Yan Huang , Wenning Wei , Lei He , Eva Sharma , Jian Wu , Yao Tian , Edward C. Lin , Yifan Gong , Rui Zhao , Jinyu Li , William Maxwell Gale

IPC: G10L15/26 , G10L13/08 , G10L15/06 , G10L15/16

Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.

5.

发明申请
UNEQUAL PROBABILITY SAMPLING BASED ON A LIKELIHOOD MODEL SCORE TO EVALUATE PREVALENCE OF INAPPROPRIATE ENTITIES 审中-公开

公开(公告)号：US20200382530A1

公开(公告)日：2020-12-03

申请号：US16424972

申请日：2019-05-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Wenqian Li , Zhou Jin , Rui Zhao , Xiaosu Huang , Chi-Yi Kuan

IPC: H04L29/06 , G06K9/62 , G06F17/18 , G06N20/00

Abstract: Techniques for performing unequal sampling are provided. In one technique, multiple scores generated by a prediction model are identified, each score corresponding to a different entity of multiple entities. Multiple buckets are determined, each bucket corresponding to a different range of scores. Each entity is assigned to a bucket based on the score corresponding to the entity. A probability distribution function is generated based on the scores and a number of scores belonging to each bucket. For each entity, a probability of sampling the entity is determined based on the probability distribution function and a score corresponding to the entity. A subset of the entities are sampled based on the probability determined for each entity.

6.

发明授权
Scalable mining of trending insights from text 有权

公开(公告)号：US10733221B2

公开(公告)日：2020-08-04

申请号：US15085714

申请日：2016-03-30

Applicant: Microsoft Technology Licensing, LLC

Inventor： Yongzheng Zhang , Rui Zhao , Chi-Yi Kuan , Yi Zheng

IPC: G06F16/33 , G06F16/93 , G06F16/35 , G06F16/951

Abstract: A system and method for identifying trending topics in a document corpus are provided. First, multiple topics are identified, some of which topics may be filtered or removed based on co-occurrence. Then, for each remaining topic, a frequency of the topic in the document corpus is determined, one or more frequencies of the topic in one or more other document corpora are determined, a trending score of the topic is generated based on the determined frequencies. Lastly, the remaining topics are ranked based on the generated trending scores.

7.

发明授权
Augmented training data for end-to-end models 有权

公开(公告)号：US11862144B2

公开(公告)日：2024-01-02

申请号：US17124341

申请日：2020-12-16

Applicant: Microsoft Technology Licensing, LLC

Inventor： Rui Zhao , Jinyu Li , Yifan Gong

IPC: G10L15/06 , G10L13/07 , G10L15/19 , G10L15/26

CPC classification number: G10L15/063 , G10L13/07 , G10L15/19 , G10L15/26

Abstract: A computer system is provided that includes a processor configured to store a set of audio training data that includes a plurality of audio segments and metadata indicating a word or phrase associated with each audio segment. For a target training statement of a set of structured text data, the processor is configured to generate a concatenated audio signal that matches a word content of a target training statement by comparing the words or phrases of a plurality of text segments of the target training statement to respective words or phrases of audio segments of the stored set of audio training data, selecting a plurality of audio segments from the set of audio training data based on a match in the words or phrases between the plurality of text segments of the target training statement and the selected plurality of audio segments, and concatenating the selected plurality of audio segments.

8.

发明授权
On-device custom wake word detection 有权

公开(公告)号：US11798535B2

公开(公告)日：2023-10-24

申请号：US17474829

申请日：2021-09-14

Applicant: Microsoft Technology Licensing, LLC

Inventor： Emilian Stoimenov , Rui Zhao , Kaustubh Prakash Kalgaonkar , Ivaylo Andreanov Enchev , Khuram Shahid , Anthony Phillip Stark , Guoli Ye , Mahadevan Srinivasan , Yifan Gong , Hosam Adel Khalil

IPC: G10L15/16 , G06N3/08 , G10L17/24 , G10L15/08

CPC classification number: G10L15/16 , G06N3/08 , G10L17/24 , G10L2015/088

Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.

9.

发明授权
Unequal probability sampling based on a likelihood model score to evaluate prevalence of inappropriate entities 有权

公开(公告)号：US11463461B2

公开(公告)日：2022-10-04

申请号：US16424972

申请日：2019-05-29

Applicant: Microsoft Technology Licensing, LLC

Inventor： Wenqian Li , Zhou Jin , Rui Zhao , Xiaosu Huang , Chi-Yi Kuan

IPC: H04L29/06 , H04L9/40 , G06N20/00 , G06F17/18 , G06K9/62

Abstract: Techniques for performing unequal sampling are provided. In one technique, multiple scores generated by a prediction model are identified, each score corresponding to a different entity of multiple entities. Multiple buckets are determined, each bucket corresponding to a different range of scores. Each entity is assigned to a bucket based on the score corresponding to the entity. A probability distribution function is generated based on the scores and a number of scores belonging to each bucket. For each entity, a probability of sampling the entity is determined based on the probability distribution function and a score corresponding to the entity. A subset of the entities are sampled based on the probability determined for each entity.

10.

发明申请
GENERATING AND USING TEXT-TO-SPEECH DATA FOR SPEECH RECOGNITION MODELS 有权

公开(公告)号：US20210304769A1

公开(公告)日：2021-09-30

申请号：US15931788

申请日：2020-05-14

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Guoli Ye , Yan Huang , Wenning Wei , Lei He , Eva Sharma , Jian Wu , Yao Tian , Edward C. Lin , Yifan Gong , Rui Zhao , Jinyu Li , William Maxwell Gale

IPC: G10L15/26 , G10L15/16 , G10L15/06 , G10L13/08

Abstract: Systems, methods, and devices are provided for generating and using text-to-speech (TTS) data for improved speech recognition models. A main model is trained with keyword independent baseline training data. In some instances, acoustic and language model sub-components of the main model are modified with new TTS training data. In some instances, the new TTS training is obtained from a multi-speaker neural TTS system for a keyword that is underrepresented in the baseline training data. In some instances, the new TTS training data is used for pronunciation learning and normalization of keyword dependent confidence scores in keyword spotting (KWS) applications. In some instances, the new TTS training data is used for rapid speaker adaptation in speech recognition models.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification