-
公开(公告)号:US11929076B2
公开(公告)日:2024-03-12
申请号:US18060949
申请日:2022-12-01
Applicant: Microsoft Technology Licensing, LLC
Inventor: Hosam Adel Khalil , Emilian Stoimenov , Christopher Hakan Basoglu , Kshitiz Kumar , Jian Wu
CPC classification number: G10L15/32 , G10L15/16 , G10L15/30 , G10L19/167 , G10L25/51 , G10L2015/088
Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.
-
公开(公告)号:US12216809B2
公开(公告)日:2025-02-04
申请号:US17364254
申请日:2021-06-30
Applicant: Microsoft Technology Licensing, LLC
Inventor: Chun Hin Nelson Siu , Hosam Adel Khalil , Ajoy Nandi , Carmen Quan , Denis Fisenko , Md Nizam Uddin Chy , Min Hu , Christopher Hakan Basoglu , Sayan Dev Pathak
Abstract: Techniques are provided for early processing of a part of a user input to produce a response to the entire or final user input. While the user input is being received, a partial user input, which is a part of the final user input, is processed to produce a response. The response is a candidate response for the final user input. After the final user input is received, and if the partial user input is determined to match or be equivalent to the final user input, the first response, which is already available, is provided to one or more output devices for presentation. If the final user input is determined to differ from the partial user input, the final user input is processed to produce a second response to the final user input, and the second response is provided for presentation. In some instances, multiple partial user inputs are received and processed.
-
公开(公告)号:US12287816B1
公开(公告)日:2025-04-29
申请号:US18385408
申请日:2023-10-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sayan Dev Pathak , Osama Abuelsorour , Christopher Hakan Basoglu , Harini Kesavamoorthy , Girish Milind Mahajan , Salman Mohammad Quazi , Valeriy Viktorovich Kirshin
IPC: G06F16/33 , G06F16/332 , G06F16/3329 , G06F16/334
Abstract: A technique partitions a user's original query into plural smaller component queries, each of which has a common part and an instance-specific part. The technique distributes the component queries to plural processor instances of a processor. The plural processor instances transform the respective component queries into query-component responses by acting in parallel, independent of each other. The technique generates a final response based on the query-component responses, e.g., by assembling the component-query responses into the final response. The technique reduces latency because the processor instances work on parts of the user's original query at the same time, rather than as a single stream of consecutive tokens. The plural processor instances have access to a shared cache memory, and utilize relevant data that has been computed in response to previous queries.
-
公开(公告)号:US11741302B1
公开(公告)日:2023-08-29
申请号:US17747585
申请日:2022-05-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sayan Dev Pathak , Christopher Hakan Basoglu , Amit Agarwal , Shuangyu Chang , Amy Shah
IPC: G06F40/253 , G06F40/166 , G06F40/205
CPC classification number: G06F40/253 , G06F40/166 , G06F40/205
Abstract: A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.
-
公开(公告)号:US11532312B2
公开(公告)日:2022-12-20
申请号:US17123087
申请日:2020-12-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: Hosam Adel Khalil , Emilian Stoimenov , Christopher Hakan Basoglu , Kshitiz Kumar , Jian Wu
Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.
-
-
-
-