专利检索 cpc:"G10L2015/085" 第 1 页

1.

发明授权
Voice recognition system 有权

公开(公告)号：US11996103B2

公开(公告)日：2024-05-28

申请号：US17811605

申请日：2022-07-11

申请人： Google LLC

发明人： Petar Aleksic , Pedro J. Moreno Mengibar

IPC分类号： G10L15/00 , G06F16/632 , G10L15/04 , G10L15/19 , G10L15/197 , G10L15/22 , G10L15/26 , G10L15/08 , G10L15/183

CPC分类号： G10L15/26 , G06F16/632 , G10L15/04 , G10L15/19 , G10L15/197 , G10L2015/085 , G10L15/183 , G10L15/22

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.

2.

发明申请
APPLYING NEURAL NETWORK LANGUAGE MODELS TO WEIGHTED FINITE STATE TRANSDUCERS FOR AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20180374484A1

公开(公告)日：2018-12-27

申请号：US16035513

申请日：2018-07-13

申请人： Apple Inc.

发明人： Rongqing HUANG , Ilya OPARIN

IPC分类号： G10L15/28 , G10L15/193 , G10L15/197 , G10L15/14 , G10L15/16 , G10L15/08

CPC分类号： G10L15/285 , G10L15/142 , G10L15/16 , G10L15/193 , G10L15/197 , G10L2015/085

摘要： Systems and processes for converting speech-to-text are provided. In one example process, speech input can be received. A sequence of states and arcs of a weighted finite state transducer (WFST) can be traversed. A negating finite state transducer (FST) can be traversed. A virtual FST can be composed using a neural network language model and based on the sequence of states and arcs of the WFST. The one or more virtual states of the virtual FST can be traversed to determine a probability of a candidate word given one or more history candidate words. Text corresponding to the speech input can be determined based on the probability of the candidate word given the one or more history candidate words. An output can be provided based on the text corresponding to the speech input.

3.

发明申请
System and Method of Lattice-Based Search for Spoken Utterance Retrieval 审中-公开

公开(公告)号：US20180253490A1

公开(公告)日：2018-09-06

申请号：US15972842

申请日：2018-05-07

申请人： NUANCE COMMUNICATIONS, INC.

发明人： Murat Saraclar , Richard William Sproat

IPC分类号： G06F17/30 , G10L15/14 , G10L15/197 , G10L13/00 , G10L15/08

CPC分类号： G06F16/632 , G10L13/00 , G10L15/08 , G10L15/142 , G10L15/197 , G10L2015/085

摘要： A system and method are disclosed for retrieving audio segments from a spoken document. The spoken document preferably is one having moderate word error rates such as telephone calls or teleconferences. The method comprises converting speech associated with a spoken document into a lattice representation and indexing the lattice representation of speech. These steps are performed typically off-line. Upon receiving a query from a user, the method further comprises searching the indexed lattice representation of speech and returning retrieved audio segments from the spoken document that match the user query.

4.

发明申请
VOICE RECOGNITION SYSTEM 审中-公开

公开(公告)号：US20180190293A1

公开(公告)日：2018-07-05

申请号：US15910872

申请日：2018-03-02

申请人： Google LLC

发明人： Petar Aleksic , Pedro J. Moreno Mengibar

IPC分类号： G10L15/26 , G06F17/30 , G10L15/04

CPC分类号： G10L15/26 , G06F17/30755 , G10L15/04 , G10L15/183 , G10L15/19 , G10L15/197 , G10L15/22 , G10L2015/085

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.

5.

发明申请
DISAMBIGUATION OF VEHICLE SPEECH COMMANDS 审中-公开

公开(公告)号：US20170323635A1

公开(公告)日：2017-11-09

申请号：US15146256

申请日：2016-05-04

申请人： GM Global Technology Operations LLC

发明人： Xufang ZHAO , Gaurav TALWAR

IPC分类号： G10L15/08 , G10L25/54 , G06F3/16 , G10L15/22

CPC分类号： G10L15/08 , G06F3/167 , G10L15/22 , G10L25/54 , G10L2015/085 , G10L2015/088 , G10L2015/223

摘要： A system and method of recognizing speech in a vehicle. The method includes receiving a voice command at the vehicle via a microphone in the vehicle, and obtaining a recognition result from speech recognition performed on the received voice command. The recognition result may represent the voice command and be indicative of any of two or more available vehicle commands. The method may further include selecting one of the two or more available vehicle commands based on a secondary characteristic and an attribute of the selected one of the vehicle commands. The system may be implemented as vehicle electronics that include a microphone located within the vehicle and configured to receive a voice command from a user located within the vehicle, and a controller in communication with the microphone. The controller may be configured to perform speech recognition on the voice command and obtain a disambiguated recognition result.

6.

发明授权
Method and system for order-free spoken term detection 有权

公开(公告)号：US09704482B2

公开(公告)日：2017-07-11

申请号：US14644817

申请日：2015-03-11

申请人： International Business Machines Corporation

发明人： Brian E. D. Kingsbury , Lidia Mangu , Michael A. Picheny , George A. Saon

IPC分类号： G10L15/00 , G10L15/193 , G10L15/08

CPC分类号： G10L15/193 , G10L2015/085 , G10L2015/088

摘要： A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

7.

发明授权
Word cloud audio navigation 有权

公开(公告)号：US09679567B2

公开(公告)日：2017-06-13

申请号：US14610798

申请日：2015-01-30

申请人： Avaya Inc.

发明人： Michael Doyle , Thomas Greenwood

IPC分类号： H04N5/765 , H04N9/80 , G10L15/26 , G11B27/32 , H04L29/06 , H04M3/42 , G10L15/08 , G10L21/10 , G10L25/57 , H04N7/14 , H04N21/858

CPC分类号： G10L15/265 , G10L15/08 , G10L15/083 , G10L21/10 , G10L25/57 , G10L2015/085 , G11B27/322 , H04L63/06 , H04L63/30 , H04L65/1076 , H04M3/42221 , H04M2201/38 , H04M2201/40 , H04N7/141 , H04N21/8586

摘要： The present invention is directed generally to linking a collection of words and/or phrases with locations in a video and/or audio stream where the words and/or phrases occur and/or associations of a collection of words and/or phrases with a call history.

8.

发明申请
APPLYING NEURAL NETWORK LANGUAGE MODELS TO WEIGHTED FINITE STATE TRANSDUCERS FOR AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20170162203A1

公开(公告)日：2017-06-08

申请号：US15156161

申请日：2016-05-16

申请人： Apple Inc.

发明人： Rongqing HUANG , Ilya OPARIN

IPC分类号： G10L15/28 , G10L15/14 , G10L15/16 , G10L15/197

CPC分类号： G10L15/285 , G10L15/142 , G10L15/16 , G10L15/193 , G10L15/197 , G10L2015/085

摘要： Systems and processes for converting speech-to-text are provided. In one example process, speech input can be received. A sequence of states and arcs of a weighted finite state transducer (WFST) can be traversed. A negating finite state transducer (FST) can be traversed. A virtual FST can be composed using a neural network language model and based on the sequence of states and arcs of the WFST. The one or more virtual states of the virtual FST can be traversed to determine a probability of a candidate word given one or more history candidate words. Text corresponding to the speech input can be determined based on the probability of the candidate word given the one or more history candidate words. An output can be provided based on the text corresponding to the speech input.

9.

发明授权
Methods, apparatus and computer programs for automatic speech recognition 有权
标题翻译：用于自动语音识别的方法，装置和计算机程序

公开(公告)号：US09502024B2

公开(公告)日：2016-11-22

申请号：US14191176

申请日：2014-02-26

申请人： Nuance Communications, Inc.

发明人： John Brian Pickering , Timothy David Poultney , Benjamin Terrick Staniford , Matthew Whitbourne

IPC分类号： G10L15/00 , G10L15/08

CPC分类号： G10L15/08 , G10L2015/025 , G10L2015/085

摘要： An automatic speech recognition (ASR) system includes a speech-responsive application and a recognition engine. The ASR system generates user prompts to elicit certain spoken inputs, and the speech-responsive application performs operations when the spoken inputs are recognized. The recognition engine compares sounds within an input audio signal with phones within an acoustic model, to identify candidate matching phones. A recognition confidence score is calculated for each candidate matching phone, and the confidence scores are used to help identify one or more likely sequences of matching phones that appear to match a word within the grammar of the speech-responsive application. The per-phone confidence scores are evaluated against predefined confidence score criteria (for example, identifying scores below a ‘low confidence’ threshold) and the results of the evaluation are used to influence subsequent selection of user prompts. One such system uses confidence scores to select prompts for targetted recognition training—encouraging input of sounds identified as having low confidence scores. Another system selects prompts to discourage input of sounds that were not easily recognized.

摘要翻译： 自动语音识别（ASR）系统包括语音响应应用和识别引擎。 ASR系统产生用户提示以引出某些口语输入，并且语音响应应用程序在识别出口语输入时执行操作。识别引擎将输入音频信号中的声音与声学模型中的电话进行比较，以识别候选匹配电话。为每个候选匹配电话计算识别置信度分数，并且使用置信度分数来帮助识别似乎与语音响应应用程序的语法中的单词相匹配的一个或多个可能的匹配电话序列。根据预定义的置信评分标准（例如，识别低于“低置信度”阈值的分数）评估每手机信心分数，并且评估结果用于影响用户提示的后续选择。一个这样的系统使用置信分数来选择针对性识别训练的提示 - 鼓励被识别为具有低置信度分数的声音输入。另一个系统选择提示来阻止不容易识别的声音输入。

10.

发明申请
METHODS, APPARATUS AND COMPUTER PROGRAMS FOR AUTOMATIC SPEECH RECOGNITION 有权
标题翻译：方法，装置和计算机程序自动语音识别

公开(公告)号：US20140249816A1

公开(公告)日：2014-09-04

申请号：US14191176

申请日：2014-02-26

申请人： Nuance Communications, Inc.

发明人： John Brian Pickering , Timothy David Poultney , Benjamin Terrick Staniford , Matthew Whitbourne

IPC分类号： G10L15/08

CPC分类号： G10L15/08 , G10L2015/025 , G10L2015/085

摘要： An automatic speech recognition (ASR) system includes a speech-responsive application and a recognition engine. The ASR system generates user prompts to elicit certain spoken inputs, and the speech-responsive application performs operations when the spoken inputs are recognised. The recognition engine compares sounds within an input audio signal with phones within an acoustic model, to identify candidate matching phones. A recognition confidence score is calculated for each candidate matching phone, and the confidence scores are used to help identify one or more likely sequences of matching phones that appear to match a word within the grammar of the speech-responsive application. The per-phone confidence scores are evaluated against predefined confidence score criteria (for example, identifying scores below a ‘low confidence’ threshold) and the results of the evaluation are used to influence subsequent selection of user prompts. One such system uses confidence scores to select prompts for targetted recognition training—encouraging input of sounds identified as having low confidence scores. Another system selects prompts to discourage input of sounds that were not easily recognised.

摘要翻译： 自动语音识别（ASR）系统包括语音响应应用和识别引擎。 ASR系统产生用户提示以引出某些口语输入，并且语音响应应用程序在识别出口语输入时执行操作。识别引擎将输入音频信号中的声音与声学模型中的电话进行比较，以识别候选匹配电话。为每个候选匹配电话计算识别置信度分数，并且使用置信度分数来帮助识别似乎与语音响应应用程序的语法中的单词相匹配的一个或多个可能的匹配电话序列。根据预定义的置信评分标准（例如，识别低于“低置信度”阈值的分数）评估每手机信心分数，并且评估结果用于影响用户提示的后续选择。一个这样的系统使用置信分数来选择针对性识别训练的提示 - 鼓励被识别为具有低置信度分数的声音输入。另一个系统选择提示来阻止不容易识别的声音输入。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类