Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Alok Ulhas Parlikar"

1.

发明授权
Active learning for lexical annotations 有权
Title translation: 积极学习词汇注释

公开(公告)号：US09508341B1

公开(公告)日：2016-11-29

申请号：US14476075

申请日：2014-09-03

Applicant: Amazon Technologies, Inc.

Inventor： Alok Ulhas Parlikar , Andrew Jake Rosenbaum , Jeffrey Paul Lilly , Jeffrey Penrod Adams

IPC: G10L15/18 , G10L13/00

CPC classification number: G10L15/18 , G10L13/00 , G10L15/187

Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.

Abstract translation: 公开了用于主动学习的特征以在手动注释之后识别可能改善猜测和自动语音识别（ASR）的单词。当语音识别系统需要发音时，通常使用词典。对于未知单词，可以包括发音猜测（G2P），以无人值守（例如，自动）的方式提供发音。然而，手动（例如，由人类）注释的发音提供比具有在某些情况下是错误的自动发音更好的ASR。包括的主动学习功能有助于指导这些有限的注释资源。

2.

发明授权
Predicting pronunciation in speech recognition 有权

公开(公告)号：US10339920B2

公开(公告)日：2019-07-02

申请号：US14196055

申请日：2014-03-04

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey Penrod Adams , Alok Ulhas Parlikar , Jeffrey Paul Lilly , Ariya Rastrow

IPC: G10L15/187 , G10L13/08 , G10L15/02 , G10L15/08

Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.

3.

发明申请
PREDICTING PRONUNCIATION IN SPEECH RECOGNITION 审中-公开
Title translation: 预测语音识别中的授权

公开(公告)号：US20150255069A1

公开(公告)日：2015-09-10

申请号：US14196055

申请日：2014-03-04

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey Penrod Adams , Alok Ulhas Parlikar , Jeffrey Paul Lilly , Ariya Rastrow

IPC: G10L17/22 , G10L15/08

CPC classification number: G10L15/08 , G06F17/275 , G10L13/08 , G10L15/187 , G10L2015/025

Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.

Abstract translation: 自动语音识别（ASR）设备可以被配置为基于预测文本标识符的一个或多个原始语言来预测文本标识符（例如，歌曲名称等）的发音。可以基于文本标识符来确定一个或多个来源的语言。发音可以包括混合发音，包括一种语言的发音，第二语言的发音和组合多种语言的混合发音。发音可以被添加到词典中并与内容项（例如，歌曲）和/或文本标识符匹配。 ASR设备可以从请求ASR设备的用户接收到该内容项的语音话语。 ASR设备确定口语话语是否匹配词典中内容项的发音之一。 ASR设备然后在口语发音与潜在的文本标识符发音之一匹配时访问该内容。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification