-
公开(公告)号:US10325598B2
公开(公告)日:2019-06-18
申请号:US15645918
申请日:2017-07-10
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US09978359B1
公开(公告)日:2018-05-22
申请号:US14098677
申请日:2013-12-06
Applicant: Amazon Technologies, Inc.
Abstract: A text-to-speech (TTS) processing system may be configured for iterative processing. Speech units for unit selection may be tagged according to extra segmental features, such as emotional features, dramatic features, etc. Preliminary TTS results based on input text may be provided to a user through a user interface. The user may offer corrections to the preliminary results. Those corrections may correspond to the extra segmental features. The user corrections may then be input into the TTS system along with the input text to provide refined TTS results. This process may be repeated iteratively to obtain desired TTS results.
-
公开(公告)号:US09953630B1
公开(公告)日:2018-04-24
申请号:US13907178
申请日:2013-05-31
Applicant: Amazon Technologies, Inc.
Inventor: Colleen Maree Aubrey , Jeffrey Penrod Adams
IPC: G06F17/20 , G06F17/28 , G06F17/27 , G06F3/00 , G06F3/048 , G06F15/173 , G06F15/16 , G10L21/00 , G10L25/00 , G10L15/00 , G10L15/26 , G10L15/18 , G10L15/04 , G10L17/00 , G10L15/16 , G10L19/00 , H04N7/00 , H04N7/14 , G09G5/00
CPC classification number: G10L15/005 , G06F9/454
Abstract: A computing device reduces the complexity of setting a preferred language on the computing device based on verbal communications with a user. The device may detect when a user is having difficulty navigating a device in a current language and detects the language spoken by a user to cause a language setting to change. The computing device may cross reference other information associated with user, such as other applications or content, when selecting a preferred language.
-
公开(公告)号:US09922639B1
公开(公告)日:2018-03-20
申请号:US13739826
申请日:2013-01-11
Applicant: Amazon Technologies, Inc.
Inventor: Gilles Jean Roger Belin , Charles S. Rogers, III , Robert David Owen , Jeffrey Penrod Adams , Rajiv Ramachandran , Gregory Michael Hart
CPC classification number: G10L15/00 , G10L15/22 , G10L15/26 , G10L2015/221
Abstract: An interactive system may be implemented in part by an audio device located within a user environment, which may accept speech commands from a user and may also interact with the user by means of generated speech. In order to improve performance of the interactive system, a user may use a separate device, such as a personal computer or mobile device, to access a graphical user interface that lists details of historical speech interactions. The graphical user interface may be configured to allow the user to provide feedback and/or corrections regarding the details of specific interactions.
-
公开(公告)号:US20150255069A1
公开(公告)日:2015-09-10
申请号:US14196055
申请日:2014-03-04
Applicant: Amazon Technologies, Inc.
Inventor: Jeffrey Penrod Adams , Alok Ulhas Parlikar , Jeffrey Paul Lilly , Ariya Rastrow
CPC classification number: G10L15/08 , G06F17/275 , G10L13/08 , G10L15/187 , G10L2015/025
Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.
Abstract translation: 自动语音识别(ASR)设备可以被配置为基于预测文本标识符的一个或多个原始语言来预测文本标识符(例如,歌曲名称等)的发音。 可以基于文本标识符来确定一个或多个来源的语言。 发音可以包括混合发音,包括一种语言的发音,第二语言的发音和组合多种语言的混合发音。 发音可以被添加到词典中并与内容项(例如,歌曲)和/或文本标识符匹配。 ASR设备可以从请求ASR设备的用户接收到该内容项的语音话语。 ASR设备确定口语话语是否匹配词典中内容项的发音之一。 ASR设备然后在口语发音与潜在的文本标识符发音之一匹配时访问该内容。
-
公开(公告)号:US12008983B1
公开(公告)日:2024-06-11
申请号:US17731968
申请日:2022-04-28
Applicant: Amazon Technologies, Inc.
Inventor: Gilles Jean Roger Belin , Charles S. Rogers, III , Robert David Owen , Jeffrey Penrod Adams , Rajiv Ramachandran , Gregory Michael Hart
CPC classification number: G10L15/00 , G10L15/22 , G10L2015/221 , G10L15/26
Abstract: An interactive system may be implemented in part by an audio device located within a user environment, which may accept speech commands from a user and may also interact with the user by means of generated speech. In order to improve performance of the interactive system, a user may use a separate device, such as a personal computer or mobile device, to access a graphical user interface that lists details of historical speech interactions. The graphical user interface may be configured to allow the user to provide feedback and/or corrections regarding the details of specific interactions.
-
公开(公告)号:US11990119B1
公开(公告)日:2024-05-21
申请号:US17195127
申请日:2021-03-08
Applicant: Amazon Technologies, Inc.
Inventor: Gilles Jean Roger Belin , Charles S. Rogers, III , Robert David Owen , Jeffrey Penrod Adams , Rajiv Ramachandran , Gregory Michael Hart
CPC classification number: G10L15/00 , G10L15/22 , G10L15/07 , G10L2015/221 , G10L15/26
Abstract: An interactive system may be implemented in part by an audio device located within a user environment, which may accept speech commands from a user and may also interact with the user by means of generated speech. In order to improve performance of the interactive system, a user may use a separate device, such as a personal computer or mobile device, to access a graphical user interface that lists details of historical speech interactions. The graphical user interface may be configured to allow the user to provide feedback and/or corrections regarding the details of specific interactions.
-
公开(公告)号:US20230395095A1
公开(公告)日:2023-12-07
申请号:US18182811
申请日:2023-03-13
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US11037584B2
公开(公告)日:2021-06-15
申请号:US16715026
申请日:2019-12-16
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
IPC: G10L15/22 , G10L25/87 , G10L15/00 , G10L25/78 , G10L21/0216
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US10950220B1
公开(公告)日:2021-03-16
申请号:US16664459
申请日:2019-10-25
Applicant: Amazon Technologies, Inc.
Inventor: Gilles Jean Roger Belin , Charles S. Rogers, III , Robert David Owen , Jeffrey Penrod Adams , Rajiv Ramachandran , Gregory Michael Hart
Abstract: An interactive system may be implemented in part by an audio device located within a user environment, which may accept speech commands from a user and may also interact with the user by means of generated speech. In order to improve performance of the interactive system, a user may use a separate device, such as a personal computer or mobile device, to access a graphical user interface that lists details of historical speech interactions. The graphical user interface may be configured to allow the user to provide feedback and/or corrections regarding the details of specific interactions.
-
-
-
-
-
-
-
-
-