-
公开(公告)号:US11651780B2
公开(公告)日:2023-05-16
申请号:US17340431
申请日:2021-06-07
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
IPC: G10L15/26 , G10L15/00 , G10L25/87 , G10L25/78 , G10L21/0216
CPC classification number: G10L25/87 , G10L15/00 , G10L25/78 , G10L2021/02166
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US20220059124A1
公开(公告)日:2022-02-24
申请号:US17340431
申请日:2021-06-07
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US10102850B1
公开(公告)日:2018-10-16
申请号:US13775954
申请日:2013-02-25
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
IPC: G10L15/22
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US09996148B1
公开(公告)日:2018-06-12
申请号:US13786254
申请日:2013-03-05
Applicant: Amazon Technologies, Inc.
IPC: G06F3/01
CPC classification number: G06F3/01
Abstract: Features are disclosed for presenting multiple media items based on one or more rules defining how the items are to be presented. One media item may be presented, and during presentation any number of additional media items may be received or scheduled for presentation. Rules may define which media items have priority over others, which media items may interrupt others or be interrupted, which media items may be delayed or presented early, whether particular media items are time-critical such that they are not to be delayed but rather should take presentation priority over others, etc. Metadata may be associated with particular media items or categories thereof. The metadata can provide details regarding how the rules should be applied to those media items. User feedback may also be obtained, and may affect the further application of the rules.
-
公开(公告)号:US11322152B2
公开(公告)日:2022-05-03
申请号:US16443160
申请日:2019-06-17
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US10460719B1
公开(公告)日:2019-10-29
申请号:US15925397
申请日:2018-03-19
Applicant: Amazon Technologies, Inc.
Inventor: Gilles Jean Roger Belin , Charles S. Rogers, III , Robert David Owen , Jeffrey Penrod Adams , Rajiv Ramachandran , Gregory Michael Hart
Abstract: An interactive system may be implemented in part by an audio device located within a user environment, which may accept speech commands from a user and may also interact with the user by means of generated speech. In order to improve performance of the interactive system, a user may use a separate device, such as a personal computer or mobile device, to access a graphical user interface that lists details of historical speech interactions. The graphical user interface may be configured to allow the user to provide feedback and/or corrections regarding the details of specific interactions.
-
公开(公告)号:US09704486B2
公开(公告)日:2017-07-11
申请号:US13711510
申请日:2012-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
IPC: G10L15/00 , G10L15/04 , G10L15/14 , G10L15/20 , G10L17/00 , G10L21/00 , G10L25/00 , G10L15/28 , G10L25/78 , G10L15/08 , G10L15/30
CPC classification number: G10L15/28 , G10L15/30 , G10L25/78 , G10L2015/088
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US09697827B1
公开(公告)日:2017-07-04
申请号:US13711478
申请日:2012-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Jeffrey Paul Lilly , Ryan Paul Thomas , Jeffrey Penrod Adams
Abstract: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.
-
公开(公告)号:US20200043499A1
公开(公告)日:2020-02-06
申请号:US16443160
申请日:2019-06-17
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
IPC: G10L15/28
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US10339920B2
公开(公告)日:2019-07-02
申请号:US14196055
申请日:2014-03-04
Applicant: Amazon Technologies, Inc.
Inventor: Jeffrey Penrod Adams , Alok Ulhas Parlikar , Jeffrey Paul Lilly , Ariya Rastrow
IPC: G10L15/187 , G10L13/08 , G10L15/02 , G10L15/08
Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.
-
-
-
-
-
-
-
-
-