-
公开(公告)号:US20200043499A1
公开(公告)日:2020-02-06
申请号:US16443160
申请日:2019-06-17
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
IPC: G10L15/28
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US10325598B2
公开(公告)日:2019-06-18
申请号:US15645918
申请日:2017-07-10
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US09818407B1
公开(公告)日:2017-11-14
申请号:US13761812
申请日:2013-02-07
Applicant: Amazon Technologies, Inc.
Inventor: Hugh Evan Secker-Walker , Kenneth John Basye , Nikko Strom , Ryan Paul Thomas
CPC classification number: G10L25/78 , G10L15/04 , G10L15/142 , G10L15/30 , G10L15/32 , G10L25/18 , G10L25/24 , G10L25/87
Abstract: An efficient audio streaming method and apparatus includes a client process implemented on a client or local device and a server process implemented on a remote server or server(s). The client process and server process each have speech recognition components and communicate over a network, and together efficiently manage the detection of speech in an audio signal streamed by the local device to the server for speech recognition and potentially further processing at the server. The client process monitors audio input and in a first detection stage, implements endpointing on the local device to determine when speech is detected. The client process may further determine if a “wakeword” is detected, and then the client process opens a connection and begins streaming audio to the server process via the network. The server process receives the speech audio stream and monitors the audio, implementing endpointing in the server process, to determine when to tell the client process to close the connection and stop streaming audio. The client process continues streaming audio to the server until the server process determines disconnect criteria have been met and tells the client process to stop streaming audio.
-
公开(公告)号:US11978478B2
公开(公告)日:2024-05-07
申请号:US18182811
申请日:2023-03-13
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
IPC: G10L25/87 , G10L15/00 , G10L21/00 , G10L21/0216 , G10L25/78
CPC classification number: G10L25/87 , G10L15/00 , G10L2021/02166 , G10L25/78
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US10566012B1
公开(公告)日:2020-02-18
申请号:US16158775
申请日:2018-10-12
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Jeffrey Penrod Adams
Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
公开(公告)号:US20180096689A1
公开(公告)日:2018-04-05
申请号:US15645918
申请日:2017-07-10
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
CPC classification number: G10L15/28 , G10L15/30 , G10L25/78 , G10L2015/088
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US09697828B1
公开(公告)日:2017-07-04
申请号:US14311163
申请日:2014-06-20
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
IPC: G10L15/18
CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088
Abstract: Features are disclosed for detecting words in audio using environmental information and/or contextual information in addition to acoustic features associated with the words to be detected. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
公开(公告)号:US11322152B2
公开(公告)日:2022-05-03
申请号:US16443160
申请日:2019-06-17
Applicant: Amazon Technologies, Inc.
Inventor: Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy
Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
-
公开(公告)号:US20210134276A1
公开(公告)日:2021-05-06
申请号:US17090716
申请日:2020-11-05
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
公开(公告)号:US20180012593A1
公开(公告)日:2018-01-11
申请号:US15641169
申请日:2017-07-03
Applicant: Amazon Technologies, Inc.
Inventor: Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister
IPC: G10L15/18
CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088
Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
-
-
-
-
-
-
-
-
-