-
公开(公告)号:US20230223023A1
公开(公告)日:2023-07-13
申请号:US18149181
申请日:2023-01-03
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Eli Joshua Fidler , Roland Maximilian Rolf Maas , Nikko Strom , Aaron Eakin , Diamond Bishop , Bjorn Hoffmeister , Sanjeev Mishra
CPC classification number: G10L15/22 , G10L15/26 , G10L15/1815 , G10L2015/088 , G10L2015/223 , G10L2015/228
Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.
-
公开(公告)号:US12236950B2
公开(公告)日:2025-02-25
申请号:US18149181
申请日:2023-01-03
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Eli Joshua Fidler , Roland Maximilian Rolf Maas , Nikko Strom , Aaron Eakin , Diamond Bishop , Bjorn Hoffmeister , Sanjeev Mishra
Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.
-
公开(公告)号:US11551685B2
公开(公告)日:2023-01-10
申请号:US16822744
申请日:2020-03-18
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Eli Joshua Fidler , Roland Maximilian Rolf Maas , Nikko Strom , Aaron Eakin , Diamond Bishop , Bjorn Hoffmeister , Sanjeev Mishra
Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.
-
公开(公告)号:US20210166686A1
公开(公告)日:2021-06-03
申请号:US17070784
申请日:2020-10-14
Applicant: Amazon Technologies, Inc.
Inventor: Siu Ming Mok , Joseph Dean Nason Pemberton , Robert David Owen , Diamond Bishop , Eliav Samuel Zimmern Kahan
Abstract: Techniques for enabling a device to send to a speech processing server further input audio data following a completed utterance dialog to prevent the need for subsequent keywords to be spoken to invoke subsequent commands are described. A system receives input audio data corresponding to an utterance from a device upon the device detecting speech corresponding to a keyword. The system performs speech processing on the input audio data to determine a command. The system determines output data responsive to the command and sends same to the device, thus completing operations regarding the utterance. The system may also send an instruction to the device to: send to the system further input audio data corresponding to further input audio without the device first detecting a wake command.
-
公开(公告)号:US20210142794A1
公开(公告)日:2021-05-13
申请号:US17099875
申请日:2020-11-17
Applicant: Amazon Technologies, Inc.
Inventor: Lambert Leo Mathias , Bala Murali Krishna Ummaneni , Ryan Scott Aldrich , Diamond Bishop , Ruhi Sarikaya , Chetan Nagaraj Naik
IPC: G10L15/18 , G10L15/22 , G06F40/30 , G06F40/295 , G06F16/9032
Abstract: A system for processing user utterances and/or text based queries that tracks entities and other context data of a current dialog between the system and the user and can fill slots for new intents of the dialog by performing statistical processing on previously mentioned entities with respect to current slots to be filled. The system may compare a previously mentioned entity to a current slot to be filled using vector representations, such as word embeddings, of the current utterance, dialog history, current intent, name of an entity under consideration, category of the current slot to be filled, distance between the current dialog turn and the dialog turn that mentioned the entity, and other considerations. The individual vectors may be weighted according to an attention operation and processed by a trained decoder to output a score indicating whether the entity in consideration is relevant to the particular slot. In this manner, slots may be filled using entities from previous dialog turns, thus performing statistical anaphora resolution and leading to improved system performance.
-
公开(公告)号:US20210295833A1
公开(公告)日:2021-09-23
申请号:US16822744
申请日:2020-03-18
Applicant: Amazon Technologies, Inc.
Inventor: Ariya Rastrow , Eli Joshua Fidler , Roland Maximilian Rolf Maas , Nikko Strom , Aaron Eakin , Diamond Bishop , Bjorn Hoffmeister , Sanjeev Mishra
Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.
-
-
-
-
-