Device-directed utterance detection

    公开(公告)号:US12236950B2

    公开(公告)日:2025-02-25

    申请号:US18149181

    申请日:2023-01-03

    Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

    Device-directed utterance detection

    公开(公告)号:US11551685B2

    公开(公告)日:2023-01-10

    申请号:US16822744

    申请日:2020-03-18

    Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

    SPEECH-BASED ATTENTION SPAN FOR VOICE USER INTERFACE

    公开(公告)号:US20210166686A1

    公开(公告)日:2021-06-03

    申请号:US17070784

    申请日:2020-10-14

    Abstract: Techniques for enabling a device to send to a speech processing server further input audio data following a completed utterance dialog to prevent the need for subsequent keywords to be spoken to invoke subsequent commands are described. A system receives input audio data corresponding to an utterance from a device upon the device detecting speech corresponding to a keyword. The system performs speech processing on the input audio data to determine a command. The system determines output data responsive to the command and sends same to the device, thus completing operations regarding the utterance. The system may also send an instruction to the device to: send to the system further input audio data corresponding to further input audio without the device first detecting a wake command.

    SPEECH PROCESSING DIALOG MANAGEMENT

    公开(公告)号:US20210142794A1

    公开(公告)日:2021-05-13

    申请号:US17099875

    申请日:2020-11-17

    Abstract: A system for processing user utterances and/or text based queries that tracks entities and other context data of a current dialog between the system and the user and can fill slots for new intents of the dialog by performing statistical processing on previously mentioned entities with respect to current slots to be filled. The system may compare a previously mentioned entity to a current slot to be filled using vector representations, such as word embeddings, of the current utterance, dialog history, current intent, name of an entity under consideration, category of the current slot to be filled, distance between the current dialog turn and the dialog turn that mentioned the entity, and other considerations. The individual vectors may be weighted according to an attention operation and processed by a trained decoder to output a score indicating whether the entity in consideration is relevant to the particular slot. In this manner, slots may be filled using entities from previous dialog turns, thus performing statistical anaphora resolution and leading to improved system performance.

    DEVICE-DIRECTED UTTERANCE DETECTION

    公开(公告)号:US20210295833A1

    公开(公告)日:2021-09-23

    申请号:US16822744

    申请日:2020-03-18

    Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

Patent Agency Ranking