INTERMEDIATE DATA FOR INTER-DEVICE SPEECH PROCESSING

    公开(公告)号:US20240029743A1

    公开(公告)日:2024-01-25

    申请号:US18206231

    申请日:2023-06-06

    CPC classification number: G10L17/26 G10L15/183 G10L15/34 G10L15/22

    Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.

    Speech interface device with caching component

    公开(公告)号:US10777203B1

    公开(公告)日:2020-09-15

    申请号:US15934761

    申请日:2018-03-23

    Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as a remote ASR result(s) and a remote NLU result(s). The response data from the remote speech processing system may include one or more cacheable status indicators associated with the NLU result(s) and/or remote directive data, which indicate whether the remote NLU result(s) and/or the remote directive data are individually cacheable. A caching component of the speech interface device allows for caching at least some of this cacheable remote speech processing information, and using the cached information locally on the speech interface device when responding to user speech in the future. This allows for responding to user speech, even when the speech interface device is unable to communicate with a remote speech processing system over a wide area network.

    Hybrid speech interface device
    3.
    发明授权

    公开(公告)号:US12266367B2

    公开(公告)日:2025-04-01

    申请号:US17234111

    申请日:2021-04-19

    Abstract: A speech interface device is configured with “hybrid” capabilities, which allows the speech interface device to perform actions in response to user speech, even when the speech interface device is unable to communicate with a remote system over a wide area network (e.g., the Internet). A hybrid request selector of the speech interface device sends audio data representing user speech to both a remote speech processing system and a local speech processing component executing on the speech interface device, and then waits for a response from either or both components. The local speech processing component may start execution based on the audio data and subsequently suspend the execution until further instruction from the hybrid request selector. The hybrid request selector can then determine which response to use, and, depending on which response is chosen, may instruct the local speech processing component to either continue or terminate the suspended execution.

    Device arbitration by multiple speech processing systems

    公开(公告)号:US12159086B2

    公开(公告)日:2024-12-03

    申请号:US17993242

    申请日:2022-11-23

    Abstract: A device can perform device arbitration, even when the device is unable to communicate with a remote system over a wide area network (e.g., the Internet). Upon detecting a wakeword in an utterance, the device can wait a period of time for data to arrive at the device, which, if received, indicates to the device that another speech interface device in the environment detected an utterance. If the device receives data prior to the period of time lapsing, the device can determine the earliest-occurring wakeword based on multiple wakeword occurrence times, and may designate whichever device that detected the wakeword first as the designated device to perform an action with respect to the user speech. To account for differences in sound capture latency between speech interface devices, a pre-calculated time offset value can be applied to wakeword occurrence time(s) during device arbitration.

    Predicting on-device command execution

    公开(公告)号:US12046234B1

    公开(公告)日:2024-07-23

    申请号:US17359932

    申请日:2021-06-28

    CPC classification number: G10L15/22 G10L15/183 G10L2015/223

    Abstract: Some natural language command processing systems may handle some commands on a user device rather than sending input to another system for processing. Such a system may include an arbitration component for arbitrating between device and/or system processing. The arbitration component may execute in the system and render a device-specific decision as to whether the device will be able to process the input and/or execute the command, based on information known to the system about the device's capabilities. If the arbitration component predicts that the device will not be able to execute the command, the system may execute the command without waiting for a signal from the device. If the arbitration component predicts that the device will be able to execute the command, the system may halt processing to prevent duplicate execution.

    SPEECH INTERFACE DEVICE WITH CACHING COMPONENT

    公开(公告)号:US20240249725A1

    公开(公告)日:2024-07-25

    申请号:US18425465

    申请日:2024-01-29

    CPC classification number: G10L15/30 G10L15/18 H04L67/5683

    Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as a remote ASR result(s) and a remote NLU result(s). The response data from the remote speech processing system may include one or more cacheable status indicators associated with the NLU result(s) and/or remote directive data, which indicate whether the remote NLU result(s) and/or the remote directive data are individually cacheable. A caching component of the speech interface device allows for caching at least some of this cacheable remote speech processing information, and using the cached information locally on the speech interface device when responding to user speech in the future. This allows for responding to user speech, even when the speech interface device is unable to communicate with a remote speech processing system over a wide area network.

    Device arbitration by multiple speech processing systems

    公开(公告)号:US11513766B2

    公开(公告)日:2022-11-29

    申请号:US16895869

    申请日:2020-06-08

    Abstract: A device can perform device arbitration, even when the device is unable to communicate with a remote system over a wide area network (e.g., the Internet). Upon detecting a wakeword in an utterance, the device can wait a period of time for data to arrive at the device, which, if received, indicates to the device that another speech interface device in the environment detected an utterance. If the device receives data prior to the period of time lapsing, the device can determine the earliest-occurring wakeword based on multiple wakeword occurrence times, and may designate whichever device that detected the wakeword first as the designated device to perform an action with respect to the user speech. To account for differences in sound capture latency between speech interface devices, a pre-calculated time offset value can be applied to wakeword occurrence time(s) during device arbitration.

    Device arbitration by multiple speech processing systems

    公开(公告)号:US10679629B2

    公开(公告)日:2020-06-09

    申请号:US15948519

    申请日:2018-04-09

    Abstract: A device can perform device arbitration, even when the device is unable to communicate with a remote system over a wide area network (e.g., the Internet). Upon detecting a wakeword in an utterance, the device can wait a period of time for data to arrive at the device, which, if received, indicates to the device that another speech interface device in the environment detected an utterance. If the device receives data prior to the period of time lapsing, the device can determine the earliest-occurring wakeword based on multiple wakeword occurrence times, and may designate whichever device that detected the wakeword first as the designated device to perform an action with respect to the user speech. To account for differences in sound capture latency between speech interface devices, a pre-calculated time offset value can be applied to wakeword occurrence time(s) during device arbitration.

    DEVICE ARBITRATION BY MULTIPLE SPEECH PROCESSING SYSTEMS

    公开(公告)号:US20190311720A1

    公开(公告)日:2019-10-10

    申请号:US15948519

    申请日:2018-04-09

    Abstract: A device can perform device arbitration, even when the device is unable to communicate with a remote system over a wide area network (e.g., the Internet). Upon detecting a wakeword in an utterance, the device can wait a period of time for data to arrive at the device, which, if received, indicates to the device that another speech interface device in the environment detected an utterance. If the device receives data prior to the period of time lapsing, the device can determine the earliest-occurring wakeword based on multiple wakeword occurrence times, and may designate whichever device that detected the wakeword first as the designated device to perform an action with respect to the user speech. To account for differences in sound capture latency between speech interface devices, a pre-calculated time offset value can be applied to wakeword occurrence time(s) during device arbitration.

Patent Agency Ranking