Accessory control using keywords
    1.
    发明授权

    公开(公告)号:US12132952B1

    公开(公告)日:2024-10-29

    申请号:US17895661

    申请日:2022-08-25

    摘要: A system configured to use keywords to augment audio playback with visual effects and/or other effects to provide an immersive audio experience. For example, a device can detect a keyword and control a color and intensity of external lights to provide visual feedback. In addition to visual effects, the device can trigger additional effects using smart plugs or other smart devices. In a listening enhancement mode in which the device outputs audio content, the device performs keyword detection by monitoring playback audio data for preconfigured keywords. In a storytelling mode in which a user reads a book out loud, the device may perform keyword detection by monitoring microphone audio data for the preconfigured keywords. Controlling the visual effects in response to keyword detection is enabled by a new pipeline that sends information back from a wakeword engine to an audio processor.

    Swaddle
    2.
    外观设计
    Swaddle 有权

    公开(公告)号:USD1045337S1

    公开(公告)日:2024-10-08

    申请号:US29871611

    申请日:2023-02-23

    申请人: Joy Chopak

    设计人: Joy Chopak

    摘要: FIG. 1 is a front perspective view of a swaddle showing my new design;
    FIG. 2 is a front elevation view thereof;
    FIG. 3 is a rear elevation view thereof;
    FIG. 4 is a right side elevation view thereof;
    FIG. 5 is a left side elevation view thereof;
    FIG. 6 is the top plan view thereof;
    FIG. 7 is the bottom plan view thereof; and,
    FIG. 8 is a front perspective view of the swaddle shown in an open configuration.
    All of the broken lines illustrate portions of the swaddle that form no part of the claimed design.

    Sound source localization using acoustic wave decomposition

    公开(公告)号:US12101599B1

    公开(公告)日:2024-09-24

    申请号:US17952806

    申请日:2022-09-26

    发明人: Mohamed Mansour

    IPC分类号: H04R3/00 H04R1/40

    CPC分类号: H04R1/406 H04R3/005

    摘要: Disclosed are techniques for an improved method for performing sound source localization (SSL) to determine a direction of arrival of an audible sound using a combination of timing information and amplitude information. For example, a device may decompose an observed sound field into directional components, then estimate a time-delay likelihood value and an energy-based likelihood value for each of the directional components. Using a combination of these likelihood values, the device can determine the direction of arrival corresponding to a maximum likelihood value. In some examples, the device may perform Acoustic Wave Decomposition processing to determine the directional components. In order to reduce a processing consumption associated with performing AWD processing, the device splits this process into two phases: a search phase that selects a subset of a device dictionary to reduce a complexity, and a decomposition phase that solves an optimization problem using the subset of the device dictionary.

    Multiple virtual assistants
    4.
    发明授权

    公开(公告)号:US12087299B2

    公开(公告)日:2024-09-10

    申请号:US18085763

    申请日:2022-12-21

    摘要: A speech-processing system may provide access to multiple virtual assistants via one or more voice-controlled devices. Each assistant may leverage language processing and language generation features of the speech-processing system, while handling different commands and/or providing access to different back applications. Different assistants may be available for use with a particular voice-controlled device based on time, location, the particular user, etc. The voice-controlled device may include components for facilitating user interaction with multiple assistants. For example, a multi-assistant component may facilitate enabling/disabling assistants, assigning gestures and/or wakewords, etc. The multi-assistant component may handle routing commands to a command processing subsystem corresponding to an assistant invoked by the command. The voice controlled device may further include observer components, each configured to monitor the voice-controlled device for invocations of a particular assistant.

    Speech processing for multiple inputs

    公开(公告)号:US12080291B2

    公开(公告)日:2024-09-03

    申请号:US17708077

    申请日:2022-03-30

    摘要: This disclosure proposes systems and methods enabling on-device/hybrid processing of speech requests using a hub device. The hub device is capable of receiving audio data from surrounding devices and performing speech processing on the audio data to improve latency and/or provide functionality to other devices within a private network. The hub device may receive multiple requests corresponding to different utterances. If the hub device receives a second utterance while processing a first utterance, the hub device may send an error notification, process the first utterance and the second utterance sequentially, suspend processing of the first utterance to process the second utterance first, send the second utterance to another hub device or remote system, or suspend processing of the first utterance and send the first utterance to the remote system in order to process the second utterance.

    Multi-device localization
    6.
    发明授权

    公开(公告)号:US12058509B1

    公开(公告)日:2024-08-06

    申请号:US17546567

    申请日:2021-12-09

    摘要: A system configured to create a flexible home theater group using a variety of different devices. To enable the home theater group to generate synchronized audio, the system performs device localization to generate map data, which represents locations of devices in a device map. The map data may include a listening position and/or television, such that the map data is centered on the listening position with the television along a vertical axis. To generate the map data, the system selects a primary device that determines calibration data indicating a sequence when each of the individual devices generates playback audio. The primary device sends the calibration data to secondary devices and each device generates playback audio at a designated time in the sequence, enabling other devices to capture the output audio and determine a relative position of the playback device (for example using angle of arrival and distance information).

    Non-speech input to speech processing system

    公开(公告)号:US11990120B2

    公开(公告)日:2024-05-21

    申请号:US16902992

    申请日:2020-06-16

    发明人: Travis Grizzel

    摘要: A system and method for associating motion data with utterance audio data for use with a speech processing system. A device, such as a wearable device, may be capable of capturing utterance audio data and sending it to a remote server for speech processing, for example for execution of a command represented in the utterance. The device may also capture motion data using motion sensors of the device. The motion data may correspond to gestures, such as head gestures, that may be interpreted by the speech processing system to determine and execute commands. The device may associate the motion data with the audio data so the remote server knows what motion data corresponds to what portion of audio data for purposes of interpreting and executing commands. Metadata sent with the audio data and/or motion data may include association data such as timestamps, session identifiers, message identifiers, etc.