Memory allocation for keyword spotting engines

    公开(公告)号:US11727919B2

    公开(公告)日:2023-08-15

    申请号:US17303066

    申请日:2021-05-19

    申请人: Sonos, Inc.

    摘要: Network microphone devices configured to detect keywords can include microphones for capturing sound samples. Features can be extracted from the sound samples by storing the sound samples in a first portion of a dynamic-access memory block, performing first computations based on spectral coefficients of the sound samples using a second portion of the memory block, and storing results of the first computations as extracted features in a third portion of the memory block. The second and third portions of the memory block can be designated as temporary memory. The extracted features are then processed using a neural network by storing the extracted features in a fourth portion of the memory block, performing second computations on the extracted features using the temporary memory, the second computations comprising computing at least one layer of the neural network, and storing an output of the neural network as a classification in the temporary memory.

    Automated audio-to-text transcription in multi-device teleconferences

    公开(公告)号:US11721344B2

    公开(公告)日:2023-08-08

    申请号:US18101301

    申请日:2023-01-25

    申请人: Nextiva, Inc.

    CPC分类号: G10L15/26 G10L17/00 H04M3/568

    摘要: A system and method are disclosed for generating a teleconference space for two or more communication devices using a computer coupled with a database and comprising a processor and memory. The computer generates a teleconference space and transmits requests to join the teleconference space to the two or more communication devices. The computer stores in memory identification information, and audiovisual data associated with one or more users, for each of the two or more communication devices. The computer stores audio transcription data, transmitted to the computer by each of the two or more communication devices and associated with one or more communication device users, in the computer memory. The computer merges the audio transcription data from each of the two or more communication devices into a master audio transcript, and transmits the master audio transcript to each of the two or more communication devices.

    Home graph
    88.
    发明授权

    公开(公告)号:US11676590B2

    公开(公告)日:2023-06-13

    申请号:US17077974

    申请日:2020-10-22

    申请人: Sonos, Inc.

    摘要: Example techniques involve a control hierarchy for a “smart” home having smart appliances and related devices, such as wireless illumination devices, home-automation devices (e.g., thermostats, door locks, etc.), and audio playback devices, among others. An example home includes various rooms in which smart devices might be located. Under the example control hierarchy described herein and referred to as “home graph,” a name of a room (e.g., “Kitchen”) may represent a smart device (or smart devices) within that room. In other words, from the perspective of a user, the smart devices within a room are that room. This hierarchy permits a user to refer to a smart device within a given room by way of the name of the room when controlling smart devices within the home using a voice user interface (VUI) or graphical user interface (GUI).

    DISPLAY CONTROL SYSTEM, DISPLAY CONTROL METHOD AND INFORMATION STORAGE MEDIUM

    公开(公告)号:US20230178081A1

    公开(公告)日:2023-06-08

    申请号:US18053364

    申请日:2022-11-07

    摘要: An input relay unit receives speech data indicating a speech entered by a speaker. An input relay unit receives a confirmation request that is output in response to a predetermined operation of the speaker. A character string relay unit controls translation of the speech indicated by the speech data, which has been received before the reception of the confirmation request, to be started in response to the reception of the confirmation request. A display control unit controls a display unit to display a screen including an image obtained by overlaying a character string representing a translation result of a speech indicated by speech data that has been received before the reception of the confirmation request on an image captured by a capturing unit.