Streaming action fulfillment based on partial hypotheses

    公开(公告)号:US11996101B2

    公开(公告)日:2024-05-28

    申请号:US18160342

    申请日:2023-01-27

    Applicant: Google LLC

    Abstract: A method for streaming action fulfillment receives audio data corresponding to an utterance where the utterance includes a query to perform an action that requires performance of a sequence of sub-actions in order to fulfill the action. While receiving the audio data, but before receiving an end of speech condition, the method processes the audio data to generate intermediate automated speech recognition (ASR) results, performs partial query interpretation on the intermediate ASR results to determine whether the intermediate ASR results identify an application type needed to perform the action and, when the intermediate ASR results identify a particular application type, performs a first sub-action in the sequence of sub-actions by launching a first application to execute on the user device where the first application is associated with the particular application type. The method, in response to receiving an end of speech condition, fulfills performance of the action.

    PRESERVING SPEECH HYPOTHESES ACROSS COMPUTING DEVICES AND/OR DIALOG SESSIONS

    公开(公告)号:US20240169977A1

    公开(公告)日:2024-05-23

    申请号:US18430196

    申请日:2024-02-01

    Applicant: GOOGLE LLC

    CPC classification number: G10L15/14 G10L15/22 G10L15/26

    Abstract: Implementations can receive, at a computing device, audio data corresponding to a spoken utterance of a user, process the audio data to generate, for one or more parts of the spoken utterance, a plurality of speech hypotheses, select a given one of the speech hypotheses, cause the given one of the speech hypotheses to be incorporated as a portion of a transcription associated with the software application, and store the plurality of speech hypotheses. In some implementations, the plurality of speech hypotheses can be loaded at an additional computing device when the transcription is accessed at the additional computing device. In additional or alternative implementations, the plurality of speech hypotheses can be loaded into memory of the computing device when the software application is reactivated and/or when a subsequent dialog session associated with the transcription is initiated.

    PROCESSING CONTINUED CONVERSATIONS OVER MULTIPLE DEVICES

    公开(公告)号:US20240127799A1

    公开(公告)日:2024-04-18

    申请号:US17967183

    申请日:2022-10-17

    Applicant: GOOGLE LLC

    CPC classification number: G10L15/08 G10L25/78

    Abstract: Implementations related to facilitating continued conversations of a user with an automated assistant when the user changes locations relative to one or more devices in an ecosystem of linked assistant devices. The user initially invokes a first device and provides a request, which is processed by the first device. The first device provides a notification to one or more other devices in the ecosystem to indicate that the user is likely to issue a further assistant request. The first device processes subsequent audio data to determine whether the subsequent audio data includes a further assistant request. The one or more other notified devices process device-specific sensor data to determine whether the user is co-present with the one of the other devices. If the user presence is detected, an indication is provided to the first device, causing the first device to cease processing subsequent audio data. Further, the co-present device starts to process subsequent audio data.

    Handling Contradictory Queries on a Shared Device

    公开(公告)号:US20240119088A1

    公开(公告)日:2024-04-11

    申请号:US17938455

    申请日:2022-10-06

    Applicant: Google LLC

    CPC classification number: G06F16/632 G06F16/639 G10L17/02 G10L17/06 G10L17/22

    Abstract: A method for handling contradictory queries on a shared device includes receiving a first query issued by a first user, the first query specifying a first long-standing operation for a digital assistant to perform, and while the digital assistant is performing the first long-standing operation, receiving a second query, the second query specifying a second long-standing operation for the digital assistant to perform. The method also includes determining that the second query was issued by another user different than the first user and determining, using a query resolver, that performing the second long-standing operation would conflict with the first long-standing operation. The method further includes identifying one or more compromise operations for the digital assistant to perform, and instructing the digital assistant to perform a selected compromise operation among the identified one or more compromise operations.

    GROUP HOTWORDS
    248.
    发明公开
    GROUP HOTWORDS 审中-公开

    公开(公告)号:US20240105178A1

    公开(公告)日:2024-03-28

    申请号:US18535701

    申请日:2023-12-11

    Applicant: Google LLC

    Abstract: A method includes a first assistant-enabled device (AED) receiving an assignment instruction assigning a group hotword to a selected group of AEDs that includes the first AED and one or more other AEDs. Each AED is configured to wake-up from a low-power state when the group hotword is detected in streaming audio by at least one of the AEDs. The method also includes receiving audio data that corresponds to an utterance spoken by the user and includes a query that specifies an operation to perform. In response to detecting the group hotword in the audio data, the method also includes triggering the first AED to wake-up from the low-power state and executing a collaboration routine to cause the first AED and each other AED in the selected group of AEDs to collaborate with one another to fulfill performance of the operation specified by the query.

    INFERRING SEMANTIC LABEL(S) FOR ASSISTANT DEVICE(S) BASED ON DEVICE-SPECIFIC SIGNAL(S)

    公开(公告)号:US20240104140A1

    公开(公告)日:2024-03-28

    申请号:US18531015

    申请日:2023-12-06

    Applicant: GOOGLE LLC

    CPC classification number: G06F16/90332 G10L15/30 G16Y10/80 G16Y40/35

    Abstract: Implementations can identify a given assistant device from among a plurality of assistant devices in an ecosystem, obtain device-specific signal(s) that are generated by the given assistant device, process the device-specific signal(s) to generate candidate semantic label(s) for the given assistant device, select a given semantic label for the given semantic device from among the candidate semantic label(s), and assigning, in a device topology representation of the ecosystem, the given semantic label to the given assistant device. Implementations can optionally receive a spoken utterance that includes a query or command at the assistant device(s), determine a semantic property of the query or command matches the given semantic label to the given assistant device, and cause the given assistant device to satisfy the query or command.

Patent Agency Ranking