Correcting speech misrecognition of spoken utterances

    公开(公告)号:US11823664B2

    公开(公告)日:2023-11-21

    申请号:US17982834

    申请日:2022-11-08

    Applicant: GOOGLE LLC

    CPC classification number: G10L15/08 G06F3/16 G10L13/02 G10L2015/088

    Abstract: Implementations can receive audio data corresponding to a spoken utterance of a user, process the audio data to generate a plurality of speech hypotheses, determine an action to be performed by an automated assistant based on the speech hypotheses, and cause the computing device to render an indication of the action. In response to the computing device rendering the indication, implementations can receive additional audio data corresponding to an additional spoken utterance of the user, process the additional audio data to determine that a portion of the spoken utterance is similar to an additional portion of the additional spoken utterance, supplant the action with an alternate action, and cause the automated assistant to initiate performance of the alternate action. Some implementations can determine whether to render the indication of the action based on a confidence level associated with the action.

    Hotphrase Triggering Based On A Sequence Of Detections

    公开(公告)号:US20230298588A1

    公开(公告)日:2023-09-21

    申请号:US18323725

    申请日:2023-05-25

    Applicant: Google LLC

    Abstract: A method includes receiving audio data corresponding to an utterance spoken by the user and captured by the user device. The utterance includes a command for a digital assistant to perform an operation. The method also includes determining, using a hotphrase detector configured to detect each trigger word in a set of trigger words associated with a hotphrase, whether any of the trigger words in the set of trigger words are detected in the audio data during the corresponding fixed-duration time window. The method also includes determining identifying, in the audio corresponding to the utterance, the hotphrase when each other trigger word in the set of trigger words was also detected in the audio data. The method also includes triggering an automated speech recognizer to perform speech recognition on the audio data when the hotphrase is identified in the audio data corresponding to the utterance.

    Automated assistant training and/or execution of inter-user procedures

    公开(公告)号:US11748660B2

    公开(公告)日:2023-09-05

    申请号:US17028262

    申请日:2020-09-22

    Applicant: Google LLC

    Abstract: Implementations relate to an automated assistant that can automate repeatedly performed procedures. The automation can involve communicating with different users, organizations, and/or other automated assistants. The automated assistant, with prior permission from respective user(s), can detect repeated performance of a particular series of manually initiated computational actions. Based on this determination, the automated assistant can determine automated assistant computational action(s) that can be performed by the automated assistant in order to reduce latency in performing a procedure, reduce quantity and/or size of transmissions in performing the procedure, and/or reduce an amount of client device resources required for performing the procedure. Such actions can include communicating with an additional automated assistant that may be associated with another user and/or organization. In these and other manners, manually initiated computational actions that include electronic communications amongst users can be converted to backend operations amongst instances of automated assistants to achieve technical benefits.

    Detecting and improving simultaneous navigation sessions on multiple devices

    公开(公告)号:US20230194294A1

    公开(公告)日:2023-06-22

    申请号:US17057079

    申请日:2020-09-11

    Applicant: GOOGLE LLC

    CPC classification number: G01C21/3661 G10L15/22 G10L2015/228

    Abstract: A first computing device can implement a method for providing navigation instructions. The method includes initiating a first navigation session for providing a first set of navigation instructions to a user from a starting location to a destination location along a first route. The method also includes detecting a second computing device in proximity to the first computing device, and determining that the second computing device is implementing a second navigation session for providing a second set of navigation instructions to the destination location along a second route. Further, the method includes adjusting the first navigation session in accordance with the second navigation session.

    Decaying automated speech recognition processing results

    公开(公告)号:US11676594B2

    公开(公告)日:2023-06-13

    申请号:US17111467

    申请日:2020-12-03

    Applicant: Google LLC

    CPC classification number: G10L15/22 G10L25/78

    Abstract: A method for decaying speech processing includes receiving, at a voice-enabled device, an indication of a microphone trigger event indicating a possible interaction with the device through speech where the device has a microphone that, when open, is configured to capture speech for speech recognition. In response to receiving the indication of the microphone trigger event, the method also includes instructing the microphone to open or remain open for a duration window to capture an audio stream in an environment of the device and providing the audio stream captured by the open microphone to a speech recognition system. During the duration window, the method further includes decaying a level of the speech recognition processing based on a function of the duration window and instructing the speech recognition system to use the decayed level of speech recognition processing over the audio stream captured by the open microphone.

    AUTOMATED ASSISTANT FOR FACILITATING COMMUNICATIONS THROUGH DISSIMILAR MESSAGING FEATURES OF DIFFERENT APPLICATIONS

    公开(公告)号:US20230178078A1

    公开(公告)日:2023-06-08

    申请号:US18103333

    申请日:2023-01-30

    Applicant: Google LLC

    CPC classification number: G10L15/22 H04L51/56 G10L15/1815 G10L2015/223

    Abstract: Implementations relate to an automated assistant that can respond to communications received via a third party application and/or other third party communication modality. The automated assistant can determine that the user is participating in multiple different conversations via multiple different third party communication services. In some implementations, conversations can be processed to identify particular features of the conversations. When the automated assistant is invoked to provide input to a conversation, the automated assistant can compare the input to the identified conversation features in order to select the particular conversation that is most relevant to the input. In this way, the automated assistant can assist with any of multiple disparate conversations that are each occurring via a different third party application.

Patent Agency Ranking