-
公开(公告)号:US11748660B2
公开(公告)日:2023-09-05
申请号:US17028262
申请日:2020-09-22
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G06N20/00 , G10L15/22 , G06F3/16 , G06F3/0488 , G06F18/22 , G06F18/214
CPC classification number: G06N20/00 , G06F3/0488 , G06F3/167 , G06F18/214 , G06F18/22 , G10L15/22 , G10L2015/223
Abstract: Implementations relate to an automated assistant that can automate repeatedly performed procedures. The automation can involve communicating with different users, organizations, and/or other automated assistants. The automated assistant, with prior permission from respective user(s), can detect repeated performance of a particular series of manually initiated computational actions. Based on this determination, the automated assistant can determine automated assistant computational action(s) that can be performed by the automated assistant in order to reduce latency in performing a procedure, reduce quantity and/or size of transmissions in performing the procedure, and/or reduce an amount of client device resources required for performing the procedure. Such actions can include communicating with an additional automated assistant that may be associated with another user and/or organization. In these and other manners, manually initiated computational actions that include electronic communications amongst users can be converted to backend operations amongst instances of automated assistants to achieve technical benefits.
-
公开(公告)号:US20230267911A1
公开(公告)日:2023-08-24
申请号:US18309754
申请日:2023-04-28
Applicant: Google LLC
Inventor: Matthew Sharifi , Jakob Nicolaus Foerster
IPC: G10L13/00 , G06F40/253 , G06F40/289 , G10L13/08
CPC classification number: G10L13/00 , G06F40/253 , G06F40/289 , G10L13/08
Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
-
公开(公告)号:US11734287B2
公开(公告)日:2023-08-22
申请号:US17676615
申请日:2022-02-21
Applicant: Google LLC
Inventor: Matthew Sharifi , David Petrou , Abhanshu Sharma
IPC: G06F16/2457 , G06F16/583 , G06F16/58 , G06F16/2452 , G06F16/903
CPC classification number: G06F16/24578 , G06F16/24522 , G06F16/583 , G06F16/5866 , G06F16/90335
Abstract: Methods, systems, and apparatus for receiving a query image, receiving one or more entities that are associated with the query image, identifying, for one or more of the entities, one or more candidate search queries that are pre-associated with the one or more entities, generating a respective relevance score for each of the candidate search queries, selecting, as a representative search query for the query image, a particular candidate search query based at least on the generated respective relevance scores and providing the representative search query for output in response to receiving the query image.
-
公开(公告)号:US20230229390A1
公开(公告)日:2023-07-20
申请号:US18189181
申请日:2023-03-23
Applicant: Google LLC
Inventor: Jan Althaus , Matthew Sharifi
IPC: G06F3/16 , G06F3/0481 , G06F3/0484 , G10L15/08 , G10L15/22
CPC classification number: G06F3/167 , G06F3/0481 , G06F3/0484 , G10L15/08 , G10L15/22 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing hotword recognition and passive assistance are disclosed. In one aspect, a method includes the actions of receiving, by a computing device that is operating in a low-power mode and that includes a display that displays a graphical interface while the computing device is in the low-power mode and that is configured to exit the low-power mode in response to detecting a first hotword, audio data corresponding to an utterance. The method further includes determining that the audio data includes a second, different hotword. The method further includes obtaining a transcription of the utterance by performing speech recognition on the audio data. The method further includes generating an additional user interface. The method further includes providing, for output on the display, the additional graphical interface.
-
公开(公告)号:US20230194294A1
公开(公告)日:2023-06-22
申请号:US17057079
申请日:2020-09-11
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
CPC classification number: G01C21/3661 , G10L15/22 , G10L2015/228
Abstract: A first computing device can implement a method for providing navigation instructions. The method includes initiating a first navigation session for providing a first set of navigation instructions to a user from a starting location to a destination location along a first route. The method also includes detecting a second computing device in proximity to the first computing device, and determining that the second computing device is implementing a second navigation session for providing a second set of navigation instructions to the destination location along a second route. Further, the method includes adjusting the first navigation session in accordance with the second navigation session.
-
公开(公告)号:US11676594B2
公开(公告)日:2023-06-13
申请号:US17111467
申请日:2020-12-03
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: A method for decaying speech processing includes receiving, at a voice-enabled device, an indication of a microphone trigger event indicating a possible interaction with the device through speech where the device has a microphone that, when open, is configured to capture speech for speech recognition. In response to receiving the indication of the microphone trigger event, the method also includes instructing the microphone to open or remain open for a duration window to capture an audio stream in an environment of the device and providing the audio stream captured by the open microphone to a speech recognition system. During the duration window, the method further includes decaying a level of the speech recognition processing based on a function of the duration window and instructing the speech recognition system to use the decayed level of speech recognition processing over the audio stream captured by the open microphone.
-
47.
公开(公告)号:US20230178078A1
公开(公告)日:2023-06-08
申请号:US18103333
申请日:2023-01-30
Applicant: Google LLC
Inventor: Victor Carbune , Matthew Sharifi
CPC classification number: G10L15/22 , H04L51/56 , G10L15/1815 , G10L2015/223
Abstract: Implementations relate to an automated assistant that can respond to communications received via a third party application and/or other third party communication modality. The automated assistant can determine that the user is participating in multiple different conversations via multiple different third party communication services. In some implementations, conversations can be processed to identify particular features of the conversations. When the automated assistant is invoked to provide input to a conversation, the automated assistant can compare the input to the identified conversation features in order to select the particular conversation that is most relevant to the input. In this way, the automated assistant can assist with any of multiple disparate conversations that are each occurring via a different third party application.
-
公开(公告)号:US20230173657A1
公开(公告)日:2023-06-08
申请号:US17544117
申请日:2021-12-07
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
CPC classification number: B25J9/0003 , G10L15/22 , G05D1/0088 , G05D1/12 , G06V20/50 , G06F3/165 , H04R1/323 , G06F3/167 , G05D2201/0207
Abstract: Implementations set forth herein relate to a robotic computing device that can seek additional information from other nearby device(s) for fulfilling a request and/or delegating certain operations to the other nearby device(s). Delegating certain operations can involve the robotic computing device maneuvering to a location of a nearby device and soliciting the nearby device for assistance by providing an input from the robotic computing device to the nearby device. In some instances, the input can include an audible rendering of an invocation phrase and a command phrase for invoking an automated assistant that is accessible via the nearby device. A determination of whether to delegate certain operations or seek additional information can be based on a variety of factors such as predicted efficiency and estimated accuracy of performance for performing certain operations.
-
公开(公告)号:US11670281B2
公开(公告)日:2023-06-06
申请号:US17153463
申请日:2021-01-20
Applicant: Google LLC
Inventor: Matthew Sharifi , Jakob Nicolaus Foerster
IPC: G10L13/00 , G06F40/253 , G06F40/289 , G10L13/08
CPC classification number: G10L13/00 , G06F40/253 , G06F40/289 , G10L13/08
Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
-
公开(公告)号:US20230059469A1
公开(公告)日:2023-02-23
申请号:US17982834
申请日:2022-11-08
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: Implementations can receive audio data corresponding to a spoken utterance of a user, process the audio data to generate a plurality of speech hypotheses, determine an action to be performed by an automated assistant based on the speech hypotheses, and cause the computing device to render an indication of the action. In response to the computing device rendering the indication, implementations can receive additional audio data corresponding to an additional spoken utterance of the user, process the additional audio data to determine that a portion of the spoken utterance is similar to an additional portion of the additional spoken utterance, supplant the action with an alternate action, and cause the automated assistant to initiate performance of the alternate action. Some implementations can determine whether to render the indication of the action based on a confidence level associated with the action.
-
-
-
-
-
-
-
-
-