-
141.
公开(公告)号:US12154576B2
公开(公告)日:2024-11-26
申请号:US17573431
申请日:2022-01-11
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L17/24 , G06F21/31 , G06F21/34 , G06F21/35 , G06F21/40 , G10L17/02 , G10L17/06 , G10L17/16 , H04L9/32 , H04L9/40 , H04W12/062
Abstract: Implementations set forth herein relate to an automated assistant that can solicit other devices for data that can assist with user authentication. User authentication can be streamlined for certain requests by removing a requirement that all authentication be performed at a single device and/or by a single application. For instance, the automated assistant can rely on data from other devices, which can indicate a degree to which a user is predicted to be present at a location of an assistant-enabled device. The automated assistant can process this data to make a determination regarding whether the user should be authenticated in response to an assistant input and/or pre-emptively before the user provides an assistant input. In some implementations, the automated assistant can perform one or more factors of authentication and utilize the data to verify the user in lieu of performing one or more other factors of authentication.
-
142.
公开(公告)号:US12154561B2
公开(公告)日:2024-11-26
申请号:US17554280
申请日:2021-12-17
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
Abstract: An overall endpointing measure can be generated based on an audio-based endpointing measure and (1) an accelerometer-based endpointing measure and/or (2) a gaze-based endpointing measure. The overall endpointing measure can be used in determining whether a candidate endpoint is an actual endpoint. Various implementations include generating the audio-based endpointing measure by processing an audio data stream, capturing a spoken utterance of a user, using an audio model. Various implementations additionally or alternatively include generating the accelerometer-based endpointing measure by processing a stream of accelerometer data using an accelerometer model. Various implementations additionally or alternatively include processing an image data stream using a gaze model to generate the gaze-based endpointing measure.
-
143.
公开(公告)号:US20240386886A1
公开(公告)日:2024-11-21
申请号:US18197573
申请日:2023-05-15
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/22
Abstract: Implementations herein related to customizing an automated assistant using domain-specific resources. One or more resources are processed to generate a natural language representation of the contents of the resources. The natural language representation is utilized to customize an automated assistant for interactions with a user. Various implementations include priming and fine-tuning large language models that are utilized to implement the automated assistant. Various implementations are directed to biasing speech recognition based on terms identified in the resources. Various implementations are directed to customizing the tone of the automated assistant based on information included in the resources.
-
公开(公告)号:US12147470B2
公开(公告)日:2024-11-19
申请号:US17938455
申请日:2022-10-06
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G06F16/632 , G06F16/638 , G10L17/02 , G10L17/06
Abstract: A method for handling contradictory queries on a shared device includes receiving a first query issued by a first user, the first query specifying a first long-standing operation for a digital assistant to perform, and while the digital assistant is performing the first long-standing operation, receiving a second query, the second query specifying a second long-standing operation for the digital assistant to perform. The method also includes determining that the second query was issued by another user different than the first user and determining, using a query resolver, that performing the second long-standing operation would conflict with the first long-standing operation. The method further includes identifying one or more compromise operations for the digital assistant to perform, and instructing the digital assistant to perform a selected compromise operation among the identified one or more compromise operations.
-
公开(公告)号:US20240347060A1
公开(公告)日:2024-10-17
申请号:US18750663
申请日:2024-06-21
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi , Ondrej Skopek , Justin Lu , Daniel Valcarce , Kevin Kilgour , Mohamad Hassan Rom , Nicolo D'Ercole , Michael Golikov
CPC classification number: G10L15/22 , G10L15/05 , G10L15/1815 , G10L25/78 , G10L2015/088 , G10L2015/223
Abstract: Some implementations process, using warm word model(s), a stream of audio data to determine a portion of the audio data that corresponds to particular word(s) and/or phrase(s) (e.g., a warm word) associated with an assistant command, process, using an automatic speech recognition (ASR) model, a preamble portion of the audio data (e.g., that precedes the warm word) and/or a postamble portion of the audio data (e.g., that follows the warm word) to generate ASR output, and determine, based on processing the ASR output, whether a user intended the assistant command to be performed. Additional or alternative implementations can process the stream of audio data using a speaker identification (SID) model to determine whether the audio data is sufficient to identify the user that provided a spoken utterance captured in the stream of audio data, and determine if that user is authorized to cause performance of the assistant command.
-
146.
公开(公告)号:US20240321277A1
公开(公告)日:2024-09-26
申请号:US18677629
申请日:2024-05-29
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Marius Sajgalik , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
CPC classification number: G10L15/26 , G10L15/22 , G10L2015/223
Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
147.
公开(公告)号:US20240312455A1
公开(公告)日:2024-09-19
申请号:US18121394
申请日:2023-03-14
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
CPC classification number: G10L15/22 , G06F3/167 , H04L63/0876 , G10L2015/223
Abstract: Implementations relate to transferring actions from a shared device to a personal device that is associated with an account of a user. Some implementations relate to determining that a request is associated with sensitive information, determining that one or more other users are co-present with the shared device, and transferring the request that is related to sensitive information to a personal device of the user. Some implementations relate determining that a user is no longer co-present with a shared device that is currently performing one or more actions and transferring one or more of the actions to a personal device that is associated with an account of the user.
-
公开(公告)号:US12094454B2
公开(公告)日:2024-09-17
申请号:US17568920
申请日:2022-01-05
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi
IPC: G10L15/08 , G06N20/00 , G10L15/22 , G10L21/0216
CPC classification number: G10L15/08 , G06N20/00 , G10L15/22 , G10L2021/02163
Abstract: Implementations described herein include detecting a stream of audio data that captures a spoken utterance of the user and that captures ambient noise occurring within a threshold time period of the spoken utterance being spoken by the user. Implementations further include processing a portion of the audio data that includes the ambient noise to determine ambient noise classification(s), processing a portion of the audio data that includes the spoken utterance to generate a transcription, processing both the transcription and the ambient noise classification(s) with a machine learning model to generate a user intent and parameter(s) for the user intent, and performing one or more automated assistant actions based on the user intent and using the parameter(s).
-
公开(公告)号:US12087297B2
公开(公告)日:2024-09-10
申请号:US17930822
申请日:2022-09-09
Applicant: Google LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/00 , G10L15/02 , G10L15/22 , G10L21/0208 , G10L21/0272 , G10L25/78 , G10L25/87
CPC classification number: G10L15/22 , G10L15/02 , G10L21/0208 , G10L21/0272 , G10L25/78 , G10L25/87
Abstract: A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
-
公开(公告)号:US12080293B2
公开(公告)日:2024-09-03
申请号:US18378083
申请日:2023-10-09
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Victor Carbune
IPC: G10L15/22 , G06F16/245 , G06F16/248 , G10L15/26 , G10L15/30 , G10L15/32
CPC classification number: G10L15/22 , G06F16/245 , G06F16/248 , G10L15/26 , G10L15/30 , G10L15/32 , G10L2015/223
Abstract: Systems and methods for determining whether to combine responses from multiple automated assistants. An automated assistant may be invoked by a user utterance, followed by a query, which is provided to a plurality of automated assistants. A first response is received from a first automated assistant and a second response is received from a second automated assistant. Based on similarity between the responses, a primary automated assistant determines whether to combine the responses into a combined response. Once the combined response has been generated, one or more actions are performed in response to the combined response.
-
-
-
-
-
-
-
-
-