-
公开(公告)号:US20250014573A1
公开(公告)日:2025-01-09
申请号:US18889063
申请日:2024-09-18
Applicant: GOOGLE LLC
Inventor: Pu-sen Chao , Alex Fandrianto
IPC: G10L15/16 , G06N3/08 , G10L15/08 , G10L17/00 , G10L21/0208
Abstract: The presentation of an automated assistant response may be selectively pre-empted in response to a hot-word free utterance that is received during the presentation and that is determined to be likely directed to the automated assistant. The determination that the utterance is likely directed to the automated assistant may be performed, for example, using an utterance classification operation that is performed on audio data received during presentation of the response, and based upon such a determination, the response may be pre-empted with another response associated with the later-received utterance. In addition, the duration that is used to determine when a session should be terminated at the conclusion of a conversation between a user and an automated assistant may be dynamically controlled based upon when the presentation of a response has completed.
-
2.
公开(公告)号:US20240054997A1
公开(公告)日:2024-02-15
申请号:US18382886
申请日:2023-10-23
Applicant: GOOGLE LLC
Inventor: Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno
CPC classification number: G10L15/197 , G10L15/005 , G10L15/22 , G10L15/30 , G10L15/08 , G10L15/14 , G10L15/1822 , G10L13/00 , G10L2015/088 , G10L2015/223 , G10L2015/228
Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. Some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.
-
3.
公开(公告)号:US20240038246A1
公开(公告)日:2024-02-01
申请号:US17876156
申请日:2022-07-28
Applicant: GOOGLE LLC
Inventor: Pu-sen Chao , Alex Fandrianto , Muhammad Umair
IPC: G10L17/22 , G06F3/16 , G06F3/0481 , G10L17/06 , G10L13/02
CPC classification number: G10L17/22 , G06F3/167 , G06F3/0481 , G10L17/06 , G10L13/02
Abstract: Implementations relate to an automated assistant that is responsive, without requiring an invocation phrase or other invocation input(s), to certain spoken utterances when certain display content is being accessed by a user. The display content can be processed to identify certain inputs and/or other intents and parameters that are associated with assistant operations and are relevant to the display content. Thereafter, the automated assistant can determine whether any spoken utterances from the user correspond to those certain inputs, intents, and/or parameters. In response to receiving such a spoken utterance, the automated assistant can initialize performance of the relevant operation without necessitating that the user provides a preceding invocation phrase or other invocation input(s). When other display content is being accessed, the automated assistant can repeat the process for other inputs and operations.
-
公开(公告)号:US11721326B2
公开(公告)日:2023-08-08
申请号:US17584866
申请日:2022-01-26
Applicant: GOOGLE LLC
Inventor: Meltem Oktem , Taral Pradeep Joglekar , Fnu Heryandi , Pu-sen Chao , Ignacio Lopez Moreno , Salil Rajadhyaksha , Alexander H. Gruenstein , Diego Melendo Casado
IPC: G10L15/08 , G06F21/32 , G10L17/06 , G06F16/635 , G10L15/22 , G10L17/00 , G06V40/10 , G10L15/07 , G10L15/26
CPC classification number: G10L15/08 , G06F16/636 , G06F21/32 , G06V40/10 , G10L15/07 , G10L15/22 , G10L17/00 , G10L17/06 , G10L15/26 , G10L2015/088
Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
-
公开(公告)号:US20210280177A1
公开(公告)日:2021-09-09
申请号:US17328400
申请日:2021-05-24
Applicant: Google LLC
Inventor: Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno , William Zhang
Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. Some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.
-
公开(公告)号:US20190318724A1
公开(公告)日:2019-10-17
申请号:US15973466
申请日:2018-05-07
Applicant: Google LLC
Inventor: Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno
Abstract: The present disclosure relates generally to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. The system can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.
-
公开(公告)号:US12154574B2
公开(公告)日:2024-11-26
申请号:US18506105
申请日:2023-11-09
Applicant: Google LLC
Inventor: Jason Pelecanos , Pu-sen Chao , Yiling Huang , Quan Wang
Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
-
公开(公告)号:US12046233B2
公开(公告)日:2024-07-23
申请号:US18361408
申请日:2023-07-28
Applicant: GOOGLE LLC
Inventor: Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno , William Zhang
CPC classification number: G10L15/197 , G10L13/00 , G10L15/005 , G10L15/08 , G10L15/14 , G10L15/1822 , G10L15/22 , G10L15/30 , G10L2015/088 , G10L2015/223 , G10L2015/228
Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. Some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.
-
公开(公告)号:US11837238B2
公开(公告)日:2023-12-05
申请号:US17076743
申请日:2020-10-21
Applicant: Google LLC
Inventor: Jason Pelecanos , Pu-sen Chao , Yiling Huang , Quan Wang
Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
-
10.
公开(公告)号:US20230368784A1
公开(公告)日:2023-11-16
申请号:US18361408
申请日:2023-07-28
Applicant: GOOGLE LLC
Inventor: Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno , William Zhang
CPC classification number: G10L15/197 , G10L15/1822 , G10L15/14 , G10L13/00 , G10L15/005 , G10L15/30 , G10L15/08 , G10L15/22 , G10L2015/228 , G10L2015/223 , G10L2015/088
Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. Some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.
-
-
-
-
-
-
-
-
-