-
1.
公开(公告)号:US20230230583A1
公开(公告)日:2023-07-20
申请号:US17579131
申请日:2022-01-19
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Gabriel Leblanc , Tzu-Chan Chuang , Qiong Huang , William A. Truong , Yixing Cai , Alexey Galata , Yuan Yuan
IPC: G10L15/08 , G10L15/065 , G10L15/22 , G06V40/16 , G06F21/32
CPC classification number: G10L15/083 , G10L15/065 , G10L15/22 , G06V40/161 , G06F21/32 , G10L2015/088 , G10L2015/0636
Abstract: Hot word free adaptation, of one or more function(s) of an automated assistant, responsive to determining, based on gaze measure(s) and/or active speech measure(s), that a user is engaging with the automated assistant. Implementations relate to various techniques for mitigating false positive occurrences of and/or false negative occurrences, of hot word free adaptation, through utilization of personalized parameter(s) for at least some user(s) of an assistant device. The personalized parameter(s) are utilized in determining whether condition(s) are satisfied, where those condition(s), if satisfied, indicate that the user is engaging in hot word free interaction with the automated assistant and result in adaptation of function(s) of the automated assistant.
-
公开(公告)号:US20250078484A1
公开(公告)日:2025-03-06
申请号:US18242213
申请日:2023-09-05
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Sergei Volnov , Yunfan Ye , Alexey Galata , William A. Truong , Tzu-Chan Chuang , Liang-yu Chen , Qiong Huang , Krunal Shah , Sai Aditya Chitturu , Sana Mithani
IPC: G06V10/80 , G06V40/16 , G10L15/183 , G10L15/30
Abstract: Implementations relate to generating and using multimodal embeddings. In various implementations, first modality data may be obtained and encoded into first modality embedding(s) using a trained first modality encoder that is stored in memory of edge-based client device(s). Second modality data may be obtained and encoded into second modality embedding(s) using a trained second modality encoder that is also stored in the memory of the edge-based client device(s). The first and second modality embeddings may be processed using an edge-based multimodal LLM that is also stored locally in memory of the edge-based client device(s) to generate a multimodal contextual embedding, which may be provided to a remote server that hosts a central LLM, e.g., in conjunction with a natural language input provided by the user. Information generated using the central LLM, responsive to the natural language input, may be received from the remote server.
-
3.
公开(公告)号:US20240347062A1
公开(公告)日:2024-10-17
申请号:US18751972
申请日:2024-06-24
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Gabriel Leblanc , Qiong Huang , Alexey Galata , Tzu-Chan Chuang , William A. Truong , Yixing Cai , Yuan Yuan
CPC classification number: G10L15/22 , G06V40/103 , G10L15/08 , G10L15/24 , G10L25/78 , G06F3/013 , G06F3/167 , G10L2015/088 , G10L2015/227 , G10L17/22
Abstract: Hot word free adaptation, of function(s) of an automated assistant, responsive to determining, based on gaze measure(s) and/or active speech measure(s), that a user is engaging with the automated assistant. Implementations relate to techniques for mitigating false positive occurrences of and/or false negative occurrences, of hot word free adaptation, through utilization of a permissive parameter set in some situation(s) and a restrictive parameter set in other situation(s). For example, utilizing the restrictive parameter set when it is determined that a user is engaged in conversation with additional user(s). The permissive parameter set includes permissive parameter(s) that are more permissive than counterpart(s) in the restrictive parameter set. A parameter set is utilized in determining whether condition(s) are satisfied, where those condition(s), if satisfied, indicate that the user is engaging in hot word free interaction with the automated assistant and result in adaptation of function(s) of the automated assistant
-
4.
公开(公告)号:US12020704B2
公开(公告)日:2024-06-25
申请号:US17579110
申请日:2022-01-19
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Gabriel Leblanc , Qiong Huang , Alexey Galata , Tzu-Chan Chuang , William A. Truong , Yixing Cai , Yuan Yuan
CPC classification number: G10L15/22 , G06V40/103 , G10L15/08 , G10L15/24 , G10L25/78 , G06F3/013 , G06F3/167 , G10L2015/088 , G10L2015/227 , G10L17/22
Abstract: Hot word free adaptation, of function(s) of an automated assistant, responsive to determining, based on gaze measure(s) and/or active speech measure(s), that a user is engaging with the automated assistant. Implementations relate to techniques for mitigating false positive occurrences of and/or false negative occurrences, of hot word free adaptation, through utilization of a permissive parameter set in some situation(s) and a restrictive parameter set in other situation(s). For example, utilizing the restrictive parameter set when it is determined that a user is engaged in conversation with additional user(s). The permissive parameter set includes permissive parameter(s) that are more permissive than counterpart(s) in the restrictive parameter set. A parameter set is utilized in determining whether condition(s) are satisfied, where those condition(s), if satisfied, indicate that the user is engaging in hot word free interaction with the automated assistant and result in adaptation of function(s) of the automated assistant.
-
公开(公告)号:US20250005293A1
公开(公告)日:2025-01-02
申请号:US18217313
申请日:2023-06-30
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Sergei Volnov , William A. Truong , Yunfan Ye , Sana Mithani , Neel Joshi , Alexey Galata , Tzu-Chan Chuang , Liang-yu Chen , Qiong Huang , Krunal Shah , Sai Aditya Chitturu
Abstract: Implementations relate to leveraging large language model(s) (LLMs) and vision language model(s) (VLMs) to facilitate human-to-computer dialogs. In various implementations, one or more digital images may be processed using one or more VLMs to generate VLM output indicative of a state of an environment. An LLM prompt may be assembled based on the VLM output and a natural language input. The LLM prompt may be processed using one or more LLMs to generate content that is responsive to the natural language input. The content that is responsive to the natural language input may subsequently be rendered at one or more output devices.
-
6.
公开(公告)号:US20230230587A1
公开(公告)日:2023-07-20
申请号:US17579110
申请日:2022-01-19
Applicant: GOOGLE LLC
Inventor: Tuan Nguyen , Gabriel Leblanc , Qiong Huang , Alexey Galata , Tzu-Chan Chuang , William A. Truong , Yixing Cai , Yuan Yuan
CPC classification number: G10L15/22 , G06V40/103 , G10L15/24 , G10L15/08 , G10L25/78 , G10L2015/088 , G10L2015/227
Abstract: Hot word free adaptation, of function(s) of an automated assistant, responsive to determining, based on gaze measure(s) and/or active speech measure(s), that a user is engaging with the automated assistant. Implementations relate to techniques for mitigating false positive occurrences of and/or false negative occurrences, of hot word free adaptation, through utilization of a permissive parameter set in some situation(s) and a restrictive parameter set in other situation(s). For example, utilizing the restrictive parameter set when it is determined that a user is engaged in conversation with additional user(s). The permissive parameter set includes permissive parameter(s) that are more permissive than counterpart(s) in the restrictive parameter set. A parameter set is utilized in determining whether condition(s) are satisfied, where those condition(s), if satisfied, indicate that the user is engaging in hot word free interaction with the automated assistant and result in adaptation of function(s) of the automated assistant
-
-
-
-
-