-
公开(公告)号:US12020703B2
公开(公告)日:2024-06-25
申请号:US17532819
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr
CPC classification number: G10L15/22 , G06N20/00 , G08B5/36 , G10L15/18 , G10L2015/088 , G10L2015/223
Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.
-
公开(公告)号:US11763813B2
公开(公告)日:2023-09-19
申请号:US17243232
申请日:2021-04-28
Applicant: Google LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L15/1815 , G10L15/285 , G10L2015/223
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
23.
公开(公告)号:US11756546B2
公开(公告)日:2023-09-12
申请号:US17346797
申请日:2021-06-14
Applicant: Google LLC
Inventor: Raunaq Shah , Jaclyn Konzelmann , Lisa Takehana , Ruxandra Davies , Adrian Diaconu
CPC classification number: G10L15/22 , G06F3/167 , G10L2015/221 , G10L2015/223
Abstract: Implementations set forth herein relate to employing dynamic regulations for governing responsiveness of multiple automated assistant devices, and specifically the responsiveness an automated assistant to a given spoken utterance that has been acknowledged by two or more of the assistant devices. The dynamic regulations can be context-dependent and adapted over time in order that the automated assistant can accommodate assistant interaction preferences that may vary from user to user. For instance, a spoken utterance such as “stop,” may be intended to affect different assistant actions based on a context in which the user provided the spoken utterance. The context can refer to a location of the user relative to other rooms in a home, a time of day, a user providing the spoken utterance, an arrangement of the assistant devices within a home, and/or a state of each device in the home.
-
公开(公告)号:US11238294B2
公开(公告)日:2022-02-01
申请号:US16787581
申请日:2020-02-11
Applicant: Google LLC
Inventor: Diego Melendo Casado , Tuan Nguyen , Jaclyn Konzelmann , Gustavo Moura , Tanya Kraljic
Abstract: Techniques are described herein for dialog-based enrollment of individual users for single- and/or multi-modal recognition by an automated assistant, as well as determining how to respond to a particular user's request based on the particular user being enrolled and/or recognized. Rather than requiring operation of a graphical user interface for individual enrollment, dialog-based enrollment enables users to enroll themselves (or others) by way of a human-to-computer dialog with the automated assistant.
-
25.
公开(公告)号:US11037562B2
公开(公告)日:2021-06-15
申请号:US16343934
申请日:2018-08-23
Applicant: Google LLC
Inventor: Raunaq Shah , Jaclyn Konzelmann , Lisa Takehana , Ruxandra Davies , Adrian Diaconu
Abstract: Implementations set forth herein relate to employing dynamic regulations for governing responsiveness of multiple automated assistant devices, and specifically the responsiveness an automated assistant to a given spoken utterance that has been acknowledged by two or more of the assistant devices. The dynamic regulations can be context-dependent and adapted over time in order that the automated assistant can accommodate assistant interaction preferences that may vary from user to user. For instance, a spoken utterance such as “stop,” may be intended to affect different assistant actions based on a context in which the user provided the spoken utterance. The context can refer to a location of the user relative to other rooms in a home, a time of day, a user providing the spoken utterance, an arrangement of the assistant devices within a home, and/or a state of each device in the home.
-
26.
公开(公告)号:US20240428796A1
公开(公告)日:2024-12-26
申请号:US18828932
申请日:2024-09-09
Applicant: GOOGLE LLC
Inventor: Raunaq Shah , Jaclyn Konzelmann , Lisa Takehana , Ruxandra Davies , Adrian Diaconu
Abstract: Implementations set forth herein relate to employing dynamic regulations for governing responsiveness of multiple automated assistant devices, and specifically the responsiveness an automated assistant to a given spoken utterance that has been acknowledged by two or more of the assistant devices. The dynamic regulations can be context-dependent and adapted over time in order that the automated assistant can accommodate assistant interaction preferences that may vary from user to user. For instance, a spoken utterance such as “stop,” may be intended to affect different assistant actions based on a context in which the user provided the spoken utterance. The context can refer to a location of the user relative to other rooms in a home, a time of day, a user providing the spoken utterance, an arrangement of the assistant devices within a home, and/or a state of each device in the home.
-
公开(公告)号:US20240331699A1
公开(公告)日:2024-10-03
申请号:US18742612
申请日:2024-06-13
Applicant: GOOGLE LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L15/1815 , G10L15/285 , G10L2015/223
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
公开(公告)号:US20240312460A1
公开(公告)日:2024-09-19
申请号:US18674479
申请日:2024-05-24
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr
CPC classification number: G10L15/22 , G06N20/00 , G08B5/36 , G10L15/18 , G10L2015/088 , G10L2015/223
Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.
-
29.
公开(公告)号:US20230396841A1
公开(公告)日:2023-12-07
申请号:US18234771
申请日:2023-08-16
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Tuan Nguyen , Vinay Bettadapura , Andrew Gallagher , Utsav Prabhu , Caroline Pantofaru
IPC: H04N21/442 , G06T7/70 , H04N21/258 , H04N21/41 , H04W12/64
CPC classification number: H04N21/44218 , G06T7/70 , H04N21/25875 , H04N21/25891 , H04N21/4126 , H04W12/64 , G06T2207/30196
Abstract: Implementations relate to an automated assistant that provides and manages output from one or more elements of output hardware of a computing device. The automated assistant manages dynamic adjustment of access permissions to the computing device according to, for example, a detected presence of one or more users. An active-user queue can be established each time a unique user enters a viewing window of a camera of the computing device when, up to that point, no user was considered active. Multiple image frames can be captured via the camera and processed to determine whether an initial user remains in the viewing window and/or whether another user has entered the viewing window. The initial user can be considered active as long as they are exclusively detected in the viewing window. Restricted content associated with the user may be rendered by the computing device whilst the user is active.
-
公开(公告)号:US20230253009A1
公开(公告)日:2023-08-10
申请号:US18135611
申请日:2023-04-17
Applicant: GOOGLE LLC
Inventor: Jaclyn Konzelmann , Kenneth Mixter , Sourish Chaudhuri , Tuan Nguyen , Hideaki Matsui , Caroline Pantofaru , Vinay Bettadapura
Abstract: Hot-word free adaptation of one or more function(s) of an automated assistant. Sensor data, from one or more sensor components of an assistant device that provides an automated assistant interface (graphical and/or audible), is processed to determine occurrence and/or confidence metric(s) of various attributes of a user that is proximal to the assistant device. Whether to adapt each of one or more of the function(s) of the automated assistant is based on the occurrence and/or the confidence of one or more of the various attributes. For example, certain processing of at least some of the sensor data can be initiated, such as initiating previously dormant local processing of at least some of the sensor data and/or initiating transmission of at least some of the audio data to remote automated assistant component(s).
-
-
-
-
-
-
-
-
-