-
公开(公告)号:US12051416B2
公开(公告)日:2024-07-30
申请号:US18228948
申请日:2023-08-01
Applicant: GOOGLE LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L15/1815 , G10L15/285 , G10L2015/223
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
公开(公告)号:US20220351720A1
公开(公告)日:2022-11-03
申请号:US17243232
申请日:2021-04-28
Applicant: Google LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
公开(公告)号:US20220238105A1
公开(公告)日:2022-07-28
申请号:US17157207
申请日:2021-01-25
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
公开(公告)号:US20230410803A1
公开(公告)日:2023-12-21
申请号:US18228948
申请日:2023-08-01
Applicant: GOOGLE LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L2015/223 , G10L15/1815 , G10L15/285 , G10L15/083
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
公开(公告)号:US20240331699A1
公开(公告)日:2024-10-03
申请号:US18742612
申请日:2024-06-13
Applicant: GOOGLE LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L15/1815 , G10L15/285 , G10L2015/223
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
6.
公开(公告)号:US20230419964A1
公开(公告)日:2023-12-28
申请号:US18462787
申请日:2023-09-07
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
公开(公告)号:US11790906B2
公开(公告)日:2023-10-17
申请号:US17157207
申请日:2021-01-25
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
公开(公告)号:US11763813B2
公开(公告)日:2023-09-19
申请号:US17243232
申请日:2021-04-28
Applicant: Google LLC
Inventor: Lior Alon , Rafael Goldfarb , Dekel Auster , Dan Rasin , Michael Andrew Goodman , Trevor Strohman , Nino Tasca , Valerie Nygaard , Jaclyn Konzelmann
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L15/1815 , G10L15/285 , G10L2015/223
Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
-
-
-
-
-
-
-