-
公开(公告)号:US12223944B2
公开(公告)日:2025-02-11
申请号:US17744440
申请日:2022-05-13
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Gianluca Martini
IPC: G06F40/56 , G06F3/16 , G06F16/332 , G06F40/169 , G06T7/20 , G06V20/40 , G06V40/20 , G10L13/02 , G10L13/033 , G10L13/08 , G10L13/10 , G10L15/06 , G10L15/18 , G10L15/183 , G10L15/22 , G10L25/57 , H04N5/04
Abstract: Implementations relate to dynamically adapting a given assistant output based on a given persona, from among a plurality of disparate personas, assigned to an automated assistant. In some implementations, the given assistant output can be generated and subsequently adapted based on the given persona assigned to the automated assistant. In other implementations, the given assistant output can be generated specific to the given persona and without having to subsequently adapt the given assistant output to the given persona. Notably, the given assistant output can include a stream of textual content to be synthesized for audible presentation to the user, and a stream of visual cues utilized in controlling a display of a client device and/or in controlling a visualized representation of the automated assistant. Various implementations utilize large language models (LLMs), or output previously generated utilizing LLMs, to reflect the given persona in the given assistant output.
-
公开(公告)号:US20240304184A1
公开(公告)日:2024-09-12
申请号:US18120216
申请日:2023-03-10
Applicant: GOOGLE LLC
Inventor: Roberto Pieraccini , Wangqing Yuan , Martin Baeuml
CPC classification number: G10L15/197 , G06F40/35 , G10L15/063 , G10L15/1807 , G10L15/1815 , G10L15/22 , G10L15/30
Abstract: As part of an ongoing dialog between a user and an automated assistant, processor(s) can receive a natural language (NL) based input from the user during a turn of the ongoing dialog, obtain style signal(s) for the turn, and determine, based on the style signal(s), a NL based response style that is not specified in the NL based input. Further, the processor(s) can process, using a large language model (LLM), the NL based input and a NL based response style tag for the NL based response style to generate LLM output, determine, based on the LLM output, a NL based response in the NL based response style, and cause the NL based response to be rendered. In some implementations, a LLM behavior controller is utilized to determine the NL based response style, whereas in other implementations, the LLM is fine-tuned to determine the NL based response style.
-
3.
公开(公告)号:US20230343324A1
公开(公告)日:2023-10-26
申请号:US17744440
申请日:2022-05-13
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Gianluca Martini
IPC: G06V40/20 , G10L25/57 , G10L15/06 , H04N5/04 , G06F40/169 , G10L15/183 , G06T7/20 , G10L13/08 , G10L15/22 , G06V20/40 , G10L13/02
CPC classification number: G10L15/22 , G06F40/169 , G06T7/20 , G06V20/40 , G06V40/20 , G10L13/02 , G10L13/08 , G10L15/063 , G10L15/183 , G10L25/57 , H04N5/04 , G06T2207/10016 , G06T2207/30196
Abstract: Implementations relate to dynamically adapting a given assistant output based on a given persona, from among a plurality of disparate personas, assigned to an automated assistant. In some implementations, the given assistant output can be generated and subsequently adapted based on the given persona assigned to the automated assistant. In other implementations, the given assistant output can be generated specific to the given persona and without having to subsequently adapt the given assistant output to the given persona. Notably, the given assistant output can include a stream of textual content to be synthesized for audible presentation to the user, and a stream of visual cues utilized in controlling a display of a client device and/or in controlling a visualized representation of the automated assistant. Various implementations utilize large language models (LLMs), or output previously generated utilizing LLMs, to reflect the given persona in the given assistant output.
-
公开(公告)号:US20230074406A1
公开(公告)日:2023-03-09
申请号:US17532794
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Vikram Sridar , Daniel De Freitas Adiwardana , Noam M. Shazeer , Quoc Le
IPC: G10L15/183 , G10L15/22
Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.
-
5.
公开(公告)号:US20230343323A1
公开(公告)日:2023-10-26
申请号:US17726244
申请日:2022-04-21
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Gianluca Martini
CPC classification number: G10L13/10 , G10L15/22 , G10L15/1815 , G10L2015/223
Abstract: Implementations relate to dynamically adapting a given assistant output based on a given persona, from among a plurality of disparate personas, assigned to an automated assistant. In some implementations, the given assistant output can be generated and subsequently adapted based on the given persona assigned to the automated assistant. In other implementations, the given assistant output can be generated specific to the given persona and without having to subsequently adapt the given assistant output to the given persona. Notably, the given assistant output can include a stream of textual content to be synthesized for audible presentation to the user, and a stream of visual cues utilized in controlling a display of a client device and/or in controlling a visualized representation of the automated assistant. Various implementations utilize large language models (LLMs), or output previously generated utilizing LLMs, to reflect the given persona in the given assistant output.
-
公开(公告)号:US20250037711A1
公开(公告)日:2025-01-30
申请号:US18912175
申请日:2024-10-10
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Vikram Sridar , Daniel De Freitas Adiwardana , Noam M. Shazeer , Quoc Le
IPC: G10L15/183 , G06F16/9032 , G10L15/22
Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.
-
公开(公告)号:US12148421B2
公开(公告)日:2024-11-19
申请号:US17532794
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Thushan Amarasiriwardena , Roberto Pieraccini , Vikram Sridar , Daniel De Freitas Adiwardana , Noam M. Shazeer , Quoc Le
IPC: G10L15/22 , G06F16/9032 , G10L15/183
Abstract: As part of a dialog session between a user and an automated assistant, implementations can receive a stream of audio data that captures a spoken utterance including an assistant query, determine, based on processing the stream of audio data, a set of assistant outputs that are each predicted to be responsive to the assistant query, process, using large language model (LLM) output(s), the assistant outputs and context of the dialog session to generate a set of modified assistant outputs, and cause given modified assistant output, from among the set of modified assistant outputs, to be provided for presentation to the user in response to the spoken utterance. In some implementations, the LLM output(s) can be generated in an offline manner for subsequent use in an online manner. In additional or alternative implementations, the LLM output(s) can be generated in an online manner when the spoken utterance is received.
-
-
-
-
-
-