-
公开(公告)号:US12283270B2
公开(公告)日:2025-04-22
申请号:US17541098
申请日:2021-12-02
Applicant: GOOGLE LLC
Inventor: Asaf Aharoni , Yaniv Leviathan , Eyal Segalis , Gal Elidan , Sasha Goldshtein , Tomer Amiaz , Deborah Cohen
Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).
-
2.
公开(公告)号:US20230419964A1
公开(公告)日:2023-12-28
申请号:US18462787
申请日:2023-09-07
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
公开(公告)号:US11790906B2
公开(公告)日:2023-10-17
申请号:US17157207
申请日:2021-01-25
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
CPC classification number: G10L15/22 , G10L15/063 , G10L2015/0635
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
4.
公开(公告)号:US20240146668A1
公开(公告)日:2024-05-02
申请号:US18403401
申请日:2024-01-03
Applicant: GOOGLE LLC
Inventor: Asaf Aharoni , Eyal Segalis , Ofer Ron , Sasha Goldshtein , Tomer Amiaz , Razvan Mathias , Yaniv Leviathan
CPC classification number: H04L51/02 , G06N20/00 , G10L15/063 , G10L15/10 , G10L15/22
Abstract: Implementations are directed to updating a trained voice bot that is deployed for conducting conversations on behalf of a third-party. A third-party developer can interact with a voice bot development system that enables the third-party developer to train, update, validate, and monitor performance of the trained voice bot. In various implementations, the trained voice bot can be updated by updating a corpus of training instances that was initially utilized to train the voice bot, and updating the trained voice bot based on the updated corpus. In some implementations, the corpus of training instances may be updated in response to identifying occurrence(s) of behavioral error(s) of the trained voice bot while the conversations are being conducted on behalf of the third-party. In additional or alternative implementations, the corpus of training instances may be updated in response to determining the trained voice bot does not include a desired behavior.
-
公开(公告)号:US11804211B2
公开(公告)日:2023-10-31
申请号:US17112418
申请日:2020-12-04
Applicant: Google LLC
Inventor: Asaf Aharoni , Yaniv Leviathan , Eyal Segalis , Gal Elidan , Sasha Goldshtein , Tomer Amiaz , Deborah Cohen
CPC classification number: G10L15/063 , G06N20/00 , G10L15/02 , G10L15/04 , G10L15/22 , H04L67/133 , H04M3/493 , G10L2015/0635
Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).
-
公开(公告)号:US20220238105A1
公开(公告)日:2022-07-28
申请号:US17157207
申请日:2021-01-25
Applicant: GOOGLE LLC
Inventor: Rafael Goldfarb , Or Guz , Lior Alon , Assaf Hurwitz Michaely , Golan Pundak , Shmuel Leibtag , Tomer Amiaz , Dan Rasin , Asaf Aharoni
Abstract: Implementations are directed to causing a voice bot to utilize a plurality of ML layers in resolving unique personal identifier(s) for a human while the voice bot is engaged in a corresponding conversation with the human. The unique personal identifier(s) can include a unique sequence of alphanumeric characters that is personal to the human. In some implementations, ASR speech hypothes(es) corresponding to spoken utterance(s) that include the unique personal identifier(s) can be processed to generate candidate unique personal identifier(s), given alphanumeric character(s) of the candidate unique personal identifier(s) can be selected, and the voice bot can prompt the human with clarification request(s) to clarify the given alphanumeric character(s) until it is predicted to correspond to the an actual unique personal identifier(s) for the human(s). The unique personal identifier(s) can then be utilized in performance of further action(s) by the voice bot and/or other systems.
-
公开(公告)号:US20220180858A1
公开(公告)日:2022-06-09
申请号:US17541098
申请日:2021-12-02
Applicant: GOOGLE LLC
Inventor: Asaf Aharoni , Yaniv LEVIATHAN , Eyal SEGALIS , Gal ELIDAN , Sasha Goldshtein , Tomer Amiaz , Deborah Cohen
Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).
-
公开(公告)号:US20210090570A1
公开(公告)日:2021-03-25
申请号:US16580726
申请日:2019-09-24
Applicant: Google LLC
Inventor: Asaf Aharoni , Arun Narayanan , Nir Shabat , Parisa Haghani , Galen Tsai Chuang , Yaniv Leviathan , Neeraj Gaur , Pedro J. Moreno Mengibar , Rohit Prakash Prabhavalkar , Zhongdi Qu , Austin Severn Waters , Tomer Amiaz , Michiel A.U. Bacchiani
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
-
公开(公告)号:US12254883B2
公开(公告)日:2025-03-18
申请号:US18635974
申请日:2024-04-15
Applicant: GOOGLE LLC
Inventor: Asaf Aharoni , Arun Narayanan , Nir Shabat , Parisa Haghani , Galen Tsai Chuang , Yaniv Leviathan , Neeraj Gaur , Pedro J. Moreno Mengibar , Rohit Prakash Prabhavalkar , Zhongdi Qu , Austin Severn Waters , Tomer Amiaz , Michiel A. U. Bacchiani
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
-
公开(公告)号:US20240265923A1
公开(公告)日:2024-08-08
申请号:US18635974
申请日:2024-04-15
Applicant: GOOGLE LLC
Inventor: Asaf Aharoni , Arun Narayanan , Nir Shabat , Parisa Haghani , Galen Tsai Chuang , Yaniv Leviathan , Neeraj Gaur , Pedro J. Moreno Mengibar , Rohit Prakash Prabhavalkar , Zhongdi Qu , Austin Severn Waters , Tomer Amiaz , Michiel A.U. Bacchiani
CPC classification number: G10L15/26 , G10L15/32 , H04M1/02 , H04M1/663 , H04M3/4286 , H04M3/5191
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
-
-
-
-
-
-
-
-
-