-
公开(公告)号:US20240428816A1
公开(公告)日:2024-12-26
申请号:US18797400
申请日:2024-08-07
Applicant: Google LLC
Inventor: Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias
IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.
-
公开(公告)号:US12073844B2
公开(公告)日:2024-08-27
申请号:US17601042
申请日:2020-10-01
Applicant: Google LLC
Inventor: Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias
IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57
CPC classification number: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57 , G10L2021/02087
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.
-
公开(公告)号:US11593712B2
公开(公告)日:2023-02-28
申请号:US16865787
申请日:2020-05-04
Applicant: Google LLC
Inventor: Barron Webster , Irene Alvarado , Kyle Phillips , Alexander Chen , Jonas Pieter Halfdan Jongejan , Jordan Griffith , Amit Pitaru
IPC: G06N20/00 , G06F3/0485 , G06F16/28
Abstract: One or more processors may output for display, an interface including a data classification section including two or more class nodes, a training section including a training node, and an evaluation section including an evaluation node. At a first class node a first set of training data may be captured and at a second class node a second set of training data may be captured. In response to an input received at the training node, a classification model based on the first set of training data and the second set of training data may be trained. Evaluation data may be captured in an evaluation node, and using the trained classification model, classifications for each piece of the evaluation data may be determined. A visual representation of the classification for each piece of the evaluation data may be output for display within the evaluation node.
-
公开(公告)号:US20230267942A1
公开(公告)日:2023-08-24
申请号:US17601042
申请日:2020-10-01
Applicant: Google LLC
Inventor: Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias
IPC: G10L21/0208 , G10L25/57
CPC classification number: G10L21/0208 , G10L25/57 , G10L2021/02087
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.
-
公开(公告)号:US20210342739A1
公开(公告)日:2021-11-04
申请号:US16865787
申请日:2020-05-04
Applicant: Google LLC
Inventor: Barron Webster , Irene Alvarado , Kyle Phillips , Alexander Chen , Jonas Pieter Halfdan Jongejan , Jordan Griffith , Amit Pitaru
IPC: G06N20/00 , G06F16/28 , G06F3/0485
Abstract: One or more processors may output for display, an interface including a data classification section including two or more class nodes, a training section including a training node, and an evaluation section including an evaluation node. At a first class node a first set of training data may be captured and at a second class node a second set of training data may be captured. In response to an input received at the training node, a classification model based on the first set of training data and the second set of training data may be trained. Evaluation data may be captured in an evaluation node, and using the trained classification model, classifications for each piece of the evaluation data may be determined. A visual representation of the classification for each piece of the evaluation data may be output for display within the evaluation node.
-
-
-
-