Patent search ap:("Google LLC") AND inv:"Tal Remez" Page 1

1.

发明申请
Learning to Segment via Cut-and-Paste 有权

公开(公告)号：US20210256707A1

公开(公告)日：2021-08-19

申请号：US17252663

申请日：2019-07-10

Applicant: Google LLC

Inventor： Matthew Alun Brown , Jonathan Chung-Kuan Huang , Tal Remez

IPC: G06T7/194 , G06N3/04 , G06T7/11 , G06T11/20

Abstract: Example aspects of the present disclosure are directed to systems and methods that enable weakly-supervised learning of instance segmentation by applying a cut-and-paste technique to training of a generator model included in a generative adversarial network. In particular, the present disclosure provides a weakly-supervised approach to object instance segmentation. In some implementations, starting with known or predicted object bounding boxes, a generator model can learn to generate object masks by playing a game of cut-and-paste in an adversarial learning setup.

2.

发明授权
Audio-visual hearing aid 有权

公开(公告)号：US12073844B2

公开(公告)日：2024-08-27

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

CPC classification number: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

3.

发明授权
Learning to segment via cut-and-paste 有权

公开(公告)号：US11853892B2

公开(公告)日：2023-12-26

申请号：US17252663

申请日：2019-07-10

Applicant: Google LLC

Inventor： Matthew Alun Brown , Jonathan Chung-Kuan Huang , Tal Remez

IPC: G06T7/194 , G06T7/11 , G06T11/20 , G06V10/764 , G06V10/82 , G06N3/084 , G06N3/045

CPC classification number: G06N3/084 , G06N3/045 , G06T7/11 , G06T7/194 , G06T11/20 , G06V10/764 , G06V10/82 , G06T2207/20081 , G06T2207/20084 , G06T2210/12

Abstract: Example aspects of the present disclosure are directed to systems and methods that enable weakly-supervised learning of instance segmentation by applying a cut-and-paste technique to training of a generator model included in a generative adversarial network. In particular, the present disclosure provides a weakly-supervised approach to object instance segmentation. In some implementations, starting with known or predicted object bounding boxes, a generator model can learn to generate object masks by playing a game of cut-and-paste in an adversarial learning setup.

4.

发明公开
AUDIO-VISUAL HEARING AID 审中-公开

公开(公告)号：US20230267942A1

公开(公告)日：2023-08-24

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L25/57

CPC classification number: G10L21/0208 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

5.

发明申请
AUDIO-VISUAL HEARING AID 有权

公开(公告)号：US20240428816A1

公开(公告)日：2024-12-26

申请号：US18797400

申请日：2024-08-07

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

6.

发明公开
Robust Direct Speech-to-Speech Translation 审中-公开

公开(公告)号：US20240273311A1

公开(公告)日：2024-08-15

申请号：US18626745

申请日：2024-04-04

Applicant: Google LLC

Inventor： Ye Jia , Michelle Tadmor Ramanovich , Tal Remez , Roi Pomerantz

IPC: G06F40/58 , G10L13/02 , G10L13/10 , G10L19/16

CPC classification number: G06F40/58 , G10L13/02 , G10L13/10 , G10L19/16

Abstract: A direct speech-to-speech translation (S2ST) model includes an encoder configured to receive an input speech representation that to an utterance spoken by a source speaker in a first language and encode the input speech representation into a hidden feature representation. The S2ST model also includes an attention module configured to generate a context vector that attends to the hidden representation encoded by the encoder. The S2ST model also includes a decoder configured to receive the context vector generated by the attention module and predict a phoneme representation that corresponds to a translation of the utterance in a second different language. The S2ST model also includes a synthesizer configured to receive the context vector and the phoneme representation and generate a translated synthesized speech representation that corresponds to a translation of the utterance spoken in the different second language.

7.

发明授权
Robust direct speech-to-speech translation 有权

公开(公告)号：US11960852B2

公开(公告)日：2024-04-16

申请号：US17644351

申请日：2021-12-15

Applicant: Google LLC

Inventor： Ye Jia , Michelle Tadmor Ramanovich , Tal Remez , Roi Pomerantz

IPC: G06F40/58 , G10L13/02 , G10L13/10 , G10L19/16

CPC classification number: G06F40/58 , G10L13/02 , G10L13/10 , G10L19/16

Abstract: A direct speech-to-speech translation (S2ST) model includes an encoder configured to receive an input speech representation that to an utterance spoken by a source speaker in a first language and encode the input speech representation into a hidden feature representation. The S2ST model also includes an attention module configured to generate a context vector that attends to the hidden representation encoded by the encoder. The S2ST model also includes a decoder configured to receive the context vector generated by the attention module and predict a phoneme representation that corresponds to a translation of the utterance in a second different language. The S2ST model also includes a synthesizer configured to receive the context vector and the phoneme representation and generate a translated synthesized speech representation that corresponds to a translation of the utterance spoken in the different second language.

8.

发明申请
Robust Direct Speech-to-Speech Translation 有权

公开(公告)号：US20230013777A1

公开(公告)日：2023-01-19

申请号：US17644351

申请日：2021-12-15

Applicant: Google LLC

Inventor： Ye Jia , Michelle Tadmor Ramanovich , Tal Remez , Roi Pomerantz

IPC: G06F40/58 , G10L13/02 , G10L13/10 , G10L19/16

Abstract: A direct speech-to-speech translation (S2ST) model includes an encoder configured to receive an input speech representation that to an utterance spoken by a source speaker in a first language and encode the input speech representation into a hidden feature representation. The S2ST model also includes an attention module configured to generate a context vector that attends to the hidden representation encoded by the encoder. The S2ST model also includes a decoder configured to receive the context vector generated by the attention module and predict a phoneme representation that corresponds to a translation of the utterance in a second different language. The S2ST model also includes a synthesizer configured to receive the context vector and the phoneme representation and generate a translated synthesized speech representation that corresponds to a translation of the utterance spoken in the different second language.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification