Patent search ap:("GOOGLE LLC") AND inv:"Neil Zeghidour" Page 1

1.

发明公开
GENERATING CODED DATA REPRESENTATIONS USING NEURAL NETWORKS AND VECTOR QUANTIZERS 审中-公开

公开(公告)号：US20240185870A1

公开(公告)日：2024-06-06

申请号：US18400992

申请日：2023-12-29

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G06N3/045 , G06N3/08 , G10L25/30

CPC classification number: G10L19/038 , G06N3/045 , G06N3/08 , G10L25/30 , G10L2019/0002

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. According to one aspect, there is provided a method comprising: receiving a new input; processing the new input using an encoder neural network to generate a feature vector representing the new input; and generating a coded representation of the feature vector using a sequence of vector quantizers that are each associated with a respective codebook of code vectors, wherein the coded representation of the feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector.

2.

发明申请
GENERATING CODED DATA REPRESENTATIONS USING NEURAL NETWORKS AND VECTOR QUANTIZERS 有权

公开(公告)号：US20250131932A1

公开(公告)日：2025-04-24

申请号：US18972483

申请日：2024-12-06

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G06N3/045 , G06N3/08 , G10L19/00 , G10L25/30

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. According to one aspect, there is provided a method comprising: receiving a new input; processing the new input using an encoder neural network to generate a feature vector representing the new input; and generating a coded representation of the feature vector using a sequence of vector quantizers that are each associated with a respective codebook of code vectors, wherein the coded representation of the feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector.

3.

发明申请
GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS 有权

公开(公告)号：US20240371366A1

公开(公告)日：2024-11-07

申请号：US18663899

申请日：2024-05-14

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier , Marco Tagliasacchi , Raphaël Marinier , Olivier Teboul , Zalán Borsos

IPC: G10L15/16 , G06N3/0455 , G06N3/0475 , G10H1/00 , G10L15/06 , G10L15/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

4.

发明公开
GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS 审中-公开

公开(公告)号：US20240233713A1

公开(公告)日：2024-07-11

申请号：US18412394

申请日：2024-01-12

Applicant: Google LLC

Inventor： Andrea Agostinelli , Timo Immanuel Denk , Antoine Caillon , Neil Zeghidour , Jesse Engel , Mauro Verzetti , Christian Frank , Zalán Borsos , Matthew Sharifi , Adam Joseph Roberts , Marco Tagliasacchi

IPC: G10L15/16 , G06N3/0455 , G06N3/0475 , G10H1/00 , G10L15/06 , G10L15/18

CPC classification number: G10L15/16 , G06N3/0455 , G06N3/0475 , G10H1/0008 , G10L15/063 , G10L15/1815 , G10H2210/056 , G10H2250/311

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

5.

发明授权
Generating audio using auto-regressive generative neural networks 有权

公开(公告)号：US12020138B2

公开(公告)日：2024-06-25

申请号：US18463092

申请日：2023-09-07

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier , Marco Tagliasacchi , Raphaël Marinier , Olivier Teboul , Zalán Borsos

IPC: G06N3/0455 , G06N3/0475

CPC classification number: G06N3/0455 , G06N3/0475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

6.

发明授权
Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations 有权

公开(公告)号：US11475909B2

公开(公告)日：2022-10-18

申请号：US17170657

申请日：2021-02-08

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier

IPC: G10L21/028 , G10L21/0316 , G10L17/04 , G10L17/18 , G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

7.

发明授权
Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations 有权

公开(公告)号：US12236970B2

公开(公告)日：2025-02-25

申请号：US17967726

申请日：2022-10-17

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier

IPC: G10L21/028 , G06N3/045 , G06N3/08 , G10L17/04 , G10L17/18 , G10L21/0208 , G10L21/0272 , G10L21/0316 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

8.

发明申请
USING MACHINE LEARNING AND DISCRETE TOKENS TO ESTIMATE DIFFERENT SOUND SOURCES FROM AUDIO MIXTURES 有权

公开(公告)号：US20250054500A1

公开(公告)日：2025-02-13

申请号：US18233323

申请日：2023-08-13

Applicant: Google LLC

Inventor： Hakan Erdogan , Scott Thomas Wisdom , John Hershey , Zalán Borsos , Marco Tagliasacchi , Neil Zeghidour , Xuankai Chang

IPC: G10L17/20 , G10L17/02 , G10L17/04 , G10L17/06 , G10L17/18

Abstract: A system and method are disclosed. Audio input comprising the mixed audio signals is received by one or more client devices. The audio input is converted into a plurality of discrete tokens. A plurality of sound sources, each corresponding to a subset of discrete tokens of a plurality of subsets of discrete tokens, is determined using a trained machine learning model.

9.

发明授权
Spatial audio recording from home assistant devices 有权

公开(公告)号：US12200465B2

公开(公告)日：2025-01-14

申请号：US17748356

申请日：2022-05-19

Applicant: Google LLC

Inventor： Rajeev Conrad Nongpiur , Qian Zhang , Andrew James Sutter , Kung-Wei Liu , Jihan Li , Hélène Bahu , Leonardo Kusumo , Sze Chie Lim , Marco Tagliasacchi , Neil Zeghidour , Michael Takezo Chinen

IPC: H04S3/00 , G06N20/00 , G10L19/008 , H04R3/00 , H04R5/027 , H04S7/00

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.

10.

发明申请
AUDIO-FOCUS FOR AMBIENT NOISE CANCELLATION 有权

公开(公告)号：US20240428818A1

公开(公告)日：2024-12-26

申请号：US18751015

申请日：2024-06-21

Applicant: GOOGLE LLC

Inventor： Rajeev Nongpiur , Neil Zeghidour , Marco Tagliasacchi

IPC: G10L21/0364 , G10L15/06 , G10L21/0208 , G10L21/0232 , G10L21/034 , G10L25/30

Abstract: A method including identifying an audio capture device and a target direction associated with the audio capture device, detecting first audio associated with the target direction, enhancing the first audio using a machine learning model configured to detect audio associated with the target direction, optionally, detecting second audio associated with a direction different from the target direction, and optionally, diminishing the second audio using the machine learning model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification