Patent search ap:("Google LLC") AND inv:"Marco Tagliasacchi" Page 3

21.

发明授权
Generating audio using auto-regressive generative neural networks 有权

公开(公告)号：US12020138B2

公开(公告)日：2024-06-25

申请号：US18463092

申请日：2023-09-07

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier , Marco Tagliasacchi , Raphaël Marinier , Olivier Teboul , Zalán Borsos

IPC: G06N3/0455 , G06N3/0475

CPC classification number: G06N3/0455 , G06N3/0475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

22.

发明公开
Conditioned Separation of Arbitrary Sounds based on Machine Learning Models 审中-公开

公开(公告)号：US20230419989A1

公开(公告)日：2023-12-28

申请号：US17808653

申请日：2022-06-24

Applicant: Google LLC

Inventor： Beat Gfeller , Kevin Ian Kilgour , Marco Tagliasacchi , Aren Jansen , Scott Thomas Wisdom , Qingqing Huang

IPC: G10L25/84 , G10L15/16 , G10L15/06 , G06N3/04

CPC classification number: G10L25/84 , G10L15/16 , G10L15/063 , G06N3/0454

Abstract: Example methods include receiving training data comprising a plurality of audio clips and a plurality of textual descriptions of audio. The methods include generating a shared representation comprising a joint embedding. An audio embedding of a given audio clip is within a threshold distance of a text embedding of a textual description of the given audio clip. The methods include generating, based on the joint embedding, a conditioning vector and training, based on the conditioning vector, a neural network to: receive (i) an input audio waveform, and (ii) an input comprising one or more of an input textual description of a target audio source in the input audio waveform, or an audio sample of the target audio source, separate audio corresponding to the target audio source from the input audio waveform, and output the separated audio corresponding to the target audio source in response to the receiving of the input.

23.

发明申请
Self-Supervised Audio Representation Learning for Mobile Devices 有权

公开(公告)号：US20230085596A1

公开(公告)日：2023-03-16

申请号：US17986477

申请日：2022-11-14

Applicant: Google LLC

Inventor： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC: G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18

Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

24.

发明申请
GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS 有权

公开(公告)号：US20230013370A1

公开(公告)日：2023-01-19

申请号：US17856292

申请日：2022-07-01

Applicant: Google LLC

Inventor： Yunpeng Li , Marco Tagliasacchi , Dominik Roblek , Félix de Chaumont Quitry , Beat Gfeller , Hannah Raphaelle Muckenhirn , Victor Ungureanu , Oleg Rybakov , Karolis Misiunas , Zalán Borsos

IPC: G10L19/022 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

25.

发明申请
Methods and Systems for Implementing On-Device Non-Semantic Representation Fine-Tuning for Speech Classification 有权

公开(公告)号：US20220059117A1

公开(公告)日：2022-02-24

申请号：US17000583

申请日：2020-08-24

Applicant: Google LLC

Inventor： Joel Shor , Ronnie Maor , Oran Lang , Omry Tuval , Marco Tagliasacchi , Ira Shavitt , Felix de Chaumont Quitry , Dotan Emanuel , Aren Jansen

IPC: G10L25/30 , G10L25/48 , G06N3/08 , G06N5/04 , G06K9/62

Abstract: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.

26.

发明申请
SEMI-SUPERVISED TEXT-TO-SPEECH BY GENERATING SEMANTIC AND ACOUSTIC REPRESENTATIONS 有权

公开(公告)号：US20250157456A1

公开(公告)日：2025-05-15

申请号：US18832325

申请日：2024-01-26

Applicant: Google LLC

Inventor： Evgeny Kharitonov , Damien Vincent , Zalán Borsos , Raphaël Marinier , Olivier Claude Pietquin , Matthew Sharifi , Marco Tagliasacchi , Neil Zeghidour

IPC: G10L13/027 , G06F40/284 , G06F40/30 , G10L13/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an audio signal from input text. In one aspect, a method comprises receiving a request to convert input text into an audio signal, wherein the input text comprises multiple tokenized text inputs, generating, using a first generative neural network, a semantic representation of the tokenized text inputs comprising semantic tokens representing semantic content of the tokenized text inputs, each semantic token being selected from a vocabulary of semantic tokens, generating, using a second generative neural network and conditioned on at least the semantic representation, an acoustic representation of the semantic representation comprising one or more respective acoustic tokens representing acoustic properties of the audio signal, and processing the acoustic representation using a decoder neural network to generate the audio signal.

27.

发明申请
GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS 有权

公开(公告)号：US20250078848A1

公开(公告)日：2025-03-06

申请号：US18952607

申请日：2024-11-19

Applicant: Google LLC

Inventor： Yunpeng Li , Marco Tagliasacchi , Dominik Roblek , Félix de Chaumont Quitry , Beat Gfeller , Hannah Raphaelle Muckenhirn , Victor Ungureanu , Oleg Rybakov , Karolis Misiunas , Zalán Borsos

IPC: G10L19/022 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

28.

发明申请
COMPRESSING AUDIO WAVEFORMS USING A STRUCTURED LATENT SPACE 有权

公开(公告)号：US20250022477A1

公开(公告)日：2025-01-16

申请号：US18278746

申请日：2023-03-16

Applicant: Google LLC

Inventor： Ahmed Omran , Neil Zeghidour , Zalán Borsos , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC: G10L19/038 , G10L25/30 , G10L25/60

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network and a decoder neural network. In one aspect, a method includes obtaining a first initial audio waveform and a first noisy audio waveform, obtaining a second initial audio waveform and a second noisy audio waveform, processing the first noisy audio waveform and the second noisy audio waveform using an encoder neural network, generating a blended embedding by concatenating: (i) clean feature dimensions from an embedding of the first noisy audio waveform, and (ii) noise feature dimensions from an embedding of the second noisy audio waveform, processing the blended embedding using a decoder neural network to generate a reconstructed audio waveform, determining gradients of an objective function; and updating parameter values of the encoder neural network and the decoder neural network using the gradients.

29.

发明授权
Generating coded data representations using neural networks and vector quantizers 有权

公开(公告)号：US12198710B2

公开(公告)日：2025-01-14

申请号：US18400992

申请日：2023-12-29

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G06N3/045 , G06N3/08 , G10L19/00 , G10L25/30

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. According to one aspect, there is provided a method comprising: receiving a new input; processing the new input using an encoder neural network to generate a feature vector representing the new input; and generating a coded representation of the feature vector using a sequence of vector quantizers that are each associated with a respective codebook of code vectors, wherein the coded representation of the feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector.

30.

发明申请
PERFORMING TASKS USING GENERATIVE NEURAL NETWORKS 有权

公开(公告)号：US20240428056A1

公开(公告)日：2024-12-26

申请号：US18750973

申请日：2024-06-21

Applicant: Google LLC

Inventor： Paul Kishan Rubenstein , Matthew Sharifi , Alexandru Tudor , Chulayuth Asawaroengchai , Duc Dung Nguyen , Marco Tagliasacchi , Neil Zeghidour , Zalán Borsos , Christian Frank , Dalia Salem Hassan Fahmy Elbadawy , Hannah Raphaelle Muckenhirn , Dirk Ryan Padfield , Damien Vincent , Evgeny Kharitonov , Michelle Dana Tadmor , Mihajlo Velimirovic , Feifan Chen , Victoria Zayats

IPC: G06N3/0475 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing tasks. One of the methods includes obtaining a sequence of input tokens, where each token is selected from a vocabulary of tokens that includes text tokens and audio tokens, and wherein the sequence of input tokens includes tokens that describe a task to be performed and data for performing the task; generating a sequence of embeddings by embedding each token in the sequence of input tokens in an embedding space; and processing the sequence of embeddings using a language model neural network to generate a sequence of output tokens for the task, where each token is selected from the vocabulary.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification