Patent search ap:("Google LLC") AND inv:"Marco Tagliasacchi" Page 4

31.

发明公开
Machine Learning Based Enhancement of Audio for a Voice Call 审中-公开

公开(公告)号：US20240153514A1

公开(公告)日：2024-05-09

申请号：US18548949

申请日：2021-03-05

Applicant: Google LLC

Inventor： Omer Ahmed Siddig Osman , Dominik Roblek , Yunpeng Li , Marco Tagliasacchi , Oleg Rybakov , Victor Ungureanu , Eric Giguere

IPC: G10L19/06 , G10L19/16 , G10L25/30 , G10L25/69

CPC classification number: G10L19/06 , G10L19/167 , G10L25/30 , G10L25/69

Abstract: Apparatus and methods related to enhancement of audio content are provided. An example method includes receiving, by a computing device and via a communications network interface, a compressed audio data frame, wherein the compressed audio data frame is received after transmission over a communications network, The method further includes decompressing the compressed audio data frame to extract an audio waveform. The method also includes predicting, by applying a neural network to the audio waveform, an enhanced version of the audio waveform, wherein the neural network has been trained on (i) a ground truth sample comprising unencoded audio waveforms prior to compression by an audio encoder, and (ii) a training dataset comprising decoded audio waveforms after compression of the unencoded audio waveforms by the audio encoder. The method additionally includes providing, by an audio output component of the computing device, the enhanced version of the audio waveform.

32.

发明公开
GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS 审中-公开

公开(公告)号：US20240078412A1

公开(公告)日：2024-03-07

申请号：US18463092

申请日：2023-09-07

Applicant: Google LLC

Inventor： Neil Zeghidour , David Grangier , Marco Tagliasacchi , Raphaël Marinier , Olivier Teboul , Zalán Borsos

IPC: G06N3/0455 , G06N3/0475

CPC classification number: G06N3/0455 , G06N3/0475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

33.

发明授权
Generating audio using auto-regressive generative neural networks 有权

公开(公告)号：US11915689B1

公开(公告)日：2024-02-27

申请号：US18463196

申请日：2023-09-07

Applicant: Google LLC

Inventor： Andrea Agostinelli , Timo Immanuel Denk , Antoine Caillon , Neil Zeghidour , Jesse Engel , Mauro Verzetti , Christian Frank , Zalán Borsos , Matthew Sharifi , Adam Joseph Roberts , Marco Tagliasacchi

IPC: G06F40/30 , G10L15/16 , G10L15/18 , G10H1/00 , G10L15/06

CPC classification number: G10L15/16 , G10H1/0008 , G10L15/063 , G10L15/1815 , G10H2210/056 , G10H2250/311

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

34.

发明公开
Spatial Audio Recording from Home Assistant Devices 审中-公开

公开(公告)号：US20230379645A1

公开(公告)日：2023-11-23

申请号：US17748356

申请日：2022-05-19

Applicant: Google LLC

Inventor： Rajeev Conrad Nongpiur , Qian Zhang , Andrew James Sutter , Kung-Wei Liu , Jihan Li , Hélène Bahu , Leonardo Kusumo , Sze Chie Lim , Marco Tagliasacchi , Neil Zeghidour , Michael Takezo Chinen

IPC: H04S7/00 , G10L19/008 , H04R5/027 , H04R3/00 , H04S3/00 , G06N20/00

CPC classification number: H04S7/30 , G10L19/008 , H04R5/027 , H04R3/005 , H04S3/008 , G06N20/00 , H04S2420/11 , H04S2400/11 , H04S2400/15 , H04S2420/03 , H04S2400/01 , H04R2420/07

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.

35.

发明公开
COMPRESSING AUDIO WAVEFORMS USING NEURAL NETWORKS AND VECTOR QUANTIZERS 审中-公开

公开(公告)号：US20230186927A1

公开(公告)日：2023-06-15

申请号：US18106094

申请日：2023-02-06

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G06N3/045 , G06N3/08 , G10L25/30

CPC classification number: G10L19/038 , G06N3/045 , G06N3/08 , G10L25/30 , G10L2019/0002

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification