专利检索 ap:("Google LLC") AND inv:"Beat Gfeller" 第 1 页

1.

发明授权
Self-supervised audio representation learning for mobile devices 有权

公开(公告)号：US11501787B2

公开(公告)日：2022-11-15

申请号：US16548146

申请日：2019-08-22

申请人： Google LLC

发明人： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC分类号： G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18

摘要： Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

2.

发明授权
Self-supervised pitch estimation 有权

公开(公告)号：US11756530B2

公开(公告)日：2023-09-12

申请号：US17640579

申请日：2020-09-25

申请人： GOOGLE LLC

发明人： Marco Tagliasacchi , Mihajlo Velimirovic , Matthew Sharifi , Dominik Roblek , Christian Frank , Beat Gfeller

IPC分类号： G10L15/06 , G10L21/013 , G10L25/30 , G10L25/90

CPC分类号： G10L15/063 , G10L21/013 , G10L25/30 , G10L25/90

摘要： Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.

3.

发明申请
Self-Supervised Audio Representation Learning for Mobile Devices 有权

公开(公告)号：US20210056980A1

公开(公告)日：2021-02-25

申请号：US16548146

申请日：2019-08-22

申请人： Google LLC

发明人： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC分类号： G10L19/035 , G10L25/18 , G10L19/038 , G06N20/00

摘要： Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

4.

发明申请
Determining that Audio Includes Music and then Identifying the Music as a Particular Song 审中-公开

公开(公告)号：US20190102458A1

公开(公告)日：2019-04-04

申请号：US16148338

申请日：2018-10-01

申请人： Google LLC

发明人： Dominik Roblek , Blaise Aguera-Arcas , Tom Hume , Marvin Ritter , Brandon Barbello , Kevin Kilgour , Mihajlo Velimirovic , Christopher Walter George Thornton , Gabriel Taubman , James David Lyon , Jan Athaus , Katsiaryna Naliuka , Julian Odell , Matthew Sharifi , Beat Gfeller

IPC分类号： G06F17/30 , G06F3/16 , G06N3/08

摘要： In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.

5.

发明申请
Identifying Music as a Particular Song 审中-公开

公开(公告)号：US20190102144A1

公开(公告)日：2019-04-04

申请号：US16148401

申请日：2018-10-01

申请人： Google LLC

发明人： Dominik Roblek , Blaise Aguera-Arcas , Tom Hume , Marvin Ritter , Brandon Barbello , Kevin Kilgour , Mihajlo Velimirovic , Christopher Walter George Thornton , Gabriel Taubman , James David Lyon , Jan Althaus , Katsiaryna Naliuka , Julian Odell , Matthew Sharifi , Beat Gfeller

IPC分类号： G06F3/16 , G06F17/30

摘要： In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for indicating a reference song. A computing device stores reference song characterization data that identifies a plurality of audio characteristics for each reference song in a plurality of reference songs. The computing device receives digital audio data that represents audio recorded by a microphone, converts the digital audio data from time-domain format into frequency-domain format, and uses the digital audio data in the frequency-domain format in a music-characterization process. In response to determining that characterization values for the digital audio data are most relevant to characterization values for a particular reference song, the computing device outputs an indication of the particular reference song.

6.

发明公开
Conditioned Separation of Arbitrary Sounds based on Machine Learning Models 审中-公开

公开(公告)号：US20230419989A1

公开(公告)日：2023-12-28

申请号：US17808653

申请日：2022-06-24

申请人： Google LLC

发明人： Beat Gfeller , Kevin Ian Kilgour , Marco Tagliasacchi , Aren Jansen , Scott Thomas Wisdom , Qingqing Huang

IPC分类号： G10L25/84 , G10L15/16 , G10L15/06 , G06N3/04

CPC分类号： G10L25/84 , G10L15/16 , G10L15/063 , G06N3/0454

摘要： Example methods include receiving training data comprising a plurality of audio clips and a plurality of textual descriptions of audio. The methods include generating a shared representation comprising a joint embedding. An audio embedding of a given audio clip is within a threshold distance of a text embedding of a textual description of the given audio clip. The methods include generating, based on the joint embedding, a conditioning vector and training, based on the conditioning vector, a neural network to: receive (i) an input audio waveform, and (ii) an input comprising one or more of an input textual description of a target audio source in the input audio waveform, or an audio sample of the target audio source, separate audio corresponding to the target audio source from the input audio waveform, and output the separated audio corresponding to the target audio source in response to the receiving of the input.

7.

发明申请
Self-Supervised Audio Representation Learning for Mobile Devices 有权

公开(公告)号：US20230085596A1

公开(公告)日：2023-03-16

申请号：US17986477

申请日：2022-11-14

申请人： Google LLC

发明人： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC分类号： G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18

摘要： Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

8.

发明申请
GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS 有权

公开(公告)号：US20230013370A1

公开(公告)日：2023-01-19

申请号：US17856292

申请日：2022-07-01

申请人： Google LLC

发明人： Yunpeng Li , Marco Tagliasacchi , Dominik Roblek , Félix de Chaumont Quitry , Beat Gfeller , Hannah Raphaelle Muckenhirn , Victor Ungureanu , Oleg Rybakov , Karolis Misiunas , Zalán Borsos

IPC分类号： G10L19/022 , G06N3/04

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

9.

发明公开
Machine Learning for Microphone Style Transfer 审中-公开

公开(公告)号：US20230395087A1

公开(公告)日：2023-12-07

申请号：US18249126

申请日：2021-10-15

申请人： Google LLC

发明人： Marco Tagliasacchi , Beat Gfeller , Yunpeng Li , Zalán Borsos

IPC分类号： G10L21/007 , G10L15/06 , G10L15/08 , G10L25/18 , G10L21/0208 , G10L25/21

CPC分类号： G10L21/007 , G10L15/063 , G10L15/08 , G10L25/18 , G10L21/0208 , G10L25/21 , G10L2015/088

摘要： Example implementations of the present disclosure relate to machine learning for microphone style transfer, for example, to facilitate augmentation of audio data such as speech data to improve robustness of machine learning models trained on the audio data. Systems and methods for microphone style transfer can include one or more machine-learned microphone models trained to obtain and augment signal data to mimic characteristics of signal data obtained from a target microphone. The systems and methods can include a speech enhancement network for enhancing a sample before the style transfer. The augmentation output can then be utilized for a variety of downstream tasks.

10.

发明授权
Determining that audio includes music and then identifying the music as a particular song 有权

公开(公告)号：US11256472B2

公开(公告)日：2022-02-22

申请号：US17010694

申请日：2020-09-02

申请人： Google LLC

发明人： Dominik Roblek , Blaise Hilary Aguera-Arcas , Thomas W. Hume , Marvin Karl Ritter , Brandon Charles Barbello , Kevin I. Kilgour , Mihajlo Velimirović , Christopher Thornton , Gabriel Oak Taubman , James David Lyon , Jan Heinrich Althaus , Katsiaryna Naliuka , Julian James Odell , Matthew Sharifi , Beat Gfeller

IPC分类号： G06F3/16 , G06F16/635 , G06F16/683 , G06N3/08 , G06N20/00

摘要： In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类