Patent search ap:("Google LLC") AND inv:"Joel Shor" Page 1

1.

发明授权
Image compression with recurrent neural networks 有权

公开(公告)号：US10192327B1

公开(公告)日：2019-01-29

申请号：US15424711

申请日：2017-02-03

Applicant: Google LLC

Inventor： George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell

IPC: G06T9/00 , G06K9/66 , G06K9/62

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

2.

发明授权
Self-supervised speech representations for fake audio detection 有权

公开(公告)号：US12198718B2

公开(公告)日：2025-01-14

申请号：US18446623

申请日：2023-08-09

Applicant: Google LLC

Inventor： Joel Shor , Alanna Foster Slocum

IPC: G10L25/69 , G10L15/02 , G10L15/06 , G10L15/22

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

3.

发明授权
Method for detecting and classifying coughs or other non-semantic sounds using audio feature set learned from speech 有权

公开(公告)号：US11862188B2

公开(公告)日：2024-01-02

申请号：US17507461

申请日：2021-10-21

Applicant: Google LLC

Inventor： Jacob Garrison , Jacob Scott Peplinski , Joel Shor

IPC: G10L25/66 , G10L15/02 , G10L15/06 , G10L15/04 , A61B5/00 , G16H40/67 , A61B5/08 , G10L25/78 , G10L25/51 , G10L25/30

CPC classification number: G10L25/66 , A61B5/0823 , A61B5/4803 , A61B5/7267 , A61B5/7282 , G10L15/02 , G10L15/04 , G10L15/063 , G10L25/30 , G10L25/51 , G10L25/78 , G16H40/67

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

4.

发明授权
Image compression with recurrent neural networks 有权

公开(公告)号：US10713818B1

公开(公告)日：2020-07-14

申请号：US16259207

申请日：2019-01-28

Applicant: Google LLC

Inventor： George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell

IPC: G06T9/00 , G06K9/62 , G06K9/66

Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.

5.

发明授权
Method for detecting and classifying coughs or other non-semantic sounds using audio feature set learned from speech 有权

公开(公告)号：US12249346B2

公开(公告)日：2025-03-11

申请号：US18509722

申请日：2023-11-15

Applicant: Google LLC

Inventor： Jacob Garrison , Jacob Scott Peplinski , Joel Shor

IPC: G10L25/66 , A61B5/00 , A61B5/08 , G10L15/02 , G10L15/04 , G10L15/06 , G10L25/30 , G10L25/51 , G10L25/78 , G16H40/67

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

6.

发明申请
Self-Supervised Speech Representations for Fake Audio Detection 有权

公开(公告)号：US20220172739A1

公开(公告)日：2022-06-02

申请号：US17110278

申请日：2020-12-02

Applicant: Google LLC

Inventor： Joel Shor , Joshua Foster Slocum

IPC: G10L25/69 , G10L15/06 , G10L15/02 , G10L15/22

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

7.

发明申请
Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech 有权

公开(公告)号：US20220130415A1

公开(公告)日：2022-04-28

申请号：US17507461

申请日：2021-10-21

Applicant: Google LLC

Inventor： Jacob Garrison , Jacob Scott Peplinski , Joel Shor

IPC: G10L25/66 , G10L15/02 , G10L15/06 , G10L15/04 , G10L25/78 , G16H40/67 , A61B5/08 , A61B5/00

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

8.

发明申请
Methods and Systems for Implementing On-Device Non-Semantic Representation Fine-Tuning for Speech Classification 有权

公开(公告)号：US20220059117A1

公开(公告)日：2022-02-24

申请号：US17000583

申请日：2020-08-24

Applicant: Google LLC

Inventor： Joel Shor , Ronnie Maor , Oran Lang , Omry Tuval , Marco Tagliasacchi , Ira Shavitt , Felix de Chaumont Quitry , Dotan Emanuel , Aren Jansen

IPC: G10L25/30 , G10L25/48 , G06N3/08 , G06N5/04 , G06K9/62

Abstract: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.

9.

发明授权
Methods and systems for implementing on-device non-semantic representation fine-tuning for speech classification 有权

公开(公告)号：US11996116B2

公开(公告)日：2024-05-28

申请号：US17000583

申请日：2020-08-24

Applicant: Google LLC

Inventor： Joel Shor , Ronnie Maor , Oran Lang , Omry Tuval , Marco Tagliasacchi , Ira Shavitt , Felix de Chaumont Quitry , Dotan Emanuel , Aren Jansen

IPC: G10L25/30 , G06F18/21 , G06N3/084 , G06N3/088 , G06N5/046 , G10L25/48

CPC classification number: G10L25/30 , G06F18/217 , G06N3/084 , G06N3/088 , G06N5/046 , G10L25/48

Abstract: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.

10.

发明公开
Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech 审中-公开

公开(公告)号：US20240161769A1

公开(公告)日：2024-05-16

申请号：US18509722

申请日：2023-11-15

Applicant: Google LLC

Inventor： Jacob Garrison , Jacob Scott Peplinski , Joel Shor

IPC: G10L25/66 , A61B5/00 , A61B5/08 , G10L15/02 , G10L15/04 , G10L15/06 , G10L25/30 , G10L25/51 , G10L25/78 , G16H40/67

CPC classification number: G10L25/66 , A61B5/0823 , A61B5/4803 , A61B5/7267 , A61B5/7282 , G10L15/02 , G10L15/04 , G10L15/063 , G10L25/30 , G10L25/51 , G10L25/78 , G16H40/67

Abstract: A method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. An embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. The embedding for each of the segments is provided to a model performing cough detection inference. This model generates a probability that each of the segments of the input audio sequence includes a cough episode. The method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification