Patent search ap:("Google LLC") AND inv:"Chong Wang" Page 1

1.

发明授权
Fully supervised speaker diarization 有权

公开(公告)号：US11688404B2

公开(公告)日：2023-06-27

申请号：US17303283

申请日：2021-05-26

Applicant: Google LLC

Inventor： Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu

IPC: G10L17/04 , G10L15/04 , G10L15/07 , G10L17/02 , G10L17/18 , G10L15/26 , G10L17/00

CPC classification number: G10L17/04 , G10L15/04 , G10L15/075 , G10L15/26 , G10L17/00 , G10L17/02 , G10L17/18

Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

2.

发明申请
Fully Supervised Speaker Diarization 审中-公开

公开(公告)号：US20200219517A1

公开(公告)日：2020-07-09

申请号：US16242541

申请日：2019-01-08

Applicant: Google LLC

Inventor： Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu

IPC: G10L17/04 , G10L17/00 , G10L17/18 , G10L17/02 , G10L15/07 , G10L15/26 , G10L15/04

Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

3.

发明申请
Fully Supervised Speaker Diarization 有权

公开(公告)号：US20210280197A1

公开(公告)日：2021-09-09

申请号：US17303283

申请日：2021-05-26

Applicant: Google LLC

Inventor： Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu

IPC: G10L17/04 , G10L15/04 , G10L15/07 , G10L17/02 , G10L17/18 , G10L15/26 , G10L17/00

Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

4.

发明授权
Fully supervised speaker diarization 有权

公开(公告)号：US11031017B2

公开(公告)日：2021-06-08

申请号：US16242541

申请日：2019-01-08

Applicant: Google LLC

Inventor： Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu

IPC: G10L17/04 , G10L15/04 , G10L15/07 , G10L17/02 , G10L17/18 , G10L15/26 , G10L17/00

Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker-discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

Patent Agency Ranking