-
公开(公告)号:US11688404B2
公开(公告)日:2023-06-27
申请号:US17303283
申请日:2021-05-26
Applicant: Google LLC
Inventor: Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu
CPC classification number: G10L17/04 , G10L15/04 , G10L15/075 , G10L15/26 , G10L17/00 , G10L17/02 , G10L17/18
Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.
-
公开(公告)号:US20200219517A1
公开(公告)日:2020-07-09
申请号:US16242541
申请日:2019-01-08
Applicant: Google LLC
Inventor: Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu
Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.
-
公开(公告)号:US20210280197A1
公开(公告)日:2021-09-09
申请号:US17303283
申请日:2021-05-26
Applicant: Google LLC
Inventor: Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu
Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.
-
公开(公告)号:US11031017B2
公开(公告)日:2021-06-08
申请号:US16242541
申请日:2019-01-08
Applicant: Google LLC
Inventor: Chong Wang , Aonan Zhang , Quan Wang , Zhenyao Zhu
Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker-discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.
-
-
-