-
公开(公告)号:US20230015169A1
公开(公告)日:2023-01-19
申请号:US17933164
申请日:2022-09-19
申请人: Google LLC
发明人: Yeming Fang , Quan Wang , Pedro Jose Moreno Mengibar , Ignacio Lopez Moreno , Gang Feng , Fang Chu , Jin Shi , Jason William Pelecanos
IPC分类号: G10L17/06
摘要: A method of generating an accurate speaker representation for an audio sample includes receiving a first audio sample from a first speaker and a second audio sample from a second speaker. The method includes dividing a respective audio sample into a plurality of audio slices. The method also includes, based on the plurality of slices, generating a set of candidate acoustic embeddings where each candidate acoustic embedding includes a vector representation of acoustic features. The method further includes removing a subset of the candidate acoustic embeddings from the set of candidate acoustic embeddings. The method additionally includes generating an aggregate acoustic embedding from the remaining candidate acoustic embeddings in the set of candidate acoustic embeddings after removing the subset of the candidate acoustic embeddings.
-
公开(公告)号:US11468900B2
公开(公告)日:2022-10-11
申请号:US17071223
申请日:2020-10-15
申请人: Google LLC
发明人: Yeming Fang , Quan Wang , Pedro Jose Moreno Mengibar , Ignacio Lopez Moreno , Gang Feng , Fang Chu , Jin Shi , Jason William Pelecanos
摘要: A method of generating an accurate speaker representation for an audio sample includes receiving a first audio sample from a first speaker and a second audio sample from a second speaker. The method includes dividing a respective audio sample into a plurality of audio slices. The method also includes, based on the plurality of slices, generating a set of candidate acoustic embeddings where each candidate acoustic embedding includes a vector representation of acoustic features. The method further includes removing a subset of the candidate acoustic embeddings from the set of candidate acoustic embeddings. The method additionally includes generating an aggregate acoustic embedding from the remaining candidate acoustic embeddings in the set of candidate acoustic embeddings after removing the subset of the candidate acoustic embeddings.
-
公开(公告)号:US20220122612A1
公开(公告)日:2022-04-21
申请号:US17071223
申请日:2020-10-15
申请人: Google LLC
发明人: Yeming Fang , Quan Wang , Pedro Jose Moreno Mengibar , Ignacio Lopez Moreno , Gang Feng , Fang Chu , Jin Shi , Jason William Pelecanos
IPC分类号: G10L17/06
摘要: A method of generating an accurate speaker representation for an audio sample includes receiving a first audio sample from a first speaker and a second audio sample from a second speaker. The method includes dividing a respective audio sample into a plurality of audio slices. The method also includes, based on the plurality of slices, generating a set of candidate acoustic embeddings where each candidate acoustic embedding includes a vector representation of acoustic features. The method further includes removing a subset of the candidate acoustic embeddings from the set of candidate acoustic embeddings. The method additionally includes generating an aggregate acoustic embedding from the remaining candidate acoustic embeddings in the set of candidate acoustic embeddings after removing the subset of the candidate acoustic embeddings.
-
-