Patent search ap:("GOOGLE LLC") AND inv:"Neil Zeghidour" Page 4

31.

发明申请
End-To-End Speech Diarization Via Iterative Speaker Embedding 有权

公开(公告)号：US20220375492A1

公开(公告)日：2022-11-24

申请号：US17304514

申请日：2021-06-22

Applicant: Google LLC

Inventor： David Grangier , Neil Zeghidour , Oliver Teboul

IPC: G10L25/78 , G10L19/008 , G06N3/04 , G10L15/07 , G10L15/06 , G10L17/18

Abstract: A method includes receiving an input audio signal corresponding to utterances spoken by multiple speakers. The method also includes encoding the input audio signal into a sequence of T temporal embeddings. During each of a plurality of iterations each corresponding to a respective speaker of the multiple speakers, the method includes selecting a respective speaker embedding for the respective speaker by determining a probability that the corresponding temporal embedding includes a presence of voice activity by a single new speaker for which a speaker embedding was not previously selected during a previous iteration and selecting the respective speaker embedding for the respective speaker as the temporal embedding. The method also includes, at each time step, predicting a respective voice activity indicator for each respective speaker of the multiple speakers based on the respective speaker embeddings selected during the plurality of iterations and the temporal embedding.

Patent Agency Ranking