- 专利标题: Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations
-
申请号: US17170657申请日: 2021-02-08
-
公开(公告)号: US11475909B2公开(公告)日: 2022-10-18
- 发明人: Neil Zeghidour , David Grangier
- 申请人: Google LLC
- 申请人地址: US CA Mountain View
- 专利权人: Google LLC
- 当前专利权人: Google LLC
- 当前专利权人地址: US CA Mountain View
- 代理机构: Fish & Richardson P.C.
- 主分类号: G10L21/028
- IPC分类号: G10L21/028 ; G10L21/0316 ; G10L17/04 ; G10L17/18 ; G06N3/04 ; G06N3/08
摘要:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.
公开/授权文献
信息查询