- 专利标题: Multi-tap minimum variance distortionless response beamformer with neural networks for target speech separation
-
申请号: US16926138申请日: 2020-07-10
-
公开(公告)号: US11423906B2公开(公告)日: 2022-08-23
- 发明人: Yong Xu , Meng Yu , Shi-Xiong Zhang , Chao Weng , Jianming Liu , Dong Yu
- 申请人: TENCENT AMERICA LLC
- 申请人地址: US CA Palo Alto
- 专利权人: TENCENT AMERICA LLC
- 当前专利权人: TENCENT AMERICA LLC
- 当前专利权人地址: US CA Palo Alto
- 代理机构: Sughrue Mion, PLLC
- 主分类号: G10L15/25
- IPC分类号: G10L15/25
摘要:
A method, computer system, and computer readable medium are provided for automatic speech recognition. Video data and audio data corresponding to one or more speakers is received. A minimum variance distortionless response function is applied to the received audio and video data. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated based on back-propagating the output of the applied minimum variance distortionless response function.
公开/授权文献
信息查询