专利检索 ap:("TENCENT AMERICA LLC") AND inv:"Chih-Kuan Yeh" 第 1 页

1.

发明授权
Unsupervised automatic speech recognition 有权

公开(公告)号：US11138966B2

公开(公告)日：2021-10-05

申请号：US16269951

申请日：2019-02-07

申请人： TENCENT AMERICA LLC

发明人： Jianshu Chen , Chengzhu Yu , Dong Yu , Chih-Kuan Yeh

IPC分类号： G10L15/00 , G10L15/06 , G10L15/02 , G10L15/22 , G10L15/16 , G10L15/30 , G06N3/04 , G06N3/08 , G06F40/20 , G10L15/187

摘要： A method for generating an automatic speech recognition (ASR) model using unsupervised learning includes obtaining, by a device, text information. The method includes determining, by the device, a set of phoneme sequences associated with the text information. The method includes obtaining, by the device, speech waveform data. The method includes determining, by the device, a set of phoneme boundaries associated with the speech waveform data. The method includes generating, by the device, the ASR model using an output distribution matching (ODM) technique based on determining the set of phoneme sequences associated with the text information and based on determining the set of phoneme boundaries associated with the speech waveform data.

2.

发明申请
UNSUPERVISED AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20200258497A1

公开(公告)日：2020-08-13

申请号：US16269951

申请日：2019-02-07

申请人： TENCENT AMERICA LLC

发明人： Jianshu Chen , Chengzhu Yu , Dong Yu , Chih-Kuan Yeh

IPC分类号： G10L15/06 , G10L15/02 , G10L15/22 , G10L15/16 , G06F17/27 , G10L15/30 , G06N3/04 , G06N3/08

摘要： A method for generating an automatic speech recognition (ASR) model using unsupervised learning includes obtaining, by a device, text information. The method includes determining, by the device, a set of phoneme sequences associated with the text information. The method includes obtaining, by the device, speech waveform data. The method includes determining, by the device, a set of phoneme boundaries associated with the speech waveform data. The method includes generating, by the device, the ASR model using an output distribution matching (ODM) technique based on determining the set of phoneme sequences associated with the text information and based on determining the set of phoneme boundaries associated with the speech waveform data.