On-device speech synthesis of textual segments for training of on-device speech recognition model

发明授权

US11127392B2 On-device speech synthesis of textual segments for training of on-device speech recognition model 有权

请登陆查看更多内容

专利标题： On-device speech synthesis of textual segments for training of on-device speech recognition model
申请号： US16959546

申请日： 2019-10-02
公开(公告)号： US11127392B2

公开(公告)日： 2021-09-21
发明人: Françoise Beaufays , Johan Schalkwyk , Khe Chai Sim
申请人： Google LLC
申请人地址： US CA Mountain View
专利权人： Google LLC
当前专利权人： Google LLC
当前专利权人地址： US CA Mountain View
代理机构： Middleton Reutlinger
国际申请： PCT/US2019/054314 WO 20191002
国际公布： WO2021/006920 WO 20210114
主分类号： G10L13/00
IPC分类号： G10L13/00 ; G10L13/047 ; G10L15/06

On-device speech synthesis of textual segments for training of on-device speech recognition model

摘要：

Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

公开/授权文献

US20210104223A1 ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL 公开/授权日：2021-04-08

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统