ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL

Invention Application

US20220005458A1 ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL 有权

Please log in to see more content

Patent Title: ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL
Application No.: US17479285

Application Date: 2021-09-20
Publication No.: US20220005458A1

Publication Date: 2022-01-06
Inventor: Françoise Beaufays , Johan Schalkwyk , Khe Chai Sim
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G10L13/047
IPC: G10L13/047 ; G10L15/06

ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL

Abstract:

Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

Public/Granted literature

US11705106B2 On-device speech synthesis of textual segments for training of on-device speech recognition model Public/Granted day:2023-07-18

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/02	.产生合成语音的方法；语音合成设备
G10L13/04	..语音合成系统的零部件，例如合成设备结构或存储器管理
G10L13/047	...语音合成设备的体系结构