-
公开(公告)号:US20250053738A1
公开(公告)日:2025-02-13
申请号:US18720376
申请日:2021-12-20
Applicant: Google LLC
Inventor: Ryan Dingler , John Rivlin , Christopher Salvarani , Yuanlei Zhang , Nazarii Kukhar , Russell John Wyatt Skerry-Ryan , Daisy Stanton , Judy Chang , Md Enzam Hossain
IPC: G06F40/253 , G06F3/0484 , G06F3/16 , G06F40/169 , G06F40/289 , G10L13/08
Abstract: Aspects of this disclosure are directed to techniques that enable efficient automated text-to-speech pronunciation editing for long form text documents. A computing device comprising a memory and a processor may be configured to perform the techniques. The memory may store a text document. The processor may process words in the text document to identify first candidate words that are predicted to be mispronounced during automated text-to-speech processing of the text document. The processor may next filter the first candidate words to remove one or more candidate words of the first candidate words and obtain second candidate words that have fewer candidate words than the first candidate words. The processor may then annotate the text document to obtain an annotated text document that identifies the second candidate words, and output at least a portion of the annotated text document that identifies at least one candidate word of the second candidate words.