-
公开(公告)号:US09865247B2
公开(公告)日:2018-01-09
申请号:US14631583
申请日:2015-02-25
Applicant: Google LLC
Inventor: Ioannis Agiomyrgiannakis , Byung Ha Chun
Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.
-
公开(公告)号:US10249289B2
公开(公告)日:2019-04-02
申请号:US15649311
申请日:2017-07-13
Applicant: Google LLC
Inventor: Byung Ha Chun , Javier Gonzalvo , Chun-an Chan , Ioannis Agiomyrgiannakis , Vincent Ping Leung Wan , Robert Andrew James Clark , Jakub Vit
IPC: G10L13/06 , G10L19/00 , G10L25/30 , G10L13/027 , G10L13/047
Abstract: Methods, systems, and computer-readable media for text-to-speech synthesis using an autoencoder. In some implementations, data indicating a text for text-to-speech synthesis is obtained. Data indicating a linguistic unit of the text is provided as input to an encoder. The encoder is configured to output speech unit representations indicative of acoustic characteristics based on linguistic information. A speech unit representation that the encoder outputs is received. A speech unit is selected to represent the linguistic unit, the speech unit being selected from among a collection of speech units based on the speech unit representation output by the encoder. Audio data for a synthesized utterance of the text that includes the selected speech unit is provided.
-