Multi-speaker neural text-to-speech synthesis

    公开(公告)号:US12266342B2

    公开(公告)日:2025-04-01

    申请号:US17293640

    申请日:2018-12-11

    Inventor: Yan Deng Lei He

    Abstract: A method for generating speech through multi-speaker neural text-to-speech (TTS) synthesis is provided. A text input may be received (1410). Speaker latent space information of a target speaker may be provided through at least one speaker model (1420). At least one acoustic feature may be predicted through an acoustic feature predictor based on the text input and the speaker latent space information (1430). A speech waveform corresponding to the text input may be generated through a neural vocoder based on the at least one acoustic feature and the speaker latent space information (1440).

    Multilingual neural text-to-speech synthesis

    公开(公告)号:US11922924B2

    公开(公告)日:2024-03-05

    申请号:US17617547

    申请日:2020-05-21

    CPC classification number: G10L13/10 G10L13/033 G10L13/047

    Abstract: Method and apparatus for generating speech through multilingual neural text-to-speech (TTS) synthesis are provided in the present disclosure. A text input in at least a first language may be received. Speaker latent space information of a target speaker may be provided through a speaker encoder. Language latent space information of a second language may be provided through a language encoder. At least one acoustic feature may be generated, through an acoustic feature predictor, based on the text input, the speaker latent space information and the language latent space information of the second language. A speech waveform corresponding to the text input may be generated, through a neural vocoder, based on the at least one acoustic feature.

    Automatic recovery engine with continuous recovery state machine and remote workflows

    公开(公告)号:US10652119B2

    公开(公告)日:2020-05-12

    申请号:US15636929

    申请日:2017-06-29

    Abstract: Various embodiments of the present technology generally relate to systems and methods for self-healing services and automatic recovery of distribute systems. Some embodiments of the present technology leverage all the available synthetic, customer, client, server, support signals from various sources to intelligently and in real-time detect outages, root cause outages to recoverable targets (e.g., for auto recovery actions), identify the right engineering teams (e.g., for faster manual mitigation), and perform the appropriate recovery action (such as recycle service, reboot server, switch out a faulty rack) or other mitigation actions such as routing, collecting debug information, alerting to the right team, or alert suppression. Some embodiments separate signal monitoring and workflow coordination.

    Speech waveform generation
    8.
    发明授权

    公开(公告)号:US11869482B2

    公开(公告)日:2024-01-09

    申请号:US17272325

    申请日:2018-09-30

    CPC classification number: G10L13/047

    Abstract: A method and apparatus for generating a speech waveform. Fundamental frequency information, glottal features and vocal tract features associated with an input may be received, wherein the glottal features include a phase feature, a shape feature, and an energy feature (1310). A glottal waveform is generated based on the fundamental frequency information and the glottal features through a first neural network model (1320). A speech waveform is generated based on the glottal waveform and the vocal tract features through a second neural network model (1330).

Patent Agency Ranking