Invention Grant
- Patent Title: High-quality non-parallel many-to-many voice conversion
-
Application No.: US16411614Application Date: 2019-05-14
-
Publication No.: US11854562B2Publication Date: 2023-12-26
- Inventor: Yang Zhang , Shiyu Chang
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: McGinn I.P. Law Group, PLLC
- Agent Peter Edwards
- Main IPC: G10L21/003
- IPC: G10L21/003 ; G10L21/013 ; G10L19/00 ; G06N20/20 ; G06N3/08 ; G06N3/045

Abstract:
A method (and structure and computer product) to permit zero-shot voice conversion with non-parallel data includes receiving source speaker speech data as input data into a content encoder of a style transfer autoencoder system, the content encoder providing a source speaker disentanglement of the source speaker speech data by reducing speaker style information of the input source speech data while retaining content information and receiving target speaker input speech as input data into a target speaker encoder. The output of the content encoder and the target speaker encoder are combined in a decoder of the style transfer autoencoder, and the output of the decoder provides the content information of the input source speech data in a style of the target speaker speech information.
Information query