Global prosody style transfer without text transcriptions

Invention Grant

US11996083B2 Global prosody style transfer without text transcriptions 有权

Please log in to see more content

Patent Title: Global prosody style transfer without text transcriptions
Application No.: US17337518

Application Date: 2021-06-03
Publication No.: US11996083B2

Publication Date: 2024-05-28
Inventor: Kaizhi Qian , Yang Zhang , Shiyu Chang , Jinjun Xiong , Chuang Gan , David Cox
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Applicant Address: US NY Armonk
Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee Address: US NY Armonk
Agency: Tutunjian & Bitetto, P.C.
Agent Stosch Sabo
Main IPC: G10L13/10
IPC: G10L13/10 ; G06N20/00 ; G10L17/04 ; G10L21/013 ; G10L25/63

Global prosody style transfer without text transcriptions

Abstract:

A computer-implemented method is provided of using a machine learning model for disentanglement of prosody in spoken natural language. The method includes encoding, by a computing device, the spoken natural language to produce content code. The method further includes resampling, by the computing device without text transcriptions, the content code to obscure the prosody by applying an unsupervised technique to the machine learning model to generate prosody-obscured content code. The method additionally includes decoding, by the computing device, the prosody-obscured content code to synthesize speech indirectly based upon the content code.

Public/Granted literature

US20220392429A1 GLOBAL PROSODY STYLE TRANSFER WITHOUT TEXT TRANSCRIPTIONS Public/Granted day:2022-12-08

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/08	.文本分析或文本以外的语音合成参数的产生，例如语义图翻译为音素、韵律产生、重音或声调测定
G10L13/10	..来自文本的韵律规则；重音或声调