Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Wenfu WANG"

1.

发明申请
METHOD AND APPARATUS OF SYNTHESIZING SPEECH, METHOD AND APPARATUS OF TRAINING SPEECH SYNTHESIS MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20220020356A1

公开(公告)日：2022-01-20

申请号：US17489616

申请日：2021-09-29

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Wenfu WANG , Tao SUN , Xilei WANG , Junteng ZHANG , Zhengkun GAO , Lei JIA

IPC: G10L13/10 , G10L25/30

Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.

2.

发明公开
METHOD OF TRAINING SPEECH SYNTHESIS MODEL AND METHOD OF SYNTHESIZING SPEECH 审中-公开

公开(公告)号：US20230178067A1

公开(公告)日：2023-06-08

申请号：US18074023

申请日：2022-12-02

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Wenfu WANG , Tao SUN , Xilei WANG , Lei JIA

IPC: G10L13/047 , G10L25/30

CPC classification number: G10L13/047 , G10L25/30

Abstract: A method of training a speech synthesis method, a method of synthesizing a speech, a device and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to a field of speech synthesis technology. The specific implementation scheme includes: processing training data by using the speech synthesis model, so as to determine a content encoding sequence, a style encoding sequence, a timbre encoding vector, a noise environment vector and a target Mel spectrum sequence corresponding to the training data; determine a total loss value according to the content encoding sequence, the style encoding sequence, the timbre encoding vector, the noise environment vector and the target Mel spectrum sequence; and adjusting a parameter of the speech synthesis model according to the total loss value.

3.

发明申请
METHOD OF REGISTERING ATTRIBUTE IN SPEECH SYNTHESIS MODEL, APPARATUS OF REGISTERING ATTRIBUTE IN SPEECH SYNTHESIS MODEL, ELECTRONIC DEVICE, AND MEDIUM 有权

公开(公告)号：US20220076657A1

公开(公告)日：2022-03-10

申请号：US17455156

申请日：2021-11-16

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Wenfu WANG , Xilei WANG , Tao SUN , Han YUAN , Zhengkun GAO , Lei JIA

IPC: G10L13/02

Abstract: A method of registering an attribute in a speech synthesis model, an apparatus of registering an attribute in a speech synthesis model, an electronic device, and a medium are provided, which relate to a field of an artificial intelligence technology such as a deep learning and intelligent speech technology. The method includes: acquiring a plurality of data associated with an attribute to be registered; and registering the attribute in the speech synthesis model by using the plurality of data associated with the attribute, wherein the speech synthesis model is trained in advance by using a training data in a training data set.

Patent Agency Ranking