Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Tao Sun"

1.

发明授权
Voice generating method and apparatus, electronic device and storage medium 有权

公开(公告)号：US12073822B2

公开(公告)日：2024-08-27

申请号：US18086004

申请日：2022-12-21

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xinyong Zhou , Junteng Zhang , Tao Sun , Lei Jia

IPC: G10L13/00 , G06F40/30 , G10L13/06 , G10L13/10 , G10L25/18 , G10L13/047

CPC classification number: G10L13/10 , G06F40/30 , G10L13/06 , G10L25/18 , G10L13/047 , G10L2013/105

Abstract: A voice generating method and apparatus, an electronic device and a storage medium. The specific implementation solution includes: acquiring a text to be processed, and determining an associated text of the text to be processed; acquiring an associated prosodic feature of the associated text; determining an associated text feature of the associated text based on the text to be processed; determining a spectrum feature to be processed of the text to be processed based on the associated prosodic feature and the associated text feature; and generating a target voice corresponding to the text to be processed based on the spectrum feature to be processed.

2.

发明授权
Method of registering attribute in speech synthesis model, apparatus of registering attribute in speech synthesis model, electronic device, and medium 有权

公开(公告)号：US12062357B2

公开(公告)日：2024-08-13

申请号：US17455156

申请日：2021-11-16

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Wenfu Wang , Xilei Wang , Tao Sun , Han Yuan , Zhengkun Gao , Lei Jia

IPC: G10L13/02

CPC classification number: G10L13/02

Abstract: A method of registering an attribute in a speech synthesis model, an apparatus of registering an attribute in a speech synthesis model, an electronic device, and a medium are provided, which relate to a field of an artificial intelligence technology such as a deep learning and intelligent speech technology. The method includes: acquiring a plurality of data associated with an attribute to be registered; and registering the attribute in the speech synthesis model by using the plurality of data associated with the attribute, wherein the speech synthesis model is trained in advance by using a training data in a training data set.

3.

发明授权
Method of processing audio data, electronic device and storage medium 有权

公开(公告)号：US11984134B2

公开(公告)日：2024-05-14

申请号：US18071187

申请日：2022-11-29

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Jiankang Hou , Zhipeng Nie , Liqiang Zhang , Tao Sun , Lei Jia

IPC: G10L25/18 , G10L25/30

CPC classification number: G10L25/18 , G10L25/30

Abstract: A method of processing audio data, an electronic device, and a storage medium, which relates to a field of artificial intelligence, in particular to a field of speech processing technology. The method includes: processing spectral data of the audio data to obtain a first feature information; obtaining a fundamental frequency indication information according to the first feature information, wherein the fundamental frequency indication information indicates valid audio data of the first feature information and invalid audio data of the first feature information; obtaining a fundamental frequency information and a spectral energy information according to the first feature information and the fundamental frequency indication information; and obtaining a harmonic structure information of the audio data according to the fundamental frequency information and the spectral energy information.

4.

发明授权
Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium 有权

公开(公告)号：US11769482B2

公开(公告)日：2023-09-26

申请号：US17489616

申请日：2021-09-29

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Wenfu Wang , Tao Sun , Xilei Wang , Junteng Zhang , Zhengkun Gao , Lei Jia

IPC: G10L13/10 , G10L25/30

CPC classification number: G10L13/10 , G10L25/30

Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.

5.

发明授权
Speech synthesis method and apparatus, device and computer storage medium 有权

公开(公告)号：US11996084B2

公开(公告)日：2024-05-28

申请号：US17738186

申请日：2022-05-06

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liqiang Zhang , Jiankang Hou , Tao Sun , Lei Jia

IPC: G10L13/02 , G06F40/20 , G10L13/04 , G10L13/047 , G10L13/10

CPC classification number: G10L13/10 , G06F40/20 , G10L13/047

Abstract: The present disclosure discloses a speech synthesis method and apparatus, a device and a computer storage medium, and relates to speech and deep learning technologies in the field of artificial intelligence technologies. A specific implementation solution involves: acquiring to-be-synthesized text; acquiring a prosody feature extracted from the text; inputting the text and the prosody feature into a speech synthesis model to obtain a vocoder feature; and inputting the vocoder feature into a vocoder to obtain synthesized speech.

6.

发明申请
VOICE GENERATING METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230131494A1

公开(公告)日：2023-04-27

申请号：US18086004

申请日：2022-12-21

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xinyong ZHOU , Junteng Zhang , Tao Sun , Lei Jia

IPC: G10L13/10 , G10L25/18 , G10L13/06 , G06F40/30

Abstract: A voice generating method and apparatus, an electronic device and a storage medium. The specific implementation solution includes: acquiring a text to be processed, and determining an associated text of the text to be processed; acquiring an associated prosodic feature of the associated text; determining an associated text feature of the associated text based on the text to be processed; determining a spectrum feature to be processed of the text to be processed based on the associated prosodic feature and the associated text feature; and generating a target voice corresponding to the text to be processed based on the spectrum feature to be processed.

7.

发明申请
METHOD AND APPARATUS FOR CONVERTING VOICE TIMBRE, METHOD AND APPARATUS FOR TRAINING MODEL, DEVICE AND MEDIUM 有权

公开(公告)号：US20230127787A1

公开(公告)日：2023-04-27

申请号：US18145326

申请日：2022-12-22

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Junchao Wang , Yixiang Chen , Tao Sun

IPC: G10L15/02 , G10L15/06

Abstract: A method and an apparatus for converting a voice timbre, and a method for training a model. The solution includes: obtaining a target acoustic feature by encoding a sample audio using an encoding branch in a voice timbre conversion model; obtaining a target text feature by performing feature extraction on a real text sequence labeled by the sample audio; training the encoding branch based on a difference between the target acoustic feature and the target text feature; obtaining a first spectrum feature having an original timbre by decoding the target text feature using a decoding branch in the voice timbre conversion model based on the original timbre corresponding to the identification information carried in the sample audio; obtaining a second spectrum feature by performing spectrum feature extraction on the sample audio; and training the decoding branch based on a difference between the first spectrum feature and the second spectrum feature.

8.

发明申请
METHOD AND APPARATUS FOR SPEECH SYNTHESIS, AND STORAGE MEDIUM 有权

公开(公告)号：US20220375453A1

公开(公告)日：2022-11-24

申请号：US17875529

申请日：2022-07-28

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Junteng Zhang , Jianmin Wu , Tao Sun , Lei Jia

IPC: G10L13/10 , G10L13/06 , G10L13/047 , G10L25/18 , G10L13/08

Abstract: A method for speech synthesis includes obtaining text to be synthesized and an identifier of a speaker, the text being written in a first language; obtaining pronunciation information of each character in the text; generating linguistic features of the text by performing feature extraction on the pronunciation information of each character in the text based on the first language; and obtaining a target speech in a second language other than the first language, by performing speech synthesis based on the linguistic features and the identifier of the speaker.

9.

发明授权
Speech synthesis method, and electronic device 有权

公开(公告)号：US12211485B2

公开(公告)日：2025-01-28

申请号：US17820339

申请日：2022-08-17

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Zhengkun Gao , Junteng Zhang , Tao Sun , Lei Jia

IPC: G10L13/00 , G10L13/047 , G10L13/08

Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

10.

发明申请
METHOD AND APPARATUS FOR PROCESSING SPEECH, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230015112A1

公开(公告)日：2023-01-19

申请号：US17933152

申请日：2022-09-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Jiankang Hou , Tao Sun , Zhipeng Nie , Liqiang Zhang , Lei Jia , Haifeng Wang

IPC: G10L21/10 , G10L13/02 , G10L21/0208 , G10L25/51

Abstract: A method for processing a speech includes: acquiring an original speech; extracting a spectrogram from the original speech; acquiring a speech synthesis model, where the speech synthesis model comprises a first generation sub-model and a second generation sub-model; generating a harmonic structure of the spectrogram, by invoking the first generation sub-model to process the spectrogram; and generating a target speech, by invoking the second generation sub-model to process the harmonic structure and the spectrogram.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification