专利检索 ap:"NEOSAPIENCE, INC." 第 1 页

1.

发明授权
Multilingual text-to-speech synthesis 有权

公开(公告)号：US11769483B2

公开(公告)日：2023-09-26

申请号：US17533459

申请日：2021-11-23

申请人： NEOSAPIENCE, INC.

发明人： Taesu Kim , Younggun Lee

IPC分类号： G06F17/00 , G10L13/10 , G10L13/033 , G10L13/047 , G06N3/04 , G06N3/08 , G06F40/40 , G10L13/08 , G10L25/30 , G06N3/044 , G06N3/045

CPC分类号： G10L13/10 , G06F40/40 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/047 , G10L13/086 , G10L25/30

摘要： A multilingual text-to-speech synthesis method and system are disclosed. The method includes receiving an articulatory feature of a speaker regarding a first language, receiving an input text of a second language, and generating output speech data for the input text of the second language that simulates the speaker's speech by inputting the input text of the second language and the articulatory feature of the speaker regarding the first language to a single artificial neural network multilingual text-to-speech synthesis model. The single artificial neural network multilingual text-to-speech synthesis model is generated by learning similarity information between phonemes of the first language and phonemes of the second language based on a first learning data of the first language and a second learning data of the second language.

2.

发明授权
Method for searching for contents having same voice as voice of target speaker, and apparatus for executing same 有权

公开(公告)号：US11664015B2

公开(公告)日：2023-05-30

申请号：US17319566

申请日：2021-05-13

申请人： NEOSAPIENCE, INC.

发明人： Suwon Shon , Younggun Lee , Taesu Kim

IPC分类号： G10L15/10 , G06F16/901 , G06F16/683 , G06N3/04 , G10L15/02 , G10L25/03

CPC分类号： G10L15/10 , G06F16/683 , G06F16/9014 , G06N3/04 , G10L15/02 , G10L25/03

摘要： A method for searching content having same voice as a voice of a target speaker from among a plurality of contents includes extracting a feature vector corresponding to the voice of the target speaker, selecting any subset of speakers from a training dataset repeatedly by a predetermined number of times, generating linear discriminant analysis (LDA) transformation matrices using each of the selected any subsets of speakers repeatedly by a predetermined number of times, projecting the extracted speaker feature vector to the selected corresponding subsets of speakers using each of the generated LDA transformation matrices, assigning a value corresponding to nearby speaker class among corresponding subsets of speakers, to each of projection regions of the extracted speaker feature vector, generating a hash value corresponding to the extracted feature vector based on the assigned values, and searching content having a similar hash value to the generated hash value among the contents.

3.

发明授权
Multilingual text-to-speech synthesis 有权

公开(公告)号：US11217224B2

公开(公告)日：2022-01-04

申请号：US16682390

申请日：2019-11-13

申请人： NEOSAPIENCE, INC.

发明人： Taesu Kim , Younggun Lee

IPC分类号： G10L13/00 , G10L13/02 , G10L13/06 , G10L15/26 , G10L13/10 , G10L13/033 , G10L13/047 , G06N3/04 , G06N3/08 , G06F40/40 , G10L13/08 , G10L25/30

摘要： A multilingual text-to-speech synthesis method and system are disclosed. The method includes receiving first learning data including a learning text of a first language and learning speech data of the first language corresponding to the learning text of the first language, receiving second learning data including a learning text of a second language and learning speech data of the second language corresponding to the learning text of the second language, and generating a single artificial neural network text-to-speech synthesis model by learning similarity information between phonemes of the first language and phonemes of the second language based on the first learning data and the second learning data.

4.

发明公开
TRANSLATION METHOD AND SYSTEM USING MULTILINGUAL TEXT-TO-SPEECH SYNTHESIS MODEL 审中-公开

公开(公告)号：US20240013771A1

公开(公告)日：2024-01-11

申请号：US18371704

申请日：2023-09-22

申请人： NEOSAPIENCE, INC.

发明人： Taesu KIM , Younggun LEE

IPC分类号： G10L13/10 , G10L15/02 , G10L13/047 , G06F40/47 , G10L25/57 , G10L13/08 , G10L15/16 , G10L13/033

CPC分类号： G10L13/10 , G10L15/02 , G10L13/047 , G06F40/47 , G10L25/57 , G10L13/086 , G10L15/16 , G10L13/033

摘要： A speech translation method using a multilingual text-to-speech synthesis model includes receiving input speech data of the first language and an articulatory feature of a speaker regarding the first language, converting the input speech data of the first language into a text of the first language, converting the text of the first language into a text of the second language, and generating output speech data for the text of the second language that simulates the speaker's speech by inputting the text of the second language and the articulatory feature of the speaker to a single artificial neural network text-to-speech synthesis model.

5.

发明公开
METHOD AND SYSTEM FOR APPLYING SYNTHETIC SPEECH TO SPEAKER IMAGE 审中-公开

公开(公告)号：US20230206896A1

公开(公告)日：2023-06-29

申请号：US18113671

申请日：2023-02-24

申请人： NEOSAPIENCE, INC.

发明人： Taesu KIM , Younggun LEE , Yookyung SHIN

IPC分类号： G10L13/08 , G10L25/30 , G10L15/02

CPC分类号： G10L13/08 , G10L25/30 , G10L15/02 , G10L2015/025

摘要： The present disclosure relates to a method for applying synthesis voice to a speaker image, in which the method includes receiving an input text, inputting the input text to an artificial neural network text-to-speech synthesis model and outputting voice data for the input text, generating a synthesis voice corresponding to the output voice data, and generating information on a plurality of phonemes included in the output voice data, in which the information on the plurality of phonemes may include timing information for each of the plurality of phonemes included in the output voice data.

6.

发明申请
METHOD FOR SEARCHING FOR CONTENTS HAVING SAME VOICE AS VOICE OF TARGET SPEAKER, AND APPARATUS FOR EXECUTING SAME 有权

公开(公告)号：US20210280173A1

公开(公告)日：2021-09-09

申请号：US17319566

申请日：2021-05-13

申请人： NEOSAPIENCE, INC.

发明人： Suwon SHON , Younggun LEE , Taesu KIM

IPC分类号： G10L15/10 , G10L15/02 , G10L25/03 , G06F16/683 , G06F16/901 , G06N3/04

摘要： A method for searching content having same voice as a voice of a target speaker from among a plurality of contents includes extracting a feature vector corresponding to the voice of the target speaker, selecting any subset of speakers from a training dataset repeatedly by a predetermined number of times, generating linear discriminant analysis (LDA) transformation matrices using each of the selected any subsets of speakers repeatedly by a predetermined number of times, projecting the extracted speaker feature vector to the selected corresponding subsets of speakers using each of the generated LDA transformation matrices, assigning a value corresponding to nearby speaker class among corresponding subsets of speakers, to each of projection regions of the extracted speaker feature vector, generating a hash value corresponding to the extracted feature vector based on the assigned values, and searching content having a similar hash value to the generated hash value among the contents.

7.

发明公开
SPEECH TRANSLATION METHOD AND SYSTEM USING MULTILINGUAL TEXT-TO-SPEECH SYNTHESIS MODEL 审中-公开

公开(公告)号：US20240363098A1

公开(公告)日：2024-10-31

申请号：US18770736

申请日：2024-07-12

申请人： NEOSAPIENCE, INC.

发明人： Taesu KIM , Younggun LEE

IPC分类号： G10L13/10 , G06F40/40 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/047 , G10L13/08 , G10L25/30

CPC分类号： G10L13/10 , G06F40/40 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/047 , G10L13/086 , G10L25/30

摘要： A speech translation method using a multilingual text-to-speech synthesis model includes receiving input speech data of the first language and an articulatory feature of a speaker regarding the first language, converting the input speech data of the first language into a text of the first language, converting the text of the first language into a text of the second language, and generating output speech data for the text of the second language that simulates the speaker's speech by inputting the text of the second language and the articulatory feature of the speaker to a single artificial neural network text-to-speech synthesis model.

8.

发明公开
METHOD AND SYSTEM FOR GENERATING SYNTHESIS VOICE USING STYLE TAG REPRESENTED BY NATURAL LANGUAGE 审中-公开

公开(公告)号：US20240105160A1

公开(公告)日：2024-03-28

申请号：US18533507

申请日：2023-12-08

申请人： NEOSAPIENCE, INC.

发明人： Taesu KIM , Younggun LEE , Yookyung SHIN , Hyeongju KIM

IPC分类号： G10L13/10 , G06F40/253

CPC分类号： G10L13/10 , G06F40/253

摘要： A method for generating a synthesis voice is provided, which is performed by one or more processors, and includes acquiring a text-to-speech synthesis model trained to generate a synthesis voice for a training text, based on reference voice data and a training style tag represented by natural language, receiving a target text, acquiring a style tag represented by natural language, and inputting the style tag and the target text into the text-to-speech synthesis model and acquiring a synthesis voice for the target text reflecting voice style features related to the style tag.

9.

发明公开
METHOD FOR PERFORMING SYNTHETIC SPEECH GENERATION OPERATION ON TEXT 审中-公开

公开(公告)号：US20230186895A1

公开(公告)日：2023-06-15

申请号：US18108080

申请日：2023-02-10

申请人： NEOSAPIENCE, INC.

发明人： Taesu KIM , Younggun LEE , Suhee JO , Yookyung SHIN

IPC分类号： G10L13/10 , G10L13/027

CPC分类号： G10L13/10 , G10L13/027

摘要： A method for performing the synthetic speech generation operation on text is provided, including receiving a plurality of sentences, receiving a plurality of speech style characteristics for the plurality of sentences, inputting the plurality of sentences and the plurality of speech style characteristics into an artificial neural network text-to-speech synthesis model, so as to generate a plurality of synthetic speeches for the plurality of sentences that reflect the plurality of speech style characteristics, and receiving a response to at least one of the plurality of synthetic speeches.

10.

发明申请
MULTILINGUAL TEXT-TO-SPEECH SYNTHESIS 有权

公开(公告)号：US20220084500A1

公开(公告)日：2022-03-17

申请号：US17533459

申请日：2021-11-23

申请人： NEOSAPIENCE, INC.

发明人： Taesu Kim , Younggun Lee

IPC分类号： G10L13/10 , G10L13/033 , G10L13/047 , G06N3/04 , G06N3/08 , G06F40/40 , G10L13/08 , G10L25/30

摘要： A multilingual text-to-speech synthesis method and system are disclosed. The method includes receiving an articulatory feature of a speaker regarding a first language, receiving an input text of a second language, and generating output speech data for the input text of the second language that simulates the speaker's speech by inputting the input text of the second language and the articulatory feature of the speaker regarding the first language to a single artificial neural network multilingual text-to-speech synthesis model. The single artificial neural network multilingual text-to-speech synthesis model is generated by learning similarity information between phonemes of the first language and phonemes of the second language based on a first learning data of the first language and a second learning data of the second language.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类