专利检索 cpc:"G10L13/086" 第 6 页

51.

发明申请
Text-to-speech process capable of interspersing recorded words and phrases 审中-公开

公开(公告)号：US20180114523A1

公开(公告)日：2018-04-26

申请号：US15792861

申请日：2017-10-25

申请人： Cepstral, LLC

发明人： Patrick Dexter , Kevin Jeffries

IPC分类号： G10L13/08

CPC分类号： G10L13/08 , G10L13/086

摘要： Creating and deploying a voice from text-to-speech, with such voice being a new language derived from the original phoneset of a known language, and thus being audio of the new language outputted using a single TTS synthesizer. An end product message is determined in an original language n to be outputted as audio n by a text-to-speech engine, wherein the original language n includes an existing phoneset n including one or more phonemes n. Words and phrases of a new language n+1 are recorded, thereby forming audio file n+1. This new audio file is labeled into unique units, thereby defining one or more phonemes n+1. The new phonemes of the new language are added to the phoneset, thereby forming new phoneset n+1, as a result outputting the end product message as an audio n+1 language different from the original language n.

52.

发明申请
PROCESSING SEQUENCES USING CONVOLUTIONAL NEURAL NETWORKS 审中-公开

公开(公告)号：US20180075343A1

公开(公告)日：2018-03-15

申请号：US15697407

申请日：2017-09-06

申请人： Google Inc.

发明人： Aaron Gerard Antonius van den Oord , Sander Etienne Lea Dieleman , Nal Emmerich Kalchbrenner , Karen Simonyan , Oriol Vinyals , Lasse Espeholt

IPC分类号： G06N3/04 , G06F17/27 , G10L25/30 , G10L15/16 , G10L13/08

CPC分类号： G06N3/0472 , G06F17/18 , G06F17/2765 , G06F17/2818 , G06N3/0445 , G06N3/0454 , G06N3/084 , G10H2250/311 , G10L13/04 , G10L13/086 , G10L15/16 , G10L25/30

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing sequences using convolutional neural networks. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.

53.

发明授权
Multilingual prosody generation 有权

公开(公告)号：US09905220B2

公开(公告)日：2018-02-27

申请号：US14942300

申请日：2015-11-16

申请人： Google Inc.

发明人： Javier Gonzalvo Fructuoso , Andrew W. Senior , Byungha Chun

IPC分类号： G10L13/08 , G10L13/10 , G06F17/28 , G10L13/07 , G10L25/30

CPC分类号： G10L13/10 , G06F17/289 , G10L13/07 , G10L13/08 , G10L13/086 , G10L25/30

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.

54.

发明授权
System and method for using prior frame data for OCR processing of frames in video sources 有权

公开(公告)号：US09792895B2

公开(公告)日：2017-10-17

申请号：US14863512

申请日：2015-09-24

申请人： ABBYY Development LLC

发明人： Ivan Khintsitskiy , Andrey Isaev , Sergey Fedorov

IPC分类号： G06K9/32 , G06K9/00 , G06T7/00 , G10L13/04 , G10L13/08

CPC分类号： G10L13/043 , G06K9/3233 , G06K9/3258 , G06K2209/01 , G10L13/04 , G10L13/086

摘要： Disclosed are systems, methods and computer program products for using prior frame data for OCR processing of frames in video sources to detect natural language text therein. An example includes receiving a frame from a video source and retrieving prior frame data associated with the video source. The OCR-processing includes using prior frame data to detect blobs similar to blobs described in the prior frame data; using detected similar blobs to detect in the frame character candidates similar to character candidates described in the prior frame data; using detected similar character candidates to detect in the frame text candidates similar to text candidates described in the prior frame data; and using detected similar text candidates to detect in the frame text strings similar to text strings described in the prior frame data.

55.

发明申请
SYSTEM AND METHOD FOR INTELLIGENT LANGUAGE SWITCHING IN AUTOMATED TEXT-TO-SPEECH SYSTEMS 审中-公开

公开(公告)号：US20170236509A1

公开(公告)日：2017-08-17

申请号：US15583068

申请日：2017-05-01

申请人： AT&T Intellectual Property I, L.P.

发明人： Gregory Pulz , Harry E. Blanchard , Lan Zhang

IPC分类号： G10L13/08 , G10L13/047

CPC分类号： G10L13/086 , G06F17/289 , G10L13/047

摘要： Systems, methods, and computer-readable storage media for providing for intelligent switching of languages and/or pronunciations in a text-to-speech system. As the system receives text, the text is analyzed to identify portions which should have speech constructed using a pronunciation distinct from the remaining portions of the text. The text-to-speech system uses multiple pronunciation dictionaries to generate and produce speech corresponding to the text, where the identified portions of the text are in a different language or have a different accent from the remainder of the text. Having generated speech corresponding to the text in multiple languages, accents, or dialects, the system combines the portions, then communicates the speech to the text recipient.

56.

发明申请
TEXT-TO-SPEECH METHOD AND MULTI-LINGUAL SPEECH SYNTHESIZER USING THE METHOD 有权
标题翻译：使用该方法的文本到语音方法和多语音合成器

公开(公告)号：US20170047060A1

公开(公告)日：2017-02-16

申请号：US14956405

申请日：2015-12-02

申请人： ASUSTeK COMPUTER INC.

发明人： Hsun-Fu LIU , Abhishek Pandey , Chin-Cheng HSU

IPC分类号： G10L13/10 , G10L13/027 , G10L15/06 , G10L13/08

CPC分类号： G10L13/10 , G10L13/00 , G10L13/02 , G10L13/04 , G10L13/06 , G10L13/07 , G10L13/08 , G10L13/086

摘要： A text to-speech method and a multi-lingual speech synthesizer using the method are disclosed. The multi-lingual speech synthesizer and the method executed by processor are applied for processing a multi-lingual text message in a mixture of a first language and a second language into a multi-lingual voice message. The multi-lingual speech synthesizer comprises a storage device configured to store a first language model database, second language model database a broadcasting device configured to broadcast the multi-lingual voice message, and a processor, connected to the storage de ice and the broadcasting device, configured to ex cute the method disclosed herein.

摘要翻译： 公开了一种使用该方法的文本语音方法和多语言语音合成器。多语言语音合成器和由处理器执行的方法被应用于将第一语言和第二语言的混合的多语言文本消息处理成多语言语音消息。多语言语音合成器包括存储装置，其被配置为存储第一语言模型数据库，第二语言模型数据库，配置成广播多语言语音消息的广播装置，以及连接到存储装置和广播装置的处理器，被配置为可爱的这里公开的方法。

57.

发明申请
QUANTITATIVE F0 CONTOUR GENERATING DEVICE AND METHOD, AND MODEL LEARNING DEVICE AND METHOD FOR F0 CONTOUR GENERATION 审中-公开
标题翻译：量化F0轮廓生成装置和方法，以及用于F0轮廓生成的模型学习装置和方法

公开(公告)号：US20160189705A1

公开(公告)日：2016-06-30

申请号：US14911189

申请日：2014-08-13

申请人： NATIONAL INSTITUTE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY

发明人： Jinfu NI , Yoshinori SHIGA

IPC分类号： G10L13/10 , G10L21/02 , G10L25/18 , G10L13/027 , G10L13/08

CPC分类号： G10L13/10 , G10L13/027 , G10L13/086 , G10L21/0205 , G10L25/18

摘要： [Object] An object is to provide an F0 contour synthesizing device based on statistic model, to clarify correspondence between linguistic information and F0 contour while maintaining accuracy.[Solution] An HMM learning device includes: a parameter estimating unit representing an F0 contour 133 fitting a continuous F0 contour 132 as a sum of phrase components and accent components and estimating target points of these; and an HMM learning means conducting learning of HMM 139 using the fitted F0 contour as training data. The continuous F0 contour may be decomposed to accent components 134, phrase components 136 and micro-prosody components 138, and separate HMMs 140, 142 and 144 may be trained. Using results of text analysis, accent components, phrase components and micro-prosody components are separately synthesized from HMMs 140, 142 and 144 and the results are synthesized to obtain an F0 contour.

摘要翻译： [目的]一个目的是提供一种基于统计模型的F0轮廓合成装置，以清楚语言信息与F0轮廓之间的对应关系，同时保持精度。解决方案HMM学习装置包括：表示F0轮廓133的参数估计单元，其将连续F0轮廓132拟合为短语分量和重音分量的和，并估计它们的目标点; 以及使用拟合的F0轮廓进行HMM 139的学习的HMM学习装置作为训练数据。可以将连续的F0轮廓分解为强调组件134，短语组件136和微韵律组件138以及单独的HMM 140,142和144可以被训练。使用文本分析的结果，重音组件，短语组件和微韵律分量由HMM 140,142和144分别合成，并且合成结果以获得F0轮廓。

58.

发明授权
Methods and systems for automated generation of nativized multi-lingual lexicons 有权
标题翻译：自动生成本土化多语言词典的方法和系统

公开(公告)号：US09263028B2

公开(公告)日：2016-02-16

申请号：US14283586

申请日：2014-05-21

申请人： Google Inc.

发明人： Javier Gonzalvo Fructuoso , Ioannis Agiomyrgiannakis

IPC分类号： G10L13/00 , G10L13/08 , G10L15/06 , G06F17/27 , G10L15/187

CPC分类号： G10L13/086 , G06F17/277 , G10L13/08 , G10L15/063 , G10L15/187 , G10L2015/0633

摘要： An input signal that includes linguistic content in a first language may be received by a computing device. The linguistic content may include text or speech. The computing device may associate the linguistic content in the first language with one or more phonemes from a second language. The computing device may also determine a phonemic representation of the linguistic content in the first language based on use of the one or more phonemes from the second language. The phonemic representation may be indicative of a pronunciation of the linguistic content in the first language according to speech sounds of the second language.

摘要翻译： 包括第一语言的语言内容的输入信号可以被计算设备接收。语言内容可能包括文字或言语。计算设备可将第一语言中的语言内容与来自第二语言的一个或多个音素相关联。计算设备还可以基于来自第二语言的一个或多个音素的使用来确定第一语言中的语言内容的音位表示。根据第二语言的语音，音素表示可以指示第一语言中的语言内容的发音。

59.

发明授权
Method, device and system for providing language service 有权
标题翻译：用于提供语言服务的方法，设备和系统

公开(公告)号：US09128930B2

公开(公告)日：2015-09-08

申请号：US14563939

申请日：2014-12-08

申请人： Tencent Technology (Shenzhen) Company Limited

发明人： Yang Song , Bo Chen , Li Lu , Hao Ye

IPC分类号： G06F17/28 , G06F17/27 , G06F17/21 , G10L21/00 , G10L15/26

CPC分类号： G06F17/289 , G10L13/086 , G10L15/005 , G10L15/26

摘要： A method, device and system for providing a language service are disclosed. In some embodiments, the method is performed at a computer system having one or more processors and memory for storing programs to be executed by the one or more processors. The method includes receiving a first message from a client device. The method includes determining if the first message is in a first language or a second language different than the first language. The method includes translating the first message into a second message in the second language if the first message is in the first language. The method includes, alternatively, generating a third message in the second language if the first message is in the second language, where the third message includes a conversational response to the first message. The method further includes returning one of the second message and the third message to the client device.

摘要翻译： 公开了一种用于提供语言服务的方法，设备和系统。在一些实施例中，该方法在具有一个或多个处理器的计算机系统和用于存储要由一个或多个处理器执行的程序的存储器中执行。该方法包括从客户端设备接收第一消息。该方法包括确定第一消息是否处于与第一语言不同的第一语言或第二语言。该方法包括：如果第一消息是第一语言，则将第一消息转换成第二语言的第二消息。如果第一消息是第二语言，则该方法包括或者以第二语言生成第三消息，其中第三消息包括对第一消息的会话响应。该方法还包括将第二消息和第三消息中的一个返回到客户端设备。

60.

发明申请
METHOD AND SYSTEM FOR VOICE SYNTHESIS 审中-公开
标题翻译：语音合成方法与系统

公开(公告)号：US20150149181A1

公开(公告)日：2015-05-28

申请号：US14411952

申请日：2013-07-02

申请人： CONTINENTAL AUTOMOTIVE FRANCE , CONTINENTAL AUTOMOTIVE GmbH

发明人： Vincent Delahaye

IPC分类号： G10L13/08 , G10L13/06

CPC分类号： G10L13/08 , G10L13/06 , G10L13/086

摘要： Method and system for generating audio signals (9) representative of a text (3) to be converted, the method includes the steps of: providing a database (1) of acoustic units, identifying a list of pre-calculated expressions (10), and recording, for each pre-calculated expression, an acoustic frame (7) corresponding to it being pronounced, decomposing, by virtue of correlation calculations, each recorded acoustic frame into a sequenced table (5) including a series of acoustic unit references modulated by amplitude (α(i)A) and temporal (α(i)T) form factors , identifying in the text the pre-calculated expressions and decomposing the rest (12) into phonemes, inserting in place of each pre-calculated expression the corresponding sequenced table, and preparing a concatenation of acoustic units (19) according to the text to be converted.

摘要翻译： 用于生成表示要转换的文本（3）的音频信号（9）的方法和系统，所述方法包括以下步骤：提供声学单元的数据库（1），识别预先计算的表达式（10）的列表，并且对于每个预先计算的表达式，记录与其相对应的声音帧（7），通过相关计算，将每个记录的声音帧分解成包括由一系列声学单元参考调制的序列表（5）幅度（α（i）A）和时间（α（i）T）形式因子，在文本中识别预先计算的表达式并将其余部分（12）分解为音素，插入每个预计算的表达式相应的根据待转换的文本准备声学单元（19）的级联。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类