专利检索 cpc:"G10L13/07" 第 1 页

1.

发明申请
NATURAL LANGUAGE PROCESSING TO MERGE RELATED ALERT MESSAGES FOR ACCESSIBILITY 审中-公开

公开(公告)号：US20190198009A1

公开(公告)日：2019-06-27

申请号：US16289263

申请日：2019-02-28

申请人： International Business Machines Corporation

发明人： Stephen A. Boxwell , Kyle M. Brake , Keith G. Frost , Stanley J. Vernier

IPC分类号： G10L13/07 , G06F17/27

CPC分类号： G10L13/07 , G06F17/271 , G06Q30/0601

摘要： A method for merging incoming alerts for accessibility is described. A first input alert and a second input alert intended for presentation by a screen reader are received. If the first input alert and the second input alert have arrived with a specified time interval, the first input alert and the second input alert are combined into an output alert. The output alert is sent to a screen reader for presentation.

2.

发明申请
TEXT-BASED INSERTION AND REPLACEMENT IN AUDIO NARRATION 审中-公开

公开(公告)号：US20190130894A1

公开(公告)日：2019-05-02

申请号：US15796292

申请日：2017-10-27

申请人： Adobe Inc. , The Trustees of Princeton University

发明人： Zeyu Jin , Gautham J. Mysore , Stephen DiVerdi , Jingwan Lu , Adam Finkelstein

IPC分类号： G10L13/08 , G10L13/07 , G10L13/04 , G10L15/02

CPC分类号： G10L13/08 , G06F17/24 , G10L13/00 , G10L13/04 , G10L13/06 , G10L13/07 , G10L15/02 , G10L21/00 , G10L2021/0135 , G11B27/022

摘要： Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.

3.

发明授权
System and method for distributed voice models across cloud and device for embedded text-to-speech 有权

公开(公告)号：US09761218B2

公开(公告)日：2017-09-12

申请号：US14953771

申请日：2015-11-30

申请人： AT&T Intellectual Property I, L.P.

发明人： Benjamin J. Stern , Mark Charles Beutnagel , Alistair D. Conkie , Horst J. Schroeter , Amanda Joy Stent

IPC分类号： G10L13/07 , G10L13/04 , G10L13/047

CPC分类号： G10L13/04 , G10L13/047 , G10L13/07

摘要： Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.

4.

发明申请
SYSTEMS AND METHODS FOR GENERATING SPEECH OF MULTIPLE STYLES FROM TEXT 审中-公开

公开(公告)号：US20170186418A1

公开(公告)日：2017-06-29

申请号：US15308731

申请日：2014-06-05

申请人： Naunce Communications, Inc.

发明人： Paolo Mairano , Corinne Bos-Plachez , Sourav Nandy , Johan Wouters , Silvia Maria Antonella Quazza , Dong-Jian Yue

IPC分类号： G10L13/10 , G10L13/047 , G10L13/07

CPC分类号： G10L13/10 , G10L13/047 , G10L13/07 , G10L13/08

摘要： A text-to-speech (TTS) system includes components capable of supporting the generation of speech output in any of multiple styles, and may switch seamlessly from producing speech output in one style to producing speech output in another style. For example, a concatenative TTS system may include a speech base storing speech units associated with multiple speech styles, and a linguistic analysis component to generate a phonetic transcription specifying speech output in any of multiple styles. Text input may include a style indication associated with a particular segment of the input text. The linguistic analysis component may invoke encoded rules and/or components based upon the style indication, and generate a phonetic transcription specifying a speech style, which may be processed to generate output speech.

5.

发明申请
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS 有权
标题翻译：多种语音合成的系统和方法

公开(公告)号：US20160093289A1

公开(公告)日：2016-03-31

申请号：US14499444

申请日：2014-09-29

申请人： Nuance Communications, Inc.

发明人： Vincent Pollet

IPC分类号： G10L13/08 , G10L13/047 , G10L13/027

CPC分类号： G10L13/027 , G10L13/047 , G10L13/07 , G10L13/08

摘要： Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.

摘要翻译： 执行多风格语音合成的技术。这些技术包括使用至少一个计算机硬件处理器来执行：获得包括文本的输入和用于将文本呈现为语音的第一说话风格的标识; 识别用于将文本呈现为语音的多个语音片段，所识别的多个语音片段包括具有第一说话风格的第一语音片段和具有不同于第一说话风格的第二说话风格的第二语音片段; 以及至少部分地通过使用所识别的多个语音片段将文本呈现为具有第一说话风格的语音。

6.

发明授权
Speech synthesis system, speech synthesis program product, and speech synthesis method 有权

公开(公告)号：US09275631B2

公开(公告)日：2016-03-01

申请号：US13731268

申请日：2012-12-31

申请人： Nuance Communications, Inc.

发明人： Ryuki Tachibana , Masafumi Nishimura

IPC分类号： G10L13/10 , G10L13/00 , G10L13/07

CPC分类号： G10L13/00 , G10L13/07 , G10L13/10

摘要： Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

7.

发明授权
Syllable based speech processing method 有权
标题翻译：基于音节的语音处理方法

公开(公告)号：US09147393B1

公开(公告)日：2015-09-29

申请号：US13767987

申请日：2013-02-15

申请人： Boris Fridman-Mintz

发明人： Boris Fridman-Mintz

IPC分类号： G10L13/08 , G10L15/00 , G10L13/00 , G10L15/04 , G10L13/07 , G10L13/10 , G10L13/06 , G10L13/047 , G10L13/04 , G10L13/02 , G10L13/033

CPC分类号： G10L15/04 , G10L13/00 , G10L13/02 , G10L13/033 , G10L13/04 , G10L13/047 , G10L13/06 , G10L13/07 , G10L13/08 , G10L13/10 , G10L15/02 , G10L25/18 , G10L2015/027

摘要： Speech is modeled as a cognitively-driven sensory-motor activity where the form of speech is the result of categorization processes that any given subject recreates by focusing on creating sound patterns that are represented by syllables. These syllables are then combined in characteristic patterns to form words, which are in turn, combined in characteristic patterns to form utterances. A speech recognition process first identifies syllables in an electronic waveform representing ongoing speech. The pattern of syllables is then deconstructed into a standard form that is used to identify words. The words are then concatenated to identify an utterance. Similarly, a speech synthesis process converts written words into patterns of syllables. The pattern of syllables is then processed to produce the characteristic rhythmic sound of naturally spoken words. The words are then assembled into an utterance which is also processed to produce a natural sounding speech.

摘要翻译： 言语被模仿为认知驱动的感觉运动活动，其中言语形式是任何给定主体通过专注于创建由音节表示的声音模式而重新创建的分类过程的结果。这些音节然后以特征模式组合以形成单词，它们又以特征模式组合以形成话语。语音识别过程首先识别表示正在进行的语音的电子波形中的音节。然后将音节的模式解构为用于识别单词的标准形式。然后连接这些单词以识别话语。类似地，语音合成过程将书写词转换成音节的模式。然后对音节的形式进行处理，以产生自然口语的特征节律声音。这些话然后被组合成一个话语，也被处理以产生自然的声音。

8.

发明授权
Pre-saved data compression for TTS concatenation cost 有权
标题翻译： TTS连接成本预先保存的数据压缩

公开(公告)号：US08798998B2

公开(公告)日：2014-08-05

申请号：US12754045

申请日：2010-04-05

申请人： Huicheng Song , Guoliang Zhang , Zhiwei Weng

发明人： Huicheng Song , Guoliang Zhang , Zhiwei Weng

IPC分类号： G10L13/00 , G10L13/08

CPC分类号： G10L13/07

摘要： Pre-saved concatenation cost data is compressed through speech segment grouping. Speech segments are assigned to a predefined number of groups based on their concatenation cost values with other speech segments. A representative segment is selected for each group. The concatenation cost between two segments in different groups may then be approximated by that between the representative segments of their respective groups, thereby reducing an amount of concatenation cost data to be pre-saved.

摘要翻译： 预先保存的级联成本数据通过语音段分组进行压缩。基于与其他语音段的级联成本值，语音段被分配给预定数量的组。为每个组选择一个代表性的段。然后，不同组中的两个段之间的级联成本可以由其各自组的代表段之间的级联成本近似，从而减少要预先保存的级联成本数据的数量。

9.

发明申请
ACCESSIBILITY TECHINQUES FOR PRESENTATION OF SYMBOLIC EXPRESSIONS 有权
标题翻译：用于表示符号表达的可访问性技术

公开(公告)号：US20140210828A1

公开(公告)日：2014-07-31

申请号：US13750199

申请日：2013-01-25

申请人： APPLE INC.

发明人： Christopher B. Fleizach , Eric T. Seymour , Gregory F. Hughes , Mike Pedersen

IPC分类号： G06T11/60 , G06F3/0488

CPC分类号： G06T11/60 , G06F3/04842 , G06F3/0488 , G06F3/04883 , G06F3/04886 , G06T11/203 , G06T2200/24 , G10L13/07 , G10L13/08

摘要： Methods for presenting symbolic expressions such as mathematical, scientific, or chemical expressions, formulas, or equations are performed by a computing device. One method includes: displaying a first portion of a symbolic expression within a first area of a display screen; while in a first state in which the first area is selected for aural presentation, aurally presenting first information related to the first portion of the symbolic expression; while in the first state, detecting particular user input; in response to detecting the particular user input, performing the steps of: transitioning from the first state to a second state in which a second area, of the display, is selected for aural presentation; determining second information associated with a second portion, of the symbolic expression, that is displayed within the second area; in response to determining the second information, aurally presenting the second information.

摘要翻译： 用于呈现诸如数学，科学或化学表达式，公式或等式的符号表达式的方法由计算设备执行。一种方法包括：在显示屏幕的第一区域内显示符号表达式的第一部分; 在选择第一区域用于听觉呈现的第一状态中，听觉地呈现与符号表达的第一部分有关的第一信息; 在第一状态下，检测特定的用户输入; 响应于检测到特定用户输入，执行以下步骤：从第一状态转换到第二状态，其中显示的第二区域被选择用于听觉呈现; 确定在所述第二区域内显示的符号表达式的与第二部分相关联的第二信息; 响应于确定第二信息，听觉地呈现第二信息。

10.

发明授权
Speech synthesis apparatus and method wherein more than one speech unit is acquired from continuous memory region by one access 有权
标题翻译：语音合成装置和方法，其中通过一次访问从连续存储器区域获取多于一个语音单元

公开(公告)号：US08468020B2

公开(公告)日：2013-06-18

申请号：US11745785

申请日：2007-05-08

申请人： Takehiko Kagoshima

发明人： Takehiko Kagoshima

IPC分类号： G10L13/04 , G10L13/06 , G10L13/00

CPC分类号： G10L13/02 , G10L13/04 , G10L13/047 , G10L13/06 , G10L13/07 , G10L13/08 , G10L15/063 , G11C7/16

摘要： An apparatus for synthesizing a speech including a waveform memory that stores a plurality of speech unit waveforms, an information memory that correspondingly stores speech unit information and an address of each of the speech unit waveforms, a selector that selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information, a speech unit waveform acquisition unit that acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address, a speech unit concatenation unit that generates the speech by concatenating the speech unit waveform acquired.

摘要翻译： 一种用于合成包括存储多个语音单元波形的波形存储器的语音的装置，对应地存储语音单元信息和每个语音单元波形的地址的信息存储器，选择器，其选择对应于通过参考语音单元信息输入音素序列;语音单元波形获取单元，通过参考地址从波形存储器获取对应于语音单元序列的每个语音单元的语音单元波形;语音单元连接单元，其生成通过连接所获取的语音单位波形的语音。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类