Speech synthesis method, speech synthesis system, and speech synthesis program
    41.
    发明授权
    Speech synthesis method, speech synthesis system, and speech synthesis program 失效
    语音合成方法,语音合成系统和语音合成程序

    公开(公告)号:US07668717B2

    公开(公告)日:2010-02-23

    申请号:US10996401

    申请日:2004-11-26

    IPC分类号: G10L13/00

    CPC分类号: G10L13/06 G10L13/04

    摘要: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.

    摘要翻译: 语音合成系统将一组语音单元存储在存储器中,基于目标语音的韵律信息从组中选择多个语音单元,对应于每个段选择的语音单元,该段是通过分割 目标语音和最小化从选择到目标语音的语音单元产生的合成语音的失真,通过融合所选择的语音单元来生成对应于每个段的新语音单元,以获得对应于该语音单元的多个新语音单元 并且通过连接新的语音单元来产生合成语音。

    Apparatus and method for voice conversion using attribute information
    42.
    发明授权
    Apparatus and method for voice conversion using attribute information 有权
    使用属性信息进行语音转换的装置和方法

    公开(公告)号:US07580839B2

    公开(公告)日:2009-08-25

    申请号:US11533122

    申请日:2006-09-19

    IPC分类号: G10L13/00

    CPC分类号: G10L13/033 G10L2021/0135

    摘要: A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means. The conversion-source-speaker speech-unit selection means selects conversion-source-speaker speech units corresponding to conversion-target-speaker speech units based on the mismatch between the attribute information of the conversion-target-speaker speech units and that of the conversion-source-speaker speech units, whereby the voice conversion rules are made from the selected pair of the conversion-target-speaker speech units and the conversion-source-speaker speech units.

    摘要翻译: 根据本发明的实施例的语音处理装置包括转换源扬声器语音单元数据库; 语音转换规则学习数据产生装置; 以及语音转换规则学习装置,用于进行语音转换规则。 语音转换规则学习数据生成装置包括转换对象扬声器语音单元提取装置; 属性信息生成装置; 转换源扬声器语音单元数据库; 以及转换源扬声器语音单元选择装置。 转换源扬声器语音单元选择装置基于转换对象扬声器语音单元的属性信息与转换目标扬声器语音单元的属性信息之间的不匹配来选择与转换对象扬声器语音单元相对应的转换源扬声器语音单元 源音扬声器语音单元,由此从所选择的转换对象扬声器语音单元和转换源扬声器语音单元中进行语音转换规则。

    Accent information extracting apparatus and method thereof
    43.
    发明申请
    Accent information extracting apparatus and method thereof 审中-公开
    重音信息提取装置及其方法

    公开(公告)号:US20090043568A1

    公开(公告)日:2009-02-12

    申请号:US12071390

    申请日:2008-02-20

    IPC分类号: G10L11/04

    CPC分类号: G10L13/06 G10L13/04 G10L17/26

    摘要: An accent type is determined by outputting mora synchronized signals, extracting a pitch pattern which is a variation pattern of a voice height (fundamental frequency) from a speech signal entered by a user, generating mora synchronized pattern from the pitch pattern and the mora synchronized signal, storing typical patterns for respective accent types, collating the mora synchronized pattern and reference accent pattern, calculating matching of the mora synchronized patterns with respect to the respective accent types, referring the matching and determining the accent type.

    摘要翻译: 通过输出mora同步信号,从用户输入的语音信号中提取作为语音高度(基本频率)的变化模式的音调模式来确定重音类型,从音调模式生成mora同步模式和mora同步信号 存储各种重音类型的典型图案,整理mora同步模式和参考重点模式,计算mora同步模式相对于各重音类型的匹配,参考匹配和确定重音类型。

    SPEECH SYNTHESIS APPARATUS AND METHOD
    44.
    发明申请
    SPEECH SYNTHESIS APPARATUS AND METHOD 有权
    语音合成设备和方法

    公开(公告)号:US20070271099A1

    公开(公告)日:2007-11-22

    申请号:US11745785

    申请日:2007-05-08

    IPC分类号: G10L13/00

    摘要: A waveform memory stores a plurality of speech unit waveforms. A information memory correspondingly stores speech unit information and an address of each of the plurality of speech unit waveforms. A selector selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information. A speech unit waveform acquisition unit acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address. A speech unit concatenation unit generates the speech by concatenating the speech unit waveform acquired. The speech unit waveform acquisition unit acquires at least two speech unit waveforms corresponding to at least two speech units included in the speech unit sequence from a continuous region of the waveform memory during one access.

    摘要翻译: 波形存储器存储多个语音单元波形。 信息存储器相应地存储语音单元信息和多个语音单元波形中的每一个的地址。 选择器通过参考语音单元信息来选择与输入音素序列相对应的语音单元序列。 语音单元波形获取单元通过参考地址从波形存储器获取与语音单元序列的每个语音单元相对应的语音单元波形。 语音单元级联单元通过连接所获取的语音单元波形来生成语音。 语音单元波形获取单元在一次访问期间从波形存储器的连续区域获取对应于包括在语音单元序列中的至少两个语音单元的至少两个语音单元波形。

    Pitch pattern generation method and its apparatus
    45.
    发明申请
    Pitch pattern generation method and its apparatus 审中-公开
    节距图生成方法及其装置

    公开(公告)号:US20060271367A1

    公开(公告)日:2006-11-30

    申请号:US11233021

    申请日:2005-09-23

    IPC分类号: G10L13/00

    CPC分类号: G10L13/10

    摘要: A pitch pattern generation method which enables generation of a stable pitch pattern with high naturalness is provided, a pattern selection part 10 selects N pitch patterns 101 and M pitch patterns 103 for each prosody control unit from pitch patterns stored in a pitch pattern storage part 14 based on language attribute information 100 obtained by analyzing a text and phoneme duration 111, a pattern shape generation part 11 fuses the N selected pitch patterns 101 based on the language attribute information 100 to generate a fused pitch pattern and performs expansion or contraction of the fused pitch pattern in a time axis direction in accordance with the phoneme duration 111 to generate a new pitch pattern 102, an offset control part 12 calculates a statistic amount of offset values from the M selected pitch patterns 103 and deforms the pitch pattern 102 in accordance with the statistic amount to output a pitch pattern 104, and a pattern connection part 13 connects the pitch pattern 104 generated for each prosody control unit, performs a process of smoothing so that discontinuity does not occur at a connection boundary portion, and outputs a sentence pattern 121.

    摘要翻译: 提供能够产生具有高自然度的稳定节距图案的节距图案生成方法,图案选择部分10从存储在节距图案存储部分14中的节距图案中选择每个韵律控制单元的N个节距图案101和M个节距图案103 基于通过分析文本和音素持续时间111获得的语言属性信息100,图案形状生成部11基于语言属性信息100来融合N个选择的音调模式101,以产生融合的音调模式,并且进行融合的扩展或缩小 根据音素持续时间111在时间轴方向上的间距图案,以产生新的节距图案102,偏移控制部分12计算来自M个选择的节距图案103的偏移值的统计量,并根据 输出节距图案104的统计量,图案连接部13连接节距图案10 4,对于每个韵律控制单元生成,执行平滑处理,使得在连接边界部分处不发生不连续性,并输出句子图案121。

    Pitch pattern generating method and pitch pattern generating apparatus
    46.
    发明申请
    Pitch pattern generating method and pitch pattern generating apparatus 审中-公开
    间距图案生成方法和俯仰图案生成装置

    公开(公告)号:US20060224380A1

    公开(公告)日:2006-10-05

    申请号:US11385822

    申请日:2006-03-22

    IPC分类号: G10L11/04

    CPC分类号: G10L25/90

    摘要: A pitch pattern generating method includes preparing a memory to store a plurality of pitch patterns each extracted from natural speech, and pattern attribute information corresponding to the pitch patterns, inputting language attribute information obtained by analyzing a text including prosody control units, selecting, from the pitch patterns stored in the memory, a group of pitch patterns corresponding to each of the prosody control units based on the language attribute information, to obtain a plurality of groups corresponding to the prosody control units respectively, generating a new pitch pattern corresponding to the each of prosody control units by fusing pitch patterns of the group, to obtain a plurality of new pitch patterns corresponding to the prosody control units respectively, and generating a pitch pattern corresponding to the text based on the new pitch patterns.

    摘要翻译: 音调模式生成方法包括准备存储器,存储从自然语音中提取的多个音调模式,以及对应于音调模式的模式属性信息,输入通过分析包括韵律控制单元的文本而获得的语言属性信息,从 存储在存储器中的音调模式,基于语言属性信息对应于每个韵律控制单元的一组音调模式,分别获得与韵律控制单元相对应的多个组,生成与每个 通过融合组的音调模式来获得韵律控制单元,以分别获得与韵律控制单元相对应的多个新音调模式,并且基于新音调模式产生与该文本相对应的音调模式。

    Speech synthesis method
    47.
    发明授权
    Speech synthesis method 有权
    语音合成方法

    公开(公告)号:US06332121B1

    公开(公告)日:2001-12-18

    申请号:US09722047

    申请日:2000-11-27

    IPC分类号: G10L1300

    CPC分类号: G10L13/07 G10L25/90

    摘要: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.

    摘要翻译: 在合成单元发生器中,通过合成用语音语境和输入语音段标记的训练语音段,同时根据训练语音段的音调/持续时间来改变输入语音段的音高/持续时间来生成多个合成语音段 。 基于合成语音段和训练语音段之间的距离,从输入语音段中选择典型语音段,并存储在存储器中。 另外,根据距离生成与合成部对应的多个语音上下文集群,并存储在存储部中。 通过从存储器读出与包括输入音素的语音上下文的语音上下文群集相对应的合成单元的合成语音信号,并且在语音合成器中连接所选择的合成单位来生成合成语音信号。

    Speech synthesis method
    48.
    发明授权

    公开(公告)号:US06240384B1

    公开(公告)日:2001-05-29

    申请号:US08758772

    申请日:1996-12-03

    IPC分类号: G10L1300

    CPC分类号: G10L13/07 G10L25/90

    摘要: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.

    Storing a representative speech unit waveform for speech synthesis based on searching for similar speech units
    50.
    发明授权
    Storing a representative speech unit waveform for speech synthesis based on searching for similar speech units 有权
    基于搜索类似的语音单元存储用于语音合成的代表性语音单元波形

    公开(公告)号:US08868422B2

    公开(公告)日:2014-10-21

    申请号:US12880796

    申请日:2010-09-13

    摘要: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.

    摘要翻译: 根据一个实施例,公开了一种用于编辑语音的方法。 该方法可以从文本生成语音信息。 语音信息包括语音信息和韵律信息。 该方法可以基于语音信息和韵律信息中的至少一个将语音信息划分成多个语音单元。 该方法可以从多个语音单元中搜索至少两个语音单元。 至少两个语音单元中的语音信息和韵律信息中的至少一个是相同的或类似的。 此外,该方法可以将与至少两个语音单元中的一个对应的语音单元波形作为代表语音单元存储到存储器中。