WAVEFORM PROCESSING DEVICE, WAVEFORM PROCESSING METHOD, AND WAVEFORM PROCESSING PROGRAM
    1.
    发明申请
    WAVEFORM PROCESSING DEVICE, WAVEFORM PROCESSING METHOD, AND WAVEFORM PROCESSING PROGRAM 有权
    波形处理设备,波形处理方法和波形处理程序

    公开(公告)号:US20140136192A1

    公开(公告)日:2014-05-15

    申请号:US14131460

    申请日:2012-06-26

    IPC分类号: G10L25/90

    摘要: There is provided a waveform processing device for changing power of each pitch waveform of a segment in order to acquire a natural synthesis speech. A power calculation means 71 selects pitch waveforms one by one from a group of pitch waveforms corresponding to a segment, and calculates a scalar indicating power of a selected pitch waveform. A normalization degree calculation means 72 calculates a degree of normalization which is an index indicating a degree of normalization of a pitch waveform selected by the power calculation means 71, as a function value of an increasing function using the scalar as a variable. A change coefficient calculation means 73 calculates a change coefficient for changing an amplitude value of a pitch waveform selected by the power calculation means 71 based on the scalar and the degree of normalization. An amplitude change means 74 multiplies an amplitude value at each sampling point of a pitch waveform selected by the power calculation means 71 by the change coefficient.

    摘要翻译: 提供了一种用于改变段的每个音调波形的功率的波形处理装置,以获得自然合成语音。 功率计算装置71从对应于段的一组音调波形中逐个选择音调波形,并计算所选音调波形的标量指示功率。 归一化度计算装置72计算作为指示由功率计算装置71选择的音调波形的归一化程度的指标的归一化程度,作为使用标量作为变量的增加函数的函数值。 变化系数计算单元73根据标量和标准化程度,计算用于改变由功率计算单元71选择的音调波形的振幅值的变化系数。 振幅改变装置74将由功率计算装置71选择的音调波形的每个采样点处的振幅值乘以变化系数。

    SPEECH SYNTHESIZING APPARATUS, METHOD, AND PROGRAM
    2.
    发明申请
    SPEECH SYNTHESIZING APPARATUS, METHOD, AND PROGRAM 有权
    语音合成设备,方法和程序

    公开(公告)号:US20100076768A1

    公开(公告)日:2010-03-25

    申请号:US12527802

    申请日:2008-02-15

    IPC分类号: G10L13/00

    CPC分类号: G10L13/06 G10L13/04

    摘要: Disclosed is a speech synthesizing apparatus including a segment selection unit that selects a segment suited to a target segment environment from candidate segments, includes a prosody change amount calculation unit that calculates prosody change amount of each candidate segment based on prosody information of candidate segments and the target segment environment, a selection criterion calculation unit that calculates a selection criterion based on the prosody change amount, a candidate selection unit that narrows down selection candidates based on the prosody change amount and the selection criterion, and an optimum segment search unit than searches for an optimum segment from among the narrowed-down candidate segments.

    摘要翻译: 公开了一种语音合成装置,包括从候选片段选择适合于目标片段环境的片段的片段选择部,包括:韵律变化量计算部,其基于候选片段的韵律信息计算每个候选片段的韵律变化量, 目标区段环境,基于韵律变化量计算选择标准的选择标准计算单元,基于韵律变化量和选择标准来缩小选择候选的候选选择单元,以及搜索最优区段搜索单元 来自缩小的候选段之间的最佳段。

    Prosody generator, speech synthesizer, prosody generating method and prosody generating program
    3.
    发明授权
    Prosody generator, speech synthesizer, prosody generating method and prosody generating program 有权
    韵律发生器,语音合成器,韵律生成方法和韵律生成程序

    公开(公告)号:US09324316B2

    公开(公告)日:2016-04-26

    申请号:US14004148

    申请日:2012-05-10

    CPC分类号: G10L13/027 G10L13/10

    摘要: There is provided a prosody generator that generates prosody information for implementing highly natural speech synthesis without unnecessarily collecting large quantities of learning data. A data dividing means 81 divides into subspaces the data space of a learning database as an assembly of learning data indicative of the feature quantities of speech waveforms. A density information extracting means 82 extracts density information indicative of the density state in terms of information quantity of the learning data in each of the subspaces divided by the data dividing means 81. A prosody information generating method selecting means 83 selects either a first method or a second method as a prosody information generating method based on the density information, the first method involving generating the prosody information using a statistical technique, the second method involving generating the prosody information using rules based on heuristics.

    摘要翻译: 提供了一种韵律发生器,其产生用于实现高度自然的语音合成的韵律信息,而不必不必要地收集大量的学习数据。 数据分割装置81将学习数据库的数据空间划分为子空间,作为指示语音波形的特征量的学习数据的组合。 密度信息提取装置82从由数据划分装置81划分的每个子空间中提取表示密度状态的密度信息。每个子空间中的学习数据的信息量表示密度信息。韵律信息生成方法选择装置83选择第一种方法或 作为基于密度信息的韵律信息生成方法的第二方法,涉及使用统计技术生成韵律信息的第一方法,涉及使用基于启发式的规则生成韵律信息的第二方法。

    Speech synthesis device, speech synthesis method, and speech synthesis program
    4.
    发明授权
    Speech synthesis device, speech synthesis method, and speech synthesis program 有权
    语音合成装置,语音合成方法和语音合成程序

    公开(公告)号:US08407054B2

    公开(公告)日:2013-03-26

    申请号:US12599317

    申请日:2008-04-28

    IPC分类号: G10L13/06

    CPC分类号: G10L13/10

    摘要: A speech synthesis device is provided with: a central segment selection unit for selecting a central segment from among a plurality of speech segments; a prosody generation unit for generating prosody information based on the central segment; a non-central segment selection unit for selecting a non-central segment, which is a segment outside of a central segment section, based on the central segment and the prosody information; and a waveform generation unit for generating a synthesized speech waveform based on the prosody information, the central segment, and the non-central segment. The speech synthesis device first selects a central segment that forms a basis for prosody generation and generates prosody information based on the central segment so that it is possible to sufficiently reduce both concatenation distortion and sound quality degradation accompanying prosody control in the section of the central segment.

    摘要翻译: 语音合成装置具有:中央段选择单元,用于从多个语音段中选择中心段; 用于产生基于中心段的韵律信息的韵律生成单元; 非中心段选择单元,用于基于所述中心段和所述韵律信息来选择作为中心段区段外的段的非中心段; 以及波形生成单元,用于基于所述韵律信息,所述中心段和所述非中心区段来生成合成语音波形。 语音合成装置首先选择形成韵律产生基础的中心片段,并且基于中心片段产生韵律信息,从而可以充分地减少伴随中心片段的韵律控制的连接失真和声音质量下降 。

    SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM
    5.
    发明申请
    SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM 有权
    语音合成设备,语音合成方法和语音合成程序

    公开(公告)号:US20100211393A1

    公开(公告)日:2010-08-19

    申请号:US12599317

    申请日:2008-04-28

    IPC分类号: G10L13/06 G10L13/08

    CPC分类号: G10L13/10

    摘要: A speech synthesis device is provided with: a central segment selection unit for selecting a central segment from among a plurality of speech segments; a prosody generation unit for generating prosody information based on the central segment; a non-central segment selection unit for selecting a non-central segment, which is a segment outside of a central segment section, based on the central segment and the prosody information; and a waveform generation unit for generating a synthesized speech waveform based on the prosody information, the central segment, and the non-central segment. The speech synthesis device first selects a central segment that forms a basis for prosody generation and generates prosody information based on the central segment so that it is possible to sufficiently reduce both concatenation distortion and sound quality degradation accompanying prosody control in the section of the central segment.

    摘要翻译: 语音合成装置具有:中央段选择单元,用于从多个语音段中选择中心段; 用于产生基于中心段的韵律信息的韵律生成单元; 非中心段选择单元,用于基于所述中心段和所述韵律信息来选择作为中心段区段外的段的非中心段; 以及波形生成单元,用于基于所述韵律信息,所述中心段和所述非中心区段来生成合成语音波形。 语音合成装置首先选择形成韵律产生基础的中心片段,并且基于中心片段产生韵律信息,从而可以充分地减少伴随中心片段的韵律控制的连接失真和声音质量下降 。

    Speech synthesizing apparatus, method, and program
    6.
    发明授权
    Speech synthesizing apparatus, method, and program 有权
    语音合成装置,方法和程序

    公开(公告)号:US08630857B2

    公开(公告)日:2014-01-14

    申请号:US12527802

    申请日:2008-02-15

    IPC分类号: G10L13/06

    CPC分类号: G10L13/06 G10L13/04

    摘要: Disclosed is a speech synthesizing apparatus including a segment selection unit that selects a segment suited to a target segment environment from candidate segments, includes a prosody change amount calculation unit that calculates prosody change amount of each candidate segment based on prosody information of candidate segments and the target segment environment, a selection criterion calculation unit that calculates a selection criterion based on the prosody change amount, a candidate selection unit that narrows down selection candidates based on the prosody change amount and the selection criterion, and an optimum segment search unit than searches for an optimum segment from among the narrowed-down candidate segments.

    摘要翻译: 公开了一种语音合成装置,包括从候选片段选择适合于目标片段环境的片段的片段选择部,包括:韵律变化量计算部,其基于候选片段的韵律信息计算每个候选片段的韵律变化量, 目标区段环境,基于韵律变化量计算选择标准的选择标准计算单元,基于韵律变化量和选择标准缩小选择候选的候选选择单元,以及搜索最优区段搜索单元 来自缩小的候选段之间的最佳段。

    PROSODY GENERATOR, SPEECH SYNTHESIZER, PROSODY GENERATING METHOD AND PROSODY GENERATING PROGRAM
    7.
    发明申请
    PROSODY GENERATOR, SPEECH SYNTHESIZER, PROSODY GENERATING METHOD AND PROSODY GENERATING PROGRAM 有权
    PROSODY发生器,语音合成器,前景生成方法和预测生成程序

    公开(公告)号:US20140012584A1

    公开(公告)日:2014-01-09

    申请号:US14004148

    申请日:2012-05-10

    IPC分类号: G10L13/027

    CPC分类号: G10L13/027 G10L13/10

    摘要: There is provided a prosody generator that generates prosody information for implementing highly natural speech synthesis without unnecessarily collecting large quantities of learning data. A data dividing means 81 divides into subspaces the data space of a learning database as an assembly of learning data indicative of the feature quantities of speech waveforms. A density information extracting means 82 extracts density information indicative of the density state in terms of information quantity of the learning data in each of the subspaces divided by the data dividing means 81. A prosody information generating method selecting means 83 selects either a first method or a second method as a prosody information generating method based on the density information, the first method involving generating the prosody information using a statistical technique, the second method involving generating the prosody information using rules based on heuristics.

    摘要翻译: 提供了一种韵律发生器,其产生用于实现高度自然的语音合成的韵律信息,而不必不必要地收集大量的学习数据。 数据分割装置81将学习数据库的数据空间划分为子空间,作为指示语音波形的特征量的学习数据的组合。 密度信息提取装置82从由数据划分装置81划分的每个子空间中提取表示密度状态的密度信息。每个子空间中的学习数据的信息量表示密度信息。韵律信息生成方法选择装置83选择第一种方法或 作为基于密度信息的韵律信息生成方法的第二方法,涉及使用统计技术生成韵律信息的第一方法,涉及使用基于启发式的规则生成韵律信息的第二方法。

    SPEECH SYNTHESIS SYSTEM
    8.
    发明申请
    SPEECH SYNTHESIS SYSTEM 有权
    语音合成系统

    公开(公告)号:US20110106538A1

    公开(公告)日:2011-05-05

    申请号:US13000342

    申请日:2009-06-22

    IPC分类号: G10L13/00

    CPC分类号: G10L13/08 G10L15/30

    摘要: This speech synthesis system includes a server device and a client device. The client device accepts text information representing text, and transmits a speech element request to the server device. The server device stores speech element information. The server device receives the speech element request transmitted by the client device and, in response to the received speech element request, transmits speech element information to the client device so that the speech element information is received by the client device in a different order from an order of arrangement of speech elements in speech corresponding to the text. The client device executes a speech synthesis process by rearranging the speech element information so that speech elements represented by the received speech element information are arranged in the same order as the order of arrangement of the speech elements in the speech corresponding to the text.

    摘要翻译: 该语音合成系统包括服务器设备和客户端设备。 客户机设备接收表示文本的文本信息,并将语音元素请求发送到服务器设备。 服务器设备存储语音元素信息。 服务器装置接收由客户机装置发送的语音要素请求,响应于接收到的语音要素请求,向客户端装置发送语音要素信息,使得客户机装置以不同于 语音元素在文本中的排列顺序与文本相对应。 客户端装置通过重新布置语音元素信息来执行语音合成处理,使得由接收到的语音元素信息表示的语音元素以与语音元素相对应的语音元素的排列顺序相同的顺序排列。

    Waveform processing device, waveform processing method, and waveform processing program
    9.
    发明授权
    Waveform processing device, waveform processing method, and waveform processing program 有权
    波形处理装置,波形处理方法和波形处理程序

    公开(公告)号:US09443538B2

    公开(公告)日:2016-09-13

    申请号:US14131460

    申请日:2012-06-26

    摘要: There is provided a waveform processing device for changing power of each pitch waveform of a segment in order to acquire a natural synthesis speech. A power calculation means 71 selects pitch waveforms one by one from a group of pitch waveforms corresponding to a segment, and calculates a scalar indicating power of a selected pitch waveform. A normalization degree calculation means 72 calculates a degree of normalization which is an index indicating a degree of normalization of a pitch waveform selected by the power calculation means 71, as a function value of an increasing function using the scalar as a variable. A change coefficient calculation means 73 calculates a change coefficient for changing an amplitude value of a pitch waveform selected by the power calculation means 71 based on the scalar and the degree of normalization. An amplitude change means 74 multiplies an amplitude value at each sampling point of a pitch waveform selected by the power calculation means 71 by the change coefficient.

    摘要翻译: 提供了一种用于改变段的每个音调波形的功率的波形处理装置,以获得自然合成语音。 功率计算装置71从对应于段的一组音调波形中逐个选择音调波形,并计算所选音调波形的标量指示功率。 归一化度计算装置72计算作为指示由功率计算装置71选择的音调波形的归一化程度的指标的归一化程度,作为使用标量作为变量的增加函数的函数值。 变化系数计算单元73根据标量和标准化程度,计算用于改变由功率计算单元71选择的音调波形的振幅值的变化系数。 振幅改变装置74将由功率计算装置71选择的音调波形的每个采样点处的振幅值乘以变化系数。

    Speech synthesis system for generating speech information obtained by converting text into speech
    10.
    发明授权
    Speech synthesis system for generating speech information obtained by converting text into speech 有权
    用于生成通过将文本转换为语音而获得的语音信息的语音合成系统

    公开(公告)号:US08606583B2

    公开(公告)日:2013-12-10

    申请号:US13000342

    申请日:2009-06-22

    IPC分类号: G10L13/08

    CPC分类号: G10L13/08 G10L15/30

    摘要: This speech synthesis system includes a server device and a client device. The client device accepts text information representing text, and transmits a speech element request to the server device. The server device stores speech element information. The server device receives the speech element request transmitted by the client device and, in response to the received speech element request, transmits speech element information to the client device so that the speech element information is received by the client device in a different order from an order of arrangement of speech elements in speech corresponding to the text. The client device executes a speech synthesis process by rearranging the speech element information so that speech elements represented by the received speech element information are arranged in the same order as the order of arrangement of the speech elements in the speech corresponding to the text.

    摘要翻译: 该语音合成系统包括服务器设备和客户端设备。 客户机设备接收表示文本的文本信息,并将语音元素请求发送到服务器设备。 服务器设备存储语音元素信息。 服务器装置接收由客户机装置发送的语音要素请求,响应于接收到的语音要素请求,向客户端装置发送语音要素信息,使得客户机装置以不同于 语音元素在文本中的排列顺序与文本相对应。 客户端装置通过重新布置语音元素信息来执行语音合成处理,使得由接收到的语音元素信息表示的语音元素以与语音元素相对应的语音元素的排列顺序相同的顺序排列。