Hybrid approach in voice conversion
    1.
    发明授权
    Hybrid approach in voice conversion 失效
    语音转换中的混合方法

    公开(公告)号:US08224648B2

    公开(公告)日:2012-07-17

    申请号:US11966255

    申请日:2007-12-28

    CPC classification number: G10L21/00 G10L2021/0135

    Abstract: A hybrid approach is described for combining frequency warping and Gaussian Mixture Modeling (GMM) to achieve better speaker identity and speech quality. To train the voice conversion GMM model, line spectral frequency and other features are extracted from a set of source sounds to generate a source feature vector and from a set of target sounds to generate a target feature vector. The GMM model is estimated based on the aligned source feature vector and the target feature vector. A mixture specific warping function is generated each set of mixture mean pairs of the GMM model, and a warping function is generated based on a weighting of each of the mixture specific warping functions. The warping function can be used to convert sounds received from a source speaker to approximate speech of a target speaker.

    Abstract translation: 描述了混合方法,用于组合频率扭曲和高斯混合建模(GMM),以实现更好的扬声器身份和语音质量。 为了训练语音转换GMM模型,从一组源声音中提取线谱频率和其他特征以产生源特征向量和从一组目标声音生成目标特征向量。 基于对齐的源特征向量和目标特征向量来估计GMM模型。 每个GMM模型的混合均值对都产生混合特定的翘曲函数,并且基于每个混合特定翘曲函数的加权产生翘曲函数。 翘曲功能可用于将从源扬声器接收的声音转换为目标扬声器的近似语音。

    Hybrid Approach in Voice Conversion
    2.
    发明申请
    Hybrid Approach in Voice Conversion 失效
    语音转换中的混合方法

    公开(公告)号:US20090171657A1

    公开(公告)日:2009-07-02

    申请号:US11966255

    申请日:2007-12-28

    CPC classification number: G10L21/00 G10L2021/0135

    Abstract: A hybrid approach is described for combining frequency warping and Gaussian Mixture Modeling (GMM) to achieve better speaker identity and speech quality. To train the voice conversion GMM model, line spectral frequency and other features are extracted from a set of source sounds to generate a source feature vector and from a set of target sounds to generate a target feature vector. The GMM model is estimated based on the aligned source feature vector and the target feature vector. A mixture specific warping function is generated each set of mixture mean pairs of the GMM model, and a warping function is generated based on a weighting of each of the mixture specific warping functions. The warping function can be used to convert sounds received from a source speaker to approximate speech of a target speaker.

    Abstract translation: 描述了混合方法,用于组合频率扭曲和高斯混合建模(GMM),以实现更好的扬声器身份和语音质量。 为了训练语音转换GMM模型,从一组源声音中提取线谱频率和其他特征以产生源特征向量和从一组目标声音生成目标特征向量。 基于对齐的源特征向量和目标特征向量来估计GMM模型。 每个GMM模型的混合均值对都产生混合特定的翘曲函数,并且基于每个混合特定翘曲函数的加权产生翘曲函数。 翘曲功能可用于将从源扬声器接收的声音转换为目标扬声器的近似语音。

    METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING IMPROVED SPEECH SYNTHESIS
    4.
    发明申请
    METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING IMPROVED SPEECH SYNTHESIS 失效
    方法,设备和计算机程序产品提供改进的语音合成

    公开(公告)号:US20090299747A1

    公开(公告)日:2009-12-03

    申请号:US12475011

    申请日:2009-05-29

    CPC classification number: G10L13/04

    Abstract: An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech.

    Abstract translation: 用于提供改进的语音合成的装置可以包括处理器和存储可执行指令的存储器。 响应于处理器执行指令,装置可以至少部分地基于与真实声门脉冲相关联的属性,利用真实声门来执行至少从一个或多个存储的真实声门脉冲中选择真实声门脉冲 选择脉冲作为产生激励信号的基础,并且基于由模型产生的频谱参数来修改激励信号以提供合成语音。

    VOICE CONVERSION IN RING TONES AND OTHER FEATURES FOR A COMMUNICATION DEVICE
    5.
    发明申请
    VOICE CONVERSION IN RING TONES AND OTHER FEATURES FOR A COMMUNICATION DEVICE 审中-公开
    环通声音转换和通信设备的其他功能

    公开(公告)号:US20080161057A1

    公开(公告)日:2008-07-03

    申请号:US11963159

    申请日:2007-12-21

    CPC classification number: G10L13/033 G10L19/0018 G10L2021/0135

    Abstract: A voice conversion processing framework is operatively associated with the central processing unit (CPU) and audio processor of a communication device to convert default voice presentations generated by, for example text readers, ring tone applications and the like, to target voice presentations based on selected target voice files stored in memory.

    Abstract translation: 语音转换处理框架可操作地与通信设备的中央处理单元(CPU)和音频处理器相关联,以将由例如文本读取器产生的默认语音呈现转换成基于所选择的语音呈现 存储在内存中的目标语音文件。

    Methods and apparatuses for facilitating speech synthesis
    6.
    发明授权
    Methods and apparatuses for facilitating speech synthesis 有权
    促进语音合成的方法和装置

    公开(公告)号:US08781835B2

    公开(公告)日:2014-07-15

    申请号:US13099158

    申请日:2011-05-02

    CPC classification number: G10L13/02 G10L13/06

    Abstract: Methods and apparatuses are provided for facilitating speech synthesis. A method may include generating a plurality of input models representing an input by using a statistical model synthesizer to statistically model the input. The method may further include determining a speech unit sequence representing at least a portion of the input by using the input models to influence selection of one or more pre-recorded speech units having parameter representations. The method may additionally include identifying one or more bad units in the unit sequence. The method may also include replacing the identified one or more bad units with one or more parameters generated by the statistical model synthesizer. Corresponding apparatuses are also provided.

    Abstract translation: 提供了用于促进语音合成的方法和装置。 方法可以包括通过使用统计模型合成器来生成表示输入的多个输入模型以对输入进行统计建模。 该方法还可以包括通过使用输入模型来确定表示输入的至少一部分的语音单元序列,以影响具有参数表示的一个或多个预先录制的语音单元的选择。 该方法可以另外包括识别单元序列中的一个或多个不良单元。 该方法还可以包括用由统计模型合成器产生的一个或多个参数替换所识别的一个或多个坏单元。 还提供了相应的装置。

    METHODS AND APPARATUSES FOR FACILITATING SPEECH SYNTHESIS
    7.
    发明申请
    METHODS AND APPARATUSES FOR FACILITATING SPEECH SYNTHESIS 有权
    促进语音合成的方法和装置

    公开(公告)号:US20120109654A1

    公开(公告)日:2012-05-03

    申请号:US13099158

    申请日:2011-05-02

    CPC classification number: G10L13/02 G10L13/06

    Abstract: Methods and apparatuses are provided for facilitating speech synthesis. A method may include generating a plurality of input models representing an input by using a statistical model synthesizer to statistically model the input. The method may further include determining a speech unit sequence representing at least a portion of the input by using the input models to influence selection of one or more pre-recorded speech units having parameter representations. The method may additionally include identifying one or more bad units in the unit sequence. The method may also include replacing the identified one or more bad units with one or more parameters generated by the statistical model synthesizer. Corresponding apparatuses are also provided.

    Abstract translation: 提供了用于促进语音合成的方法和装置。 方法可以包括通过使用统计模型合成器来生成表示输入的多个输入模型以对输入进行统计建模。 该方法还可以包括通过使用输入模型来确定表示输入的至少一部分的语音单元序列,以影响具有参数表示的一个或多个预先录制的语音单元的选择。 该方法可以另外包括识别单元序列中的一个或多个不良单元。 该方法还可以包括用由统计模型合成器产生的一个或多个参数替换所识别的一个或多个坏单元。 还提供了相应的装置。

    Method and Apparatus for Text Input
    9.
    发明申请
    Method and Apparatus for Text Input 审中-公开
    文本输入的方法和装置

    公开(公告)号:US20110154193A1

    公开(公告)日:2011-06-23

    申请号:US12643301

    申请日:2009-12-21

    CPC classification number: G06F17/276

    Abstract: In accordance with an example embodiment of the present invention, there is provided a method comprising receiving a first text input at a first point in time, providing a first completion candidate for the first text input, receiving a second text input at a second point in time, determining a time difference between the second point in time and the first point in time and providing a second completion candidate for the second text input based on at least the first completion candidate and the time difference.

    Abstract translation: 根据本发明的示例实施例,提供了一种方法,包括在第一时间点接收第一文本输入,为第一文本输入提供第一完成候选,在第二点处接收第二文本输入 确定第二时间点和第一时间点之间的时间差,并且至少基于第一完成候选和时间差提供第二文本输入的第二完成候选。

Patent Agency Ranking