Voice converter with extraction and modification of attribute data

    公开(公告)号:US07149682B2

    公开(公告)日:2006-12-12

    申请号:US10282754

    申请日:2002-10-29

    IPC分类号: G10L21/00 G10H7/00

    摘要: An apparatus is constructed for converting an input voice signal into an output voice signal according to a target voice signal. In the apparatus, an input device provides the input voice signal composed of original sinusoidal components and original residual components other than the original sinusoidal components. An extracting device extracts original attribute data from at least the sinusoidal components of the input voice signal. The original attribute data is characteristic of the input voice signal. A synthesizing device synthesizes new attribute data based on both of the original attribute data derived from the input voice signal and target attribute data being characteristic of the target voice signal composed of target sinusoidal components and target residual components other than the sinusoidal components. The target attribute data is derived from at least the target sinusoidal components. An output device operates based on the new attribute data and either of the original residual component and the target residual component for producing the output voice signal.

    Voice converter with extraction and modification of attribute data
    2.
    发明授权
    Voice converter with extraction and modification of attribute data 失效
    具有提取和修改属性数据的语音转换器

    公开(公告)号:US07606709B2

    公开(公告)日:2009-10-20

    申请号:US10282536

    申请日:2002-10-29

    IPC分类号: G10L13/00 G10L21/00

    摘要: An apparatus is constructed for converting an input voice signal into an output voice signal according to a target voice signal. In the apparatus, an input device provides the input voice signal composed of original sinusoidal components and original residual components other than the original sinusoidal components. An extracting device extracts original attribute data from at least the sinusoidal components of the input voice signal. The original attribute data is characteristic of the input voice signal. A synthesizing device synthesizes new attribute data based on both of the original attribute data derived from the input voice signal and target attribute data being characteristic of the target voice signal composed of target sinusoidal components and target residual components other than the sinusoidal components. The target attribute data is derived from at least the target sinusoidal components. An output device operates based on the new attribute data and either of the original residual component and the target residual component for producing the output voice signal.

    摘要翻译: 一种根据目标语音信号将输入语音信号转换为输出语音信号的装置。 在该装置中,输入装置提供由原始正弦分量和原始正弦分量以外的原始剩余分量组成的输入语音信号。 提取装置从至少输入语音信号的正弦分量中提取原始属性数据。 原始属性数据是输入语音信号的特征。 合成装置基于从输入语音信号导出的原始属性数据和由目标正弦分量和除了正弦分量之外的目标残差分量组成的目标语音信号的特征的目标属性数据,合成新的属性数据。 目标属性数据至少从目标正弦分量导出。 输出装置基于新的属性数据和原始剩余分量中的任一个和用于产生输出语音信号的目标剩余分量进行操作。

    Voice converter for assimilation by frame synthesis with temporal alignment
    3.
    发明申请
    Voice converter for assimilation by frame synthesis with temporal alignment 失效
    语音转换器通过帧合成与时间对准同化

    公开(公告)号:US20050049875A1

    公开(公告)日:2005-03-03

    申请号:US10951328

    申请日:2004-09-27

    IPC分类号: G10L13/02 G10L21/00 G10L13/00

    CPC分类号: G10L13/033 G10L2021/0135

    摘要: A voice converting apparatus is constructed for converting an input voice into an output voice according to a target voice. In the apparatus, a storage section provisionally stores source data, which is associated to and extracted from the target voice. An analyzing section analyzes the input voice to extract therefrom a series of input data frames representing the input voice. A producing section produces a series of target data frames representing the target voice based on the source data, while aligning the target data frames with the input data frames to secure synchronization between the target data frames and the input data frames. A synthesizing section synthesizes the output voice according to the target data frames and the input data frames. In the recognizing feature analysis, a characteristic analyzer extracts from the input voice a characteristic vector. A memory memorizes target behavior data representing a behavior of the target voice. An alignment processor determines a temporal relation between the input data frames and the target data frames according to the characteristic vector and the target behavior data so as to output alignment data. A target decoder produces the target data frames according to the alignment data, the input data frames and the source data containing phoneme of the target voice.

    摘要翻译: 构成语音转换装置,用于根据目标语音将输入语音转换为输出语音。 在装置中,存储部临时存储与目标语音相关联并从其中提取的源数据。 分析部分分析输入声音以从中提取代表输入声音的一系列输入数据帧。 产生部分基于源数据产生一系列表示目标语音的目标数据帧,同时使目标数据帧与输入数据帧对齐,以确保目标数据帧与输入数据帧之间的同步。 合成部根据目标数据帧和输入数据帧合成输出声音。 在识别特征分析中,特征分析器从输入语音中提取特征向量。 存储器存储表示目标语音行为的目标行为数据。 对准处理器根据特征向量和目标行为数据确定输入数据帧和目标数据帧之间的时间关系,以输出对准数据。 目标解码器根据对准数据,输入数据帧和包含目标声音的音素的源数据产生目标数据帧。

    Voice converter for assimilation by frame synthesis with temporal alignment
    5.
    发明授权
    Voice converter for assimilation by frame synthesis with temporal alignment 失效
    语音转换器通过帧合成与时间对准同化

    公开(公告)号:US06836761B1

    公开(公告)日:2004-12-28

    申请号:US09693144

    申请日:2000-10-20

    IPC分类号: G10L1300

    CPC分类号: G10L13/033 G10L2021/0135

    摘要: A voice converting apparatus is constructed for converting an input voice into an output voice according to a target voice. In the apparatus, a storage section provisionally stores source data, which is associated to and extracted from the target voice. An analyzing section analyzes the input voice to extract therefrom a series of input data frames representing the input voice. A producing section produces a series of target data frames representing the target voice based on the source data, while aligning the target data frames with the input data frames to secure synchronization between the target data frames and the input data frames. A synthesizing section synthesizes the output voice according to the target data frames and the input data frames.

    摘要翻译: 构成语音转换装置,用于根据目标语音将输入语音转换为输出语音。 在装置中,存储部临时存储与目标语音相关联并从其中提取的源数据。 分析部分分析输入声音以从中提取代表输入声音的一系列输入数据帧。 产生部分基于源数据产生一系列表示目标语音的目标数据帧,同时使目标数据帧与输入数据帧对齐,以确保目标数据帧与输入数据帧之间的同步。 合成部根据目标数据帧和输入数据帧合成输出声音。

    Singing voice synthesizing apparatus, singing voice synthesizing method, and program for realizing singing voice synthesizing method

    公开(公告)号:US07016841B2

    公开(公告)日:2006-03-21

    申请号:US10034359

    申请日:2001-12-27

    IPC分类号: G10L13/00 G10H7/00

    CPC分类号: G10L13/07

    摘要: A singing voice synthesizing apparatus is provided, which enables achievement of a natural sounding synthesized singing voice with a good level of comprehensibility. A phoneme database stores a plurality of voice fragment data formed of voice fragments each being a single phoneme or a phoneme chain of at least two concatenated phonemes, each of the plurality of voice fragment data comprising data of a deterministic component and data of a stochastic component. A readout device that reads out from the phoneme database the voice fragment data corresponding to inputted lyrics. A duration time adjusting device adjusts time duration of the read-out voice fragment data so as to match a desired tempo and manner of singing. An adjusting device adjusts the deterministic component and the stochastic component of the read-out voice fragment so as to match a desired pitch. A synthesizing device synthesizes a singing sound by sequentially concatenating the voice fragment data that have been adjusted by the duration time adjusting device and the adjusting device.

    Sound processing apparatus and method, and program therefor
    7.
    发明授权
    Sound processing apparatus and method, and program therefor 有权
    声音处理装置及方法及其程序

    公开(公告)号:US07945446B2

    公开(公告)日:2011-05-17

    申请号:US11372812

    申请日:2006-03-09

    IPC分类号: G10L21/00 G10L13/06 G10L13/00

    摘要: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.

    摘要翻译: 检测输入声音的频谱包络。 同时,获取转换频谱,其是包括多个声音(例如一致声音)的转换声音的频谱。 通过将检测到的输入声音的频谱包络赋予所获取的转换频谱来产生输出频谱。 声音信号是根据产生的输出频谱进行合成的。 此外,可以检测输入声音的音调,并且可以根据检测到的输入声音的音调来改变所获取的转换频谱中的峰值频率。 以这种方式,输出频谱可以具有包括多个声音的转换声音的输入声音和频谱频率分量的音调和频谱包络,从而可以以简单的布置容易地产生一致的声音。

    Singing voice synthesizing apparatus, singing voice synthesizing method and program for singing voice synthesizing
    8.
    发明授权
    Singing voice synthesizing apparatus, singing voice synthesizing method and program for singing voice synthesizing 有权
    唱歌语音合成装置,歌唱合成方法和歌唱合成程序

    公开(公告)号:US07135636B2

    公开(公告)日:2006-11-14

    申请号:US10375272

    申请日:2003-02-27

    IPC分类号: G10H1/06 G10H7/00

    摘要: A method for synthesizing a natural-sounding singing voice divides performance data into a transition part and a long sound part. The transition part is represented by articulation (phonemic chain) data that is read from an articulation template database and is outputted without modification. For the long sound part, a new characteristic parameter is generated by linearly interpolating characteristic parameters of the transition parts positioned before and after the long sound part and adding thereto a changing component of stationary data that is read from a constant part (stationary) template database. An associated apparatus for carrying out the singing voice synthesizing method includes a phoneme database for storing articulation data for the transition part and stationary data for the long sound part, a first device for outputting the articulation data, and a second device for outputting the newly-generated characteristic parameter of the long sound part.

    摘要翻译: 用于合成自然发声的歌声的方法将演奏数据分成转换部分和长音部分。 过渡部分由从关节运动模板数据库读取并且没有修改地输出的关节(音素链)数据表示。 对于长音部分,通过线性内插位于长声部分之前和之后的过渡部分的特征参数,并且向其添加从恒定部分(静止)模板数据库读取的静止数据的变化分量,生成新的特征参数 。 用于执行歌唱声合成方法的相关装置包括用于存储用于转换部分的发音数据的音素数据库和用于长音部分的固定数据,用于输出关节数据的第一装置,以及用于输出新音符的第二装置, 生成长音部分的特征参数。

    Sound processing apparatus and method, and program therefor
    9.
    发明申请
    Sound processing apparatus and method, and program therefor 有权
    声音处理装置及方法及其程序

    公开(公告)号:US20060212298A1

    公开(公告)日:2006-09-21

    申请号:US11372812

    申请日:2006-03-09

    IPC分类号: G10L21/02

    摘要: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.

    摘要翻译: 检测输入声音的频谱包络。 同时,获取转换频谱,其是包括多个声音(例如一致声音)的转换声音的频谱。 通过将检测到的输入声音的频谱包络赋予所获取的转换频谱来产生输出频谱。 声音信号是根据产生的输出频谱进行合成的。 此外,可以检测输入声音的音调,并且可以根据检测到的输入声音的音调来改变所获取的转换频谱中的峰值频率。 以这种方式,输出频谱可以具有包括多个声音的转换声音的输入声音和频谱频率分量的音调和频谱包络,从而可以以简单的布置容易地产生一致的声音。

    Tone processing apparatus and method
    10.
    发明授权
    Tone processing apparatus and method 有权
    音调处理装置及方法

    公开(公告)号:US07750228B2

    公开(公告)日:2010-07-06

    申请号:US12006918

    申请日:2008-01-07

    IPC分类号: G10H1/00 G10H1/18

    摘要: For at least one music piece, a storage section stores tone data of each of a plurality of fragments segmented from the music piece and stores a first descriptor indicative of a musical character of each of the fragments in association with the fragment. Descriptor generation section receives input data based on operation by a user and generates a second descriptor, indicative of a musical character, on the basis of the received input data. Determination section determines similarity between the second descriptor and the first descriptor of each of the fragments. Selection section selects the tone data of at least one fragment on the basis of a result of the similarity determination by the determination section. On the basis of the tone data of the selected at least one fragment, a data generation section generates tone data to be outputted.

    摘要翻译: 对于至少一个音乐片段,存储部分存储从音乐片段分割的多个片段中的每一个的乐曲数据,并且存储指示与片段相关联的每个片段的音乐特征的第一描述符。 描述符生成部分基于用户的操作接收输入数据,并且基于所接收的输入数据生成表示音乐人物的第二描述符。 确定部分确定第二描述符和每个片段的第一描述符之间的相似性。 选择部根据判定部的相似判定结果,选择至少一个片段的色调数据。 根据所选择的至少一个片段的色调数据,数据产生部分产生要输出的色调数据。