Framework for voice conversion
    1.
    发明申请
    Framework for voice conversion 审中-公开
    语音转换框架

    公开(公告)号:US20060235685A1

    公开(公告)日:2006-10-19

    申请号:US11107344

    申请日:2005-04-15

    IPC分类号: G10L15/26

    摘要: This invention relates to a framework for converting a source speech signal associated with a source voice into a target speech signal that is a representation of the source speech signal associated with a target voice. The source speech signal is encoded into samples of encoding parameters, wherein the encoding comprises the step of segmenting the source speech signal into segments based on characteristics of the source speech signal. The samples of the encoding parameters, or a converted representation of the samples of the encoding parameters are then decoded to obtain the target speech signal. Therein, in the encoding, the decoding or in a separate step, samples of parameters related to the source speech signal are converted into samples of parameters related to the target speech signal. Therein, at least one of the encoding and the converting depends on the segments of the source speech signal.

    摘要翻译: 本发明涉及一种用于将与源语音相关联的源语音信号转换成作为与目标语音相关联的源语音信号的表示的目标语音信号的框架。 源语音信号被编码为编码参数的采样,其中编码包括基于源语音信号的特性将源语音信号分割成段的步骤。 然后对编码参数的样本或编码参数的样本的转换表示进行解码以获得目标语音信号。 其中,在编码,解码或单独的步骤中,与源语音信号相关的参数样本被转换成与目标语音信号相关的参数的采样。 其中,编码和转换中的至少一个取决于源语音信号的段。

    Soft alignment based on a probability of time alignment
    2.
    发明授权
    Soft alignment based on a probability of time alignment 有权
    基于时间对齐概率的软对齐

    公开(公告)号:US07505950B2

    公开(公告)日:2009-03-17

    申请号:US11380289

    申请日:2006-04-26

    CPC分类号: G10L13/033 G10L2021/0135

    摘要: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.

    摘要翻译: 提供了系统和方法,用于在基于高斯混合模型(GMM)和其他矢量变换中执行软对齐。 软对齐可以将对齐概率分配给源和目标特征向量对。 然后可以使用矢量对和相关联的概率来计算转换函数,例如通过从联合向量计算GMM训练参数和对齐概率来创建用于将来自语音扬声器的语音声音转换为目标扬声器的语音转换功能。

    Memory usage in a text-to-speech system
    3.
    发明申请
    Memory usage in a text-to-speech system 审中-公开
    文本到语音系统中的内存使用

    公开(公告)号:US20060229877A1

    公开(公告)日:2006-10-12

    申请号:US11100001

    申请日:2005-04-06

    IPC分类号: G10L13/06

    CPC分类号: G10L13/06

    摘要: In the concatenative text-to-speech system, high compression rate of duration data in the prosodic template is achieved by extracting statistical parameters describing behavior of actual duration values of instances of each given syllable, phoneme, half-phoneme, diphone, triphone or any other basic speech unit employed, and storing only the extracted statistical parameters, instead of the original duration values. Entries of each given basic unit in the prosodic template is sorted and indexed in the order of increasing duration value. Consequently, the amount of duration data can be significantly reduced, while keeping the error statistically under acceptable range.

    摘要翻译: 在连续的文本到语音系统中,韵律模板中的持续时间数据的高压缩率通过提取统计参数来实现,该统计参数描述了每个给定音节,音素,半音素,聋音,三耳机或任何 使用其他基本语音单元,并且仅存储所提取的统计参数,而不是原始持续时间值。 韵律模板中每个给定基本单位的条目按照持续时间增加值的顺序进行排序和索引。 因此,持续时间数据量可以显着降低,同时将误差统计学上保持在可接受的范围内。

    SOFT ALIGNMENT IN GAUSSIAN MIXTURE MODEL BASED TRANSFORMATION
    4.
    发明申请
    SOFT ALIGNMENT IN GAUSSIAN MIXTURE MODEL BASED TRANSFORMATION 有权
    高斯混合模型基于变换的软对齐

    公开(公告)号:US20070256189A1

    公开(公告)日:2007-11-01

    申请号:US11380289

    申请日:2006-04-26

    IPC分类号: C12N15/82 C12N15/87

    CPC分类号: G10L13/033 G10L2021/0135

    摘要: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.

    摘要翻译: 提供了系统和方法,用于在基于高斯混合模型(GMM)和其他矢量变换中执行软对齐。 软对齐可以将对齐概率分配给源和目标特征向量对。 然后可以使用矢量对和相关联的概率来计算转换函数,例如通过从联合向量计算GMM训练参数和对齐概率来创建用于将语音从源扬声器转换为目标扬声器的语音转换功能。

    Method, apparatus, mobile terminal and computer program product for providing data clustering and mode selection
    5.
    发明申请
    Method, apparatus, mobile terminal and computer program product for providing data clustering and mode selection 失效
    用于提供数据聚类和模式选择的方法,装置,移动终端和计算机程序产品

    公开(公告)号:US20070233625A1

    公开(公告)日:2007-10-04

    申请号:US11396831

    申请日:2006-04-03

    IPC分类号: G06F15/18

    摘要: An apparatus for providing data clustering and mode selection includes a training element and a transformation element. The training element is configured to receive a first training data set, a second training data set and auxiliary data extracted from the same material as the first training data set. The training element is also configured to train a classifier to group the first training data set into M clusters based on the auxiliary data and the first training data set and train M processing schemes corresponding to the M clusters for transforming the first training data set into the second training data set. The transformation element is in communication with the training element and is configured to cluster the second training data set into M clusters based on features associated with the second training data set.

    摘要翻译: 一种用于提供数据聚类和模式选择的装置包括训练元素和变换元素。 训练元件被配置为接收从与第一训练数据集相同的材料提取的第一训练数据集,第二训练数据集和辅助数据。 训练元素还被配置为训练分类器,以基于辅助数据和第一训练数据集以及对应于M个簇的训练M个处理方案将第一训练数据集合分组成M个群集,以将第一训练数据集变换为 第二训练数据集。 转换元件与训练元素通信,并且被配置为基于与第二训练数据集相关联的特征将第二训练数据集聚集成M个群集。

    System and method for optimizing run-time memory usage for a lexicon
    6.
    发明申请
    System and method for optimizing run-time memory usage for a lexicon 审中-公开
    用于优化词典的运行时内存使用的系统和方法

    公开(公告)号:US20060167680A1

    公开(公告)日:2006-07-27

    申请号:US11042445

    申请日:2005-01-25

    IPC分类号: G06F17/21

    摘要: A system and method of extracting information from a lexicon and using the information with a computer software program. Lexicon data is arranged for a particular language using Unicode values or other uniquely defined code values for each character of word of the language. A location array is then created for the lexicon data arranged by Unicode value or other uniquely defined code value. Upon a request to search for a word, words that have the same initial character as the searched-for word are identified using the location array. The identified words are then searched for an identified word that matches the searched-for word. Therefore, the amount of data loaded into run-time memory is minimized, and searches for a given word are completely more quickly than in conventional systems.

    摘要翻译: 从词汇提取信息并使用计算机软件程序使用信息的系统和方法。 针对特定语言的Lexicon数据使用Unicode值或其他唯一定义的代码值针对该语言的每个字符。 然后为由Unicode值或其他唯一定义的代码值排列的词典数据创建位置数组。 在搜索单词的请求时,使用位置数组来识别具有与搜索词相同的初始字符的单词。 然后搜索所识别的单词与搜索词匹配的识别词。 因此,加载到运行时存储器中的数据量最小化,并且搜索给定单词比在常规系统中更快。

    MEMORY-EFFICIENT METHOD FOR HIGH-QUALITY CODEBOOK BASED VOICE CONVERSION
    7.
    发明申请
    MEMORY-EFFICIENT METHOD FOR HIGH-QUALITY CODEBOOK BASED VOICE CONVERSION 审中-公开
    用于基于高质量代码的语音转换的内存有效方法

    公开(公告)号:US20080147385A1

    公开(公告)日:2008-06-19

    申请号:US11611798

    申请日:2006-12-15

    IPC分类号: G10L19/12

    CPC分类号: G10L21/00 G10L2021/0135

    摘要: An improved system method for enabling and implementing codebook-based voice conversion that both significantly reduces the memory footprint and improves the continuity of the output. In various embodiments, the paired source-target codebook is implemented as a multi-stage vector quantizer. During the conversion, N best candidates in a tree search are taken as the output from the quantizer. The N candidates for each vector to be converted are used in a dynamic programming-based approach that finds a smooth but accurate output sequence.

    摘要翻译: 一种改进的系统方法,用于启用和实施基于代码本的语音转换,可显着减少内存占用并提高输出的连续性。 在各种实施例中,成对的源目标码本被实现为多级矢量量化器。 在转换期间,树搜索中的N个最佳候选者作为量化器的输出。 将要转换的每个向量的N个候选者用于基于动态规划的方法,其寻找平滑但准确的输出序列。

    Method, apparatus, mobile terminal and computer program product for providing efficient evaluation of feature transformation
    8.
    发明申请
    Method, apparatus, mobile terminal and computer program product for providing efficient evaluation of feature transformation 有权
    方法,装置,移动终端和计算机程序产品,用于提供特征转换的有效评估

    公开(公告)号:US20070239634A1

    公开(公告)日:2007-10-11

    申请号:US11400629

    申请日:2006-04-07

    IPC分类号: G06N3/02

    摘要: An apparatus for providing efficient evaluation of feature transformation includes a training module and a transformation module. The training module is configured to train a Gaussian mixture model (GMM) using training source data and training target data. The transformation module is in communication with the training module. The transformation module is configured to produce a conversion function in response to the training of the GMM. The training module is further configured to determine a quality of the conversion function prior to use of the conversion function by calculating a trace measurement of the GMM.

    摘要翻译: 用于提供特征变换的有效评估的装置包括训练模块和变换模块。 训练模块被配置为使用训练源数据和训练目标数据训练高斯混合模型(GMM)。 变换模块与训练模块通信。 转换模块被配置为响应于GMM的训练而产生转换功能。 训练模块还被配置为通过计算GMM的跟踪测量来确定在使用转换功能之前的转换功能的质量。

    Correcting a pronunciation of a synthetically generated speech object
    9.
    发明申请
    Correcting a pronunciation of a synthetically generated speech object 审中-公开
    纠正合成语音对象的发音

    公开(公告)号:US20070016421A1

    公开(公告)日:2007-01-18

    申请号:US11180316

    申请日:2005-07-12

    IPC分类号: G10L13/08

    CPC分类号: G10L13/08

    摘要: This invention relates to a method, a device and a software application product for correcting a pronunciation of a speech object. The speech object is synthetically generated from a text object in dependence on a segmented representation of the text object. It is determined if an initial pronunciation of the speech object, which initial pronunciation is associated with an initial segmented representation of the text object, is incorrect. Furthermore, in case it is determined that the initial pronunciation of the speech object is incorrect, a new segmented representation of the text object is determined, which new segmented representation of the text object is associated with a new pronunciation of the speech object.

    摘要翻译: 本发明涉及一种用于校正语音对象的发音的方法,装置和软件应用产品。 语音对象根据文本对象的分段表示从文本对象合成生成。 确定初始发音是否与文本对象的初始分段表示相关联的语音对象的初始发音是不正确的。 此外,在确定语音对象的初始发音不正确的情况下,确定文本对象的新的分段表示,文本对象的哪个新的分段表示与语音对象的新发音相关联。