Voice signal conversation method and system
    101.
    发明授权
    Voice signal conversation method and system 失效
    语音信号对话方式和系统

    公开(公告)号:US07765101B2

    公开(公告)日:2010-07-27

    申请号:US10594396

    申请日:2005-03-09

    CPC classification number: G10L21/00 G10L13/033 G10L2021/0135

    Abstract: A method of converting a voice signal spoken by a source speaker into a converted voice signal having acoustic characteristics that resemble those of a target speaker. The method includes the following steps: determining (1) at least one function for the transformation of the acoustic characteristics of the source speaker into acoustic characteristics similar to those of the target speaker; and transforming the acoustic characteristics of the voice signal to be converted using the at least one transformation function. The method is characterized in that: (i) the aforementioned transformation function-determining step (1) consists in determining (1) a function for the joint transformation of characteristics relating to the spectral envelope and characteristics relating to the fundamental frequency of the source speaker; and (ii) the transformation includes the application of the joint transformation function.

    Abstract translation: 将源扬声器所说出的语音信号转换成具有类似于目标说话者的声学特性的转换语音信号的方法。 该方法包括以下步骤:确定(1)用于将源扬声器的声学特性变换成与目标扬声器相似的声学特性的至少一个功能; 以及使用所述至少一个变换函数来变换要转换的语音信号的声学特性。 该方法的特征在于:(i)上述变换函数确定步骤(1)在于确定(1)与频谱包络有关的特性的联合变换和与源扬声器的基频有关的特性的函数 ; 和(ii)转型包括联合转化功能的应用。

    Providing personalized voice font for text-to-speech applications
    102.
    发明授权
    Providing personalized voice font for text-to-speech applications 失效
    为文字到语音应用程序提供个性化的语音字体

    公开(公告)号:US07693719B2

    公开(公告)日:2010-04-06

    申请号:US10977178

    申请日:2004-10-29

    CPC classification number: G10L13/033 G10L2021/0135

    Abstract: A method for synthesizing speech from text includes receiving one or more waveforms characteristic of a voice of a person selected by a user, generating a personalized voice font based on the one or more waveforms, and delivering the personalized voice font to the user's computer, whereby speech can be synthesized from text, the speech being in the voice of the selected person, the speech being synthesized using the personalized voice font. A system includes a text-to-speech (TTS) application operable to generate a voice font based on speech waveforms transmitted from a client computer remotely accessing the TTS application.

    Abstract translation: 一种用于从文本合成语音的方法包括接收用户选择的人物的声音特征的一个或多个波形,基于一个或多个波形产生个性化语音字体,并将个性化语音字体传送到用户的计算机,由此 可以从文本合成语音,语音在所选择的人的语音中,使用个性化语音字体合成语音。 一种系统包括文本到语音(TTS)应用,其可操作以基于远程访问TTS应用的客户端计算机发送的语音波形来生成语音字体。

    Voice converter with extraction and modification of attribute data
    103.
    发明授权
    Voice converter with extraction and modification of attribute data 失效
    具有提取和修改属性数据的语音转换器

    公开(公告)号:US07606709B2

    公开(公告)日:2009-10-20

    申请号:US10282536

    申请日:2002-10-29

    Abstract: An apparatus is constructed for converting an input voice signal into an output voice signal according to a target voice signal. In the apparatus, an input device provides the input voice signal composed of original sinusoidal components and original residual components other than the original sinusoidal components. An extracting device extracts original attribute data from at least the sinusoidal components of the input voice signal. The original attribute data is characteristic of the input voice signal. A synthesizing device synthesizes new attribute data based on both of the original attribute data derived from the input voice signal and target attribute data being characteristic of the target voice signal composed of target sinusoidal components and target residual components other than the sinusoidal components. The target attribute data is derived from at least the target sinusoidal components. An output device operates based on the new attribute data and either of the original residual component and the target residual component for producing the output voice signal.

    Abstract translation: 一种根据目标语音信号将输入语音信号转换为输出语音信号的装置。 在该装置中,输入装置提供由原始正弦分量和原始正弦分量以外的原始剩余分量组成的输入语音信号。 提取装置从至少输入语音信号的正弦分量中提取原始属性数据。 原始属性数据是输入语音信号的特征。 合成装置基于从输入语音信号导出的原始属性数据和由目标正弦分量和除了正弦分量之外的目标残差分量组成的目标语音信号的特征的目标属性数据,合成新的属性数据。 目标属性数据至少从目标正弦分量导出。 输出装置基于新的属性数据和原始剩余分量中的任一个和用于产生输出语音信号的目标剩余分量进行操作。

    METHOD AND APPARATUS FOR POLYMORPHING A PLURALITY OF SETS OF DATA
    104.
    发明申请
    METHOD AND APPARATUS FOR POLYMORPHING A PLURALITY OF SETS OF DATA 审中-公开
    用于聚合数据集的多项式的方法和装置

    公开(公告)号:US20090244098A1

    公开(公告)日:2009-10-01

    申请号:US12411658

    申请日:2009-03-26

    Abstract: 2M-sets of model data strings (M is a positive integer and M≧2) are polymorphed. The model data strings are acquired by defining at least 2M-piece coordinates being morphed in a M-dimensional model-data mapping space and making the defined model data strings correspond to the coordinates being morphed, respectively. A unit cell is set in the space. The unit cell consists of a hyper rectangular parallelepiped having 2M-piece vertexes each located at the coordinates being morphed. A desired coordinate is set, as a morphing-destination coordinate, within the unit cell. The 2M sets of model data strings corresponding, set by set, to the coordinates being morphed are polymorphed using weighting factors depending on distances from the respective coordinates being morphed to the morphing-destination coordinate in the unit cell. Accordingly, a string of synthesized data corresponding to the morphing-destination coordinate is produced. The string of synthesized data is outputted using an outputting device.

    Abstract translation: 模型数据串(M为正整数,M> = 2)的2M组被变形。 模型数据串是通过定义在M维模型 - 数据映射空间中变形的至少2M个零件坐标并且使所定义的模型数据串分别对应于正在变形的坐标来获取的。 一个单元格被设置在空间中。 单元单元由具有2M片顶点的超长方体组成,每个顶点位于正在变形的坐标处。 作为变形目的地坐标,在单位单元内设置所需的坐标。 通过设置设置的与设定的坐标对应的2M组模型数据串,使用加权因子进行变形,这取决于来自被变形的相应坐标的距离到单位单元中的变形 - 目的地坐标。 因此,产生与变形 - 目的地坐标相对应的一串合成数据。 使用输出装置输出合成数据串。

    Normalization of speech accent
    105.
    发明授权
    Normalization of speech accent 有权
    语音口音规范化

    公开(公告)号:US07593849B2

    公开(公告)日:2009-09-22

    申请号:US10352720

    申请日:2003-01-28

    CPC classification number: G10L15/07 G10L21/00 G10L2021/0135

    Abstract: A normalizer (100, 300) of the accent of accented speech modifies (210, 410) the characteristics of input signals that represent the speech spoken in an individual voice with an accent to form output signals that represent the speech spoken in the same voice but with less or no accent.

    Abstract translation: 重音语音的重音的归一化器(100,300)修改(210,410)表示在具有重音的单个语音中说出的语音的输入信号的特征,以形成表示在相同声音中说出的语音的输出信号,但是 具有较少或没有口音。

    STRAINED-ROUGH-VOICE CONVERSION DEVICE, VOICE CONVERSION DEVICE, VOICE SYNTHESIS DEVICE, VOICE CONVERSION METHOD, VOICE SYNTHESIS METHOD, AND PROGRAM
    106.
    发明申请
    STRAINED-ROUGH-VOICE CONVERSION DEVICE, VOICE CONVERSION DEVICE, VOICE SYNTHESIS DEVICE, VOICE CONVERSION METHOD, VOICE SYNTHESIS METHOD, AND PROGRAM 有权
    应变粗糙语音转换设备,语音转换设备,语音合成设备,语音转换方法,语音合成方法和程序

    公开(公告)号:US20090204395A1

    公开(公告)日:2009-08-13

    申请号:US12438860

    申请日:2008-01-22

    CPC classification number: G10L13/033 G10L2021/0135

    Abstract: A strained-rough-voice conversion unit (10) is included in a voice conversion device that can generate a “strained rough” voice produced in a part of a speech when speaking forcefully with excitement, nervousness, anger, or emphasis and thereby richly express vocal expression such as anger, excitement, or an animated or lively way of speaking, using voice quality change. The strained-rough-voice conversion unit (10) includes: a strained phoneme position designation unit (11) designating a phoneme to be uttered as a “strained rough” voice in a speech; and an amplitude modulation unit (14) performing modulation including periodic amplitude fluctuation on a speech waveform. The amplitude modulation unit (14) generates, according to the designation of the strained phoneme position designation unit (11), the “strained rough” voice by performing the modulation including periodic amplitude fluctuation on the part to be uttered as the “strained rough” voice, in order to generate a speech having realistic and rich expression uttering forcefully with excitement, nervousness, anger, or emphasis.

    Abstract translation: 语音转换装置中包括一个紧张粗糙的语音转换单元(10),当语音激发,紧张,愤怒或强调说话时,可以产生一部分语音中产生的“紧张粗糙”的声音,从而丰富表达 声音表达,如愤怒,兴奋,或动画或活泼的演讲方式,使用语音质量改变。 应变粗略语音转换单元(10)包括:应变声音位置指定单元(11),指定要在语音中作为“紧张粗糙”声音发出的音素; 以及调制单元(14),其执行包括语音波形上的周期性幅度波动的调制。 幅度调制单元(14)根据应变音素位置指定单元(11)的指定,通过执行包括周期性幅度波动的调制来产生“应变粗糙”语音,该调制部分将被称为“应变粗糙” 声音,以产生一种具有激动,紧张,愤怒或强调的逼真而丰富的表达力的演讲。

    VOICE SYNTHESIS METHOD AND INTERPERSONAL COMMUNICATION METHOD, PARTICULARLY FOR MULTIPLAYER ONLINE GAMES
    107.
    发明申请
    VOICE SYNTHESIS METHOD AND INTERPERSONAL COMMUNICATION METHOD, PARTICULARLY FOR MULTIPLAYER ONLINE GAMES 审中-公开
    语音合成方法和人际通信方法,特别适用于多人在线游戏

    公开(公告)号:US20090063156A1

    公开(公告)日:2009-03-05

    申请号:US12198391

    申请日:2008-08-26

    Abstract: A voice synthesis method, said method comprising a step of choosing a synthetic voice from among a set of voices having predetermined spectral signatures and a step of recording the natural voice of a first person, the method comprising a step of transforming the natural recorded voice so as to conform with the spectral signature of the chosen synthetic voice, the natural voice thereby transformed being recorded, said method comprising a step of determining at least one situation parameter for a first character from among a set of predefined parameters, each predefined parameter being associated with a spectral alteration of the emitted voice, the determined situation parameter particularly characterizing the environment or the physical or psychological state of the character, the method comprising a step of spectrally altering the transformed natural voice so as to conform with the spectral alteration associated with the character's situation parameter.

    Abstract translation: 一种语音合成方法,所述方法包括从具有预定频谱特征的一组语音中选择合成语音的步骤和记录第一人的自然语音的步骤,该方法包括将自然录音声音变换为 为了符合所选择的合成语音的频谱特征,由此变换的自然语音被记录,所述方法包括从一组预定参数中确定第一个字符的至少一个情境参数的步骤,每个预定参数相关联 具有发出的声音的频谱改变,所确定的情况参数,特别是表征字符的环境或身体或心理状态,该方法包括频谱改变变换的自然语音以便符合与 字符的情况参数。

    Device and Method for Capturing Vocal Sound and Mouth Region Images
    108.
    发明申请
    Device and Method for Capturing Vocal Sound and Mouth Region Images 审中-公开
    捕获声音和口腔区域图像的设备和方法

    公开(公告)号:US20080317264A1

    公开(公告)日:2008-12-25

    申请号:US12158445

    申请日:2006-12-18

    Inventor: Jordan Wynnychuk

    Abstract: A device suitable for use in various applications, including, for example, sound production applications and video game applications. In one non-limiting embodiment, the device comprises a sound capturing unit for generating a first signal indicative of vocal sound produced by a user and an image capturing unit for generating a second signal indicative of images of a mouth region of the user. The device also comprises a processing unit communicatively coupled to the sound capturing unit and the image capturing unit for processing the first signal and the second signal. In an example in which the device is used for sound production, the processing unit is operative for processing the first signal and the second signal to cause a sound production unit to emit sound audibly perceivable as being a modified version of the vocal sound produced by the user. In an example in which the device is used for playing a video game, the processing unit is operative for processing the second signal to generate a video game feature control signal for controlling a feature associated with the video game. The feature associated with the video game may be a virtual character of the video game. The processing unit is further operative for processing the first signal for causing a sound production unit to emit sound associated with the video game.

    Abstract translation: 适用于各种应用的设备,包括例如声音制作应用和视频游戏应用。 在一个非限制性实施例中,该装置包括用于产生指示由用户产生的声音的第一信号的声音捕获单元和用于产生指示用户的嘴区域的图像的第二信号的图像捕获单元。 该设备还包括通信地耦合到声音捕获单元和图像捕获单元的处理单元,用于处理第一信号和第二信号。 在设备用于声音制作的示例中,处理单元用于处理第一信号和第二信号,以使声音产生单元发出可听见的声音,作为由声音产生的声音的修改版本 用户。 在该装置用于播放视频游戏的示例中,处理单元用于处理第二信号以产生用于控制与视频游戏相关联的特征的视频游戏特征控制信号。 与视频游戏相关联的特征可以是视频游戏的虚拟角色。 处理单元进一步操作用于处理第一信号以使声音产生单元发出与视频游戏相关联的声音。

    Personality-Based Device
    109.
    发明申请
    Personality-Based Device 有权
    基于人格的设备

    公开(公告)号:US20080291325A1

    公开(公告)日:2008-11-27

    申请号:US11752989

    申请日:2007-05-24

    CPC classification number: G10L13/033 G10L2021/0135

    Abstract: A personality-based theme may be provided. An application program may query a personality resource file for a prompt corresponding to a personality. Then the prompt may be received at a speech synthesis engine. Next, the speech synthesis engine may query a personality voice font database for a voice font corresponding to the personality. Then the speech synthesis engine may apply the voice font to the prompt. The voice font applied prompt may then be produced at an output device.

    Abstract translation: 可以提供基于个性的主题。 应用程序可以查询个性资源文件以获得与个性相对应的提示。 然后可以在语音合成引擎处接收提示。 接下来,语音合成引擎可以针对与个性对应的语音字体查询个性语音字体数据库。 然后语音合成引擎可以将语音字体应用于提示。 然后可以在输出设备处产生应用提示的语音字体。

    Method and device for modifying an audio signal
    110.
    发明申请
    Method and device for modifying an audio signal 有权
    用于修改音频信号的方法和设备

    公开(公告)号:US20080255830A1

    公开(公告)日:2008-10-16

    申请号:US12075759

    申请日:2008-03-12

    CPC classification number: G10L21/04 G10L13/033 G10L2021/0135

    Abstract: A method of modifying acoustic characteristics of an original audio signal as a function of modification instructions relating at least to the fundamental frequency and the spectral envelope of the original signal. The method comprises a first modification operation applied to the original signal to deliver an intermediate audio signal, the first modification operation being intended to deform the spectral envelope of the original signal in application of said spectral envelope modification instruction; and a second modification operation applied to the intermediate signal to deliver a final audio signal, the second modification operation being intended to modify at least the fundamental frequency of the intermediate signal, in application of a modification factor that is determined so as to take account of the effects of the first modification operation on the fundamental frequency of the original audio signal, so that the fundamental frequency obtained for the final signal conforms to said instruction relating to fundamental frequency.

    Abstract translation: 根据至少与原始信号的基频和频谱包络有关的修改指令来修改原始音频信号的声学特性的方法。 该方法包括应用于原始信号以传递中间音频信号的第一修改操作,第一修改操作旨在在应用所述频谱包络修改指令时使原始信号的频谱包络变形; 以及应用于中间信号以递送最终音频信号的第二修改操作,第二修改操作旨在修改至少中间信号的基频,在应用修改因子时,其被确定为考虑到 第一修改操作对原始音频信号的基频的影响,使得对于最终信号获得的基频符合与基频有关的所述指令。

Patent Agency Ranking