Method and system for text-to-speech synthesis with personalized voice
    1.
    发明授权
    Method and system for text-to-speech synthesis with personalized voice 有权
    用于个性化语音的文本到语音合成的方法和系统

    公开(公告)号:US08886537B2

    公开(公告)日:2014-11-11

    申请号:US11688264

    申请日:2007-03-20

    摘要: A method and system are provided for text-to-speech synthesis with personalized voice. The method includes receiving an incidental audio input (403) of speech in the form of an audio communication from an input speaker (401) and generating a voice dataset (404) for the input speaker (401). The method includes receiving a text input (411) at the same device as the audio input (403) and synthesizing (312) the text from the text input (411) to synthesized speech including using the voice dataset (404) to personalize the synthesized speech to sound like the input speaker (401). In addition, the method includes analyzing (316) the text for expression and adding the expression (315) to the synthesized speech. The audio communication may be part of a video communication (453) and the audio input (403) may have an associated visual input (455) of an image of the input speaker. The synthesis from text may include providing a synthesized image personalized to look like the image of the input speaker with expressions added from the visual input (455).

    摘要翻译: 提供了一种用于具有个性化语音的文本到语音合成的方法和系统。 该方法包括从输入扬声器(401)接收音频通信形式的语音的附带音频输入(403),并产生用于输入扬声器(401)的语音数据集(404)。 该方法包括在与音频输入(403)相同的设备处接收文本输入(411),并将来自文本输入(411)的文本合成(312)到包括使用语音数据集(404)的合成语音,以个性化合成的 语音类似于输入扬声器(401)。 此外,该方法包括分析(316)表达的文本并将表达式(315)添加到合成语音。 音频通信可以是视频通信的一部分(453),并且音频输入(403)可以具有输入说话者的图像的相关视觉输入(455)。 来自文本的合成可以包括提供个性化的看起来像输入说话者的图像的合成图像,其中从视觉输入(455)添加表达。

    METHOD AND SYSTEM FOR TEXT-TO-SPEECH SYNTHESIS WITH PERSONALIZED VOICE
    2.
    发明申请
    METHOD AND SYSTEM FOR TEXT-TO-SPEECH SYNTHESIS WITH PERSONALIZED VOICE 有权
    使用个性化语音进行语音合成的方法和系统

    公开(公告)号:US20080235024A1

    公开(公告)日:2008-09-25

    申请号:US11688264

    申请日:2007-03-20

    IPC分类号: G10L13/00

    摘要: A method and system are provided for text-to-speech synthesis with personalized voice. The method includes receiving an incidental audio input (403) of speech in the form of an audio communication from an input speaker (401) and generating a voice dataset (404) for the input speaker (401). The method includes receiving a text input (411) at the same device as the audio input (403) and synthesizing (312) the text from the text input (411) to synthesized speech including using the voice dataset (404) to personalize the synthesized speech to sound like the input speaker (401). In addition, the method includes analyzing (316) the text for expression and adding the expression (315) to the synthesized speech. The audio communication may be part of a video communication (453) and the audio input (403) may have an associated visual input (455) of an image of the input speaker. The synthesis from text may include providing a synthesized image personalized to look like the image of the input speaker with expressions added from the visual input (455).

    摘要翻译: 提供了一种用于具有个性化语音的文本到语音合成的方法和系统。 该方法包括从输入扬声器(401)接收音频通信形式的语音的附带音频输入(403),并产生用于输入扬声器(401)的语音数据集(404)。 该方法包括在与音频输入(403)相同的设备处接收文本输入(411),并将来自文本输入(411)的文本合成(312)到包括使用语音数据集(404)的合成语音,以个性化合成的 语音类似于输入扬声器(401)。 此外,该方法包括分析(316)表达的文本并将表达式(315)添加到合成语音。 音频通信可以是视频通信的一部分(453),并且音频输入(403)可以具有输入说话者的图像的相关视觉输入(455)。 来自文本的合成可以包括提供个性化的看起来像输入说话者的图像的合成图像,其中从视觉输入(455)添加表达。

    Parallel visual radio station selection
    3.
    发明授权
    Parallel visual radio station selection 失效
    平行视频无线电台选择

    公开(公告)号:US08196046B2

    公开(公告)日:2012-06-05

    申请号:US12184945

    申请日:2008-08-01

    IPC分类号: G06F3/16

    摘要: A computer implemented method in a data processing system and a computer program product enable visual selection of a media signal. A set of media signals is received from a set of media providers. A subject matter and a performer of the subject matter are then identified for at least one of the set of media signals. A set of icons is then identified. Each of the set of icons corresponds to at least one of media signals. The set of icons and the set of media providers are then forwarded to a client media player.

    摘要翻译: 在数据处理系统和计算机程序产品中的计算机实现的方法使得能够可视化地选择媒体信号。 从一组媒体提供商接收一组媒体信号。 然后,针对该组媒体信号中的至少一个识别主题的主题和表演者。 然后识别一组图标。 所述一组图标中的每一个对应于至少一个媒体信号。 然后将该组图标和一组媒体提供商转发给客户端媒体播放器。

    Parallel Visual Radio Station Selection
    4.
    发明申请
    Parallel Visual Radio Station Selection 失效
    并行视频无线电台选择

    公开(公告)号:US20100031146A1

    公开(公告)日:2010-02-04

    申请号:US12184945

    申请日:2008-08-01

    IPC分类号: G06F3/00

    摘要: A computer implemented method in a data processing system and a computer program product enable visual selection of a media signal. A set of media signals is received from a set of media providers. A subject matter and a performer of the subject matter are then identified for at least one of the set of media signals. A set of icons is then identified. Each of the set of icons corresponds to at least one of media signals. The set of icons and the set of media providers are then forwarded to a client media player.

    摘要翻译: 在数据处理系统和计算机程序产品中的计算机实现的方法使得能够可视化地选择媒体信号。 从一组媒体提供商接收一组媒体信号。 然后,针对该组媒体信号中的至少一个识别主题的主题和表演者。 然后识别一组图标。 所述一组图标中的每一个对应于至少一个媒体信号。 然后将该组图标和一组媒体提供商转发给客户端媒体播放器。

    VOCAL SOURCE EXTRACTION BY MAXIMUM PHASE DETECTION
    5.
    发明申请
    VOCAL SOURCE EXTRACTION BY MAXIMUM PHASE DETECTION 有权
    通过最大相位检测提取VOCAL SOURCE

    公开(公告)号:US20130325455A1

    公开(公告)日:2013-12-05

    申请号:US13487275

    申请日:2012-06-04

    IPC分类号: G10L11/04

    CPC分类号: G10L25/75 G10L25/03 G10L25/45

    摘要: Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.

    摘要翻译: 方法,装置和计算机程序产品实现本发明的实施例,其包括接收时域语音信号,并从接收到的信号中提取单个音调周期。 提取的单音调周期被转换为频域,并且识别和校正频域的错误分类的根。 使用校正的根,产生频域的最大相位的指示。

    Vocal source extraction by maximum phase detection
    6.
    发明授权
    Vocal source extraction by maximum phase detection 有权
    通过最大相位检测进行声源提取

    公开(公告)号:US09105272B2

    公开(公告)日:2015-08-11

    申请号:US13487275

    申请日:2012-06-04

    IPC分类号: G10L25/75 G10L25/03 G10L25/45

    CPC分类号: G10L25/75 G10L25/03 G10L25/45

    摘要: Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.

    摘要翻译: 方法,装置和计算机程序产品实现本发明的实施例,其包括接收时域语音信号,并从接收到的信号中提取单个音调周期。 提取的单音调周期被转换为频域,并且识别和校正频域的错误分类的根。 使用校正的根,产生频域的最大相位的指示。

    Voice transformation with encoded information
    7.
    发明授权
    Voice transformation with encoded information 有权
    具有编码信息的语音变换

    公开(公告)号:US08930182B2

    公开(公告)日:2015-01-06

    申请号:US13049924

    申请日:2011-03-17

    CPC分类号: G10L21/003 G10L19/018

    摘要: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.

    摘要翻译: 提供语音转换的方法,系统和计算机程序产品。 该方法包括使用变换参数来变换源语言,以及使用隐写术对输入语音中的变换参数对信息进行编码,其中可以使用输出语音和关于变换参数的信息来重构源语音。 还提供了一种用于重建语音变换的方法,包括:接收语音转换系统的输出语音,其中输出语音是使用隐写术编码关于变换参数的信息的变换语音; 提取变换参数信息; 并执行输出语音的逆变换以获得原始源语音的近似。

    VOICE TRANSFORMATION WITH ENCODED INFORMATION
    10.
    发明申请
    VOICE TRANSFORMATION WITH ENCODED INFORMATION 有权
    语音转换与编码信息

    公开(公告)号:US20120239387A1

    公开(公告)日:2012-09-20

    申请号:US13049924

    申请日:2011-03-17

    IPC分类号: G10L19/02

    CPC分类号: G10L21/003 G10L19/018

    摘要: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.

    摘要翻译: 提供语音转换的方法,系统和计算机程序产品。 该方法包括使用变换参数来变换源语言,以及使用隐写术对输入语音中的变换参数对信息进行编码,其中可以使用输出语音和关于变换参数的信息来重构源语音。 还提供了一种用于重建语音变换的方法,包括:接收语音转换系统的输出语音,其中输出语音是使用隐写术编码关于变换参数的信息的变换语音; 提取变换参数信息; 并执行输出语音的逆变换以获得原始源语音的近似。