Translingual visual speech synthesis
    1.
    发明授权
    Translingual visual speech synthesis 失效
    横向视觉语音综合

    公开(公告)号:US06813607B1

    公开(公告)日:2004-11-02

    申请号:US09494582

    申请日:2000-01-31

    IPC分类号: G10L1100

    摘要: A computer implemented method in a language independent system generates audio-driven facial animation given the speech recognition system for just one language. The method is based on the recognition that once alignment is generated, the mapping and the animation hardly have any language dependency in them. Translingual visual speech synthesis can be achieved if the first step of alignment generation can be made speech independent. Given a speech recognition system for a base language, the method synthesizes video with speech of any novel language as the input.

    摘要翻译: 语言独立系统中的计算机实现的方法产生音频驱动的面部动画,给出仅一种语言的语音识别系统。 该方法基于识别一旦生成对齐,映射和动画在它们中几乎没有任何语言依赖关系。 如果可以使语音不依赖于对准生成的第一步,则可以实现视觉语音合成。 给定基本语言的语音识别系统,该方法以任何新颖语言的语音合成视频作为输入。

    Method and System for Hybrid Call Handling
    2.
    发明申请
    Method and System for Hybrid Call Handling 审中-公开
    混合呼叫处理方法与系统

    公开(公告)号:US20080086690A1

    公开(公告)日:2008-04-10

    申请号:US11534000

    申请日:2006-09-21

    IPC分类号: G06F15/177

    CPC分类号: H04L12/66

    摘要: The present invention provides a hybrid call handling method and system. The method comprises navigating a plurality of received calls from a plurality of callers. The method further comprises monitoring a call health status for each of the plurality of the calls being navigated for entire call duration and notifying a bad call health status of the monitored call to a human agent for employing at least one rectification action. The call health status is determined by monitoring and measuring one or more call parameters. The invention provides for a system for call handling and navigation by an automated system with a human agent assisting the automated system for rectification of calls with bad call health status. Once the call with a bad health is transferred to the human agent, he assists the automated system either by directly communicating with the caller or by communicating using a machine interface.

    摘要翻译: 本发明提供一种混合呼叫处理方法和系统。 该方法包括从多个呼叫者导航多个接收的呼叫。 所述方法还包括:监视针对整个呼叫持续时间进行导航的多个呼叫中的每个呼叫的呼叫健康状态,并将被监视呼叫的不良呼叫健康状况通知给人类代理以采用至少一个整流动作。 通过监视和测量一个或多个呼叫参数来确定呼叫健康状态。 本发明提供了一种用于通过自动化系统进行呼叫处理和导航的系统,其中人体代理协助自动化系统来校正具有不良呼叫健康状态的呼叫。 一旦身体不好的呼叫转移给人类代理,他可以通过与呼叫者直接通信或通过使用机器接口进行通信来协助自动化系统。

    Language context dependent data labeling
    3.
    发明授权
    Language context dependent data labeling 有权
    语言上下文相关数据标签

    公开(公告)号:US07295979B2

    公开(公告)日:2007-11-13

    申请号:US09790296

    申请日:2001-02-22

    IPC分类号: G10L15/06 G10L15/00

    CPC分类号: G10L15/06 G10L15/187

    摘要: Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.

    摘要翻译: 当两种语言共享相似的声学空间时,将系统从一种语言引导到另一种语言通常会很好。 然而,当新语言的语音不会出现在引导引导的语言中时,引导不会产生良好的初始模型,并且新的语言数据未正确对齐这些模型。 本发明提供了使用另一种语言的识别系统来生成新语言数据的上下文相关标签的技术。 然后,这个标记的数据用于生成新语言手机的模型。

    Speech driven lip synthesis using viseme based hidden markov models
    4.
    发明授权
    Speech driven lip synthesis using viseme based hidden markov models 有权
    使用基于Viseme的隐马尔可夫模型的语音驱动唇形合成

    公开(公告)号:US06366885B1

    公开(公告)日:2002-04-02

    申请号:US09384763

    申请日:1999-08-27

    IPC分类号: G10L2106

    摘要: A method of speech driven lip synthesis which applies viseme based training models to units of visual speech. The audio data is grouped into a smaller number of visually distinct visemes rather than the larger number of phonemes. These visemes then form the basis for a Hidden Markov Model (HMM) state sequence or the output nodes of a neural network. During the training phase, audio and visual features are extracted from input speech, which is then aligned according to the apparent viseme sequence with the corresponding audio features being used to calculate the HMM state output probabilities or the output of the neutral network. During the synthesis phase, the acoustic input is aligned with the most likely viseme HMM sequence (in the case of an HMM based model) or with the nodes of the network (in the case of a neural network based system), which is then used for animation.

    摘要翻译: 基于视觉训练模型的视觉语音单元的语音驱动唇形合成方法。 音频数据被分组为较少数量的视觉上不同的视角,而不是较大数量的音素。 这些视差然后形成了隐马尔可夫模型(HMM)状态序列或神经网络的输出节点的基础。 在训练阶段,从输入语音中提取音频和视觉特征,然后根据明显的视度序列对准音频特征,使用相应的音频特征来计算HMM状态输出概率或中性网络的输出。 在合成阶段期间,声输入与最可能的viseme HMM序列(在基于HMM的模型的情况下)或网络的节点(在基于神经网络的系统的情况下)对齐,然后使用 用于动画。

    SPEAKER ADAPTATION OF VOCABULARY FOR SPEECH RECOGNITION
    5.
    发明申请
    SPEAKER ADAPTATION OF VOCABULARY FOR SPEECH RECOGNITION 有权
    语音识别词汇的适应

    公开(公告)号:US20120035928A1

    公开(公告)日:2012-02-09

    申请号:US13273020

    申请日:2011-10-13

    IPC分类号: G10L15/06

    CPC分类号: G10L17/02 G10L15/07

    摘要: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.

    摘要翻译: 用于语音识别系统的语音词汇适用于特定说话者的发音。 扬声器可以归因于特定的发音风格,可以从具体的发音示例中识别。 因此,可以减小语音词汇量,从而提高识别精度和识别速度。

    Speaker adaptation of vocabulary for speech recognition
    8.
    发明授权
    Speaker adaptation of vocabulary for speech recognition 有权
    演讲者适应语音识别词汇

    公开(公告)号:US07389228B2

    公开(公告)日:2008-06-17

    申请号:US10320020

    申请日:2002-12-16

    IPC分类号: G10L15/00

    CPC分类号: G10L17/02 G10L15/07

    摘要: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.

    摘要翻译: 用于语音识别系统的语音词汇适用于特定说话者的发音。 扬声器可以归因于特定的发音风格,可以从特定的发音示例中识别。 因此,可以减小语音词汇量,从而提高识别精度和识别速度。

    Hybrid baseform generation
    9.
    发明授权
    Hybrid baseform generation 有权
    混合型形式生成

    公开(公告)号:US07206738B2

    公开(公告)日:2007-04-17

    申请号:US10219686

    申请日:2002-08-14

    IPC分类号: G06F17/21

    CPC分类号: G06F17/2735

    摘要: A method, a computer system and a computer program product for generating baseforms or phonetic spellings from input text are disclosed. The baseforms are initially generated using rules defined for a particular language. Then, phones are identified in the language that are exceptions to the defined rules and an action is associated with each identified phone. A statistical technique is applied to determine whether the identified phones can be modified. Finally, baseforms containing the identified phones that can be modified, are corrected according to the associated actions. Preferably, the statistical technique is only applied to baseforms containing phones that are exceptions to the defined rules. The defined rules can comprise spelling-to-sound rules for a particular phonetic language that incorporate all possible alternative pronunciations of each baseform.

    摘要翻译: 公开了一种用于从输入文本生成基础形式或语音拼写的方法,计算机系统和计算机程序产品。 最初使用为特定语言定义的规则生成基本形式。 然后,以所述语言识别电话,这是所定义的规则的例外,并且动作与每个识别的电话相关联。 应用统计技术来确定所识别的手机是否可以修改。 最后,包含可以修改的已识别电话的基本形式,将根据相关的动作进行更正。 优选地,统计技术仅适用于包含作为定义规则的例外的电话的基本形式。 定义的规则可以包括用于特定语音语言的拼写音符规则,其包含每个基本形式的所有可能的替代发音。

    Speaker adaptation of vocabulary for speech recognition
    10.
    发明授权
    Speaker adaptation of vocabulary for speech recognition 有权
    演讲者适应语音识别词汇

    公开(公告)号:US08046224B2

    公开(公告)日:2011-10-25

    申请号:US12105390

    申请日:2008-04-18

    IPC分类号: G10L15/06 G10L15/04 G10L15/28

    CPC分类号: G10L17/02 G10L15/07

    摘要: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.

    摘要翻译: 用于语音识别系统的语音词汇适用于特定说话者的发音。 扬声器可以归因于特定的发音风格,可以从具体的发音示例中识别。 因此,可以减小语音词汇量,从而提高识别精度和识别速度。