Speech recognition based on a multilingual acoustic model
    1.
    发明公开
    Speech recognition based on a multilingual acoustic model 有权
    Spracherkennung auf Grundlage eines mehrspra​​chigen akustischen Modells

    公开(公告)号:EP2192575A1

    公开(公告)日:2010-06-02

    申请号:EP08020639.4

    申请日:2008-11-27

    IPC分类号: G10L15/18

    CPC分类号: G10L15/144 G10L15/187

    摘要: The present invention relates to a method for generating a multilingual speech recognizer comprising a multilingual acoustic model, comprising the steps of providing a first speech recognizer comprising a first codebook consisting of first Gaussians and a first Hidden Markov Model, HMM, comprising first states; providing at least one second speech recognizer comprising a second codebook consisting of second Gaussians and a second Hidden Markov Model, HMM, comprising second states; replacing each of the second Gaussians of the at least one second speech recognizer by the respective closest one of the first Gaussians and/or each of the second states of the second HMM of the at least one second speech recognizer with the respective closest state of the first HMM of the first speech recognizer to obtain at least one modified second speech recognizer and combining the first speech recognizer and the at least one modified second speech recognizer to obtain the multilingual speech recognizer.

    摘要翻译: 本发明涉及一种用于生成包括多语言声学模型的多语言语音识别器的方法,包括以下步骤:提供第一语音识别器,其包括由第一高斯组成的第一码本和包括第一状态的第一隐马尔可夫模型HMM; 提供至少一个第二语音识别器,包括由第二高斯组成的第二码本和包含第二状态的第二隐马尔可夫模型HMM; 将所述至少一个第二语音识别器中的每个第二高斯替换为所述至少一个第二语音识别器中的所述第一高斯和/或所述第二HMM中的每个所述第二状态中的相应最近的一个, 第一语音识别器的第一HMM以获得至少一个修改的第二语音识别器,并且组合第一语音识别器和至少一个修改的第二语音识别器以获得多语言语音识别器。

    Multilingual codebooks for speech recognition
    2.
    发明公开
    Multilingual codebooks for speech recognition 有权
    Erzeugung von mehrspra​​chigenCodebüchernzur Spracherkennung

    公开(公告)号:EP2107554A1

    公开(公告)日:2009-10-07

    申请号:EP08006690.5

    申请日:2008-04-01

    IPC分类号: G10L15/06 G10L15/00

    摘要: The present invention relates to a method for generating a multilingual codebook, comprising the steps of providing a main language codebook, providing at least one additional codebook corresponding to a language different from the main language and generating a multilingual codebook from the main language codebook and the at least one additional codebook by adding a sub-set of the code vectors of the at least one additional codebook to the main codebook based on distances between the code vectors of the at least one additional codebook to code vectors of the main language codebook.

    摘要翻译: 本发明涉及一种用于生成多语种码本的方法,包括提供主语言码本的步骤,提供与主语言不同的语言对应的至少一个附加码本,并从主语言码本生成多语种码本, 至少一个附加码本,其基于所述至少一个附加码本的码矢量与所述主语言码本的码矢量之间的距离,将所述至少一个附加码本的码矢量的子集添加到所述主码本。

    Exploitation of language identification of media file data in speech dialog systems
    4.
    发明公开
    Exploitation of language identification of media file data in speech dialog systems 有权
    在Sprachdialogsystemen中的Nutzung von Sprachidentifizierung von Mediendateidaten

    公开(公告)号:EP1909263A1

    公开(公告)日:2008-04-09

    申请号:EP06020732.1

    申请日:2006-10-02

    IPC分类号: G10L15/22 G10L15/26 G06F17/30

    摘要: The present invention relates to a method for outputting a synthesized speech signal corresponding to an orthographic string stored in a media file comprising audio data, comprising the steps of analyzing the audio data to determine at least one candidate for a language of the orthographic string, estimating a phonetic representation of the orthographic string based on the determined at least one candidate for a language and synthesizing a speech signal based on the estimated phonetic representation of the orthographic string. The invention also relates to a media player incorporating such a method for a estimating phonetic representation for song and album titles as well as artists' names for speech recognition. Furthermore, the invention relates to the choice of an appropriate speech recognizer for automatically transcribing the lyrics of songs by using audio-based language estimates.

    摘要翻译: 本发明涉及一种用于输出对应于存储在包括音频数据的媒体文件中的正交字符串的合成语音信号的方法,包括以下步骤:分析音频数据以确定正弦字符串的语言的至少一个候选者,估计 基于所确定的用于语言的至少一个候选者并基于所估计的正字符串的语音表示来合成语音信号的正字符串的语音表示。 本发明还涉及一种媒体播放器,其结合了用于估计歌曲和专辑标题的语音表示以及艺术家用于语音识别的名称的方法。 此外,本发明涉及通过使用基于音频的语言估计来自动转录歌词的适当的语音识别器的选择。

    System for a speech-driven selection of an audio file and method therefor
    5.
    发明公开
    System for a speech-driven selection of an audio file and method therefor 有权
    Systemfürspra​​chgesteuerte Auswahl einer Audiodatei und Verfahrendafür

    公开(公告)号:EP1818837A1

    公开(公告)日:2007-08-15

    申请号:EP06002752.1

    申请日:2006-02-10

    IPC分类号: G06F17/30 G10L11/00

    摘要: The present invention relates to a method for detecting a refrain in an audio file, the audio file comprising vocal components, with the following steps:
    - generating a phonetic transcription of a major part of the audio file,
    - analysing the phonetic transcription and identifying a vocal segment in the generated phonetic transcription which is repeated frequently, the identified frequently repeated vocal segment representing the refrain.

    Furthermore, it relates to the speech-driven selection based on similarity of detected refrain and user input.

    摘要翻译: 本发明涉及一种用于检测音频文件中的抑制的方法,所述音频文件包括声音分量,具有以下步骤: - 产生音频文件的主要部分的语音转录, - 分析语音转录并识别 所产生的语音转录中的声音段经常重复,识别出频繁重复的声乐段代表禁忌。 此外,它涉及基于检测到的拒绝和用户输入的相似性的语音驱动选择。

    Speech recognition
    6.
    发明授权
    Speech recognition 有权
    语音识别

    公开(公告)号:EP2161718B1

    公开(公告)日:2011-08-31

    申请号:EP08015561.7

    申请日:2008-09-03

    IPC分类号: G10L15/06 G10L15/02

    摘要: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.

    Generation of multilingual codebooks for speech recognition
    7.
    发明授权
    Generation of multilingual codebooks for speech recognition 有权
    语音识别多种语言的码本的生产

    公开(公告)号:EP2107554B1

    公开(公告)日:2011-08-10

    申请号:EP08006690.5

    申请日:2008-04-01

    IPC分类号: G10L15/06 G10L15/00

    摘要: The present invention relates to a method for generating a multilingual codebook, comprising the steps of providing a main language codebook, providing at least one additional codebook corresponding to a language different from the main language and generating a multilingual codebook from the main language codebook and the at least one additional codebook by adding a sub-set of the code vectors of the at least one additional codebook to the main codebook based on distances between the code vectors of the at least one additional codebook to code vectors of the main language codebook.

    Speech recognition
    8.
    发明公开
    Speech recognition 有权
    Spracherkennung

    公开(公告)号:EP2161718A1

    公开(公告)日:2010-03-10

    申请号:EP08015561.7

    申请日:2008-09-03

    IPC分类号: G10L15/06 G10L15/02

    摘要: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.

    摘要翻译: 本发明涉及一种用于语音信号的语音识别的方法,包括以下步骤:提供至少一个码本,其包括特征矢量的码本条目,特别是多元高斯特征向量的频率加权,使得较高的权重被分配给对应于 频率低于对应于高于预定电平的频率的条目,并且处理用于语音识别的语音信号,包括从语音信号中提取至少一个特征向量并使特征向量与码本的条目相匹配。