A VOCODER-BASED VOICE RECOGNIZER
    91.
    发明公开
    A VOCODER-BASED VOICE RECOGNIZER 失效
    声码器基于语音

    公开(公告)号:EP1046154A1

    公开(公告)日:2000-10-25

    申请号:EP98933871.0

    申请日:1998-07-22

    IPC分类号: G10L5/04

    摘要: A vocoder based voice recognizer recognizes a spoken word using linear prediction coding based vocoder data without completely reconstructing the voice data. The recognizer generates at least one energy estimate per frame of the vocoder data (60) and searches for word boundaries in the vocoder data (64) using the associated energy estimates. If a word is found (66), the linear prediction coding word parameters are extracted (68) from the vocoder data associated with the word and recognition features are calculated (70) from the extracted linear prediction coding word parameters. Finally, the recognition features are matched with previously stored recognition features of other words (40), thereby recognizing the spoken word.

    PARAMETER SHARING SPEECH RECOGNITION SYSTEM
    92.
    发明公开
    PARAMETER SHARING SPEECH RECOGNITION SYSTEM 审中-公开
    含参数的划分语音识别系统

    公开(公告)号:EP1034533A1

    公开(公告)日:2000-09-13

    申请号:EP98952239.6

    申请日:1998-10-09

    IPC分类号: G10L5/04

    CPC分类号: G10L15/142 G10L15/148

    摘要: A method and apparatus for a parameter sharing speech recognition system are provided. A model device (410) coupled to receive the output of a signal segmenter hosting a shared hidden Markov model produced by generating a number of phoneme models (600-603), son of which are shared. The phoneme models (600-603) are generated by retaining as a separate phoneme model any triphone model having a number of trained frames available that exceeds a pre-specified threshold. The generated phoneme models are trained, and shared phoneme models states (604-609) are generated that are shared among the phoneme models (600-603). Shared probability distribution functions (610-616) are generated that are shared among the phoneme models (600-609). Shared probability sub-distribution function (617-627) are generated that are shared among the phoneme model probability distribution functions (610-616). The shared phoneme model hierarchy is reevaluated for further sharing in response to the shared probability sub-distribution functions.

    VERFAHREN ZUR BESTIMMUNG EINES REPRÄSENTANTEN FÜR EINEN SPRACHBAUSTEIN EINER SPRACHE AUS EINEM LAUTABSCHNITTE UMFASSENDEN SPRACHSIGNAL
    94.
    发明公开
    VERFAHREN ZUR BESTIMMUNG EINES REPRÄSENTANTEN FÜR EINEN SPRACHBAUSTEIN EINER SPRACHE AUS EINEM LAUTABSCHNITTE UMFASSENDEN SPRACHSIGNAL 失效
    一种用于确定代表一个语言块一种语言一个响亮的声音部分的满信号

    公开(公告)号:EP1005694A1

    公开(公告)日:2000-06-07

    申请号:EP98948677.4

    申请日:1998-07-27

    发明人: HOLZAPFEL, Martin

    IPC分类号: G10L5/04

    CPC分类号: G10L13/06

    摘要: After segmenting a voice signal into individual speech units, said units representing a speech sound block are assembled in a group. These multiple speech units included in a group describe distinctively well a sound block. Different selection criteria to evaluate the usability of individual speech units are provided. One advantage of combining the selection criteria is that different criteria can be taken into account when selecting a representative speech unit. Each selection criterion includes a membership function which indicates the 'usability' of individual speech units to be selected as a representative of the group. Preferably, the speech unit representing a maximum amongst the speech units of the group according to the selection criteria indicated by the membership function is selected as the representative of the corresponding sound block.

    IMPROVEMENTS IN, OR RELATING TO, SPEECH-TO-SPEECH CONVERSION
    95.
    发明公开
    IMPROVEMENTS IN, OR RELATING TO, SPEECH-TO-SPEECH CONVERSION 失效
    改进技术或与之相关的语音到语音转换

    公开(公告)号:EP0976026A1

    公开(公告)日:2000-02-02

    申请号:EP97919841.3

    申请日:1997-04-08

    申请人: TELIA AB

    发明人: LYBERG, Bertil

    IPC分类号: G06F3/16 G10L5/04

    摘要: A system and method for speech-to-speech conversion for providing spoken responses to speech inputs in at least two natural languages wherein speech inputs are recognised and interpreted in said at least two languages. The recognised speech inputs are evaluated to determine the language of the speech inputs, and a dialogue is undertaken with a database containing speech information data, in said at least two natural languages, to obtain data for the formulation of spoken responses to the speech inputs. The speech information data, obtained from the database, is then converted into spoken responses which exhibit the language characteristics of the respective speech inputs.

    Pitch marks management for speech synthesis
    96.
    发明公开
    Pitch marks management for speech synthesis 有权
    Verwaltung der GrundfrequenzmarkierungenfürSprachsynthese

    公开(公告)号:EP0942408A2

    公开(公告)日:1999-09-15

    申请号:EP99301669.0

    申请日:1999-03-05

    IPC分类号: G10L5/04

    摘要: The distance between the first two pitch marks of a voiced portion of speech data to be processed is calculated. The difference between the adjacent inter-pitch-mark distances is calculated. The respective calculation results are stored and managed in a file.

    摘要翻译: 计算要处理的语音数据的有声部分的前两个音高标记之间的距离。 计算出相邻间距标记距离之间的差异。 相应的计算结果在文件中存储和管理。

    EP0932896A4 -
    97.
    发明公开
    EP0932896A4 - 失效
    EP0932896A4 - Google专利

    公开(公告)号:EP0932896A4

    公开(公告)日:1999-09-08

    申请号:EP97946261

    申请日:1997-10-15

    摘要: A method (500, 600), device (201 and 206) and system (203) provide, in response to text/linguistic information, efficient generation of a parametric representation of speech. A coder parameter generating system provides a principal set and a supplementary set of speech parameters, the principal set of speech parameters being the parametric representation of speech. Then feedback is provided to the coder parameter generating system using the supplementary set of speech parameters to modify the principal set of speech parameters.

    摘要翻译: 响应于文本/语言信息,方法(500,600),设备(201和206)和系统(203)提供语音的参数表示的有效生成。 编码参数生成系统提供一个主要集合和一个补充集合的语音参数,该主要语音参数集合是语音的参数表示。 然后使用补充语音参数集将反馈提供给编码参数生成系统以修改主要语音参数集。

    Prosodic databases holding fundamental frequency templates for use in speech synthesis
    100.
    发明公开
    Prosodic databases holding fundamental frequency templates for use in speech synthesis 失效
    含用于语音合成的韵律数据库基频图案

    公开(公告)号:EP0833304A3

    公开(公告)日:1999-03-24

    申请号:EP97114208.8

    申请日:1997-08-18

    IPC分类号: G10L5/04

    摘要: Prosodic databases hold fundamental frequency templates for use in a speech synthesis system. Prosodic database templates may hold fundamental frequency values for syllables in a given sentence. These fundamental frequency values may be applied in synthesizing a sentence of speech. The templates are indexed by tonal pattern markings. A predicted tonal marking pattern is generated for each sentence of text that is to be synthesized, and this predicted pattern of tonal markings is used to locate a best matching template. The templates are derived by calculating fundamental frequencies on a pursuable basis for sentences that are spoken by a human trainer for a given unlabeled corpus.