Formant emphasis method and formant emphasis filter device
    1.
    发明授权
    Formant emphasis method and formant emphasis filter device 失效
    渐进强调方法和共振峰强调过滤装置

    公开(公告)号:US6064962A

    公开(公告)日:2000-05-16

    申请号:US713356

    申请日:1996-09-13

    IPC分类号: G10L19/26 G10L25/15 G10L5/02

    CPC分类号: G10L19/26 G10L25/15

    摘要: In a formant emphasis method of emphasizing the formant as the spectral peak of an input speech signal and attenuating the spectral valley of the input speech signal, a spectrum emphasis filter performs processing for emphasizing the formant of the input speech signal and attenuating the valley of the input speech signal. A first-order variable characteristic filter whose characteristic adaptively changes in accordance with the characteristic of the input speech signal and a first-order fixed characteristic filter compensate a spectral tilt included in an output signal from the spectrum emphasis filter.

    摘要翻译: 在强调共振峰作为输入语音信号的频谱峰值并衰减输入语音信号的频谱谷的共振峰强调方法中,频谱加重滤波器执行用于强调输入语音信号的共振峰并衰减输入语音信号的谷值的处理 输入语音信号。 其特征根据输入语音信号的特性自适应地改变的一阶可变特征滤波器和一阶固定特性滤波器对包括在来自频谱加重滤波器的输出信号中的频谱倾斜进行补偿。

    Musical sound synthesizer and storage medium therefor
    2.
    发明授权
    Musical sound synthesizer and storage medium therefor 失效
    音乐合成器及其存储介质

    公开(公告)号:US5998725A

    公开(公告)日:1999-12-07

    申请号:US902424

    申请日:1997-07-29

    申请人: Shinichi Ohta

    发明人: Shinichi Ohta

    IPC分类号: G10L5/00 G10L5/02 G10L5/04

    摘要: A musical sound synthesizer generates a predetermined singing sound based on performance data. A compression device determines whether each of a plurality of phonemes forming the predetermined singing sound is a first phoneme to be sounded in accordance with a note-on signal indicative of a note-on of the performance data, and compresses a rise time of the first phoneme when the first phoneme is sounded in accordance with occurrence of the note-on signal of the performance data.

    摘要翻译: 音乐声合成器基于演奏数据产生预定的歌声。 压缩装置确定形成预定歌声的多个音素中的每一个是否是根据表示演奏数据的音符开启的音符开启信号来发声的第一个音素,并且压缩第一音素的上升时间 当根据演奏数据的音符开启信号的发生而发出第一音素时的音素。

    Method and apparatus for continuous spelling speech recognition with
early identification
    3.
    发明授权
    Method and apparatus for continuous spelling speech recognition with early identification 失效
    具有早期识别的连续拼写语音识别的方法和装置

    公开(公告)号:US5995928A

    公开(公告)日:1999-11-30

    申请号:US720554

    申请日:1996-10-02

    IPC分类号: G10L15/18 G10L5/02

    摘要: A speech recognition system capable of recognizing a word or a plurality of words based on a continuous spelling of the word(s) by a user. The system includes a speech recognition engine with a decoder running in forward mode such that the recognition engine continuously outputs an updated string of hypothesized letters based on the letters uttered by the user. The system further includes a spelling engine for comparing each string of hypothesized letters to a vocabulary list of words. The spelling engine returns a best match for the string of hypothesized letters. The system may also include an early identification unit for presenting the user with the best matching word(s) possibly before the user has completed spelling the desired word(s).

    摘要翻译: 一种语言识别系统,其能够基于用户对单词的连续拼写来识别单词或多个单词。 该系统包括具有以正向模式运行的解码器的语音识别引擎,使得识别引擎基于用户发出的字母连续地输出更新的假设字母串。 该系统还包括拼写引擎,用于将每一串假设的字母与单词的词汇列表进行比较。 拼写引擎为假设字母串返回最佳匹配。 该系统还可以包括早期识别单元,用于在用户完成拼写期望的单词之前向用户呈现最佳匹配字词。

    Synthesising speech by converting phonemes to digital waveforms
    4.
    发明授权
    Synthesising speech by converting phonemes to digital waveforms 失效
    通过将音素转换为数字波形来合成语音

    公开(公告)号:US5987412A

    公开(公告)日:1999-11-16

    申请号:US796818

    申请日:1997-02-06

    申请人: Andrew Paul Breen

    发明人: Andrew Paul Breen

    IPC分类号: G10L13/06 G10L13/08 G10L5/02

    CPC分类号: G10L13/07 G10L13/04

    摘要: Synthetic speech is generated by production of a digital waveform from a text in phonemes. A linked database is used which comprises an extended text in phonemes and its equivalent in the form of a digital waveform. The two portions of the database are linked by a parameter which establishes equivalent points in both the phoneme text and the digital waveform. The input text (in phonemes) is analyzed to locate a matching portion in the phoneme portion of the database. This matching utilizes exact equivalence of phonemes where this is possible; otherwise relation between phonemes is utilized. The selection process identifies input phonemes in context whereby improved conversions are obtained. Having analyzed the input exit into matching strings in the input form of the database beginning and ending parameters for the sections are established. The output text is produced by abutting sections of the digital waveform and defined by the beginning and ending parameters.

    摘要翻译: 通过从音素中的文本生成数字波形来产生合成语音。 使用链接的数据库,其包括音素中的扩展文本及其数字波形形式的等效文本。 通过在音素文本和数字波形中建立等效点的参数来链接数据库的两个部分。 分析输入文本(在音素中)以定位数据库的音素部分中的匹配部分。 这种匹配利用了可能的音素的精确等效性; 否则使用音素之间的关系。 选择过程在上下文中识别输入音素,从而获得改进的转换。 将数据库的输入形式的输入出口分析为匹配的字符串,建立了这些部分的开始和结束参数。 输出文本由数字波形的邻接部分生成,并由开始和结束参数定义。

    Signal extraction system, system and method for speech restoration,
learning method for neural network model, constructing method of neural
network model, and signal processing system
    5.
    发明授权
    Signal extraction system, system and method for speech restoration, learning method for neural network model, constructing method of neural network model, and signal processing system 失效
    信号提取系统,语音恢复系统与方法,神经网络模型学习方法,神经网络模型构建方法及信号处理系统

    公开(公告)号:US5960391A

    公开(公告)日:1999-09-28

    申请号:US766633

    申请日:1996-12-13

    CPC分类号: G10L21/0272 G10L25/30

    摘要: A signal extraction system for extracting one or more signal components from an input signal including a plurality of signal components. This system is equipped with a neural network arithmetic section designed to process information through the use of a recurrent neural network. The neural network arithmetic section extracts one or more signal components, for example, a speech signal component and a noise signal component from an input signal including a plurality of signal components such as a speech and noises and outputs the extracted signal components. Owing to the presence of this neural network arithmetic section, the signal extraction becomes possible with a high accuracy.

    摘要翻译: 一种用于从包括多个信号分量的输入信号中提取一个或多个信号分量的信号提取系统。 该系统配备有神经网络算术部分,用于通过使用循环神经网络来处理信息。 神经网络算术部分从包括诸如语音和噪声的多个信号分量的输入信号中提取一个或多个信号分量,例如语音信号分量和噪声信号分量,并输出所提取的信号分量。 由于存在该神经网络运算部分,所以可以高精度地进行信号提取。

    Voice recording apparatus capable of displaying remaining recording
capacity of memory according to encoding bit rates
    6.
    发明授权
    Voice recording apparatus capable of displaying remaining recording capacity of memory according to encoding bit rates 失效
    能够根据编码比特率显示存储器的剩余记录容量的语音记录装置

    公开(公告)号:US5950164A

    公开(公告)日:1999-09-07

    申请号:US721937

    申请日:1996-09-27

    摘要: A voice recording apparatus includes an encoding unit capable of encoding an input voice signal at different encoding bit rates. A system controller records the input voice signal encoded by the encoding unit on a memory and acquires information on at least one of a used recordable capacity and remaining recordable capacity of the memory for any of the encoding bit rates. A display unit displays the information in a single way or a plurality of different ways of representation.

    摘要翻译: 语音记录装置包括能够以不同编码比特率编码输入语音信号的编码单元。 系统控制器将由编码单元编码的输入语音信号记录在存储器上,并获取关于任何编码比特率的存储器的使用的可记录容量和剩余可记录容量中的至少一个的信息。 显示单元以单一方式或多种不同的表示方式显示信息。

    Word syllabification in speech synthesis system
    7.
    发明授权
    Word syllabification in speech synthesis system 失效
    语音合成系统中的词音节

    公开(公告)号:US5949961A

    公开(公告)日:1999-09-07

    申请号:US503960

    申请日:1995-07-19

    IPC分类号: G10L5/02

    CPC分类号: G10L13/08

    摘要: The present invention relates to a system and method of word syllabification. The present invention receives a word to be syllabified and determines therefrom all possible substrings capable of forming part of the word. Sequences matching at least part of or the whole of the word are determined from the substrings together with respective probabilities of occurrence and the sequence having the greatest probability of occurrence is selected as being the most probable syllabification of the word. The most probable sequence can be determined in many different ways. For example, the sequence can be determined by commencing with the substring having the greatest probability of forming the beginning of a given word and subsequently traversing in a step-by-step manner a table comprising all possible substrings of the word and at each step selecting the next substring of the sequence according to which of the possible next substrings has the highest probability of occurrence. A further method of determining the most probable sequence would be to adopt the above step-by-step approach for all possible substrings capable of forming the beginning of the given word. Alternatively, all possible sequences of substring capable of constituting the word can be determined together respective probabilities of occurrence thereof and the sequence having the highest respective probability of occurrence is selected as being the most probable syllabification of the given word.

    摘要翻译: 本发明涉及一种单词词典的系统和方法。 本发明接收要被音节化的单词并由其确定能够形成单词的一部分的所有可能的子串。 根据子字符串和相应的出现概率确定与字的至少一部分或全部相匹配的序列,并将具有最大发生概率的序列选择为该词的最可能的音节。 最可能的序列可以以许多不同的方式确定。 例如,可以通过开始具有形成给定字的开始的最大概率的子串并随后以逐步的方式遍历包括单词的所有可能的子串的表并且在每个步骤选择 根据哪个可能的下一个子串具有最高发生概率的序列的下一个子串。 确定最可能序列的另一种方法是对能够形成给定单词的开始的所有可能的子串采用上述逐步方法。 或者,能够构成单词的子串的所有可能的序列可以一起确定其出现的概率,并且具有最高相应发生概率的序列被选择为给定单词的最可能的音节。

    Speech recognition based on HMMs
    8.
    发明授权
    Speech recognition based on HMMs 失效
    基于HMM的语音识别

    公开(公告)号:US5943647A

    公开(公告)日:1999-08-24

    申请号:US869408

    申请日:1997-06-05

    申请人: Jari Ranta

    发明人: Jari Ranta

    摘要: A speech recognition method that combines HMMs and vector quantization to model the speech signal and adds spectral derivative information in the speech parameters. Each state of a HMM is modeled by two different VQ-codebooks. One is trained by using the spectral parameters and the second is trained by using the spectral derivative parameters.

    摘要翻译: 一种语音识别方法,其组合HMM和矢量量化以对语音信号进行建模,并在语音参数中添加频谱导数信息。 HMM的每个状态由两个不同的VQ码本建模。 通过使用光谱参数训练一个,第二个通过使用光谱衍生参数进行训练。

    Method and system of runtime acoustic unit selection for speech synthesis
    9.
    发明授权
    Method and system of runtime acoustic unit selection for speech synthesis 失效
    用于语音合成的运行时音单元选择的方法和系统

    公开(公告)号:US5913193A

    公开(公告)日:1999-06-15

    申请号:US648808

    申请日:1996-04-30

    CPC分类号: G10L13/07

    摘要: The present invention pertains to a concatenative speech synthesis system and method which produces a more natural sounding speech. The system provides for multiple instances of each acoustic unit which can be used to generate a speech waveform representing an linguistic expression. The multiple instances are formed during an analysis or training phase of the synthesis process and are limited to a robust representation of the highest probability instances. The provision of multiple instances enables the synthesizer to select the instance which closely resembles the desired instance thereby eliminating the need to alter the stored instance to match the desired instance. This in essence minimizes the spectral distortion between the boundaries of adjacent instances thereby producing more natural sounding speech.

    摘要翻译: 本发明涉及一种产生更自然的声音语音的级联语音合成系统和方法。 该系统提供每个声学单元的多个实例,其可用于生成表示语言表达式的语音波形。 多个实例在合成过程的分析或训练阶段期间形成,并且被限制为最高概率实例的鲁棒表示。 提供多个实例使得合成器能够选择非常类似于期望实例的实例,从而消除了改变存储的实例以匹配所需实例的需要。 这实质上使相邻实例的边界之间的频谱失真最小化,从而产生更自然的声音语音。

    Method and apparatus for an improved language recognition system
    10.
    发明授权
    Method and apparatus for an improved language recognition system 失效
    改进语言识别系统的方法和装置

    公开(公告)号:US5870706A

    公开(公告)日:1999-02-09

    申请号:US631874

    申请日:1996-04-10

    申请人: Hiyan Alshawi

    发明人: Hiyan Alshawi

    IPC分类号: G10L15/00 G10L15/18 G10L5/02

    摘要: Methods and apparatus for a language model and language recognition systems are disclosed. The method utilizes a plurality of probabilistic finite state machines having the ability to recognize a pair of sequences, one sequence scanned leftwards, the other scanned rightwards. Each word in the lexicon of the language model is associated with one or more such machines which model the semantic relations between the word and other words. Machine transitions create phrases from a set of word string hypotheses, and incrementally calculate costs related to the probability that such phrases represent the language to be recognized. The cascading lexical head machines utilized in the methods and apparatus capture the structural associations implicit in the hierachical organization of a sentence, resulting in a language model and language recognition systems that combine the lexical sensitivity of N-gram models with the structural properties of dependency grammar.

    摘要翻译: 公开了用于语言模型和语言识别系统的方法和装置。 该方法利用具有识别一对序列的能力的多个概率有限状态机,向左扫描的一个序列,另一个向右扫描的序列。 语言模型的词典中的每个单词都与一个或多个这样的机器相关联,这些机器对单词和其他单词之间的语义关系进行建模。 机器转换从一组字串假设中创建短语,并逐步计算与这种短语表示要识别的语言的概率相关的成本。 在方法和装置中使用的级联词汇头机器捕获隐含在句子的层次组织中的结构关联,导致语言模型和语言识别系统将N-gram模型的词汇敏感性与依赖性语法的结构性质相结合 。