Method and apparatus for processing speech information using a phoneme
environment
    1.
    发明授权
    Method and apparatus for processing speech information using a phoneme environment 失效
    使用音素环境处理语音信息的方法和装置

    公开(公告)号:US5845047A

    公开(公告)日:1998-12-01

    申请号:US406487

    申请日:1995-03-20

    CPC分类号: G10L15/02 G10L13/04

    摘要: A speech information processing apparatus includes a statistical processing unit for extracting features by performing statistical processing of a feature file formed by extracting features of speech, such as the fundamental frequency and its variations, and the power and its variations of speech, from a speech file, and a label file in which a phoneme environment, comprising the accent type, the number of moras, the mora position, phonemes and the like, is considered, and a pitch pattern forming unit for forming a pitch pattern, in which phoneme environment is considered, based on the result of the statistical processing.

    摘要翻译: 语音信息处理装置包括:统计处理单元,用于通过从语音文件中提取特征文件的特征文件来提取特征,所述特征文件通过从基本频率及其变化提取语音的特征,以及语音的功率及其变化, 以及其中考虑包括重音类型,莫尔斯数,莫尔斯位置,音素等的音素环境的标签文件,以及用于形成音调模式的音调模式形成单元,其中音素环境是 根据统计处理结果考虑。

    Document inputting method and apparatus and speech outputting apparatus
    2.
    发明授权
    Document inputting method and apparatus and speech outputting apparatus 失效
    文件输入方法和装置和语音输出装置

    公开(公告)号:US5809467A

    公开(公告)日:1998-09-15

    申请号:US923939

    申请日:1997-09-05

    CPC分类号: G10L13/08 G10L13/04

    摘要: A document inputting apparatus or speech outputting apparatus inputs and displays document data, specifies accent information, pronunciation information and syllable-length information of words or characters of the document data. The apparatus displays the document data in accordance with the specified information so that information such as the accent positions or accent intensities can be recognized. Thus formed document data is stored in a memory with the accent information, the pronunciation information or the syllable-length information. Upon reading the document data from the memory and outputting it as speech, the specified information is referred to for speech synthesizing, thus outputting speech corresponding to the correct pronunciation.

    摘要翻译: 文档输入装置或语音输出装置输入和显示文档数据,指定文档数据的单词或字符的重音信息,发音信息和音节长度信息。 该装置根据指定的信息显示文档数据,以便能够识别诸如重音位置或重音强度之类的信息。 这样形成的文档数据被存储在具有重音信息,发音信息或音节长度信息的存储器中。 在从存储器读取文档数据并将其作为语音输出时,参考用于语音合成的指定信息,从而输出与正确发音相对应的语音。

    Speech synthesis apparatus and method for causing a computer to perform
speech synthesis by calculating product of parameters for a speech
waveform and a read waveform generation matrix
    3.
    发明授权
    Speech synthesis apparatus and method for causing a computer to perform speech synthesis by calculating product of parameters for a speech waveform and a read waveform generation matrix 失效
    语音合成装置和方法,用于使计算机通过计算语音波形和读取波形生成矩阵的参数的乘积来执行语音合成

    公开(公告)号:US5745651A

    公开(公告)日:1998-04-28

    申请号:US452545

    申请日:1995-05-30

    CPC分类号: G10L13/033 G10L13/08

    摘要: A speech synthesis method and a speech synthesis apparatus includes a system for synthesis by rule that prevents the quality of synthesized speech from deteriorating and for reducing the number of calculations that are required for the generation of a speech waveform. The speech synthesis apparatus includes a character series input section, for inputting a character series as phonetic text, a pitch waveform generator, for generating a pitch waveform by calculating a product of a matrix, which has been acquired for each pitch, and the character series, which is input by the character series input section, and a device for connecting pitch waveforms that are generated by the pitch waveform generator and for providing a speech waveform. The calculation method for the generation of such a pitch waveform provides a great reduction in the number of calculations that are required. In addition, in the calculation for the generation of a pitch waveform, a function that determines a frequency response is employed to convert a spectral envelope, which is obtained from a parameter, so that the timbres of synthesized speech can be changed without parameter operations.

    摘要翻译: 语音合成方法和语音合成装置包括用于合成规则的系统,该系统防止合成语音的质量恶化,并减少产生语音波形所需的计算次数。 语音合成装置包括:字符串输入部,用于输入作为语音文本的字符串;音调波形发生器,用于通过计算已经针对每个音调获取的矩阵的乘积和字符串来产生音调波形 ,由字符串输入部输入,以及用于连接由音调波形发生器产生的音调波形并用于提供语音波形的装置。 用于产生这种音调波形的计算方法大大减少了所需的计算次数。 此外,在产生音调波形的计算中,采用确定频率响应的函数来转换从参数获得的频谱包络,使得可以在没有参数操作的情况下改变合成语音的音色。

    Speech synthesis apparatus and method for synthesizing speech from a
character series comprising a text and pitch information
    5.
    发明授权
    Speech synthesis apparatus and method for synthesizing speech from a character series comprising a text and pitch information 失效
    用于从包括文本和音调信息的字符串中合成语音的语音合成装置和方法

    公开(公告)号:US5745650A

    公开(公告)日:1998-04-28

    申请号:US448982

    申请日:1995-05-24

    CPC分类号: G10L13/10 G10L13/04 G10L25/93

    摘要: A speech synthesis method and apparatus for synthesizing speech from a character series comprising a text and pitch information. The apparatus includes a parameter generator for generating power spectrum envelopes as parameters of a speech waveform to be synthesized representing the input text in accordance with the input character series. The apparatus also includes a pitch waveform generator for generating pitch waveforms whose period equals the pitch specified by the pitch information. The pitch waveform generator generates the pitch waveforms from the input pitch information and the power spectrum envelopes generated by the parameter generator. Also provided is a speech waveform output device for outputting the speech waveform obtained by connecting the generated pitch waveforms.

    摘要翻译: 一种用于从包括文本和音调信息的字符系列合成语音的语音合成方法和装置。 该装置包括参数发生器,用于根据输入的字符系列,产生功率谱包络作为要合成的语音波形的参数,表示输入文本。 该装置还包括用于产生音调波形的音调波形发生器,其音调波形的周期等于由音调信息指定的音调。 音调波形发生器根据输入音调信息和由参数发生器产生的功率谱包络产生音调波形。 还提供了用于输出通过连​​接所生成的音调波形而获得的语音波形的语音波形输出装置。

    Syllable-beat-point synchronized rule-based speech synthesis from coded
utterance-speed-independent phoneme combination parameters
    6.
    发明授权
    Syllable-beat-point synchronized rule-based speech synthesis from coded utterance-speed-independent phoneme combination parameters 失效
    基于编码话音速度独立音素组合参数的音节点同步规则语音合成

    公开(公告)号:US5682502A

    公开(公告)日:1997-10-28

    申请号:US490140

    申请日:1995-06-14

    CPC分类号: G10L13/06 G10L21/04

    摘要: In a speech synthesizer, each frame for generating a speech waveform has an expansion degree to which the frame is expanded or compressed in accordance with the production speed of synthetic speech. In accordance with the set speech production speed, the time interval between beat synchronization points is determined on the basis of the speed of the speech to be produced, and the time length of each frame present between the beat synchronization points is determined on the basis of the expansion degree of the frame. Parameters for producing a speech waveform in each frame are properly generated by the time length determined for the frame. In the speech synthesizer for outputting a speech signal by coupling phonemes constituted by one or a plurality of frames having phoneme vowel-consonant combination parameters (VcV, cV, or V) of the speech waveform, the number of frames can be held constant regardless of a change in the speech production speed. This prevents degradation in the tone quality or a variation in the processing quantity resulting from a change in the speech production speed.

    摘要翻译: 在语音合成器中,用于产生语音波形的每个帧具有根据合成语音的生产速度来扩展或压缩帧的扩展度。 根据设定的语音生成速度,基于要产生的语音的速度来确定拍子同步点之间的时间间隔,并且基于在拍子同步点之间存在的每个帧的时间长度 框架的扩展程度。 用于产生每帧中的语音波形的参数通过为帧确定的时间长度适当地产生。 在语音合成器中,通过将由语音波形的音素元音辅音组合参数(VcV,cV或V)组成的一个或多个帧构成的音素耦合到语音信号中,可以将帧数保持不变,而不管 演讲生产速度的变化。 这防止了语音质量的劣化或由语音生成速度的变化导致的处理量的变化。

    Speech recognition method
    7.
    发明授权
    Speech recognition method 失效
    语音识别方法

    公开(公告)号:US5787396A

    公开(公告)日:1998-07-28

    申请号:US529436

    申请日:1995-09-18

    CPC分类号: G10L15/144 G10L15/187

    摘要: A speech recognition method uses continuous mixture Hidden Markov Models (HMM) for probability processing including a first type of HMM having a small number of mixtures and a second type of HMM having a larger number of mixtures. First output probabilities are formed for inputted speech using the small number of mixtures type HMM and second output probabilities are formed for the input speech using the large number of mixtures type HMM for selected states corresponding to the highest output probabilities of the first type HMM. The input speech is recognized from both the first and second output probabilities.

    摘要翻译: 语音识别方法使用连续混合隐马尔可夫模型(HMM)进行概率处理,包括具有少量混合物的第一类型HMM和具有较大数目混合物的第二类型HMM。 使用少数混合型HMM形成用于输入语音的第一输出概率,并且使用对应于第一类型HMM的最高输出概率的选定状态的大量混合型HMM形成用于输入语音的第二输出概率。 从第一和第二输出概率识别输入语音。

    Method and apparatus for recognizing previously unrecognized speech by
requesting a predicted-category-related domain-dictionary-linking word
    8.
    发明授权
    Method and apparatus for recognizing previously unrecognized speech by requesting a predicted-category-related domain-dictionary-linking word 失效
    通过请求预测类别相关的域 - 字典链接字来识别先前未被识别的语音的方法和装置

    公开(公告)号:US5797116A

    公开(公告)日:1998-08-18

    申请号:US785840

    申请日:1997-01-21

    摘要: A voice communication method includes the steps of inputting speech into an apparatus, recognizing the input speech using a first dictionary, predicting the category of an unrecognized word included in the input speech based on the recognition of the input speech in the recognition step, outputting a question to be asked to an operator requesting the operator to input a word which is included in the first dictionary and which can specify a second dictionary for recognizing the unrecognized word, based on the predicted category, and re-recognizing the unrecognized word with the second dictionary specified in response to the word inputted by the operator. The invention also relates to an apparatus performing these functions and to a computer program product instructing a computer to perform these functions.

    摘要翻译: 语音通信方法包括以下步骤:将语音输入到装置中,使用第一字典识别输入语音,基于识别步骤中的输入语音的识别来预测输入语音中包括的无法识别的词的类别,输出 要求操作者要求操作者输入包括在第一字典中的单词并且可以基于预测类别指定用于识别未被识别的单词的第二字典,并且重新识别具有第二字典的无法识别的单词 根据操作者输入的字来指定字典。 本发明还涉及执行这些功能的装置和指示计算机执行这些功能的计算机程序产品。

    Recognizing speech data using a state transition model
    9.
    发明授权
    Recognizing speech data using a state transition model 失效
    使用状态转换模型识别语音数据

    公开(公告)号:US06662159B2

    公开(公告)日:2003-12-09

    申请号:US08739013

    申请日:1996-10-28

    IPC分类号: G10L1514

    CPC分类号: G10L15/142 G10L2015/085

    摘要: Detecting an unknown word in input speech data reduces the search space and the memory capacity for the unknown word. For this purpose, an HMM data memory stores data describing a state transition mode for the unknown word, defined by a number of states and the transition probability between the states. An output probability calculation unit acquires a state of the maximum likelihood at each time of the speech data, among the plural states employed in the state transition mode for a known word, employed in the speech recognition of the known word. The obtained result is applied to the state transition mode for the unknown word, stored in the HMM data memory, to obtain a state transition mode of the unknown word. A different output probability calculation unit determines the likelihood of the state transition mode for the known word. Then a language search unit effects the language search process, utilizing the likelihoods determined by the aforementioned two output probability calculation units, in a portion where the presence of the unknown word is permitted by the dictionary.

    摘要翻译: 检测输入语音数据中的未知字减少了未知单词的搜索空间和存储器容量。 为此,HMM数据存储器存储描述由状态数量和状态之间的转移概率定义的未知字的状态转换模式的数据。 输出概率计算单元在已知字的语音识别中采用的已知字的状态转换模式中采用的多个状态中,获取语音数据的每个时刻的最大似然度的状态。 将获得的结果应用于存储在HMM数据存储器中的未知字的状态转换模式,以获得未知字的状态转换模式。 不同的输出概率计算单元确定已知单词的状态转换模式的可能性。 然后,语言搜索单元利用由上述两个输出概率计算单元确定的可能性,在字典允许存在未知单词的部分中实现语言搜索处理。

    Speech recognition apparatus and method and a computer usable medium for
selecting an application in accordance with the viewpoint of a user
    10.
    发明授权
    Speech recognition apparatus and method and a computer usable medium for selecting an application in accordance with the viewpoint of a user 失效
    语音识别装置和方法以及根据用户观点选择应用的计算机可用介质

    公开(公告)号:US6076061A

    公开(公告)日:2000-06-13

    申请号:US524949

    申请日:1995-09-08

    CPC分类号: G06F3/167 G06F3/165 G10L15/24

    摘要: A viewpoint of a user is detected in a viewpoint detecting process, and how long the detected viewpoint has stayed in an area is determined. The obtained viewpoint and its trace is displayed on a display unit. In a recognition information controlling process, the relationship between the viewpoint (in an area) and/or its movement, and recognition information (words, sentences, grammar, etc.) is obtained as weight P(). When the user pronounces a word (or sentence), the speech is inputted and A/D converted via a speech input unit. Next, in a speech recognition process, a speech recognition probability PS() is obtained. Finally, speech recognition is performed on the basis of a product of the weight P() and the speech recognition probability PS(). Accordingly, classes of the recognition information are controlled in accordance with the movement of the user's viewpoint, thereby improving the speech recognition probability and the speed of recognition.

    摘要翻译: 在视点检测处理中检测到用户的视点,并且确定检测到的视点停留在区域中的时间。 获得的视点及其轨迹显示在显示单元上。 在识别信息控制处理中,获得视点(区域)和/或其移动之间的关系以及识别信息(单词,句子,语法等)作为权重P()。 当用户发音(或句子)时,通过语音输入单元输入语音并进行A / D转换。 接下来,在语音识别处理中,获得语音识别概率PS()。 最后,基于权重P()和语音识别概率PS()的乘积执行语音识别。 因此,根据用户视点的移动来控制识别信息的类别,从而提高语音识别概率和识别速度。