Speech recognition system
    1.
    发明授权
    Speech recognition system 失效
    语音识别系统

    公开(公告)号:US4624011A

    公开(公告)日:1986-11-18

    申请号:US462042

    申请日:1983-01-28

    CPC分类号: G10L15/00

    摘要: An acoustic signal processing circuit extracts input speech pattern data and subsidiary feature data from an input speech signal. The input speech pattern data comprise frequency spectra, whereas the subsidiary feature data comprise phoneme and acoustic features. These data are then stored in a data buffer memory. The similarity measures between the input speech pattern data stored in the data buffer memory and reference speech pattern data stored in a dictionary memory are computed by a similarity computation circuit. When the largest similarity measure exceeds a first threshold value and when the difference between the largest similarity measure and the second largest measure exceeds a second threshold value, category data of the reference pattern which gives the largest similarity measure is produced by a control circuit to correspond to an input speech. When recognition cannot be performed, the categories of the reference speech patterns which respectively give the largest to mth similarity measures are respectively compared with the subsidiary feature data. In this manner, subsidiary feature recognition of the input voice is performed by a subsidiary feature recognition section.

    摘要翻译: 声信号处理电路从输入语音信号中提取输入语音模式数据和辅助特征数据。 输入语音模式数据包括频谱,而辅助特征数据包括音素和声学特征。 然后将这些数据存储在数据缓冲存储器中。 存储在数据缓冲存储器中的输入语音模式数据与字典存储器中存储的参考语音模式数据之间的相似性度量由相似度计算电路计算。 当最大相似性度量超过第一阈值时,当最大相似性度量与第二最大量度之间的差异超过第二阈值时,通过控制电路产生给出最大相似性度量的参考模式的类别数据,以对应于 输入语音。 当不能执行识别时,将分别给出最大到第m个相似性度量的参考语音模式的类别分别与辅助特征数据进行比较。 以这种方式,由辅助特征识别部执行输入语音的辅助特征识别。

    Speech recognition system
    3.
    发明授权
    Speech recognition system 失效
    语音识别系统

    公开(公告)号:US4881266A

    公开(公告)日:1989-11-14

    申请号:US19781

    申请日:1987-02-27

    CPC分类号: G10L25/87 G10L15/04 G10L15/20

    摘要: In a speech recognition system for recognizing speeches uttered by non-specific speakers, start and end points of a word or speech interval are determined by a novel preprocessor for searching a sound power level to obtain speech boundary candidates and for determining likelihoods of speech or word intervals on the basis of the boundary candidates. Since likelihoods (probabilities) are determined for speech interval candidates, the similarity rate between feature parameters and reference pattern set of a speech signal are calculated for only the higher likelihood candidates, thus improving the accuracy and the speed of speech recognition. A percentage of erroneous boundary decision is about 0.5% when two speech interval candidates of the first and second likelihoods are adopted.

    摘要翻译: 在用于识别由非特定扬声器发出的讲话的语音识别系统中,用于搜索声功率电平以获得语音边界候选的新型预处理器确定单词或语音间隔的开始和结束点,并且用于确定语音或单词的可能性 基于边界候选人的间隔。 由于为语音间隔候选确定了可能性(概率),所以仅针对较高似然候选来计算特征参数和语音信号的参考模式集合之间的相似度,从而提高语音识别的准确性和速度。 当采用第一和第二可能性的两个语音间隔候选时,错误边界决定的百分比约为0.5%。

    Continuous speech recognition apparatus
    4.
    发明授权
    Continuous speech recognition apparatus 失效
    连续语音识别装置

    公开(公告)号:US4677673A

    公开(公告)日:1987-06-30

    申请号:US563755

    申请日:1983-12-21

    CPC分类号: G10L15/00

    摘要: Continuous speech signal is recognized using "rough" and "detail" parameters derived from prestored reference speech and current unknown speech. The detail parameters are 16 spectral coefficients, the rough parameters 2 or 4 spectral coefficients representing the signal. A word interval detector decides segmentation based on rough parameter similarity.

    摘要翻译: 使用从预存储的参考语音和当前未知语音导出的“粗略”和“详细”参数识别连续语音信号。 详细参数是16个频谱系数,粗略参数2或4个表示信号的频谱系数。 字间隔检测器根据粗略的参数相似度来决定分割。

    Phoneme information extracting apparatus
    5.
    发明授权
    Phoneme information extracting apparatus 失效
    音素信息提取装置

    公开(公告)号:US4405838A

    公开(公告)日:1983-09-20

    申请号:US273400

    申请日:1981-06-15

    CPC分类号: G10L15/00

    摘要: A phoneme information extracting apparatus includes correlation data generators for successively generating correlation data representing the correlation between the acoustic power spectrum data corresponding to input voice and power spectrum data of various reference phonemes, selection circuits for successively transferring these correlation data when they detect that three or more successive correlation data have values greater than a predetermined value, maximum data hold circuits for holding the maximum correlation data among the correlation data transferred from the respective selection circuits, and a phoneme determination circuit for determining the optimum phoneme by detecting one of the data hold circuits that is holding the maximum correlation data among the correlation data held in the data hold circuits.

    摘要翻译: 音素信息提取装置包括:相关数据发生器,用于连续产生表示与输入声音相对应的声功率谱数据与各种参考音素的功率谱数据之间的相关性的相关数据;选择电路,当它们检测到三个或 更连续的相关数据具有大于预定值的值,用于保持从各个选择电路传送的相关数据之间的最大相关数据的最大数据保持电路和用于通过检测数据保持之一来确定最佳音素的音素确定电路 在保持在数据保持电路中的相关数据中保持最大相关数据的电路。

    Text-to-speech synthesis with controllable processing time and speech
quality
    6.
    发明授权
    Text-to-speech synthesis with controllable processing time and speech quality 失效
    具有可控处理时间和语音质量的文本到语音合成

    公开(公告)号:US5615300A

    公开(公告)日:1997-03-25

    申请号:US67079

    申请日:1993-05-26

    CPC分类号: G10L13/047 G10L13/04

    摘要: Synthesized speech is generated by a software-implemented system with a programmed central processing unit. Phonetic parameters are generated from a series of phonetic symbols of an input text to be converted into synthesized speech, and prosodic parameters are also generated from prosodic information of the input text. The activity ratio of the central processing unit is determined, and the order of phonetic parameters or the arrangement of a synthesis unit or filter for speech synthesis is determined depending on the determined activity ratio of the central processing unit. Synthesized speech sounds are generated and filtered based on the phonetic and prosodic parameters according to the determined order of phonetic parameters or the determined arrangement of the filter.

    摘要翻译: 合成语音由具有编程的中央处理单元的软件实现的系统产生。 语音参数是从要转换为合成语音的输入文本的一系列语音符号生成的,并且还由输入文本的韵律信息生成韵律参数。 确定中央处理单元的活动比,并且根据所确定的中央处理单元的活动比确定语音合成的合成单元或滤波器的语音参数或排列顺序。 根据确定的语音参数顺序或确定的滤波器布置,基于语音和韵律参数来生成和滤波合成语音。

    Orthogonalized dictionary speech recognition apparatus and method thereof
    8.
    发明授权
    Orthogonalized dictionary speech recognition apparatus and method thereof 失效
    正交字典语音识别装置及其方法

    公开(公告)号:US4979213A

    公开(公告)日:1990-12-18

    申请号:US378780

    申请日:1989-07-12

    申请人: Tsuneo Nitta

    发明人: Tsuneo Nitta

    IPC分类号: G10L11/00 G10L15/06

    CPC分类号: G10L15/063

    摘要: Speech pattern data representing speech of a plurality of speakers are stored in a pattern storage section in advance. Averaged pattern data obtained by averaging a plurality of speech pattern data of the first of the plurality of speakers are obtained. Data obtained by blurring and differentiating the averaged pattern data are stored in an orthogonalized dictionary as basic orthogonalized dictionary data of first and second axes, respectively. Blurred data and differentiated data obtained with respect to the second and subsequent of the plurality of speakers are selectively stored in the orthogonalized dictionary as additional dictionary data having new axes. Speech of the plurality of speakers is recognized by computing a similarity between the orthogonalized dictionary formed in this manner and input speech.

    摘要翻译: 代表多个扬声器的语音的语音模式数据预先存储在模式存储部分中。 获得通过对多个扬声器中的第一个的多个语音图案数据进行平均而获得的平均图案数据。 通过模糊和区分平均图案数据获得的数据分别存储在正交字典中作为第一和第二轴的基本正交字典数据。 选择性地,在正交字典中存储相对于多个扬声器的第二和随后获得的模糊数据和微分数据作为具有新轴的附加字典数据。 通过计算以这种方式形成的正交字典和输入语音之间的相似度来识别多个扬声器的语音。

    Speech search device and speech search method
    9.
    发明授权
    Speech search device and speech search method 失效
    语音搜索设备和语音搜索方法

    公开(公告)号:US08626508B2

    公开(公告)日:2014-01-07

    申请号:US13203371

    申请日:2010-02-10

    CPC分类号: G10L15/12 G10L2015/025

    摘要: Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.

    摘要翻译: 提供了一种语音搜索装置,其搜索速度非常快,其搜索性能也优异,并且执行模糊搜索和语音搜索方法。 不仅执行模糊搜索,而且还计算包括在语音数据中的音素辨别特征之间的距离,以使用后缀数组和动态编程来确定相对于语音的相似度,并且要搜索的对象被 基于相对于多个划分的搜索关键词的音素和搜索阈值的搜索关键词划分的手段,重复地搜索要搜索的对象,同时依次增加搜索阈值,并且是否存在关键词分割 根据搜索关键词的长度确定,从而实现语音搜索,搜索速度非常快,搜索性能也很好。

    Speech recognition using continuous density hidden markov models and the
orthogonalizing karhunen-loeve transformation
    10.
    发明授权
    Speech recognition using continuous density hidden markov models and the orthogonalizing karhunen-loeve transformation 失效
    使用连续密度隐马尔可夫模型和正交化karhunen-loeve变换的语音识别

    公开(公告)号:US5506933A

    公开(公告)日:1996-04-09

    申请号:US30618

    申请日:1993-03-12

    申请人: Tsuneo Nitta

    发明人: Tsuneo Nitta

    CPC分类号: G10L15/144

    摘要: A recognition system comprises a feature extractor for extracting a feature vector x from an input speech signal, and a recognizing section for defining continuous density Hidden Markov Models of predetermined categories k as transition network models each having parameters of transition probabilities p(k,i,j) that a state Si transits to a next state Sj and output probabilities g(k,s) that a feature vector x is output in transition from the state Si to one of the states Si and Sj, and recognizing the input signal on the basis of similarity between a sequence X of feature vectors extracted by the feature extractor and the continuous density HMMs. Particularly, the recognizing section includes a memory section for storing a set of orthogonal vectors .phi..sub.m (k,s) provided for the continuous density HMMs, and a modified CDHMM processor for obtaining each of the output probabilities g(k,s) for the continuous density HMMs in accordance with corresponding orthogonal vectors .phi..sub.m (k,s).

    摘要翻译: 识别系统包括用于从输入语音信号中提取特征向量x的特征提取器,以及用于定义预定类别k的连续密度隐马尔科夫模型的识别部分,其中每个转移网络模型具有转移概率p(k,i, j)状态Si转换到下一状态Sj,并且输出特征向量x从状态Si向状态Si和Sj中的一个转变而输出的概率g(k,s),并且识别输入信号 由特征提取器提取的特征向量的序列X与连续密度HMM之间的相似度的基础。 特别地,识别部分包括存储部分,用于存储为连续密度HMM提供的一组正交向量phi(k,s),以及修改的CDHMM处理器,用于获得针对所述连续密度HMM的每个输出概率g(k,s) 连续密度HMM根据相应的正交向量phi(k,s)。