-
公开(公告)号:WO1984003983A1
公开(公告)日:1984-10-11
申请号:PCT/US1983000464
申请日:1983-03-28
Applicant: EXXON RESEARCH AND ENGINEERING COMPANY
Inventor: EXXON RESEARCH AND ENGINEERING COMPANY , BAKER, James, K. , MacALLISTER, Jeffrey, G. , KLOVSTAD, John, W. , SIDELL, Mark, F. , BROWN, Peter, F. , GANESAN, Kalyan , HATTON, Terence, J. , LEE, Chin-Hui , ROSS, Steven , ROTH, Robert, S.
IPC: G10L01/00
CPC classification number: G10L15/00
Abstract: A speech recognition method and apparatus employ a speech processing circuity (26) for repetitively deriving from a speech imput (100), at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries (28, 30) are connected to a system bus (24), along with the speech processing circuity, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuity to the bus thereby increasing the speech recognition capacity of the apparatus. The speech processing circuity establishes overlapping time durations for generating the acoustic parameters and further employs a sinc-Kaiser smoothing function in combination with a folding technique (113) for providing a discrete Fourier transform (112). The Fourier spectra are transformed using a principal component analysis (122) which optimizes the across class variance. The template matching and cost processing circuitries (28, 30) provide distributed processing, on demand, of the acoustic parameters for generating through a dynamic programming technique the recognition decision. Grammar and word model syntax structures reduce the computational load. Template pattern generation is aided by using a "joker" word to specify the time boundaries of utterances spoken in isolation.
Abstract translation: 语音识别方法和装置采用语音处理电路(26),以语音输入(100)以帧重复率重复地导出多个声学参数。 声学参数表示帧时间的语音输入信号。 多个模板匹配和成本处理电路(28,30)连同语音处理电路连接到系统总线(24),用于通过比较声学参数来确定或识别输入语音中的语音单元 具有存储的模板模式。 可以通过向总线添加更多的模板匹配和成本处理电路来扩展该装置,从而增加装置的语音识别能力。 语音处理电路建立用于产生声学参数的重叠时间持续时间,并且还结合用于提供离散付里叶变换(112)的折叠技术(113)来采用sinc-Kaiser平滑函数。 傅立叶光谱使用主成分分析(122)进行变换,该分析优化了跨类别方差。 模板匹配和成本处理电路(28,30)根据需要提供分布式处理声学参数,用于通过动态规划技术生成识别决策。 语法和单词模型语法结构降低了计算量。 通过使用“小丑”字来指定孤立地说出的话语的时间边界来辅助模板模式生成。