Connected word recognition enrollment method
    2.
    发明授权
    Connected word recognition enrollment method 失效
    连接词识别注册方法

    公开(公告)号:US4783808A

    公开(公告)日:1988-11-08

    申请号:US856722

    申请日:1986-04-25

    CPC分类号: G10L15/063 G10L15/22

    摘要: A method for generating connected word templates begins with generating isolated word templates of selected words. The isolated word templates are used to extract a continuous word template from a segment of continuous speech containing the selectd words. Both the isolated word templates and the connected word templates can be used to generate speech to determine the quality of the generated templates through aural judgment.

    摘要翻译: 用于产生连接的单词模板的方法开始于生成所选单词的隔离单词模板。 孤立词模板用于从包含selectd单词的连续语音段中提取连续词模板。 孤立词模板和连接的单词模板都可以用于生成语音,以通过听觉判断来确定生成的模板的质量。

    Temporal decorrelation method for robust speaker verification
    6.
    发明授权
    Temporal decorrelation method for robust speaker verification 失效
    用于稳健的扬声器验证的时间装饰方法

    公开(公告)号:US5167004A

    公开(公告)日:1992-11-24

    申请号:US662086

    申请日:1991-02-28

    IPC分类号: G10L15/02 G10L17/00 G10L19/00

    CPC分类号: G10L17/02 G10L17/20

    摘要: A speaker voice verification system uses temporal decorrelation linear transformation and includes a collector for receiving speech inputs from an unknown speaker claiming a specific identity, a word-level speech features calculator operable to use a temporal decorrelation linear transformation for generating word-level speech feature vectors from such speech inputs, word-level speech feature storage for storing word-level speech feature vectors known to belong to a speaker with the specific identity, a word-level speech feature vectors received from the unknown speaker with those received from the word-level speech feature storage, and speaker verification decision circuitry for determining, based on the similarity score, whether the unknown speaker's identity is the same as that claimed. The word-level vector scorer further includes concatenation circuitry as well as a word-specific orthogonalizing linear transformer. Other systems and methods are also disclosed.

    Speaker-dependent connected speech word recognition method
    7.
    发明授权
    Speaker-dependent connected speech word recognition method 失效
    扬声器依赖连接的语音字识别方法

    公开(公告)号:US4989248A

    公开(公告)日:1991-01-29

    申请号:US319384

    申请日:1989-03-03

    IPC分类号: G10L15/00 G10L15/12

    CPC分类号: G10L15/12

    摘要: A cost-effective word recognizer. Each frame of spoken input is compared to a set of reference frames. The comparison is equivalent to embodying the reference frame as an LPC inverse filter, and is preferably done in the autocorrelation domain. To avoid the instability and computational difficulties which can be caused by a high-gain LPC inverse filter, a noise floor is introduced into each reference frame sample. Thus, for each input speech frame, a scalar measures its similarity to each of the vocabulary of reference frames.To achieve connected word recognition based on this similarity measurement, a dynamic programming algorithm is used in which time warping to match a sample to a reference is in effect permitted, and in which matching is performed with unconstrained endpoints. Thus, the word boundary decisions are made on the basis of a local maximum in similarity, and, since no separate word division decision is required, the error which can be introduced by even the best preliminary decision as to word boundaries is avoided.

    摘要翻译: 一个具有成本效益的字识别器。 将每个语音输入帧与一组参考帧进行比较。 该比较等效于将参考帧作为LPC逆滤波器,并且优选地在自相关域中完成。 为了避免由高增益LPC反相滤波器引起的不稳定性和计算困难,本底噪声被引入每个参考帧样本。 因此,对于每个输入语音帧,标量测量其与参考帧的每个词汇表的相似度。 为了实现基于这种相似性测量的连续字识别,使用动态规划算法,其中实际上允许将样本与参考匹配的时间扭曲,并且其中使用非约束端点执行匹配。 因此,单词边界决策是基于相似度的局部最大值进行的,并且由于不需要单独的分离决策,所以避免了即使是关于字边界的最佳初步决定也可以引入的误差。

    System and method for time aligning speech
    8.
    发明授权
    System and method for time aligning speech 失效
    时间对齐语音的系统和方法

    公开(公告)号:US5333275A

    公开(公告)日:1994-07-26

    申请号:US903033

    申请日:1992-06-23

    摘要: A method and system are provided for time aligning speech. Speech data is input representing speech signals from a speaker. An orthographic transcription is input including a plurality of words transcribed from the speech signals. A sentence model is generated indicating a selected order of the words in response to the orthographic transcription. In response to the orthographic transcription, word models are generated associated with respective ones of the words. The orthographic transcription is aligned with the speech data in response to the sentence model, to the word models and to the speech data.

    摘要翻译: 提供了一种用于时间对准语音的方法和系统。 语音数据是表示来自扬声器的语音信号的输入。 输入包括从语音信号转录的多个单词的正字转录。 生成句子模型,指示响应于正交转录的单词的选定顺序。 响应于正交转录,产生与相应单词相关联的词模型。 正交转录与句子模型,词模型和语音数据相对应的语音数据。

    Very low rate speech encoder and decoder
    9.
    发明授权
    Very low rate speech encoder and decoder 失效
    极低速率语音编码器和解码器

    公开(公告)号:US4815134A

    公开(公告)日:1989-03-21

    申请号:US094162

    申请日:1987-09-08

    IPC分类号: G10L19/02 G10L19/06 G10L3/02

    摘要: A speech encoder is disclosed quantizing speech information with respect to energy, voicing and pitch parameters to provide a fixed number of bits per block of frames. Coding of the parameters takes place for each N frames, which comprise a block, irrespective of phonemic boundaries. Certain frames of speech information are discarded during transmission, if such information is substantially duplicated in an adjacent frame. A very low data rate transmission system is thus provided which exhibits a high degree of fidelity and throughput.

    摘要翻译: 公开了一种语音编码器,用于量化关于能量,发声和音调参数的语音信息,以提供每块帧的固定数量的比特。 对于包括块的每个N个帧,不管音素边界如何,都对这些参数进行编码。 如果这样的信息在相邻帧中基本上重复,则在传输期间丢弃某些语音信息帧。 因此提供了非常低的数据速率传输系统,其呈现高度的保真度和吞吐量。

    Voice messaging system with pitch tracking based on adaptively filtered
LPC residual signal
    10.
    发明授权
    Voice messaging system with pitch tracking based on adaptively filtered LPC residual signal 失效
    具有基于自适应滤波的LPC残差信号的音调跟踪的语音消息系统

    公开(公告)号:US4731846A

    公开(公告)日:1988-03-15

    申请号:US484711

    申请日:1983-04-13

    CPC分类号: G10L25/90 G10L19/06

    摘要: A voice messaging system, wherein linear predictive coding (LPC) parameters, pitch, and preferably other excitation information is derived from a human voice input, encoded, and transmitted and/or stored, to be called up later to provide a speech output which is nearly identical to the original speech input. The invention features adaptive filtering of the residual signal. The residual signal derived from LPC estimation is adaptively filtered, and then is used as the input to a conventional pitch estimation procedure. The adaptive filtering step uses the first reflection coefficient (k.sub.1) to realize a simple filter (e.g., A(z)=(1-k.sub.1 z.sup.-1).sup.-1. This filter removes high frequency noise from the residual signal during voiced periods, but does not remove the high frequency energy which contains important information during the unvoiced periods of speech. Preferably the above preprocessing technique is also combined with a postprocessing technique, wherein dynamic programming is used to optimally track pitch and voicing information through successive frames.

    摘要翻译: 一种语音消息系统,其中线性预测编码(LPC)参数,音调,并且优选地其他激励信息是从人类语音输入,编码,发送和/或存储中导出的,以供稍后调用以提供语音输出 几乎与原来的语音输入相同。 本发明特征在于残留信号的自适应滤波。 从LPC估计导出的残差信号被自适应滤波,然后被用作常规音调估计过程的输入。 自适应滤波步骤使用第一反射系数(k1)来实现简单滤波器(例如,A(z)=(1-k1 z-1)-1),该滤波器在有声周期期间从残余信号中去除高频噪声, 但不排除包含重要信息的高频能量,优选地,上述预处理技术也与后处理技术组合,其中使用动态规划来通过连续的帧来最佳地跟踪音调和发音信息。