Method and apparatus for normalizing channel specific speech feature elements
    1.
    发明授权
    Method and apparatus for normalizing channel specific speech feature elements 有权
    用于规范信道特定语音特征元素的方法和装置

    公开(公告)号:US06502070B1

    公开(公告)日:2002-12-31

    申请号:US09559459

    申请日:2000-04-28

    IPC分类号: G10L1914

    CPC分类号: G10L15/065

    摘要: An apparatus for normalizing speech feature elements in a signal derived from a spoken utterance. The apparatus includes an input, a processing unit and an output. The input receives speech feature elements transmitted over a channel that induces a channel specific distortion in the speech feature elements. The processing unit is coupled to the input and is operative for altering the speech feature elements to generate normalized speech feature elements. The normalized speech feature elements simulate a transmission of the speech feature elements over a reference channel that is other than the channel over which the transmission actually takes place. The apparatus can be used as a speech recognition pre-processing unit to reduce channel related variability in the signal on which speech recognition is to be performed.

    摘要翻译: 一种用于对从讲话语音导出的信号中的语音特征元素进行归一化的装置。 该装置包括输入,处理单元和输出。 该输入接收通过信道发送的语音特征元素,其在语音特征元素中引起信道特定失真。 处理单元耦合到输入端并且可操作用于改变语音特征元素以产生归一化的语音特征元素。 归一化语音特征元素模拟语音特征元素在不同于实际发生传输的信道的参考信道上的传输。 该装置可以用作语音识别预处理单元,以减少要进行语音识别的信号中的信道相关变化。

    Method and apparatus providing hypothesis driven speech modelling for use in speech recognition
    2.
    发明授权
    Method and apparatus providing hypothesis driven speech modelling for use in speech recognition 失效
    提供用于语音识别的假设驱动语音建模的方法和装置

    公开(公告)号:US06868381B1

    公开(公告)日:2005-03-15

    申请号:US09468138

    申请日:1999-12-21

    摘要: A speech recognition system having an input for receiving an input signal indicative of a spoken utterance that is indicative of at least one speech element. The system further includes a first processing unit operative for processing the input signal to derive from a speech recognition dictionary a speech model associated to a given speech element that constitutes a potential match to the at least one speech element. The system further comprised a second processing unit for generating a modified version of the speech model on the basis of the input signal. The system further provides a third processing unit for processing the input signal on the basis of the modified version of the speech model to generate a recognition result indicative of whether the modified version of the at least one speech model constitutes a match to the input signal. The second processing unit allows the speech model to be modified on the basis of the recognition attempt thereby allowing speech recognition to be effected on the basis of the modified speech model. This permits adaptation of the speech models during the recognition process. The invention further provides an apparatus, method and computer readable medium for implementing the second processing unit.

    摘要翻译: 一种语音识别系统,具有用于接收表示至少一个语音元素的表示话语的输入信号的输入。 该系统还包括第一处理单元,其操作用于处理输入信号以从语音识别词典中导出与构成与至少一个语音元素的潜在匹配的给定语音元素相关联的语音模型。 该系统还包括第二处理单元,用于基于输入信号产生语音模型的修改版本。 该系统还提供一个第三处理单元,用于基于该语音模型的修改版本来处理该输入信号,以产生一个表示该至少一个语音模型的修改版本是否构成对该输入信号的匹配的识别结果。 第二处理单元允许基于识别尝试来修改语音模型,从而允许基于修改的语音模型来实现语音识别。 这允许在识别过程中对语音模型进行适应。 本发明还提供了一种用于实现第二处理单元的装置,方法和计算机可读介质。

    Method and apparatus for hierarchical training of speech models for use in speaker verification
    3.
    发明授权
    Method and apparatus for hierarchical training of speech models for use in speaker verification 有权
    用于说话者验证中使用的语音模型的分级训练的方法和装置

    公开(公告)号:US06499012B1

    公开(公告)日:2002-12-24

    申请号:US09470995

    申请日:1999-12-23

    IPC分类号: G10L1514

    CPC分类号: G10L17/04

    摘要: A method and apparatus for generating a pair of data elements is provided suitable for use in a speaker verification system. The pair includes a first element representative of a speaker independent template and a second element representative of an extended speaker specific speech pattern. An audio signal forming enrollment data associated with a given speaker is received and processed to derive a speaker independent template and a speaker specific speech pattern. The speaker specific speech pattern is then processed to derive an extended speaker specific speech pattern. The extended speaker specific speech pattern includes a set of expanded speech models, each expanded speech model including a plurality of groups of states, the groups of states being linked to one another by inter-group transitions. Optionally, the expanded speech models are processed on the basis of the enrollment data to condition at least one of the plurality of inter-group transitions.

    摘要翻译: 提供一种用于产生一对数据元素的方法和装置,其适用于扬声器验证系统。 该对包括表示扬声器独立模板的第一元件和表示扩展扬声器特定语音模式的第二元件。 接收并处理与给定扬声器相关联的形成注册数据的音频信号,以导出与讲者无关的模板和说话者特定的语音模式。 然后处理扬声器特定语音模式以导出扩展的说话者特定语音模式。 扩展扬声器特定语音模式包括一组扩展语音模型,每个扩展语音模型包括多个状态组,所述状态组通过组间转换彼此链接。 可选地,扩展语音模型基于登记数据进行处理,以便条件中的至少一个组间转换。

    Method and apparatus to detect and delimit foreground speech
    4.
    发明授权
    Method and apparatus to detect and delimit foreground speech 失效
    检测和界定前景语音的方法和装置

    公开(公告)号:US6134524A

    公开(公告)日:2000-10-17

    申请号:US950417

    申请日:1997-10-24

    IPC分类号: G10L11/02 G10L15/20

    CPC分类号: G10L25/87

    摘要: The present invention provides improved foreground-speech signal endpointing by computing a spectral stationarity statistic. This statistic is used by a finite state machine to endpoint speech. Endpointing using the spectral stationarity statistic is less susceptible to background noise than endpointing using conventional measures. The present invention uses frame-synchronous quantile estimation to generate a mask signal for signal to Noise Ratio Normalization.

    摘要翻译: 本发明通过计算光谱平稳度统计量来提供改进的前景语音信号终点。 该统计量由有限状态机用于终点语音。 使用频谱平稳性统计的终点不利于背景噪声,而不是使用常规测量的终点。 本发明使用帧同步分位数估计来产生用于信噪比归一化的掩码信号。