Methods for creating and searching a database of speakers
    1.
    发明授权
    Methods for creating and searching a database of speakers 有权
    创建和搜索扬声器数据库的方法

    公开(公告)号:US08442823B2

    公开(公告)日:2013-05-14

    申请号:US12907729

    申请日:2010-10-19

    IPC分类号: G10L15/00

    CPC分类号: G10L15/1822

    摘要: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.

    摘要翻译: 一种执行对扬声器数据库的搜索的方法,包括:接收由查询扬声器所说出的查询语音样本; 从查询语音样本中导出查询语句; 从查询语句中提取查询语句统计信息; 使用核函数执行内核局部敏感哈希(KLSH),所述KLSH使用包括在扬声器数据库中的多个话语中提取的查询话语统计和话音统计作为输入,以便选择所述多个话语的子集; 以及使用话语比较方程比较所述子集中每个话语的话语统计量的查询话语统计量,以从所述数据库中产生具有与所述查询发音者具有最高相似性的话语的说话者列表。

    Noise reduced speech recognition parameters
    2.
    发明授权
    Noise reduced speech recognition parameters 有权
    噪声降低语音识别参数

    公开(公告)号:US06678656B2

    公开(公告)日:2004-01-13

    申请号:US10061048

    申请日:2002-01-30

    IPC分类号: G10L1520

    摘要: A voice sample characterization front-end suitable for use in a distributed speech recognition context. A digitized voice sample 31 is split between a low frequency path 32 and a high frequency path 33. Both paths are used to determine spectral content suitable for use when determining speech recognition parameters (such as cepstral coefficients) that characterize the speech sample for recognition purposes. The low frequency path 32 has a thorough noise reduction capability. In one embodiment, the results of this noise reduction are used by the high frequency path 33 to aid in de-noising without requiring the same level of resource capacity as used by the low frequency path 32.

    摘要翻译: 语音样本表征前端适用于分布式语音识别语境。 数字化语音样本31在低频路径32和高频路径33之间分离。当确定表征语音样本以识别目的的语音识别参数(例如倒谱系数)时,两个路径用于确定适合使用的频谱内容 。 低频路径32具有彻底的降噪能力。 在一个实施例中,由高频路径33使用该噪声降低的结果来帮助去噪,而不需要与低频路径32所使用的相同的资源容量。

    Methods and apparatus for reducing noise associated with an electrical speech signal
    3.
    发明授权
    Methods and apparatus for reducing noise associated with an electrical speech signal 有权
    用于降低与电语音信号相关联的噪声的方法和装置

    公开(公告)号:US06480821B2

    公开(公告)日:2002-11-12

    申请号:US09774840

    申请日:2001-01-31

    IPC分类号: G10L2102

    摘要: A system for enhancing the signal-to-noise ratio of a speech signal is avoided. A plurality of local energy maximums associated with a speech signal are determined. Presumably, each of these local energy maximums defines a speech pitch period. Typically, human pitch periods are approximately 100-400 Hz depending on the sex and age of the speaker. Because human speech typically includes more energy near the beginning of a pitch period than at the end of the pitch period, and background noise tends to remain relatively constant throughout the pitch period, the speech signal may be enhanced by increasing the energy associated with the beginning of the pitch period and/or by decreasing the energy associated with the end of the pitch period. Preferably, the amount of energy increase in the earlier portion of the pitch period is approximately equal to the amount of energy reduction in the later portion of the pitch period. In this manner, the total energy remains the constant.

    摘要翻译: 避免了用于提高语音信号的信噪比的系统。 确定与语音信号相关联的多个局部能量最大值。 大概地,这些局部能量最大值中的每一个定义了语音音调周期。 通常,根据演讲者的性别和年龄,人类音调周期约为100-400Hz。 因为人类语音通常在音调周期的开始处包括比在音调周期结束时更多的能量,并且背景噪声在整个音调周期期间趋于保持相对恒定,所以可以通过增加与开始相关联的能量来增强语音信号 和/或通过减小与音调周期结束相关联的能量。 优选地,在音调周期的较早部分中的能量增加量大约等于音调周期的稍后部分中的能量减少量。 以这种方式,总能量保持恒定。

    Method and Apparatus for Robust Speech Activity Detection
    4.
    发明申请
    Method and Apparatus for Robust Speech Activity Detection 审中-公开
    用于鲁棒语音活动检测的方法和装置

    公开(公告)号:US20080147389A1

    公开(公告)日:2008-06-19

    申请号:US11611469

    申请日:2006-12-15

    申请人: Dusan Macho

    发明人: Dusan Macho

    IPC分类号: G10L21/00

    CPC分类号: G10L25/78

    摘要: A method and apparatus for robust speech activity detection is disclosed. The method may include calculating autocorrelations by filtering input signals using order statistic filtering, averaging the autocorrelations over a time period, obtaining a voiced speech feature from the averaged autocorrelations, classifying the input signal as one of speech and non-speech based on the obtained voiced speech feature, and outputting only the classified speech signals or the input signals along with the speech/non-speech classification information, to an automated speech recognizer.

    摘要翻译: 公开了用于鲁棒语音活动检测的方法和装置。 该方法可以包括通过使用顺序统计滤波对输入信号进行滤波来计算自相关性,在一段时间周期内对自相关进行平均,从平均的自相关中获得有声语音特征,将输入信号分类为语音和非语音之一,基于获得的有声 语音特征,并且仅将分类的语音信号或输入信号与语音/非语音分类信息一起输出到自动语音识别器。