Signal bias removal for robust telephone speech recognition
    1.
    发明授权
    Signal bias removal for robust telephone speech recognition 失效
    强大的电话语音识别信号偏移去除

    公开(公告)号:US5590242A

    公开(公告)日:1996-12-31

    申请号:US217035

    申请日:1994-03-24

    摘要: A signal bias removal (SBR) method based on the maximum likelihood estimation of the bias for minimizing undesirable effects in speech recognition systems is described. The technique is readily applicable in various architectures including discrete (vector-quantization based), semicontinuous and continuous-density Hidden Markov Model (HMM) systems. For example, the SBR method can be integrated into a discrete density HMM and applied to telephone speech recognition where the contamination due to extraneous signal components is unknown. To enable real-time implementation, a sequential method for the estimation of the bias (SSBR) is disclosed.

    摘要翻译: 描述了基于用于最小化语音识别系统中的不良影响的偏差的最大似然估计的信号偏移去除(SBR)方法。 该技术易于应用于各种架构,包括离散(矢量量化),半连续和连续密度隐马尔可夫模型(HMM)系统。 例如,SBR方法可以集成到离散密度HMM中,并应用于电话语音识别,其中由于外部信号分量引起的污染是未知的。 为了实现实时性,公开了用于估计偏差(SSBR)的顺序方法。

    Discriminative utterance verification for connected digits recognition
    2.
    发明授权
    Discriminative utterance verification for connected digits recognition 失效
    连接数字识别的歧视性话语验证

    公开(公告)号:US5737489A

    公开(公告)日:1998-04-07

    申请号:US528902

    申请日:1995-09-15

    摘要: In a speech recognition system, a recognition processor receives an unknown utterance signal as input. The recognition processor in response to the unknown utterance signal input accesses a recognition database and scores the utterance signal against recognition models in the recognition database to classify the unknown utterance and to generate a hypothesis speech signal. A verification processor receives the hypothesis speech signal as input to be verified. The verification processor accesses a verification database to test the hypothesis speech signal against verification models reflecting a preselected type of training stored in the verification database. Based on the verification test, the verification processor generates a confidence measure signal. The confidence measure signal can be compared against a verification threshold to determine the accuracy of the recognition decision made by the recognition processor.

    摘要翻译: 在语音识别系统中,识别处理器接收未知的话音信号作为输入。 响应于未知话语信号输入的识别处理器访问识别数据库,并根据识别数据库中的识别模型对话音信号进行评分,以对未知话语进行分类并生成假设语音信号。 验证处理器接收假设语音信号作为待验证的输入。 验证处理器访问验证数据库以针对反映存储在验证数据库中的预选类型的训练的验证模型来测试假设语音信号。 基于验证测试,验证处理器产生置信度测量信号。 可以将置信度信号与验证阈值进行比较,以确定由识别处理器进行的识别决策的准确性。

    Adaptive decision directed speech recognition bias equalization method
and apparatus
    3.
    发明授权
    Adaptive decision directed speech recognition bias equalization method and apparatus 失效
    自适应决策导向语音识别偏差均衡方法和装置

    公开(公告)号:US5812972A

    公开(公告)日:1998-09-22

    申请号:US366657

    申请日:1994-12-30

    摘要: The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors. The new equalization vector may then be used either for performing another segmentation iteration on the same observation sequence or for performing segmentation on subsequent feature vectors.

    摘要翻译: 本发明提供了一种语音识别器,其在输入语音被提供给识别器时创建和更新均衡矢量。 本发明包括将输入语音信号变换为一系列特征向量或观察序列的语音分析器。 然后将每个特征向量提供给语音识别器,语音识别器通过从其减去预先确定的均衡矢量来修改特征向量。 识别器然后执行分割并将修改的特征向量与被定义为分割向量的存储的模型向量进行匹配。 识别器然后不时地确定新的均衡矢量,新的均衡矢量是基于一个或多个输入特征矢量与它们各自的分割矢量之间的差定义的。 然后可以将新的均衡矢量用于在相同观察序列上执行另一分割迭代,或用于对后续特征向量执行分割。

    Speaker verification with cohort normalized scoring
    4.
    发明授权
    Speaker verification with cohort normalized scoring 失效
    演讲者验证与队列归一化得分

    公开(公告)号:US5675704A

    公开(公告)日:1997-10-07

    申请号:US638401

    申请日:1996-04-26

    摘要: A facility is provided for allowing a caller to place a telephone call by merely uttering a label identifying a desired called destination and to charge the telephone call to a particular billing account by merely uttering a label identifying that account. Alternatively, the caller may place the call by dialing or uttering the telephone number of the called destination or by entering a speed dial code associated with that telephone number. The facility includes a speaker verification system which employs cohort normalized scoring. Cohort normalized scoring provides a dynamic threshold for the verification process making the process more robust to variation in training and verification utterences. Such variation may be caused by, e.g., changes in communication channel characteristics or speaker loudness level.

    摘要翻译: 提供了一种设施,用于允许呼叫者通过仅仅发出标识期望的被叫目的地的标签来进行电话呼叫,并通过仅仅发出标识该帐户的标签来将电话呼叫收费到特定的记帐帐户。 或者,呼叫者可以通过拨打或说出被叫目的地的电话号码或通过输入与该电话号码相关联的快速拨号代码来进行呼叫。 该设施包括使用队列归一化得分的扬声器验证系统。 队列归一化得分为验证过程提供了动态门槛,使得过程对培训和验证发现的变化更加鲁棒。 这种变化可以由例如通信信道特性或扬声器响度水平的变化引起。

    Method and apparatus for combined wired/wireless pop-out speakerphone microphone
    6.
    发明授权
    Method and apparatus for combined wired/wireless pop-out speakerphone microphone 有权
    用于组合有线/无线弹出扬声器麦克风的方法和装置

    公开(公告)号:US08064969B2

    公开(公告)日:2011-11-22

    申请号:US10641449

    申请日:2003-08-15

    IPC分类号: H04M1/00

    摘要: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.

    摘要翻译: 本发明是具有基站和可拆卸麦克风盒的台式扬声器。 基站包括标准电话部件,以及无线接收器和可拆卸麦克风盒的外壳。 可拆卸荚包含至少一个麦克风和无线发射器。 当盒子连接到基站,并且会议模式的操作被激活时,话筒麦克风的音频信号通过有线连接直接发送到基站音频电路。 当吊舱分离并且会议模式被激活时,pod麦克风的音频信号现在通过pod的无线发射器进入基站的无线接收器。 这种分离的无线模式允许麦克风定位在房间的任何地方,从而通过增加语音信号与房间噪声比来提高传输语音的质量,并且通过减少房间回声的可能性,通过减少 基站扬声器和荚麦克风。

    Methods and apparatus for discriminative training and adaptation of
pronunciation networks
    7.
    发明授权
    Methods and apparatus for discriminative training and adaptation of pronunciation networks 失效
    用于歧视性训练和发音网络适应的方法和装置

    公开(公告)号:US6076053A

    公开(公告)日:2000-06-13

    申请号:US82854

    申请日:1998-05-21

    IPC分类号: G01L5/06

    摘要: A speech recognition method comprises the steps of using given speech data and the N-best algorithm to generate alternative pronunciations and then merging the obtained pronunciations into a pronunciation networks structure; using additional parameters to characterize a pronunciation network for a particular word; optimizing the parameters of the pronunciation networks using a minimum classification error criterion that maximizes a discrimination between different pronunciation networks; and adapting parameters of the pronunciation networks by, first, adjusting probabilities of the possible pronunciations that may be generated by the pronunciation network for a word claimed to be a true one and, second, to correct weights for all of the pronunciation networks by using the adjusted probabilities.

    摘要翻译: 一种语音识别方法包括以下步骤:使用给定语音数据和N最佳算法产生替代发音,然后将获得的发音合并成发音网络结构; 使用附加参数来表征特定单词的发音网络; 使用最大化不同发音网络之间的区分的最小分类错误标准来优化发音网络的参数; 以及通过首先调整发音网络对于声称为真实的单词可能发音的可能发音的概率来适应发音网络的参数,以及第二,通过使用所述发音网络来校正所有发音网络的权重 调整概率。

    Speech recognition method with error reset commands
    8.
    发明授权
    Speech recognition method with error reset commands 失效
    具有错误复位命令的语音识别方法

    公开(公告)号:US5781887A

    公开(公告)日:1998-07-14

    申请号:US728012

    申请日:1996-10-09

    申请人: Biing-Hwang Juang

    发明人: Biing-Hwang Juang

    IPC分类号: G10L15/22 G10L7/08 G10L9/00

    CPC分类号: G10L15/22

    摘要: A method for revising at least a portion of a sequence of speech data segments recognized by an automated speech recognition system. A user is prompted to vocalize the speech data segments sequentially, one speech data segment at a time. When each speech data segment is recognized it is stored as a data element and a confirmation of recognition is issued to the user. The user may then issue a verbal command to delete the last recognized data element if the confirmation indicates that a recognition error has occurred, and then repeat the last speech data element for a second recognition attempt. The user may also issue another verbal command to delete all thus-far recognized data elements in the sequence and to restart the recognition process from the beginning. If no such verbal commands are issued by the user, then the user may continue to vocalize the next sequential speech data segment.

    摘要翻译: 一种用于修改由自动语音识别系统识别的语音数据段序列的至少一部分的方法。 提示用户顺序发音语音数据段,一次一个语音数据段。 当每个语音数据段被识别时,其被存储为数据元素,并且向用户发出识别确认。 然后,如果确认指示已经发生识别错误,则用户可以发出口令命令来删除最后识别的数据元素,然后重复最后一个语音数据元素进行第二次识别尝试。 用户还可以发出另一个口头命令来删除序列中的所有这样被识别的数据元素,并从头开始重新启动识别过程。 如果用户没有发出这样的口头命令,则用户可以继续发出下一个顺序语音数据段的发声。

    Automatic pattern recognition using category dependent feature selection
    10.
    发明申请
    Automatic pattern recognition using category dependent feature selection 失效
    使用分类依赖特征选择的自动模式识别

    公开(公告)号:US20080147402A1

    公开(公告)日:2008-06-19

    申请号:US11998262

    申请日:2007-11-29

    IPC分类号: G10L15/00

    CPC分类号: G06K9/623 G10L15/142

    摘要: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.

    摘要翻译: 公开了使用人类外围和中央听觉系统的计算模型的修改版本的装置和方法,并且使用类别依赖特征选择来提供自动模式识别。 通过从用于常规音素识别任务的中央听觉系统的尺寸扩大皮层响应的导出特征向量来检查模型的输出的有效性。 此外,皮层响应可以是地方编码的数据集,其中声音根据包含其最显着特征的区域被分类。 这提供了一种新颖的类别依赖特征选择装置和方法,其中该机制可以用于更好地模拟鲁棒的人类模式(语音)识别。