combined engine system and method for voice recognition
    2.
    发明授权
    combined engine system and method for voice recognition 有权
    组合发动机系统和语音识别方法

    公开(公告)号:US06671669B1

    公开(公告)日:2003-12-30

    申请号:US09618177

    申请日:2000-07-18

    IPC分类号: G10L1528

    CPC分类号: G10L15/32

    摘要: A method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engines. A speaker independent (SI) Hidden Markov Model (HMM) engine, a speaker independent Dynamic Time Warping (DTW-SI) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined. Combining and resolving the results of these engines results in a system with better recognition accuracy and lower rejection rates than using the results of only one engine.

    摘要翻译: 一种组合语音识别引擎并解决各个语音识别引擎结果之间差异的方法和系统。 独立于扬声器(SI)隐马尔可夫模型(HMM)引擎,独立于扬声器的动态时间扭曲(DTW-SI)引擎和与扬声器相关的动态时间扭曲(DTW-SD)引擎。 结合和解决这些发动机的结果导致与使用仅一个发动机的结果相比,具有更好的识别精度和更低的排除率的系统。

    Voice recognition rejection scheme
    3.
    发明授权
    Voice recognition rejection scheme 有权
    语音识别拒绝方案

    公开(公告)号:US06574596B2

    公开(公告)日:2003-06-03

    申请号:US09248513

    申请日:1999-02-08

    IPC分类号: G10L1504

    CPC分类号: G10L15/10 G10L15/22

    摘要: A voice recognition rejection scheme for capturing an utterance includes the steps accepting the utterance, applying an N-best algorithm to the utterance, or rejecting the utterance. The utterance is accepted if a first predefined relationship exists between one or more closest comparison results for the utterance with respect to a stored word and one or more differences between the one or more closest comparison results and one or more other comparison results between the utterance and one or more other stored words. An N-best algorithm is applied to the utterance if a second predefined relationship exists between the one or more closest comparison results and the one or more differences between the one or more closest comparison results and the one or more other comparison results. The utterance is rejected if a third predefined relationship exists between the one or more closest comparison results and the one or more differences between the one or more closest comparison results and the one or more other comparison results. One of the one or more other comparison results may advantageously be a next-closest comparison result for the utterance and another store word. The first, second, and third predefined relationships may advantageously be linear relationships.

    摘要翻译: 用于捕获话语的语音识别拒绝方案包括接受发音的步骤,将N最佳算法应用于话语或拒绝话语。 如果在一个或多个最接近的比较结果之间存在关于存储的单词的一个或多个最接近的比较结果与一个或多个最接近的比较结果之间的一个或多个差异以及话语和语音的一个或多个其他比较结果之间存在第一预定义关系, 一个或多个其他存储的字。 如果在一个或多个最接近的比较结果与一个或多个最接近的比较结果与一个或多个其他比较结果之间的一个或多个差异存在第二预定关系,那么将N最佳算法应用于话语。 如果一个或多个最接近的比较结果与一个或多个最接近的比较结果与一个或多个其它比较结果之间的一个或多个差异存在第三预定关系,那么话语被拒绝。 一个或多个其它比较结果中的一个可以有利地是用于话语和另一个存储词的下一个最接近的比较结果。 第一,第二和第三预定关系可以有利地是线性关系。

    System and method for automatic voice recognition using mapping
    4.
    发明授权
    System and method for automatic voice recognition using mapping 有权
    使用映射自动语音识别的系统和方法

    公开(公告)号:US06754629B1

    公开(公告)日:2004-06-22

    申请号:US09657760

    申请日:2000-09-08

    IPC分类号: G10L1700

    摘要: A method and system that combines voice recognition engines and resolves differences between the results of individual voice recognition engines using a mapping function. Speaker independent voice recognition engines and speaker-dependent voice recognition engines are combined. Hidden Markov Model (HMM) engines and Dynamic Time Warping (DTW) engines are combined.

    摘要翻译: 一种组合语音识别引擎的方法和系统,并使用映射功能解决各个语音识别引擎的结果之间的差异。 扬声器独立的语音识别引擎和与扬声器相关的语音识别引擎相结合。 隐马尔可夫模型(HMM)引擎和动态时间扭曲(DTW)引擎相结合。

    Voice recognition system method and apparatus
    5.
    发明授权
    Voice recognition system method and apparatus 有权
    语音识别系统的方法和装置

    公开(公告)号:US06941265B2

    公开(公告)日:2005-09-06

    申请号:US10017270

    申请日:2001-12-14

    IPC分类号: G10L15/28 G10L15/00

    CPC分类号: G10L15/28

    摘要: Generally stated a method and an accompanying apparatus provides for a voice recognition system (300) with programmable front end processing unit (400). The front end processing unit (400) requests and receives different configuration files at different times for processing voice data in the voice recognition system (300). The configuration files are communicated to the front end unit via a communication link (310) for configuring the front end processing unit (400). A microprocessor may provide the front end configuration files on the communication link at different times.

    摘要翻译: 通常所述方法和伴随装置提供具有可编程前端处理单元(400)的语音识别系统(300)。 前端处理单元400在不同时间请求并接收不同的配置文件,以处理语音识别系统(300)中的语音数据。 配置文件经由用于配置前端处理单元(400)的通信链路(310)传送到前端单元。 微处理器可以在不同时间在通信链路上提供前端配置文件。

    Voice recognition user interface for telephone handsets
    6.
    发明授权
    Voice recognition user interface for telephone handsets 有权
    语音识别用户界面,用于电话手机

    公开(公告)号:US06449496B1

    公开(公告)日:2002-09-10

    申请号:US09246499

    申请日:1999-02-08

    IPC分类号: H04B138

    CPC分类号: H04M1/271

    摘要: A method and apparatus providing a user interface within a phone that responds to a limited vocabulary of user trained voice commands. The interface allows users to perform all phone handset dialing functions using voice commands. Additionally, users will be able to create and modify entries within a voice recognition phonebook, whereby a number within the voice recognition phonebook can be called by saying the name associated with the number. The user interface provides a combination of voice and LCD displayed user prompts and responses to voice input. The interface responds to user voice commands and performs the command functions based upon matches to previously user trained voice command vocabulary words stored in memory.

    摘要翻译: 一种在电话内提供用户界面的方法和装置,其响应于用户训练的语音命令的有限词汇。 该接口允许用户使用语音命令执行所有手机拨号功能。 此外,用户将能够创建和修改语音识别电话簿内的条目,由此可以通过说出与该号码相关联的名称来呼叫语音识别电话簿内的号码。 用户界面提供语音和LCD组合,显示用户提示和响应语音输入。 接口响应用户语音命令,并且基于与存储在存储器中的先前用户训练的语音命令词汇词的匹配来执行命令功能。

    Method and apparatus for accurate endpointing of speech in the presence of noise
    7.
    发明授权
    Method and apparatus for accurate endpointing of speech in the presence of noise 有权
    用于在存在噪声的情况下准确地终止语音的方法和装置

    公开(公告)号:US06324509B1

    公开(公告)日:2001-11-27

    申请号:US09246414

    申请日:1999-02-08

    IPC分类号: G10L1504

    CPC分类号: G10L25/87 G10L2025/786

    摘要: An apparatus for accurate endpointing of speech in the presence of noise includes a processor and a software module. The processor executes the instructions of the software module to compare an utterance with a first signal-to-noise-ratio (SNR) threshold value to determine a first starting point and a first ending point of the utterance. The processor then compares with a second SNR threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance. The processor also then compares with the second SNR threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance. The first and second SNR threshold values are recalculated periodically to reflect changing SNR conditions. The first SNR threshold value advantageously exceeds the second SNR threshold value.

    摘要翻译: 用于在存在噪声的情况下准确地终止语音的装置包括处理器和软件模块。 处理器执行软件模块的指令,以将话语与第一信噪比(SNR)阈值进行比较,以确定话音的第一起始点和第一个终点。 然后,处理器与第二SNR阈值比较发声的一部分,该部分在第一起始点之前确定发音的第二起始点。 然后,处理器还与第二SNR阈值比较后续第一个终点的话语的一部分,以确定话语的第二个终点。 周期性地重新计算第一和第二SNR阈值以反映改变的SNR条件。 第一SNR阈值有利地超过第二SNR阈值。

    Zero disparity plane for feedback-based three-dimensional video
    8.
    发明授权
    Zero disparity plane for feedback-based three-dimensional video 有权
    用于基于反馈的三维视频的零视差平面

    公开(公告)号:US09049423B2

    公开(公告)日:2015-06-02

    申请号:US12958107

    申请日:2010-12-01

    摘要: The techniques of this disclosure are directed to the feedback-based stereoscopic display of three-dimensional images, such as may be used for video telephony (VT) and human-machine interface (HMI) application. According to one example, a region of interest (ROI) of stereoscopically captured images may be automatically determined based on determining disparity for at least one pixel of the captured images are described herein. According to another example, a zero disparity plane (ZDP) for the presentation of a 3D representation of stereoscopically captured images may be determined based on an identified ROI. According to this example, the ROI may be automatically identified, or identified based on receipt of user input identifying the ROI.

    摘要翻译: 本公开的技术涉及三维图像的基于反馈的立体显示,诸如可用于视频电话(VT)和人机界面(HMI)应用。 根据一个示例,可以基于确定捕获图像的至少一个像素的视差来自动确定立体拍摄图像的感兴趣区域(ROI)。 根据另一示例,可以基于所识别的ROI来确定用于呈现立体摄影图像的3D表示的零视差平面(ZDP)。 根据该示例,可以基于接收到识别ROI的用户输入来自动识别或识别ROI。

    ZERO DISPARITY PLANE FOR FEEDBACK-BASED THREE-DIMENSIONAL VIDEO
    9.
    发明申请
    ZERO DISPARITY PLANE FOR FEEDBACK-BASED THREE-DIMENSIONAL VIDEO 有权
    用于基于反馈的三维视频的零偏差平面

    公开(公告)号:US20120140038A1

    公开(公告)日:2012-06-07

    申请号:US12958107

    申请日:2010-12-01

    IPC分类号: H04N13/02 G06K9/00

    摘要: The techniques of this disclosure are directed to the feedback-based stereoscopic display of three-dimensional images, such as may be used for video telephony (VT) and human-machine interface (HMI) application. According to one example, a region of interest (ROI) of stereoscopically captured images may be automatically determined based on determining disparity for at least one pixel of the captured images are described herein. According to another example, a zero disparity plane (ZDP) for the presentation of a 3D representation of stereoscopically captured images may be determined based on an identified ROI. According to this example, the ROI may be automatically identified, or identified based on receipt of user input identifying the ROI.

    摘要翻译: 本公开的技术涉及三维图像的基于反馈的立体显示,诸如可用于视频电话(VT)和人机界面(HMI)应用。 根据一个示例,可以基于确定捕获图像的至少一个像素的视差来自动确定立体拍摄图像的感兴趣区域(ROI)。 根据另一示例,可以基于所识别的ROI来确定用于呈现立体摄影图像的3D表示的零视差平面(ZDP)。 根据该示例,可以基于接收到识别ROI的用户输入来自动识别或识别ROI。

    Method for searching an excitation codebook in a code excited linear
prediction (CELP) coder
    10.
    发明授权
    Method for searching an excitation codebook in a code excited linear prediction (CELP) coder 失效
    用于在代码激励线性预测(CELP)编码器中搜索激励码本的方法

    公开(公告)号:US5751901A

    公开(公告)日:1998-05-12

    申请号:US690709

    申请日:1996-07-31

    CPC分类号: G10L19/12 G10L25/06

    摘要: A method for selecting a code vector in an algebraic codebook wherein the analysis window for the coder is extended beyond the length of the target speech frame. By extending the analysis window, the two dimensional impulse response matrix can be stored as a one dimensional autocorrelation matrix greatly saving on the computational complexity and memory required for the search.

    摘要翻译: 一种用于选择代数码本中的码矢量的方法,其中用于编码器的分析窗口被扩展到目标语音帧的长度之外。 通过扩展分析窗口,可以将二维脉冲响应矩阵作为一维自相关矩阵存储,大大节省了搜索所需的计算复杂度和存储空间。