SYSTEM AND METHOD FOR OPTIMIZING PATTERN RECOGNITION OF NON-GAUSSIAN PARAMETERS
    11.
    发明申请
    SYSTEM AND METHOD FOR OPTIMIZING PATTERN RECOGNITION OF NON-GAUSSIAN PARAMETERS 有权
    优化非高斯参数模式识别的系统与方法

    公开(公告)号:US20090254496A1

    公开(公告)日:2009-10-08

    申请号:US12061023

    申请日:2008-04-02

    IPC分类号: G06F15/18 G06F17/10

    CPC分类号: G06K9/6226

    摘要: A method of optimizing a function of a parameter includes associating, with an objective function for initial value of parameters, an auxiliary function of parameters that could be optimized computationally more efficiently than an original objective function, obtaining parameters that are optimum for the auxiliary function, obtaining updated parameters by taking a weighted sum of the optimum of the auxiliary function and initial model parameters.

    摘要翻译: 一种优化参数函数的方法包括将参数初始值的目标函数与原始目标函数的计算效率进行优化的参数辅助函数相关联,获得对辅助函数最佳的参数, 通过获取辅助功能和初始模型参数的最优值的加权和获得更新的参数。

    METHODS AND APPARATUS FOR CORRECTING RECOGNITION ERRORS
    14.
    发明申请
    METHODS AND APPARATUS FOR CORRECTING RECOGNITION ERRORS 审中-公开
    纠正识别错误的方法和设备

    公开(公告)号:US20120304057A1

    公开(公告)日:2012-11-29

    申请号:US13479010

    申请日:2012-05-23

    IPC分类号: G06F17/00

    摘要: Techniques for error correction using a history list comprising at least one misrecognition and correction information associated with each of the at least one misrecognitions indicating how a user corrected the associated misrecognition. The techniques include converting data input from a user to generate a text segment, determining whether at least a portion of the text segment appears in the history list as one of the at least one misrecognitions, if the at least a portion of the text segment appears in the history list as one of the at least one misrecognitions, obtaining the correction information associated with the at least one misrecognition, and correcting the at least a portion of the text segment based, at least in part, on the correction information.

    摘要翻译: 用于使用历史列表进行纠错的技术包括至少一个与所述至少一个错误识别中的每一个相关联的错误识别和校正信息,指示用户如何更正相关联的误识别。 这些技术包括转换从用户输入的数据以生成文本段,确定文本段的至少一部分是否出现在历史列表中作为至少一个错误识别之一,如果文本段的至少一部分出现 在所述历史列表中作为所述至少一个错误识别之一,获得与所述至少一个错误识别相关联的所述校正信息,以及至少部分地基于所述校正信息来校正所述文本段的所述至少一部分。

    CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE
    15.
    发明申请
    CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 有权
    通过对话虚拟机对话计算

    公开(公告)号:US20090313026A1

    公开(公告)日:2009-12-17

    申请号:US12544473

    申请日:2009-08-20

    IPC分类号: G10L15/22

    摘要: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

    摘要翻译: 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用)和常规应用(12)提供通用协调多模态对话用户界面(CUI)10。 对话感知应用(11)通过对话应用API(13)与对话内核(14)通信。 会话核心14基于其注册的会话能力和需求来控制应用和设备(本地和网络)之间的对话,并提供统一的对话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。

    Hierarchical labeler in a speech recognition system
    16.
    发明授权
    Hierarchical labeler in a speech recognition system 失效
    语音识别系统中的分层标签器

    公开(公告)号:US6023673A

    公开(公告)日:2000-02-08

    申请号:US869061

    申请日:1997-06-04

    IPC分类号: G10L5/06 G10L9/00

    CPC分类号: G10L15/083

    摘要: A speech coding apparatus and method uses a hierarchy of prototype sets to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of level subsets of prototype vector signals is computed, wherein each prototype vector signal in a higher level subset is associated with at least one prototype vector signal in a lower level subset. Each level subset contains a plurality of prototype vector signals, with lower level subsets containing more prototypes than higher level subsets. The closeness of the feature value of the first feature vector signal is compared to the parameter values of prototype vector signals in the first level subset of prototype vector signals to obtain a ranked list of prototype match scores for the first feature vector signal and each prototype vector signal in the first level subset. The closeness of the feature value of the first feature vector signal is compared to the parameter values of each prototype vector signal in a second (lower) level subset that is associated with the highest ranking prototype vectors in the first level subset, to obtain a second ranked list of prototype match scores. The identification value of the prototype vector signal in the second ranked list having the best prototype match score is output as a coded utterance representation signal of the first feature vector signal.

    摘要翻译: 语音编码装置和方法使用原型集的层次来编码话语,同时消耗更少的计算资源。 在一系列连续时间间隔的每一个期间测量话音的至少一个特征的值,以产生表示特征值的一系列特征向量信号。 计算原型矢量信号的多个级别子集,其中较高级子集中的每个原型矢量信号与较低级子集中的至少一个原型矢量信号相关联。 每个级别子集包含多个原型矢量信号,其中较低级子集包含比较高级子集更多的原型。 将第一特征向量信号的特征值的接近度与原型矢量信号的第一级子集中的原型矢量信号的参数值进行比较,以获得第一特征向量信号和每个原型矢量的原型匹配分数的排序列表 信号在第一级子集。 将第一特征向量信号的特征值的接近度与与第一级子集中的最高排序原型向量相关联的第二(较低)级子集中的每个原型矢量信号的参数值进行比较,以获得第二 排名榜的原型比赛得分。 将具有最佳原型匹配分数的第二等级列表中的原型矢量信号的识别值输出为第一特征向量信号的编码话音表示信号。

    Method and apparatus for suppressing background music or noise from the
speech input of a speech recognizer
    17.
    发明授权
    Method and apparatus for suppressing background music or noise from the speech input of a speech recognizer 失效
    用于从语音识别器的语音输入中抑制背景音乐或噪声的方法和装置

    公开(公告)号:US5848163A

    公开(公告)日:1998-12-08

    申请号:US594679

    申请日:1996-02-02

    CPC分类号: G10L21/0208

    摘要: A method and apparatus for removing the effect of background music or noise from speech input to a speech recognizer so as to improve recognition accuracy has been devised. Samples of pure music or noise related to the background music or noise that corrupts the speech input are utilized to reduce the effect of the background in speech recognition. The pure music and noise samples can be obtained in a variety of ways. The music or noise corrupted speech input is segmented in overlapping segments and is then processed in two phases: first, the best matching pure music or noise segment is aligned with each speech segment; then a linear filter is built for each segment to remove the effect of background music or noise from the speech input and the overlapping segments are averaged to improve the signal to noise ratio. The resulting acoustic output can then be fed to a speech recognizer.

    摘要翻译: 已经设计了一种用于从语音输入到语音识别器中去除背景音乐或噪声的影响以提高识别精度的方法和装置。 用于破坏语音输入的背景音乐或噪音相关的纯音乐或噪音的样本被用来减少背景在语音识别中的影响。 纯音乐和噪音样本可以通过各种方式获得。 音乐或噪声损坏的语音输入被分割成重叠的段,然后分两个阶段进行处理:首先,最佳匹配的纯音乐或噪声段与每个语音段对齐; 然后为每个段构建线性滤波器,以消除来自语音输入的背景音乐或噪声的影响,并且重叠的段被平均以提高信噪比。 然后,所得到的声输出可以被馈送到语音识别器。

    State-dependent speaker clustering for speaker adaptation
    18.
    发明授权
    State-dependent speaker clustering for speaker adaptation 失效
    用于说话者适应的状态依赖的扬声器聚类

    公开(公告)号:US5787394A

    公开(公告)日:1998-07-28

    申请号:US572223

    申请日:1995-12-13

    IPC分类号: G10L15/06 G10L5/06

    CPC分类号: G10L15/07 G10L2015/0631

    摘要: A system and method for adaptation of a speaker independent speech recognition system for use by a particular user. The system and method gather acoustic characterization data from a test speaker and compare the data with acoustic characterization data generated for a plurality of training speakers. A match score is computed between the test speaker's acoustic characterization for a particular acoustic subspace and each training speaker's acoustic characterization for the same acoustic subspace. The training speakers are ranked for the subspace according to their scores and a new acoustic model is generated for the test speaker based upon the test speaker's acoustic characterization data and the acoustic characterization data of the closest matching training speakers. The process is repeated for each acoustic subspace.

    摘要翻译: 一种适用于特定用户使用的独立于说话者的语音识别系统的系统和方法。 该系统和方法从测试扬声器收集声学表征数据,并将数据与为多个训练说话者生成的声学特征数据进行比较。 在特定声学子空间的测试扬声器的声学特性与相同声学子空间的每个训练说话者的声学特性之间计算匹配分数。 训练演讲者根据其分数对子空间进行排名,并且基于测试讲者的声学表征数据和最接近的匹配训练说话者的声学表征数据为测试说话者生成新的声学模型。 对于每个声学子空间重复该过程。

    System and method for automatic handwriting recognition with a
writer-independent chirographic label alphabet
    19.
    发明授权
    System and method for automatic handwriting recognition with a writer-independent chirographic label alphabet 失效
    自动手写识别的系统和方法,具有与笔者无关的手写标签字母表

    公开(公告)号:US5644652A

    公开(公告)日:1997-07-01

    申请号:US424236

    申请日:1995-04-19

    IPC分类号: G06K9/70 G06K9/62 G06K9/18

    CPC分类号: G06K9/6297

    摘要: An automatic handwriting recognition system wherein each written (chirographic) manifestation of each character is represented by a statistical model (called a hidden Markov model). The system implements a method which entails sampling a pool of independent writers and deriving a hidden Markov model for each particular character (allograph) which is independent of a particular writer. The HMMs are used to derive a chirographic label alphabet which is independent of each writer. This is accomplished during what is described as the training phase of the system. The alphabet is constructed using supervised techniques. That is, the alphabet is constructed using information learned in the training phase to adjust the result according to a statistical algorithm (such as a Viterbi alignment) to arrive at a cost efficient recognition tool. Once such an alphabet is constructed a new set of HMMs can be defined which more accurately reflects parameter typing across writers. The system recognizes handwriting by applying an efficient hierarchical decoding strategy which employs a fast match and a detailed match function, thereby making the recognition cost effective.

    摘要翻译: 一种自动手写识别系统,其中每个字符的每个书写(手绘)表现由统计模型(称为隐马尔可夫模型)表示。 该系统实现了一种方法,该方法需要对独立作者的池进行抽样,并为独立于特定作者的每个特定字符(同位素)导出隐马尔科夫模型。 HMM用于导出独立于每个作者的手写标签字母表。 这是在系统的训练阶段描述的过程中完成的。 字母表使用监督技术构建。 也就是说,使用在训练阶段学习的信息来构建字母表,以根据统计算法(例如维特比对齐)来调整结果,以得到成本有效的识别工具。 一旦构建了这样一个字母表,就可以定义一组新的HMM,它可以更准确地反映作者的参数分类。 该系统通过应用采用快速匹配和详细匹配功能的有效分层解码策略来识别手写,从而使识别成本有效。

    Automatic handwriting recognition using both static and dynamic
parameters

    公开(公告)号:US5550931A

    公开(公告)日:1996-08-27

    申请号:US450557

    申请日:1995-05-25

    摘要: Methods and apparatus are disclosed for recognizing handwritten characters in response to an input signal from a handwriting transducer. A feature extraction and reduction procedure is disclosed that relies on static or shape information, wherein the temporal order in which points are captured by an electronic tablet may be disregarded. A method of the invention generates and processes the tablet data with three independent sets of feature vectors which encode the shape information of the input character information. These feature vectors include horizontal (x-axis) and vertical (y-axis) slices of a bit-mapped image of the input character data, and an additional feature vector to encode an absolute y-axis displacement from a baseline of the bit-mapped image. It is shown that the recognition errors that result from the spatial or static processing are quite different from those resulting from temporal or dynamic processing. Furthermore, it is shown that these differences complement one another. As a result, a combination of these two sources of feature vector information provides a substantial reduction in an overall recognition error rate. Methods to combine probability scores from dynamic and the static character models are also disclosed.