Statistical Machine Translation Based Search Query Spelling Correction
    1.
    发明申请
    Statistical Machine Translation Based Search Query Spelling Correction 审中-公开
    基于统计机器翻译的搜索查询拼写更正

    公开(公告)号:US20130124492A1

    公开(公告)日:2013-05-16

    申请号:US13296640

    申请日:2011-11-15

    IPC分类号: G06F17/30

    摘要: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).

    摘要翻译: 本文描述了基于统计机器翻译(SMT)的搜索查询拼写校正技术。 在一个或多个实现中,可以记录关于由客户端执行的搜索的搜索数据。 记录的数据包括可用于确定错误模式的查询校正对,指示拼写错误的子字符串可以被翻译为校正子字符串。 错误模式可用于确定输入查询的建议,并开发用于将输入查询转换为更正查询的查询校正模型。 在一个或多个实现中,来自多个查询校正模型的概率特征被组合以得出不同的校正候选。 然后可以将一个或多个顶级评分校正候选者作为用户的选择和/或提供给搜索引擎的建议被公开,以使用校正的查询版本进行相应的搜索。

    Statistical machine translation based search query spelling correction

    公开(公告)号:US10176168B2

    公开(公告)日:2019-01-08

    申请号:US13296640

    申请日:2011-11-15

    IPC分类号: G06F17/30 G06F17/28 G06F17/27

    摘要: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).

    Method and system for dynamically adjusted training for speech
recognition
    4.
    发明授权
    Method and system for dynamically adjusted training for speech recognition 失效
    用于语音识别的动态调整训练的方法和系统

    公开(公告)号:US5963903A

    公开(公告)日:1999-10-05

    申请号:US673435

    申请日:1996-06-28

    CPC分类号: G10L15/063 G10L2015/0635

    摘要: A method and system for dynamically selecting words for training a speech recognition system. The speech recognition system models each phoneme using a hidden Markov model and represents each word as a sequence of phonemes. The training system ranks each phoneme for each frame according to the probability that the corresponding codeword will be spoken as part of the phoneme. The training system collects spoken utterances for which the corresponding word is known. The training system then aligns the codewords of each utterance with the phoneme that it is recognized to be part of. The training system then calculates an average rank for each phoneme using the aligned codewords for the aligned frames. Finally, the training system selects words for training that contain phonemes with a low rank.

    摘要翻译: 一种用于动态选择用于训练语音识别系统的单词的方法和系统。 语音识别系统使用隐马尔科夫模型对每个音素进行建模,并将每个单词表示为音素序列。 训练系统根据将相应的码字作为音素的一部分被说出的概率,对每个帧的每个音素进行排序。 训练系统收集对应词语已知的口语说话。 然后,训练系统将每个话语的码字与被认为是其一部分的音素对齐。 训练系统然后使用对齐的帧的对齐码字来计算每个音素的平均等级。 最后,训练系统选择包含低等级音素的训练词。

    FORCE-FEEDBACK WITHIN TELEPRESENCE
    7.
    发明申请
    FORCE-FEEDBACK WITHIN TELEPRESENCE 有权
    电报中的反馈

    公开(公告)号:US20100306647A1

    公开(公告)日:2010-12-02

    申请号:US12472579

    申请日:2009-05-27

    IPC分类号: G06F3/01 G06F3/048

    CPC分类号: G06F3/016

    摘要: The claimed subject matter provides a system and/or a method that facilitates replicating a telepresence session with a real world physical meeting. A telepresence session can be initiated within a communication framework that includes two or more virtually represented users that communicate therein. A trigger component can monitor the telepresence session in real time to identify a participant interaction with an object, wherein the object is at least one of a real world physical object or a virtually represented object within the telepresence session. A feedback component can implement a force feedback to at least one participant within the telepresence session based upon the identified participant interaction with the object, wherein the force feedback is employed via a device associated with at least one participant.

    摘要翻译: 所要求保护的主题提供了一种有助于利用真实世界物理会议复制远程呈现会话的系统和/或方法。 可以在通信框架内启动远程呈现会话,该通信框架包括在其中通信的两个或更多虚拟表示的用户。 触发组件可以实时地监视远程呈现会话,以识别与对象的参与者交互,其中对象是远程呈现会话中的真实世界物理对象或虚拟表示对象中的至少一个。 基于所识别的参与者与对象的交互,反馈组件可以向远程呈现会话中的至少一个参与者实施强制反馈,其中通过与至少一个参与者相关联的设备来采用力反馈。

    Use of a unified language model
    9.
    发明授权
    Use of a unified language model 失效
    使用统一的语言模型

    公开(公告)号:US07013265B2

    公开(公告)日:2006-03-14

    申请号:US11003121

    申请日:2004-12-03

    IPC分类号: G06F17/27 G10L15/18 G10L11/00

    CPC分类号: G10L15/193 G10L15/197

    摘要: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.

    摘要翻译: 语言处理系统包括统一的语言模型。 统一语言模型包括具有表示语义或句法概念和终端的非终端令牌的多个无上下文语法,以及具有非终端令牌的N-gram语言模型。 能够接收指示语言的输入信号的语言处理模块访问统一语言模型以识别语言。 语言处理模块根据统一语言模型的单词生成接收到的语言的假设和/或提供指示语言的输出信号以及其中包含的至少一些语义或句法概念。

    Information retrieval and speech recognition based on language models
    10.
    发明授权
    Information retrieval and speech recognition based on language models 失效
    基于语言模型的信息检索和语音识别

    公开(公告)号:US06418431B1

    公开(公告)日:2002-07-09

    申请号:US09050286

    申请日:1998-03-30

    IPC分类号: G06F1730

    摘要: A language model is used in a speech recognition system which has access to a first, smaller data store and a second, larger data store. The language model is adapted by formulating an information retrieval query based on information contained in the first data store and querying the second data store. Information retrieved from the second data store is used in adapting the language model. Also, language models are used in retrieving information from the second data store. Language models are built based on information in the first data store, and based on information in the second data store. The perplexity of a document in the second data store is determined, given the first language model, and given the second language model. Relevancy of the document is determined based upon the first and second perplexities. Documents are retrieved which have a relevancy measure that exceeds a threshold level.

    摘要翻译: 一种语言模型用于能够访问第一个较小的数据存储和第二个更大数据存储的语音识别系统。 通过基于包含在第一数据存储器中的信息并查询第二数据存储器来制定信息检索查询来适应语言模型。 从第二数据存储器检索的信息用于适应语言模型。 此外,语言模型用于从第二数据存储检索信息。 语言模型是基于第一数据存储中的信息构建的,并且基于第二数据存储中的信息。 在给定第一语言模型并给出第二语言模型的情况下,确定第二数据存储中的文档的困惑度。 基于第一和第二困惑来确定文档的相关性。 检索具有超过阈值水平的相关性度量的文档。