Spoken language understanding that incorporates prior knowledge into boosting
    1.
    发明授权
    Spoken language understanding that incorporates prior knowledge into boosting 有权
    将先前知识纳入提升的口语理解

    公开(公告)号:US07328146B1

    公开(公告)日:2008-02-05

    申请号:US11484120

    申请日:2006-07-11

    IPC分类号: G06F17/20

    CPC分类号: G06F17/28

    摘要: A system for understanding entries, such as speech, develops a classifier by employing prior knowledge with which a given corpus of training entries is enlarged threefold. A rule is created for each of the labels employed in the classifier, and the created rules are applied to the given corpus to create a corpus of attachments by appending a weight of ηp(x), or 1−ηp(x), to labels of entries that meet, or fail to meet, respectively, conditions of the labels' rules, and to also create a corpus of non-attachments by appending a weight of 1−ηp(x), or ηp(x), to labels of entries that meet, or fail to meet conditions of the labels' rules.

    摘要翻译: 用于理解诸如言语之类的条目的系统通过采用将给定的训练条目语料库放大三倍的先验知识来开发分类器。 为分类器中使用的每个标签创建规则,并将创建的规则应用于给定的语料库,以通过将一个权重为etap(x)或1-etap(x)附加到标签来创建附件语料库 分别符合或未能满足标签规则的条件的条目,并通过将1-etap(x)或etap(x)的权重附加到标签的标签上来创建非附件语料库 满足或不符合标签规则条件的条目。

    Language-understanding systems employing machine translation components
    2.
    发明授权
    Language-understanding systems employing machine translation components 有权
    使用机器翻译组件的语言理解系统

    公开(公告)号:US07212964B1

    公开(公告)日:2007-05-01

    申请号:US10103049

    申请日:2002-03-22

    IPC分类号: G06F17/21 G06F17/28

    摘要: Embodiments of the present invention relate to a method and system for augmenting a training database of an automated language-understanding system. In one embodiment, a training example in a first language may be received from the training database. The first language-training example may be translated to a second language output. The second language output may be translated to a first variant of the first language-training example. An action pair including the first variant of the first language-training example and an action command associated with the first language-training example may be stored in an augmented training database.

    摘要翻译: 本发明的实施例涉及一种用于增强自动语言理解系统的训练数据库的方法和系统。 在一个实施例中,可以从训练数据库接收第一语言的训练示例。 第一语言训练示例可以被翻译成第二语言输出。 可以将第二语言输出转换为第一语言训练示例的第一变体。 包括第一语言训练示例的第一变型和与第一语言训练示例相关联的动作命令的动作对可以存储在增强训练数据库中。

    Method and apparatus for language translation
    3.
    发明授权
    Method and apparatus for language translation 失效
    语言翻译的方法和装置

    公开(公告)号:US06233544B1

    公开(公告)日:2001-05-15

    申请号:US08665182

    申请日:1996-06-14

    申请人: Hiyan Alshawi

    发明人: Hiyan Alshawi

    IPC分类号: G06F1728

    CPC分类号: G06F17/2872 G06F17/2818

    摘要: Methods and systems for language translation are disclosed. The translator is based on finite state machines that can convert a pair of input symbol sequences to a pair of output symbol sequences. The translator includes a lexicon associating a finite state machine with a pair of head words with corresponding meanings in the source and target languages. The state machine for a source language head word w and a target language head word &ngr; reads the dependent words of w to its left and right in a source sentence and proposes corresponding dependents to the left and right of &ngr; in a target language sentence being constructed, taking account of the required word order for the target language. The state machines are used by a transduction search engine to generate a plurality of candidate translations via a recursive process wherein, a source language head word is first translated as described above, and then the heads of each of the dependent phrases are similarly translated, and then their dependents and so on. Only the state machines corresponding to the words in the source language string are activated and used by the search engine. The translator also includes a parameter table that provides costs for actions taken by each finite state machine in converting between the source language and the target language. The costs for machine transitions are indicative of the likelihood of co-occurence of pairs of words in the source language, and between corresponding pairs of words in the target language. The transduction search engine provides a total cost, using the parameter table, for each of the candidate translations. The total cost of a translation is the sum of the cost for all actions taken by each machine involved in the translation.

    摘要翻译: 公开了语言翻译的方法和系统。 翻译器基于有限状态机,其可以将一对输入符号序列转换成一对输出符号序列。 翻译器包括将有限状态机与源和目标语言中具有相应含义的一对头字相关联的词典。 源语言头词w和目标语言头词&ngr的状态机 在源语句中读取w的左右相关词,并提出与ngr左右相对应的依赖关系; 在正在构建的目标语言句中,考虑到目标语言所需的单词顺序。 转移搜索引擎使用状态机通过递归处理生成多个候选翻译,其中首先如上所述翻译源语言头文字,然后每个依赖短语的头部被类似地翻译,并且 那么他们的家属等等。 只有与源语言字符串中的单词相对应的状态机才被搜索引擎激活和使用。 翻译器还包括一个参数表,为每个有限状态机在源语言和目标语言之间转换中采取的行动提供成本。 机器转换的成本表示在源语言中以及在目标语言的相应的单词对之间共同出现的单词的可能性。 换能搜索引擎使用参数表为每个候选翻译提供总成本。 翻译的总成本是翻译中涉及的每台机器所采取的所有行动的费用之和。

    Language-understanding training database action pair augmentation using bidirectional translation
    4.
    发明授权
    Language-understanding training database action pair augmentation using bidirectional translation 有权
    语言理解训练数据库动作对增强使用双向翻译

    公开(公告)号:US08073683B2

    公开(公告)日:2011-12-06

    申请号:US12336429

    申请日:2008-12-16

    IPC分类号: G06F17/21 G06F17/26

    摘要: Embodiments of the present invention relate to a method and system for augmenting a training database of an automated language-understanding system. In one embodiment, a training example in a first language is received from the training database. The first language-training example is translated to a second language output. The second language output is translated to a first variant of the first language-training example. An action pair including the first variant of the first language-training example and an action command associated with the first language-training example is stored in an augmented training database.

    摘要翻译: 本发明的实施例涉及一种用于增强自动语言理解系统的训练数据库的方法和系统。 在一个实施例中,从训练数据库接收第一语言的训练示例。 第一语言训练示例被转换为第二语言输出。 第二语言输出被转换为第一语言训练示例的第一变体。 将包括第一语言训练示例的第一变体和与第一语言训练示例相关联的动作命令的动作对存储在增强训练数据库中。

    SUGGESTING ALTERNATIVE QUERIES IN QUERY RESULTS
    5.
    发明申请
    SUGGESTING ALTERNATIVE QUERIES IN QUERY RESULTS 有权
    在查询结果中建议替代查询

    公开(公告)号:US20090077037A1

    公开(公告)日:2009-03-19

    申请号:US12209890

    申请日:2008-09-12

    IPC分类号: G06F17/30

    CPC分类号: G06F17/3097 G06F17/30867

    摘要: Methods, systems, and apparatus, including computer program products, for suggesting alternative queries based on original query search results. In one aspect, a method includes receiving search results for a first query, where each search result refers to a respective resource and includes a snippet of content from the respective resource, receiving one or more suggested second queries, for each of the suggested second queries: selecting a set of words in one of the snippets to represent the suggested second query, associating the suggested second query with the set so that a user can interact with a word in the set to invoke the suggested second query, and marking the set so as to indicate that the user can interact with a word in the set to invoke the suggested second query, and transmitting the search results including each marked set to a client device for presentation to the user.

    摘要翻译: 方法,系统和装置,包括计算机程序产品,用于根据原始查询搜索结果建议替代查询。 在一个方面,一种方法包括接收第一查询的搜索结果,其中每个搜索结果参考相应的资源,并且包括来自相应资源的内容的片段,为每个建议的第二查询接收一个或多个建议的第二查询 :在其中一个片段中选择一组单词以表示建议的第二个查询,将建议的第二个查询与该集合相关联,以便用户可以与该集合中的单词进行交互以调用建议的第二个查询,并将该集合标记为 以指示用户可以与集合中的单词交互以调用建议的第二查询,并将包括每个标记的集合的搜索结果发送到客户端设备以呈现给用户。

    Method and apparatus for automatic construction of hierarchical transduction models for language translation
    6.
    发明授权
    Method and apparatus for automatic construction of hierarchical transduction models for language translation 有权
    自动构建语言翻译分层转换模型的方法和装置

    公开(公告)号:US06195631B1

    公开(公告)日:2001-02-27

    申请号:US09198109

    申请日:1998-11-23

    IPC分类号: G06F1728

    CPC分类号: G06F17/2827

    摘要: A method and apparatus for automatically constructing hierarchical transduction models for language translation is presented. The input to the construction process may be a database of examples each consisting of a transcribed speech utterance and its translation into another language. A translation pairing score is assigned (or computed) for translating a word in the source language into each of the possible translations it has in the target language. For each instance of the resulting training dataset, a head transducer may be constructed that translates the source string into the target string by splitting the source string into a source head word, the words preceding the source head word, and the words following the source head word. This process may be performed recursively to generate a set of transducer fragments. The transducer fragments may form a statistical head transducer model. The head transducer translation model may then be input into a transduction search module.

    摘要翻译: 提出了一种自动构建语言翻译分层转换模型的方法和装置。 对施工过程的输入可以是每个由转录的语音说话和其翻译成另一种语言的例子的数据库。 分配(或计算)用于将源语言中的单词翻译成目标语言中的每个可能的翻译的翻译配对分数。 对于所得到的训练数据集的每个实例,可以构造头传感器,其通过将源字符串分割成源头字,源头字前面的字和源头之后的字,将源字符串转换为目标字符串 字。 该过程可以递归地执行以产生一组换能器碎片。 传感器碎片可以形成统计头传感器模型。 然后可以将头部换能器转换模型输入到换能搜索模块中。

    Search engine with fill-the-blanks capability
    7.
    发明授权
    Search engine with fill-the-blanks capability 有权
    具有填充空白功能的搜索引擎

    公开(公告)号:US07693829B1

    公开(公告)日:2010-04-06

    申请号:US11114971

    申请日:2005-04-25

    申请人: Hiyan Alshawi

    发明人: Hiyan Alshawi

    IPC分类号: G06F7/00 G06F17/30

    摘要: A method of searching for information is described. A sequence of terms, including one or more term segments and one or more identifiers corresponding to one or more missing terms, is received. The sequence of terms is converted into a corresponding search pattern, including a set of one or more query expressions and one or more ordering constraints. The search pattern is compared with a plurality of documents to identify a set of documents. Match scores for one or more matches between the search pattern and documents in the set of documents are determined. Content in the set of documents corresponding to the one or more missing terms in the search pattern are identified and a ranked set of information items containing the identified content is provided in accordance with the match scores.

    摘要翻译: 描述搜索信息的方法。 接收包括一个或多个术语段和一个或多个缺失术语对应的一个或多个标识符的术语序列。 术语序列被转换成相应的搜索模式,包括一组一个或多个查询表达式和一个或多个排序约束。 将搜索模式与多个文档进行比较以识别一组文档。 确定搜索模式和文档集中的文档之间的一个或多个匹配的匹配分数。 识别与搜索模式中的一个或多个缺失项相对应的文档集合中的内容,并且根据匹配分数提供包含所识别的内容的排列的信息项集。

    Language-understanding training database action pair augmentation using bidirectional translation
    8.
    发明授权
    Language-understanding training database action pair augmentation using bidirectional translation 有权
    语言理解训练数据库动作对增强使用双向翻译

    公开(公告)号:US07467081B2

    公开(公告)日:2008-12-16

    申请号:US11656155

    申请日:2007-01-22

    IPC分类号: G06F17/21 G06F17/28

    摘要: Embodiments of the present invention relate to a method and system for augmenting a training database of an automated language-understanding system. In one embodiment, a training example in a first language is received from the training database. The first language-training example translated to a second language output. The second language output is translated to a first variant of the first language-training example. An action pair including the first variant of the first language-training example and an action command associated with the first language-training example is stored in an augmented training database.

    摘要翻译: 本发明的实施例涉及一种用于增强自动语言理解系统的训练数据库的方法和系统。 在一个实施例中,从训练数据库接收第一语言的训练示例。 第一语言训练示例转换为第二语言输出。 第二语言输出被转换为第一语言训练示例的第一变体。 将包括第一语言训练示例的第一变体和与第一语言训练示例相关联的动作命令的动作对存储在增强训练数据库中。

    Method and apparatus for an improved language recognition system
    9.
    发明授权
    Method and apparatus for an improved language recognition system 失效
    改进语言识别系统的方法和装置

    公开(公告)号:US5870706A

    公开(公告)日:1999-02-09

    申请号:US631874

    申请日:1996-04-10

    申请人: Hiyan Alshawi

    发明人: Hiyan Alshawi

    IPC分类号: G10L15/00 G10L15/18 G10L5/02

    摘要: Methods and apparatus for a language model and language recognition systems are disclosed. The method utilizes a plurality of probabilistic finite state machines having the ability to recognize a pair of sequences, one sequence scanned leftwards, the other scanned rightwards. Each word in the lexicon of the language model is associated with one or more such machines which model the semantic relations between the word and other words. Machine transitions create phrases from a set of word string hypotheses, and incrementally calculate costs related to the probability that such phrases represent the language to be recognized. The cascading lexical head machines utilized in the methods and apparatus capture the structural associations implicit in the hierachical organization of a sentence, resulting in a language model and language recognition systems that combine the lexical sensitivity of N-gram models with the structural properties of dependency grammar.

    摘要翻译: 公开了用于语言模型和语言识别系统的方法和装置。 该方法利用具有识别一对序列的能力的多个概率有限状态机,向左扫描的一个序列,另一个向右扫描的序列。 语言模型的词典中的每个单词都与一个或多个这样的机器相关联,这些机器对单词和其他单词之间的语义关系进行建模。 机器转换从一组字串假设中创建短语,并逐步计算与这种短语表示要识别的语言的概率相关的成本。 在方法和装置中使用的级联词汇头机器捕获隐含在句子的层次组织中的结构关联,导致语言模型和语言识别系统将N-gram模型的词汇敏感性与依赖性语法的结构性质相结合 。

    Search engine with fill-the-blanks capability
    10.
    发明授权
    Search engine with fill-the-blanks capability 有权
    具有填充空白功能的搜索引擎

    公开(公告)号:US08209315B2

    公开(公告)日:2012-06-26

    申请号:US12507009

    申请日:2009-07-21

    申请人: Hiyan Alshawi

    发明人: Hiyan Alshawi

    IPC分类号: G06F7/00 G06F17/30

    摘要: A client system provides to a server system a fill-the-blank query comprising one or more term segments and one or more missing term identifiers signifying missing information sought by a user. The client system receives from the server system a response to the query, the response including at least one or more potential answers corresponding to the one or more missing term identifiers in the fill-the-blank query, and then displays the response to the query, including displaying the one or more potential answers. Optionally, the client system displays a ranked list of documents containing the one or more potential answers. Optionally, the response to the query further includes snippets of text from one or more documents containing the one or more potential answers. Optionally, the fill-the-blank query includes a respective missing term identifier located between two respective term segments.

    摘要翻译: 客户端系统向服务器系统提供包括一个或多个术语段的填充空白查询,以及一个或多个缺失的术语标识符,其表示用户寻找的缺失信息。 客户端系统从服务器系统接收对查询的响应,响应包括与填充空白查询中的一个或多个缺失的项标识符相对应的至少一个或多个潜在答案,然后显示对查询的响应 ,包括显示一个或多个潜在答案。 可选地,客户端系统显示包含一个或多个潜在答案的文档的排序列表。 可选地,对查询的响应还包括来自包含一个或多个潜在答案的一个或多个文档的文本片段。 可选地,填充空白查询包括位于两个相应术语段之间的相应丢失的术语标识符。