Machine translation with side information
    1.
    发明授权
    Machine translation with side information 有权
    机器翻译与侧面信息

    公开(公告)号:US08768686B2

    公开(公告)日:2014-07-01

    申请号:US12779751

    申请日:2010-05-13

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2818

    摘要: A method of identifying and using side information available to statistical machine translation systems within an enterprise setting, the method including extracting user-specific interaction and non-interaction-based information from at least one corresponding database within the enterprise for each of a plurality of users, aggregating the user-specific interaction and non-interaction based information from a plurality of users, by using a processor on a computer, to tune and adapt background translation and language models, and updating all relevant models within the enterprise after user activity based on the tuned and adapted translation and language models.

    摘要翻译: 一种识别和使用可用于企业设置内的统计机器翻译系统的侧面信息的方法,所述方法包括从多个用户中的每一个的企业内的至少一个对应的数据库中提取用户特定交互和非基于交互的信息 ,通过使用计算机上的处理器来聚合来自多个用户的用户特定交互和非基于交互的信息,以调整和适应背景翻译和语言模型,以及在基于用户活动的用户活动之后更新企业内的所有相关模型 调整和适应的翻译和语言模型。

    Machine translation in continuous space
    2.
    发明授权
    Machine translation in continuous space 失效
    机器翻译在连续空间

    公开(公告)号:US08229729B2

    公开(公告)日:2012-07-24

    申请号:US12054636

    申请日:2008-03-25

    CPC分类号: G06F17/2818

    摘要: A system and method for training a statistical machine translation model and decoding or translating using the same is disclosed. A source word versus target word co-occurrence matrix is created to define word pairs. Dimensionality of the matrix may be reduced. Word pairs are mapped as vectors into continuous space where the word pairs are vectors of continuous real numbers and not discrete entities in the continuous space. A machine translation parametric model is trained using an acoustic model training method based on word pair vectors in the continuous space.

    摘要翻译: 公开了一种用于训练统计机器翻译模型和使用其的解码或翻译的系统和方法。 创建源词与目标词同现矩阵以定义单词对。 可以减小矩阵的尺寸。 字对被映射为连续空间中的向量,其中单词对是连续实数的向量,而不是连续空间中的离散实体。 使用基于连续空间中的字对矢量的声学模型训练方法训练机器翻译参数模型。

    Semantic language modeling and confidence measurement
    3.
    发明申请
    Semantic language modeling and confidence measurement 有权
    语义语言建模和置信度测量

    公开(公告)号:US20050055209A1

    公开(公告)日:2005-03-10

    申请号:US10655838

    申请日:2003-09-05

    IPC分类号: G10L15/18 G10L15/28 G10L15/00

    CPC分类号: G10L15/1815

    摘要: A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.

    摘要翻译: 一种用于语音识别的系统和方法包括在识别语音中产生一组可能的假设,通过使用语义结构化语言模型通过使用语义内容来重新计算可能的假设,并且通过采用语法结构语言模型对解析树进行评分以识别根据句子的解析树的最佳句子 语义结构语言模型来澄清公认的言语。

    Word classing for language modeling
    4.
    发明授权
    Word classing for language modeling 有权
    用于语言建模的词分类

    公开(公告)号:US09367526B1

    公开(公告)日:2016-06-14

    申请号:US13190891

    申请日:2011-07-26

    摘要: A language processing application employs a classing function optimized for the underlying production application context for which it is expected to process speech. A combination of class based and word based features generates a classing function optimized for a particular production application, meaning that a language model employing the classing function uses word classes having a high likelihood of accurately predicting word sequences encountered by a language model invoked by the production application. The classing function optimizes word classes by aligning the objective of word classing with the underlying language processing task to be performed by the production application. The classing function is optimized to correspond to usage in the production application context using class-based and word-based features by computing a likelihood of a word in an n-gram and a frequency of a word within a class of the n-gram.

    摘要翻译: 语言处理应用程序使用针对其预期处理语音的底层生产应用程序环境进行优化的分类功能。 基于类和基于字的特征的组合产生针对特定生产应用优化的分类功能,这意味着采用分类函数的语言模型使用具有准确预测由生产调用的语言模型遇到的单词序列的高似然性的单词类 应用。 分类函数通过将单词分类的目标与生产应用程序执行的底层语言处理任务进行对齐来优化单词类。 通过计算n-gram中的单词和n-gram类中的单词的可能性,使用基于类和基于单词的特征来优化分类功能以对应于生产应用上下文中的使用。

    Machine Translation with Side Information
    6.
    发明申请
    Machine Translation with Side Information 有权
    机器翻译与侧面信息

    公开(公告)号:US20110282648A1

    公开(公告)日:2011-11-17

    申请号:US12779751

    申请日:2010-05-13

    IPC分类号: G06F17/28 G06F17/30 G06F7/00

    CPC分类号: G06F17/2818

    摘要: A method of identifying and using side information available to statistical machine translation systems within an enterprise setting, the method including extracting user-specific interaction and non-interaction-based information from at least one corresponding database within the enterprise for each of a plurality of users, aggregating the user-specific interaction and non-interaction based information from a plurality of users, by using a processor on a computer, to tune and adapt background translation and language models, and updating all relevant models within the enterprise after user activity based on the tuned and adapted translation and language models.

    摘要翻译: 一种识别和使用可用于企业设置内的统计机器翻译系统的侧面信息的方法,所述方法包括从多个用户中的每一个的企业内的至少一个对应的数据库中提取用户特定交互和非基于交互的信息 ,通过使用计算机上的处理器来聚合来自多个用户的用户特定交互和非基于交互的信息,以调整和适应背景翻译和语言模型,以及在基于用户活动的用户活动之后更新企业内的所有相关模型 调整和适应的翻译和语言模型。

    Method for fast semi-automatic semantic annotation
    7.
    发明授权
    Method for fast semi-automatic semantic annotation 有权
    快速半自动语义注释方法

    公开(公告)号:US07610191B2

    公开(公告)日:2009-10-27

    申请号:US10959523

    申请日:2004-10-06

    IPC分类号: G06F17/27

    CPC分类号: G06F17/271 G06F17/2755

    摘要: A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.

    摘要翻译: 提供了一种用于快速半自动语义注释的方法,装置和计算机指令。 给定有限的注释语料库,本发明使用解析器引擎,相似性引擎和SVM引擎为下一个有限注释语料库的每个单词分配标签和标签。 然后,流动站组合来自三个引擎的解析树,并自信地注释下一批有限注释语料库,从而减少人体注释所需的努力。

    Building multi-language processes from existing single-language processes
    9.
    发明授权
    Building multi-language processes from existing single-language processes 有权
    从现有的单一语言流程构建多语言流程

    公开(公告)号:US09098494B2

    公开(公告)日:2015-08-04

    申请号:US13469078

    申请日:2012-05-10

    CPC分类号: G06F17/289

    摘要: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.

    摘要翻译: 能够以一种或多种语言接受语言输入的过程通过重新使用与不同锚语言相关联的现有语言组件以及在锚语言和一种或多种语言之间进行翻译的机器翻译组件而产生。 语言输入针对机器翻译组件,将这种输入从其语言转换为锚语言。 然后利用那些现有的语言分量来启动响应处理并产生输出。 可选地,输出被引导通过机器翻译组件。 语言标识符可以最初接收语言输入并且识别提供这种语言输入以选择适当的机器翻译组件的语言。 包括与锚语言相关联的机器翻译组件和语言组件的混合过程也可以用作随时间创建单个语言过程的起始构造。

    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES
    10.
    发明申请
    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES 有权
    从现有的单一语言过程建立多语言过程

    公开(公告)号:US20130304451A1

    公开(公告)日:2013-11-14

    申请号:US13469078

    申请日:2012-05-10

    IPC分类号: G06F17/28 G10L15/26

    CPC分类号: G06F17/289

    摘要: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.

    摘要翻译: 能够以一种或多种语言接受语言输入的过程通过重新使用与不同锚语言相关联的现有语言组件以及在锚语言和一种或多种语言之间进行翻译的机器翻译组件而产生。 语言输入针对机器翻译组件,将这种输入从其语言转换为锚语言。 然后利用那些现有的语言分量来启动响应处理并产生输出。 可选地,输出被引导通过机器翻译组件。 语言标识符可以最初接收语言输入并且识别提供这种语言输入以选择适当的机器翻译组件的语言。 包括与锚语言相关联的机器翻译组件和语言组件的混合过程也可以用作随时间创建单个语言过程的起始构造。