OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION
    1.
    发明申请
    OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION 审中-公开
    机器翻译优化参数

    公开(公告)号:US20100004919A1

    公开(公告)日:2010-01-07

    申请号:US12497169

    申请日:2009-07-02

    IPC分类号: G06F17/28 G06F17/20

    CPC分类号: G06F17/2818

    摘要: Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one implementation, a method is provided. The method includes determining, for a plurality of feature functions in a translation lattice, a corresponding plurality of error surfaces for each of one or more candidate translations represented in the translation lattice; adjusting weights for the feature functions by traversing a combination of the plurality of error surfaces for phrases in a training set; selecting weighting values that minimize error counts for the traversed combination; and applying the selected weighting values to convert a sample of text from a first language to a second language.

    摘要翻译: 公开了用于语言翻译的方法,系统和设备,包括计算机程序产品。 在一个实现中,提供了一种方法。 所述方法包括为所述平移网格中的多个特征函数确定针对在所述平移网格中表示的一个或多个候选翻译中的每一个的对应的多个误差表面; 通过遍历训练集中的短语的多个误差表面的组合来调整特征功能的权重; 选择最小化遍历组合的错误计数的加权值; 以及应用所选择的加权值以将来自第一语言的文本样本转换为第二语言。

    Linguistic key normalization
    2.
    发明授权
    Linguistic key normalization 有权
    语言关键正常化

    公开(公告)号:US08521516B2

    公开(公告)日:2013-08-27

    申请号:US12411224

    申请日:2009-03-25

    IPC分类号: G06F17/21

    CPC分类号: G06F17/27

    摘要: Systems, methods, and apparatuses including computer program products are provided for training machine learning systems. In some implementations, a method is provided. The method includes receiving a collection of phrases, normalizing a plurality of phrases of the collection of phrases, the normalizing being based at least in part on lexicographic normalizing rules, and generating a normalized phrase table including a plurality of key-value pairs, each key value pair includes a key corresponding to a normalized phrase and a value corresponding to one or more un-normalized phrases associated with the normalized key, each un-normalized phrase having one or more parameters.

    摘要翻译: 提供包括计算机程序产品在内的系统,方法和设备用于训练机器学习系统。 在一些实现中,提供了一种方法。 该方法包括接收短语集合,归一化短语集合中的多个短语,归一化至少部分地基于词典标准化规则,以及生成包括多个键值对的标准化短语表,每个键 值对包括对应于归一化短语的键和对应于与归一化键相关联的一个或多个非标准化短语的值,每个非正规化短语具有一个或多个参数。

    Minimum error rate training with a large number of features for machine learning
    4.
    发明授权
    Minimum error rate training with a large number of features for machine learning 有权
    用于机器学习的大量功能的最小错误率训练

    公开(公告)号:US08645119B2

    公开(公告)日:2014-02-04

    申请号:US12056083

    申请日:2008-03-26

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2845

    摘要: Systems, methods, and apparatuses including computer program products for machine learning. A method is provided that includes determining model parameters for a plurality of feature functions for a linear machine learning model, ranking the plurality of feature functions according to a quality criterion, and selecting, using the ranking, a group of feature functions from the plurality of feature functions to update with the determined model parameters.

    摘要翻译: 包括用于机器学习的计算机程序产品的系统,方法和装置。 提供一种方法,其包括确定用于线性机器学习模型的多个特征函数的模型参数,根据质量标准对所述多个特征函数进行排序,以及使用所述排名来选择来自所述多个特征函数的一组特征函数 特征功能使用确定的模型参数进行更新。

    LINGUISTIC KEY NORMALIZATION
    5.
    发明申请
    LINGUISTIC KEY NORMALIZATION 有权
    LINGUISTIC关键正常化

    公开(公告)号:US20130151235A1

    公开(公告)日:2013-06-13

    申请号:US12411224

    申请日:2009-03-25

    IPC分类号: G06F17/27

    CPC分类号: G06F17/27

    摘要: Systems, methods, and apparatuses including computer program products are provided for training machine learning systems. In some implementations, a method is provided. The method includes receiving a collection of phrases, normalizing a plurality of phrases of the collection of phrases, the normalizing being based at least in part on lexicographic normalizing rules, and generating a normalized phrase table including a plurality of key-value pairs, each key value pair includes a key corresponding to a normalized phrase and a value corresponding to one or more un-normalized phrases associated with the normalized key, each un-normalized phrase having one or more parameters.

    摘要翻译: 提供包括计算机程序产品在内的系统,方法和设备用于训练机器学习系统。 在一些实现中,提供了一种方法。 该方法包括接收短语集合,归一化短语集合中的多个短语,归一化至少部分地基于词典标准化规则,以及生成包括多个键值对的标准化短语表,每个键 值对包括对应于归一化短语的键和对应于与归一化键相关联的一个或多个非标准化短语的值,每个非正规化短语具有一个或多个参数。

    MINIMUM ERROR RATE TRAINING WITH A LARGE NUMBER OF FEATURES FOR MACHINE LEARNING
    6.
    发明申请
    MINIMUM ERROR RATE TRAINING WITH A LARGE NUMBER OF FEATURES FOR MACHINE LEARNING 有权
    具有大量机器学习功能的最小错误率训练

    公开(公告)号:US20130144593A1

    公开(公告)日:2013-06-06

    申请号:US12056083

    申请日:2008-03-26

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2845

    摘要: Systems, methods, and apparatuses including computer program products for machine learning. A method is provided that includes determining model parameters for a plurality of feature functions for a linear machine learning model, ranking the plurality of feature functions according to a quality criterion, and selecting, using the ranking, a group of feature functions from the plurality of feature functions to update with the determined model parameters.

    摘要翻译: 包括用于机器学习的计算机程序产品的系统,方法和装置。 提供一种方法,其包括确定用于线性机器学习模型的多个特征函数的模型参数,根据质量标准对所述多个特征函数进行排序,以及使用所述排名来选择来自所述多个特征函数的一组特征函数 特征功能使用确定的模型参数进行更新。

    Providing alternative translations
    8.
    发明授权
    Providing alternative translations 有权
    提供替代翻译

    公开(公告)号:US08635059B2

    公开(公告)日:2014-01-21

    申请号:US13046468

    申请日:2011-03-11

    IPC分类号: G06F17/28

    CPC分类号: G06F17/277 G06F17/2818

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting alternative translations. In one aspect, a method includes receiving source language text; receiving translated text corresponding to the source language text from a machine translation system; receiving segmentation data for the translated text, wherein the segmentation data includes a first segmentation of the translated text, the first segmentation dividing the translated text into two or more segments; receiving one or more alternative translations for each of the two or more segments; presenting the source text and the translated text to a user in a user interface; and in response to a user selection of a first portion of the translated text, displaying, in the user interface, one or more alternative translations for a first segment to which the first portion of translated text corresponds according to the first segmentation.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的用于呈现替代翻译的计算机程序。 一方面,一种方法包括接收源语言文本; 从机器翻译系统接收与源语言文本相对应的翻译文本; 接收所述翻译文本的分割数据,其中所述分割数据包括所述翻译文本的第一分割,所述第一分割将所述翻译文本分割成两个或多个分段; 为所述两个或更多个段中的每一个接收一个或多个替代的翻译; 在用户界面中向用户呈现源文本和翻译文本; 并且响应于用户选择所述翻译文本的第一部分,在所述用户界面中,根据所述第一分割在所述翻译文本的第一部分对应的第一片段上显示一个或多个替代翻译。

    Optimizing parameters for machine translation
    9.
    发明授权
    Optimizing parameters for machine translation 有权
    优化机器翻译参数

    公开(公告)号:US08401836B1

    公开(公告)日:2013-03-19

    申请号:US13528426

    申请日:2012-06-20

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2809

    摘要: Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one aspect, a method includes accessing a translation hypergraph that represents a plurality of candidate translations, the translation hypergraph including a plurality of paths including nodes connected by edges; calculating first posterior probabilities for each edge in the translation hypergraph; calculating second posterior probabilities for each n-gram represented in the translation hypergraph based on the first posterior probabilities; and performing decoding on the translation hypergraph using the second posterior probabilities to convert a sample text from a first language to a second language.

    摘要翻译: 公开了用于语言翻译的方法,系统和装置,包括计算机程序产品。 一方面,一种方法包括访问表示多个候选翻译的翻译超图,所述翻译超图包括包括通过边连接的节点的多个路径; 计算翻译超图中每个边缘的第一个后验概率; 基于第一后验概率计算在翻译超图中表示的每个n-gram的第二后验概率; 以及使用所述第二后验概率在所述翻译超图上执行解码以将来自第一语言的样本文本转换为第二语言。

    Optimizing parameters for machine translation
    10.
    发明授权
    Optimizing parameters for machine translation 失效
    优化机器翻译参数

    公开(公告)号:US08285536B1

    公开(公告)日:2012-10-09

    申请号:US12533519

    申请日:2009-07-31

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2809

    摘要: Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one aspect, a method includes accessing a translation hypergraph that represents a plurality of candidate translations, the translation hypergraph including a plurality of paths including nodes connected by edges; calculating first posterior probabilities for each edge in the translation hypergraph; calculating second posterior probabilities for each n-gram represented in the translation hypergraph based on the first posterior probabilities; and performing decoding on the translation hypergraph using the second posterior probabilities to convert a sample text from a first language to a second language.

    摘要翻译: 公开了用于语言翻译的方法,系统和装置,包括计算机程序产品。 一方面,一种方法包括访问代表多个候选翻译的翻译超图,所述翻译超图包括包括通过边连接的节点的多个路径; 计算翻译超图中每个边缘的第一个后验概率; 基于第一后验概率计算在翻译超图中表示的每个n-gram的第二后验概率; 以及使用所述第二后验概率在所述翻译超图上执行解码以将来自第一语言的样本文本转换为第二语言。