Watermarking of structured results and watermark generation
    1.
    发明授权
    Watermarking of structured results and watermark generation 有权
    结构化水印和水印生成

    公开(公告)号:US08788507B1

    公开(公告)日:2014-07-22

    申请号:US13296451

    申请日:2011-11-15

    CPC classification number: G06F17/30905

    Abstract: A way of generating a watermark for a structured result, such as a search result or a machine translation. A hash function is used to generate a bit sequence for each of a plurality of structured results. A ranking score is generated for each resulting bit sequence. The ranking score can be based on the detectability of the bit sequence compared to a randomly-generated bit sequence and the quality of each of the structured results. A structured result is selected as the watermarked structured result based upon the ranking score.

    Abstract translation: 为结构化结果生成水印的方法,如搜索结果或机器翻译。 哈希函数用于为多个结构化结果中的每一个生成比特序列。 为每个结果比特序列生成排名得分。 排序得分可以基于与随机生成的比特序列相比的比特序列的可检测性以及每个结构化结果的质量。 基于排名得分,选择结构化结果作为水印结构化结果。

    Optimizing parameters for machine translation
    2.
    发明授权
    Optimizing parameters for machine translation 有权
    优化机器翻译参数

    公开(公告)号:US08401836B1

    公开(公告)日:2013-03-19

    申请号:US13528426

    申请日:2012-06-20

    CPC classification number: G06F17/2809

    Abstract: Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one aspect, a method includes accessing a translation hypergraph that represents a plurality of candidate translations, the translation hypergraph including a plurality of paths including nodes connected by edges; calculating first posterior probabilities for each edge in the translation hypergraph; calculating second posterior probabilities for each n-gram represented in the translation hypergraph based on the first posterior probabilities; and performing decoding on the translation hypergraph using the second posterior probabilities to convert a sample text from a first language to a second language.

    Abstract translation: 公开了用于语言翻译的方法,系统和装置,包括计算机程序产品。 一方面,一种方法包括访问表示多个候选翻译的翻译超图,所述翻译超图包括包括通过边连接的节点的多个路径; 计算翻译超图中每个边缘的第一个后验概率; 基于第一后验概率计算在翻译超图中表示的每个n-gram的第二后验概率; 以及使用所述第二后验概率在所述翻译超图上执行解码以将来自第一语言的样本文本转换为第二语言。

    Method and System for Translating Information with a Higher Probability of a Correct Translation
    3.
    发明申请
    Method and System for Translating Information with a Higher Probability of a Correct Translation 有权
    翻译信息的方法和系统具有更高的正确翻译概率

    公开(公告)号:US20080270109A1

    公开(公告)日:2008-10-30

    申请号:US12132401

    申请日:2008-06-03

    Inventor: Franz Josef Och

    CPC classification number: G06F17/2818

    Abstract: A system with a nonstatistical translation component integrated with a statistical translation component engine. The same corpus may be used for training the statistical engine and also for determining when to use the statistical engine and when to use the translation component. This training may use probabilistic techniques. Both the statistical engine and the translation components may be capable of translating the same information, however the system determines which component to use based on the training. Retraining can be carried out to add additional components, or when after additional translator training.

    Abstract translation: 具有与统计翻译组件引擎集成的非统计翻译组件的系统。 同一语料库可用于训练统计引擎,也可用于确定何时使用统计引擎以及何时使用翻译组件。 这种训练可能会使用概率技术。 统计引擎和翻译组件都能够翻译相同的信息,然而系统基于训练确定要使用的组件。 可以进行再培训以增加额外的组成部分,或者经过额外的翻译培训之后。

    Minimum error rate training with a large number of features for machine learning
    4.
    发明授权
    Minimum error rate training with a large number of features for machine learning 有权
    用于机器学习的大量功能的最小错误率训练

    公开(公告)号:US08645119B2

    公开(公告)日:2014-02-04

    申请号:US12056083

    申请日:2008-03-26

    CPC classification number: G06F17/2845

    Abstract: Systems, methods, and apparatuses including computer program products for machine learning. A method is provided that includes determining model parameters for a plurality of feature functions for a linear machine learning model, ranking the plurality of feature functions according to a quality criterion, and selecting, using the ranking, a group of feature functions from the plurality of feature functions to update with the determined model parameters.

    Abstract translation: 包括用于机器学习的计算机程序产品的系统,方法和装置。 提供一种方法,其包括确定用于线性机器学习模型的多个特征函数的模型参数,根据质量标准对所述多个特征函数进行排序,以及使用所述排名来选择来自所述多个特征函数的一组特征函数 特征功能使用确定的模型参数进行更新。

    LINGUISTIC KEY NORMALIZATION
    5.
    发明申请
    LINGUISTIC KEY NORMALIZATION 有权
    LINGUISTIC关键正常化

    公开(公告)号:US20130151235A1

    公开(公告)日:2013-06-13

    申请号:US12411224

    申请日:2009-03-25

    CPC classification number: G06F17/27

    Abstract: Systems, methods, and apparatuses including computer program products are provided for training machine learning systems. In some implementations, a method is provided. The method includes receiving a collection of phrases, normalizing a plurality of phrases of the collection of phrases, the normalizing being based at least in part on lexicographic normalizing rules, and generating a normalized phrase table including a plurality of key-value pairs, each key value pair includes a key corresponding to a normalized phrase and a value corresponding to one or more un-normalized phrases associated with the normalized key, each un-normalized phrase having one or more parameters.

    Abstract translation: 提供包括计算机程序产品在内的系统,方法和设备用于训练机器学习系统。 在一些实现中,提供了一种方法。 该方法包括接收短语集合,归一化短语集合中的多个短语,归一化至少部分地基于词典标准化规则,以及生成包括多个键值对的标准化短语表,每个键 值对包括对应于归一化短语的键和对应于与归一化键相关联的一个或多个非标准化短语的值,每个非正规化短语具有一个或多个参数。

    MINIMUM ERROR RATE TRAINING WITH A LARGE NUMBER OF FEATURES FOR MACHINE LEARNING
    6.
    发明申请
    MINIMUM ERROR RATE TRAINING WITH A LARGE NUMBER OF FEATURES FOR MACHINE LEARNING 有权
    具有大量机器学习功能的最小错误率训练

    公开(公告)号:US20130144593A1

    公开(公告)日:2013-06-06

    申请号:US12056083

    申请日:2008-03-26

    CPC classification number: G06F17/2845

    Abstract: Systems, methods, and apparatuses including computer program products for machine learning. A method is provided that includes determining model parameters for a plurality of feature functions for a linear machine learning model, ranking the plurality of feature functions according to a quality criterion, and selecting, using the ranking, a group of feature functions from the plurality of feature functions to update with the determined model parameters.

    Abstract translation: 包括用于机器学习的计算机程序产品的系统,方法和装置。 提供一种方法,其包括确定用于线性机器学习模型的多个特征函数的模型参数,根据质量标准对所述多个特征函数进行排序,以及使用所述排名来选择来自所述多个特征函数的一组特征函数 特征功能使用确定的模型参数进行更新。

    OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION
    8.
    发明申请
    OPTIMIZING PARAMETERS FOR MACHINE TRANSLATION 审中-公开
    机器翻译优化参数

    公开(公告)号:US20100004919A1

    公开(公告)日:2010-01-07

    申请号:US12497169

    申请日:2009-07-02

    CPC classification number: G06F17/2818

    Abstract: Methods, systems, and apparatus, including computer program products, for language translation are disclosed. In one implementation, a method is provided. The method includes determining, for a plurality of feature functions in a translation lattice, a corresponding plurality of error surfaces for each of one or more candidate translations represented in the translation lattice; adjusting weights for the feature functions by traversing a combination of the plurality of error surfaces for phrases in a training set; selecting weighting values that minimize error counts for the traversed combination; and applying the selected weighting values to convert a sample of text from a first language to a second language.

    Abstract translation: 公开了用于语言翻译的方法,系统和设备,包括计算机程序产品。 在一个实现中,提供了一种方法。 所述方法包括为所述平移网格中的多个特征函数确定针对在所述平移网格中表示的一个或多个候选翻译中的每一个的对应的多个误差表面; 通过遍历训练集中的短语的多个误差表面的组合来调整特征功能的权重; 选择最小化遍历组合的错误计数的加权值; 以及应用所选择的加权值以将来自第一语言的文本样本转换为第二语言。

    Method and system for translating information with a higher probability of a correct translation
    9.
    发明授权
    Method and system for translating information with a higher probability of a correct translation 有权
    用于以更高的正确翻译概率翻译信息的方法和系统

    公开(公告)号:US08977536B2

    公开(公告)日:2015-03-10

    申请号:US12132401

    申请日:2008-06-03

    Inventor: Franz Josef Och

    CPC classification number: G06F17/2818

    Abstract: A system with a nonstatistical translation component integrated with a statistical translation component engine. The same corpus may be used for training the statistical engine and also for determining when to use the statistical engine and when to use the translation component. This training may use probabilistic techniques. Both the statistical engine and the translation components may be capable of translating the same information, however the system determines which component to use based on the training. Retraining can be carried out to add additional components, or when after additional translator training.

    Abstract translation: 具有与统计翻译组件引擎集成的非统计翻译组件的系统。 同一语料库可用于训练统计引擎,也可用于确定何时使用统计引擎以及何时使用翻译组件。 这种训练可能会使用概率技术。 统计引擎和翻译组件都能够翻译相同的信息,然而系统基于训练确定要使用的组件。 可以进行再培训以增加额外的组成部分,或者经过额外的翻译培训之后。

    Optical character recognition
    10.
    发明授权
    Optical character recognition 有权
    光学字符识别

    公开(公告)号:US08953885B1

    公开(公告)日:2015-02-10

    申请号:US13617710

    申请日:2012-09-14

    CPC classification number: G06K9/723 G06K9/50 G06K2209/01

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing optical character recognition. In one aspect, a method includes receiving a text image I. A set of feature functions are evaluated for a log linear model to determine respective feature values for the text image I, wherein each feature function hi maps the text image I to a feature value, and wherein each feature function hi is associated with a respective feature weight λi. A transcription {circumflex over (T)} is determined that minimizes a cost of the log linear model.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行光学字符识别的计算机程序。 一方面,一种方法包括接收文本图像I.对于对数线性模型评估一组特征函数,以确定文本图像I的各个特征值,其中每个特征函数hi将文本图像I映射到特征值 ,并且其中每个特征函数hi与相应的特征权重λi相关联。 确定一个记录(T)}的转录,使日志线性模型的成本最小化。

Patent Agency Ranking