LARGE LANGUAGE MODELS IN MACHINE TRANSLATION
    1.
    发明申请
    LARGE LANGUAGE MODELS IN MACHINE TRANSLATION 审中-公开
    机器翻译中的大量语言模型

    公开(公告)号:WO2008118905A2

    公开(公告)日:2008-10-02

    申请号:PCT/US2008/058116

    申请日:2008-03-25

    CPC classification number: G06F17/2818 G06F17/2827 G06F17/2845

    Abstract: Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus.

    Abstract translation: 提供了用于机器翻译的系统,方法和计算机程序产品。 在一些实现中,提供了一种系统。 该系统包括语言模型,其包括来自语料库的n-gram的集合,每个n-gram在语料库中具有对应的相对频率,并且n阶对应于n-gram中的令牌数量,每个n-gram对应 到具有n-1级的退避n-gram和回退分数的集合,与n-gram相关联的每个回退分数,作为退避因子的函数确定的退避分数和相应退避n的相对频率 -gram在语料库中。

    TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION
    2.
    发明申请
    TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION 审中-公开
    分布式光学字符识别和分布式语言翻译技术

    公开(公告)号:WO2015168056A1

    公开(公告)日:2015-11-05

    申请号:PCT/US2015/027884

    申请日:2015-04-28

    Applicant: GOOGLE INC.

    CPC classification number: G06K9/18 G06F17/289 G06K9/22 G06K9/325 G06K2209/01

    Abstract: Techniques for selectively distributing OCR and machine language translation tasks between a mobile computing device and servers includes receiving an image of an object comprising a text. The mobile computing device can determine a degree of optical character recognition (OCR) complexity for obtaining the text from the image. Based on this degree of OCR complexity, the mobile computing device and/or the server(s) can perform OCR to obtain an OCR text. The mobile computing device can determine a degree of translation complexity for translating the OCR text from its source language to a target language. Based on this degree of translation complexity, the mobile computing device and/or the server(s) can perform machine language translation of the OCR text from the source language to a target language to obtain a translated OCR text. The mobile computing device can then output the translated OCR text.

    Abstract translation: 用于在移动计算设备和服务器之间选择性地分发OCR和机器语言翻译任务的技术包括接收包括文本的对象的图像。 移动计算设备可以确定从图像中获得文本的光学字符识别(OCR)复杂程度。 基于这种程度的OCR复杂度,移动计算设备和/或服务器可以执行OCR以获得OCR文本。 移动计算设备可以确定将OCR文本从其源语言翻译成目标语言的翻译复杂程度。 基于这种翻译复杂度,移动计算设备和/或服务器可以执行OCR文本从源语言到目标语言的机器语言翻译,以获得翻译的OCR文本。 然后,移动计算设备可以输出翻译的OCR文本。

    PROVIDING ALTERNATIVE TRANSLATIONS
    3.
    发明申请
    PROVIDING ALTERNATIVE TRANSLATIONS 审中-公开
    提供替代翻译

    公开(公告)号:WO2012068074A1

    公开(公告)日:2012-05-24

    申请号:PCT/US2011/060739

    申请日:2011-11-15

    CPC classification number: G06F17/277 G06F17/2818

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting alternative translations. In one aspect, a method includes receiving source language text; receiving translated text corresponding to the source language text from a machine translation system; receiving segmentation data for the translated text, wherein the segmentation data includes a first segmentation of the translated text, the first segmentation dividing the translated text into two or more segments; receiving one or more alternative translations for each of the two or more segments; presenting the source text and the translated text to a user in a user interface; and in response to a user selection of a first portion of the translated text, displaying, in the user interface, one or more alternative translations for a first segment to which the first portion of translated text corresponds according to the first segmentation.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于呈现替代翻译的计算机程序。 一方面,一种方法包括接收源语言文本; 从机器翻译系统接收与源语言文本相对应的翻译文本; 接收所述翻译文本的分割数据,其中所述分割数据包括所述翻译文本的第一分割,所述第一分割将所述翻译文本分割成两个或多个分段; 为所述两个或更多个段中的每一个接收一个或多个替代的翻译; 在用户界面中向用户呈现源文本和翻译文本; 并且响应于用户选择所述翻译文本的第一部分,在所述用户界面中,根据所述第一分割,在所述翻译文本的第一部分对应的第一片段上显示一个或多个替代翻译。

    TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION
    4.
    发明申请
    TECHNIQUES FOR DISTRIBUTED OPTICAL CHARACTER RECOGNITION AND DISTRIBUTED MACHINE LANGUAGE TRANSLATION 审中-公开
    分布式光学字符识别技术和分布式机器语言翻译

    公开(公告)号:WO2015168051A1

    公开(公告)日:2015-11-05

    申请号:PCT/US2015/027873

    申请日:2015-04-28

    Applicant: GOOGLE INC.

    CPC classification number: G06K9/00979 G06F17/289 G06K2209/01

    Abstract: Techniques for selectively distributing OCR and machine language translation tasks between a mobile computing device and servers includes receiving an image of an object comprising a text. The mobile computing device can determine a degree of optical character recognition (OCR) complexity for obtaining the text from the image. Based on this degree of OCR complexity, the mobile computing device and/or the server(s) can perform OCR to obtain an OCR text. The mobile computing device can determine a degree of translation complexity for translating the OCR text from its source language to a target language. Based on this degree of translation complexity, the mobile computing device and/or the server(s) can perform machine language translation of the OCR text from the source language to a target language to obtain a translated OCR text. The mobile computing device can then output the translated OCR text.

    Abstract translation: 用于在移动计算设备和服务器之间选择性地分配OCR和机器语言翻译任务的技术包括接收包括文本的对象的图像。 移动计算设备可以确定用于从图像获得文本的光学字符识别(OCR)复杂程度。 基于这种OCR复杂度,移动计算设备和/或服务器可以执行OCR以获得OCR文本。 移动计算设备可以确定用于将OCR文本从其源语言翻译成目标语言的翻译复杂度。 基于这种翻译复杂度,移动计算设备和/或服务器可以执行OCR文本从源语言到目标语言的机器语言翻译以获得翻译的OCR文本。 移动计算设备然后可以输出翻译的OCR文本。

    SPEECH RECOGNITION USING VARIABLE-LENGTH CONTEXT
    5.
    发明申请
    SPEECH RECOGNITION USING VARIABLE-LENGTH CONTEXT 审中-公开
    使用可变长度语境的语音识别

    公开(公告)号:WO2013003772A2

    公开(公告)日:2013-01-03

    申请号:PCT/US2012/045039

    申请日:2012-06-29

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech using a variable length of context. Speech data and data identifying a candidate transcription for the speech data are received. A phonetic representation for the candidate transcription is accessed. Multiple test sequences are extracted for a particular phone in the phonetic representation. Each of the multiple test sequences includes a different set of contextual phones surrounding the particular phone. Data indicating that an acoustic model includes data corresponding to one or more of the multiple test sequences is received. From among the one or more test sequences, the test sequence that includes the highest number of contextual phones is selected. A score for the candidate transcription is generated based on the data from the acoustic model that corresponds to the selected test sequence.

    Abstract translation: 包括编码在计算机存储介质上的计算机程序的方法,系统和装置,用于使用可变长度的上下文来识别语音。 语音数据和识别语音数据的候选转录的数据被接收。 候选转录的语音表示被访问。 为语音表示中的特定电话提取多个测试序列。 多个测试序列中的每一个都包括围绕特定电话的不同组的上下文电话。 接收指示声学模型包括与多个测试序列中的一个或多个相对应的数据的数据。 从一个或多个测试序列中,选择包括最多数量的上下文电话的测试序列。 基于来自与所选测试序列对应的声学模型的数据生成候选转录的分数。

Patent Agency Ranking