Patent search ap:("Facebook Page Inc.") AND inv:"Matthias Gerhard Eck"

1.

发明授权
Language model using reverse translations 有权

公开(公告)号：US10460040B2

公开(公告)日：2019-10-29

申请号：US15194249

申请日：2016-06-27

Applicant: Facebook, Inc.

Inventor： Matthias Gerhard Eck

IPC: G06F17/28 , G06F17/27

Abstract: Exemplary embodiments relate to techniques for improving machine translation systems. The machine translation system may apply one or more models for translating material from a source language into a destination language. The models are initially trained using training data. According to exemplary embodiments, supplemental training data is used to train the models, where the supplemental training data uses in-domain material to improve the quality of output translations. In-domain data may include data that relates to the same or similar topics as those expected to be encountered in a translation of material from the source language into the destination language. In-domain data may include material previously translated from the source language into the destination language, material similar to previous translations, and destination language material that has previously been the subject of a request for translation into the source language.

2.

发明授权
Data sorting for language processing such as POS tagging 有权

公开(公告)号：US09916299B2

公开(公告)日：2018-03-13

申请号：US15416186

申请日：2017-01-26

Applicant: Facebook, Inc.

Inventor： Matthias Gerhard Eck

IPC: G06F17/27 , G06F17/28 , G06F17/20 , G06F17/21 , H04L12/58

CPC classification number: G06F17/274 , G06F17/218 , G06F17/2715 , G06F17/273 , G06F17/28 , G06F17/2818 , G06N5/022 , G06Q50/01 , H04L51/32

Abstract: Technology is disclosed that improves language coverage by selecting sentences to be used as training data for a language processing engine. The technology accomplishes the selection of a number of sentences by obtaining a group of sentences, computing a score for each sentence, sorting the sentences based on their scores, and selecting a number of sentences with the highest scores. The scores can be computed by dividing a sum of frequency values of unseen words (or n-grams) in the sentence by a length of the sentence. The frequency values can be based on posts in one or more particular domains, such as the public domain, the private domain, or other specialized domains.

3.

发明授权
Identifying risky translations 有权

公开(公告)号：US10318640B2

公开(公告)日：2019-06-11

申请号：US15192076

申请日：2016-06-24

Applicant: Facebook, Inc.

Inventor： William Arthur Hughes , Matthias Gerhard Eck , Kay Rottmann

IPC: G06F17/28 , G06F17/27 , G10L15/26

Abstract: Exemplary embodiments provide techniques for evaluating when words or phrases of a translation were generated with a low degree of confidence, and conveying this information when the translation is presented. For example, if a source language word is encountered in source material for translation, but the source language word was only encountered a few times (or not at all) in the training data used to train the translation system, then the resulting translation may be flagged as being of low confidence. Other situations, such as the generation of two equally-likely translations, or translation system model disagreement, may also indicate a questionable translation. When the translation is displayed, questionable words and phrases may be flagged, and possible alternative translations may be presented. If one of the alternatives is selected, this information may be used to update the translation system's models in order to improve translation quality in the future.

4.

发明申请
MACHINE-TRANSLATION BASED CORRECTIONS 审中-公开

公开(公告)号：US20190018837A1

公开(公告)日：2019-01-17

申请号：US15868970

申请日：2018-01-11

Applicant: Facebook, Inc.

Inventor： Juan Miguel Pino , Matthias Gerhard Eck , Rui Andre Augusto Ferreira

IPC: G06F17/27

CPC classification number: G06F17/2775 , G06F17/273

Abstract: Technology is disclosed for building correction models that correct natural language snippets. Correction models can include rules comprising pairs of word sequences identified from viable correction snippet pairs, where a first sequence of words in the pair should be replaced with a second sequence of words in the pair. Viable correction snippet pairs can be identified from among pairs of language snippets, such as a post to a social media website and a subsequent update to that post. Viable corrections can be the snippet pairs that both have no more unaligned words than a word alignment threshold and have no aligned word pair with a character edit difference above an edit distance threshold. In some implementations, word alignments can be found by aligning all the characters between a pair of language snippets, and identifying aligned words as those that have at least one aligned letter in common.

5.

发明授权
Machine-translation based corrections 有权

公开(公告)号：US09904672B2

公开(公告)日：2018-02-27

申请号：US14788679

申请日：2015-06-30

Applicant: Facebook, Inc.

Inventor： Juan Miguel Pino , Matthias Gerhard Eck , Rui Andre Augusto Ferreira

IPC: G06F17/27

CPC classification number: G06F17/2775 , G06F17/273

Abstract: Technology is disclosed for building correction models that correct natural language snippets. Correction models can include rules comprising pairs of word sequences identified from viable correction snippet pairs, where a first sequence of words in the pair should be replaced with a second sequence of words in the pair. Viable correction snippet pairs can be identified from among pairs of language snippets, such as a post to a social media website and a subsequent update to that post. Viable corrections can be the snippet pairs that both have no more unaligned words than a word alignment threshold and have no aligned word pair with a character edit difference above an edit distance threshold. In some implementations, word alignments can be found by aligning all the characters between a pair of language snippets, and identifying aligned words as those that have at least one aligned letter in common.

6.

发明申请
CORRECTIONS FOR NATURAL LANGUAGE PROCESSING 审中-公开
Title translation: 自然语言处理的修正

公开(公告)号：US20170004120A1

公开(公告)日：2017-01-05

申请号：US14788578

申请日：2015-06-30

Applicant: Facebook, Inc.

Inventor： Matthias Gerhard Eck , Fei Huang , Kay Rottmann

IPC: G06F17/24 , G06F17/28 , G06F17/27 , G06F17/22

CPC classification number: G06F17/2775 , G06F17/273

Abstract: Technology is disclosed for correcting items containing natural language words that match qualified corrections. Qualified corrections can be identified from language snippet sets, which can include, for example, a post to a social media website and one or more updates to that post. Qualified corrections can be word pairs identified in one of these language snippet sets by aligning words between the language snippets according to a minimum word edit distance and computing that the word edit distance is below a first threshold. Based on this word alignment, word pairs can be selected and analyzed to identify qualified corrections as the word pairs that have a minimum character edit distance below a second threshold. In some cases, such as where both words in the qualified correction word pair are known words, a context can be associated with the qualified correction to control when the qualified correction should be applied.

Abstract translation: 公开了用于校正包含符合合格更正的自然语言单词的项目的技术。可以从语言片段集中识别合格的更正，例如，可以将社交媒体网站的帖子和该帖子的一个或多个更新。通过根据最小单词编辑距离对准语言片段之间的单词并计算单词编辑距离低于第一阈值，可以通过这些语言片段集合之一识别的合格校正。基于该字对齐，可以选择和分析字对以将合格的校正识别为具有低于第二阈值的最小字符编辑距离的字对。在某些情况下，例如在合格校正字对中的两个字都是已知字的情况下，上下文可以与合格校正相关联，以便在应用合格校正时进行控制。

7.

发明授权
Machine-translation based corrections 有权

公开(公告)号：US10474751B2

公开(公告)日：2019-11-12

申请号：US15868970

申请日：2018-01-11

Applicant: Facebook, Inc.

Inventor： Juan Miguel Pino , Matthias Gerhard Eck , Rui Andre Augusto Ferreira

IPC: G06F17/27

Abstract: Technology is disclosed for building correction models that correct natural language snippets. Correction models can include rules comprising pairs of word sequences identified from viable correction snippet pairs, where a first sequence of words in the pair should be replaced with a second sequence of words in the pair. Viable correction snippet pairs can be identified from among pairs of language snippets, such as a post to a social media website and a subsequent update to that post. Viable corrections can be the snippet pairs that both have no more unaligned words than a word alignment threshold and have no aligned word pair with a character edit difference above an edit distance threshold. In some implementations, word alignments can be found by aligning all the characters between a pair of language snippets, and identifying aligned words as those that have at least one aligned letter in common.

8.

发明授权
Machine translation system employing classifier 有权

公开(公告)号：US10268686B2

公开(公告)日：2019-04-23

申请号：US15192170

申请日：2016-06-24

Applicant: Facebook, Inc.

Inventor： Matthias Gerhard Eck , Priya Goyal

IPC: G06F17/27 , G06F17/28

Abstract: Exemplary embodiments relate to detecting, removing, and/or replacing objectionable words and phrases in a machine-generated translation. A classifier identifies translations containing target words or phrases. The classifier may be applied to the output translation to remove target words and phrases from the translation, or to prevent target words and phrases from being automatically presented. Further, the classifier may be applied to a translation model to prevent the target words and phrases from appearing in the output translation. Still further, the classifier may be applied to training data so that the translation model is not trained using the target words of phrases. The classifier may remove target words or phrases only when the target words or phrases appear in the output translation but not the source language input data. The classifier may be provided as a standalone service, or may be employed in the context of a machine translation system.

9.

发明申请
MINING MULTI-LINGUAL DATA 审中-公开

公开(公告)号：US20180089178A1

公开(公告)日：2018-03-29

申请号：US15823492

申请日：2017-11-27

Applicant: Facebook, Inc.

Inventor： Matthias Gerhard Eck , Ying Zhang , Yury Andreyevich Zemlyanskiy , Alexander Waibel

IPC: G06F17/28 , G06F17/30

CPC classification number: G06F17/289 , G06F16/951 , G06F17/2818 , G06F17/2827

Abstract: Technology is disclosed for mining training data to create machine translation engines. Training data can be mined as translation pairs from single content items that contain multiple languages; multiple content items in different languages that are related to the same or similar target; or multiple content items that are generated by the same author in different languages. Locating content items can include identifying potential sources of translation pairs that fall into these categories and applying filtering techniques to quickly gather those that are good candidates for being actual translation pairs. When actual translation pairs are located, they can be used to retrain a machine translation engine as in-domain for social media content items.

10.

发明申请
IDENTIFYING RISKY TRANSLATIONS 审中-公开

公开(公告)号：US20170371867A1

公开(公告)日：2017-12-28

申请号：US15192076

申请日：2016-06-24

Applicant: Facebook, Inc.

Inventor： William Arthur Hughes , Matthias Gerhard Eck , Kay Rottmann

IPC: G06F17/28 , G06F17/27

CPC classification number: G06F17/2854 , G06F17/2818

Abstract: Exemplary embodiments provide techniques for evaluating when words or phrases of a translation were generated with a low degree of confidence, and conveying this information when the translation is presented. For example, if a source language word is encountered in source material for translation, but the source language word was only encountered a few times (or not at all) in the training data used to train the translation system, then the resulting translation may be flagged as being of low confidence. Other situations, such as the generation of two equally-likely translations, or translation system model disagreement, may also indicate a questionable translation. When the translation is displayed, questionable words and phrases may be flagged, and possible alternative translations may be presented. If one of the alternatives is selected, this information may be used to update the translation system's models in order to improve translation quality in the future.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification