Systems And Methods For Training Translation Models Using Source-Augmented Training Examples

    公开(公告)号:US20230419053A1

    公开(公告)日:2023-12-28

    申请号:US17988315

    申请日:2022-11-16

    Applicant: Google LLC

    CPC classification number: G06F40/58 G06F40/51 G06F40/49

    Abstract: Systems and methods for training a translation model based on a first text sequence in a first language, a second text sequence in a second language different from the first language, and a label based on a source of the second text sequence. In some examples, the label may comprise an Internet domain, an Internet subdomain, a uniform resource locator, a website name, or an IP address. In some examples, the label may further indicate a source of the first text sequence. In some examples, each given training example may be automatically generated by sampling the first text sequence from a first page of a given Internet domain, sampling the second text sequence from a second page of the given Internet domain, and generating the label based on all or a portion of source data of the second page.

    Systems and Methods for Contextual Post-Editing of Sentence-Level Translations

    公开(公告)号:US20230252245A1

    公开(公告)日:2023-08-10

    申请号:US18015551

    申请日:2020-08-07

    Applicant: Google LLC

    Inventor: Melvin Johnson

    CPC classification number: G06F40/51 G06F40/47

    Abstract: Generally, the present disclosure is directed to systems and methods that leverage machine learning to perform post-editing of sentence-level translations that takes into account contextual information from the language source. As an example, the proposed post-editing system can run as a second pass to a sentence-level translation system and the goal of the post-editing system may be to refine translations which are affected by the larger context.

Patent Agency Ranking