Modeling Ambiguity in Neural Machine Translation

    公开(公告)号:US20230351125A1

    公开(公告)日:2023-11-02

    申请号:US18089684

    申请日:2022-12-28

    Applicant: Google LLC

    CPC classification number: G06F40/58 G06F40/284

    Abstract: The technology addresses ambiguity in neural machine translation. An encoder module receives a given text exemplar and generates an encoded representation of it. A decoder module receives the encoded representation and a set of translation prefixes. The decoder module outputs an unbounded function corresponding to a set of tokens associated with each pair of the given text exemplar and translation prefix from the set of translation prefixes. Each token is assigned a probability between 0 and 1 in a vocabulary of the exemplar at each time step. A logits module generates, based on the unbounded function, a corresponding bounded conditional probability for each token, wherein the probabilities are not normalized over the vocabulary at each time step. A loss function module having a positive loss component and a scaled negative loss component identifies whether each target text of a set of target texts is a valid translation of the exemplar.

Patent Agency Ranking