-
公开(公告)号:US20220237435A1
公开(公告)日:2022-07-28
申请号:US17159437
申请日:2021-01-27
Applicant: Google LLC
Inventor: Yanping Huang , Dmitry Lepikhin , Maxim Krikun , Orhan Firat , Ankur Bapna , Thang Luong , Sneha Kudugunta
Abstract: Systems and methods for routing in mixture-of-expert models. In some aspects of the technology, a transformer may have at least one Mixture-of-Experts (“MoE”) layer in each of its encoder and decoder, with the at least one MoE layer of the encoder having a learned gating function configured to route each token of a task to two or more selected expert feed-forward networks, and the at least one MoE layer of the decoder having a learned gating function configured to route each task to two or more selected expert feed-forward networks.
-
公开(公告)号:US20210097144A1
公开(公告)日:2021-04-01
申请号:US16590309
申请日:2019-10-01
Applicant: Google LLC
Inventor: Ankur Bapna , Ye Tian , Orhan Firat
Abstract: Adapters for neural machine translation systems. A method includes determining a set of similar n-grams that are similar to a source n-gram, and each similar n-gram and the source n-gram is in a first language; determining, for each n-gram in the set of similar n-grams, a target n-gram is a translation of the similar n-gram in the first language to the target n-gram in the second language; generating a source encoding of the source n-gram, and, for each target n-gram determined from the set of similar n-grams determined for the source n-gram, a target encoding of the target n-gram and a conditional source target memory that is an encoding of each of the target encodings; providing, as input to a first prediction model, the source encoding and the condition source target memory; and generating a predicted translation of the source n-gram from the first language to the second language.
-