-
公开(公告)号:US20220108212A1
公开(公告)日:2022-04-07
申请号:US17308033
申请日:2021-05-04
Applicant: Apple Inc.
Inventor: Shuangfei ZHAI , Walter A. TALBOTT , Nitish SRIVASTAVA , Chen HUANG , Hanlin GOH , Joshua M. SUSSKIND
Abstract: Attention-free transformers are disclosed. Various implementations of attention-free transformers include a gating and pooling operation that allows the attention-free transformers to provide comparable or better results to those of a standard attention-based transformer, with improved efficiency and reduced computational complexity with respect to space and time.
-
公开(公告)号:US20200327450A1
公开(公告)日:2020-10-15
申请号:US16384738
申请日:2019-04-15
Applicant: Apple Inc.
Inventor: Chen HUANG , Joshua M. SUSSKIND , Carlos GUESTRIN
IPC: G06N20/00
Abstract: The subject technology trains, for a first set of iterations, a first machine learning model using a loss function with a first set of parameters. The subject technology determines, by a second machine learning model, a state of the first machine learning model corresponding to the first set of iterations. The subject technology determines, by the second machine learning model, an action for updating the loss function based on the state of the first machine learning model. The subject technology updates, by the second machine learning model, the loss function based at least in part on the action, where the updated loss function includes a second set of parameters corresponding to a change in values of the first set of parameters. The subject technology trains, for a second set of iterations, the first machine learning model using the updated loss function with the second set of parameters.
-