-
公开(公告)号:US20240370717A1
公开(公告)日:2024-11-07
申请号:US18313189
申请日:2023-05-05
Applicant: Google LLC
Inventor: Qifei Wang , Yicheng Fan , Wei Xu , Jiayu Ye , Lu Wang , Chuo-Ling Chang , Dana Alon , Erik Nathan Vee , Hongkun Yu , Matthias Grundmann , Shanmugasundaram Ravikumar , Andrew Stephen Tomkins
IPC: G06N3/08
Abstract: A method for a cross-platform distillation framework includes obtaining a plurality of training samples. The method includes generating, using a student neural network model executing on a first processing unit, a first output based on a first training sample. The method also includes generating, using a teacher neural network model executing on a second processing unit, a second output based on the first training sample. The method includes determining, based on the first output and the second output, a first loss. The method further includes adjusting, based on the first loss, one or more parameters of the student neural network model. The method includes repeating the above steps for each training sample of the plurality of training samples.
-
公开(公告)号:US20240232637A9
公开(公告)日:2024-07-11
申请号:US18491877
申请日:2023-10-23
Applicant: Google LLC
Inventor: Krishna Pragash Srinivasan , Michael Bendersky , Anupam Samanta , Lingrui Liao , Luca Bertelli , Ming-Wei Chang , Iftekhar Naim , Siddhartha Brahma , Siamak Shakeri , Hongkun Yu , John Nham , Karthik Raman , Raphael Dominik Hoffmann
IPC: G06N3/0895 , G06F16/903 , G06F16/93 , G06N3/0455
CPC classification number: G06N3/0895 , G06F16/90335 , G06F16/93 , G06N3/0455
Abstract: Provided are computing systems, methods, and platforms that train query processing models, such as large language models, to perform query intent classification tasks by using retrieval augmentation and multi-stage distillation. Unlabeled training examples of queries may be obtained, and a set of the training examples may be augmented with additional feature annotations to generate augmented training examples. A first query processing model may annotate the retrieval augmented queries to generate inferred labels for the augmented training examples. A second query processing model may be trained on the inferred labels, distilling the query processing model that was trained with retrieval augmentation into a non-retrieval augmented query processing model. The second query processing model may annotate the entire set of unlabeled training examples. Another stage of distillation may train a third query processing model using the entire set of unlabeled training examples without retrieval augmentation.
-
公开(公告)号:US20240135187A1
公开(公告)日:2024-04-25
申请号:US18491877
申请日:2023-10-22
Applicant: Google LLC
Inventor: Krishna Pragash Srinivasan , Michael Bendersky , Anupam Samanta , Lingrui Liao , Luca Bertelli , Ming-Wei Chang , Iftekhar Naim , Siddhartha Brahma , Siamak Shakeri , Hongkun Yu , John Nham , Karthik Raman , Raphael Dominik Hoffmann
IPC: G06N3/0895 , G06F16/903 , G06F16/93 , G06N3/0455
CPC classification number: G06N3/0895 , G06F16/90335 , G06F16/93 , G06N3/0455
Abstract: Provided are computing systems, methods, and platforms that train query processing models, such as large language models, to perform query intent classification tasks by using retrieval augmentation and multi-stage distillation. Unlabeled training examples of queries may be obtained, and a set of the training examples may be augmented with additional feature annotations to generate augmented training examples. A first query processing model may annotate the retrieval augmented queries to generate inferred labels for the augmented training examples. A second query processing model may be trained on the inferred labels, distilling the query processing model that was trained with retrieval augmentation into a non-retrieval augmented query processing model. The second query processing model may annotate the entire set of unlabeled training examples. Another stage of distillation may train a third query processing model using the entire set of unlabeled training examples without retrieval augmentation.
-
-