Patent search ap:("Google LLC") AND inv:"Hongkun Yu" Page 1

1.

发明申请
CROSS-PLATFORM DISTILLATION FRAMEWORK 有权

公开(公告)号：US20240370717A1

公开(公告)日：2024-11-07

申请号：US18313189

申请日：2023-05-05

Applicant: Google LLC

Inventor： Qifei Wang , Yicheng Fan , Wei Xu , Jiayu Ye , Lu Wang , Chuo-Ling Chang , Dana Alon , Erik Nathan Vee , Hongkun Yu , Matthias Grundmann , Shanmugasundaram Ravikumar , Andrew Stephen Tomkins

IPC: G06N3/08

Abstract: A method for a cross-platform distillation framework includes obtaining a plurality of training samples. The method includes generating, using a student neural network model executing on a first processing unit, a first output based on a first training sample. The method also includes generating, using a teacher neural network model executing on a second processing unit, a second output based on the first training sample. The method includes determining, based on the first output and the second output, a first loss. The method further includes adjusting, based on the first loss, one or more parameters of the student neural network model. The method includes repeating the above steps for each training sample of the plurality of training samples.

2.

发明公开
Method for Training Large Language Models to Perform Query Intent Classification 审中-公开

公开(公告)号：US20240232637A9

公开(公告)日：2024-07-11

申请号：US18491877

申请日：2023-10-23

Applicant: Google LLC

Inventor： Krishna Pragash Srinivasan , Michael Bendersky , Anupam Samanta , Lingrui Liao , Luca Bertelli , Ming-Wei Chang , Iftekhar Naim , Siddhartha Brahma , Siamak Shakeri , Hongkun Yu , John Nham , Karthik Raman , Raphael Dominik Hoffmann

IPC: G06N3/0895 , G06F16/903 , G06F16/93 , G06N3/0455

CPC classification number: G06N3/0895 , G06F16/90335 , G06F16/93 , G06N3/0455

Abstract: Provided are computing systems, methods, and platforms that train query processing models, such as large language models, to perform query intent classification tasks by using retrieval augmentation and multi-stage distillation. Unlabeled training examples of queries may be obtained, and a set of the training examples may be augmented with additional feature annotations to generate augmented training examples. A first query processing model may annotate the retrieval augmented queries to generate inferred labels for the augmented training examples. A second query processing model may be trained on the inferred labels, distilling the query processing model that was trained with retrieval augmentation into a non-retrieval augmented query processing model. The second query processing model may annotate the entire set of unlabeled training examples. Another stage of distillation may train a third query processing model using the entire set of unlabeled training examples without retrieval augmentation.

3.

发明公开
Method for Training Large Language Models to Perform Query Intent Classification 审中-公开

公开(公告)号：US20240135187A1

公开(公告)日：2024-04-25

申请号：US18491877

申请日：2023-10-22

Applicant: Google LLC

Inventor： Krishna Pragash Srinivasan , Michael Bendersky , Anupam Samanta , Lingrui Liao , Luca Bertelli , Ming-Wei Chang , Iftekhar Naim , Siddhartha Brahma , Siamak Shakeri , Hongkun Yu , John Nham , Karthik Raman , Raphael Dominik Hoffmann

IPC: G06N3/0895 , G06F16/903 , G06F16/93 , G06N3/0455

CPC classification number: G06N3/0895 , G06F16/90335 , G06F16/93 , G06N3/0455

Abstract: Provided are computing systems, methods, and platforms that train query processing models, such as large language models, to perform query intent classification tasks by using retrieval augmentation and multi-stage distillation. Unlabeled training examples of queries may be obtained, and a set of the training examples may be augmented with additional feature annotations to generate augmented training examples. A first query processing model may annotate the retrieval augmented queries to generate inferred labels for the augmented training examples. A second query processing model may be trained on the inferred labels, distilling the query processing model that was trained with retrieval augmentation into a non-retrieval augmented query processing model. The second query processing model may annotate the entire set of unlabeled training examples. Another stage of distillation may train a third query processing model using the entire set of unlabeled training examples without retrieval augmentation.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification