Patent search ap:("QUALCOMM Incorporated") AND inv:"James Esliger" Page 1

1.

发明授权
Concurrent optimization of machine learning model performance 有权

公开(公告)号：US11907810B2

公开(公告)日：2024-02-20

申请号：US16515711

申请日：2019-07-18

Applicant: QUALCOMM Incorporated

Inventor： Serag Gadelrab , James Esliger , Meghal Varia , Kyle Ernewein , Alwyn Dos Remedios , George Lee

IPC: G06N20/00 , G06F11/34 , G06N5/04

CPC classification number: G06N20/00 , G06F11/3466 , G06N5/04

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

2.

发明授权
Concurrent optimization of machine learning model performance 有权

公开(公告)号：US12182676B2

公开(公告)日：2024-12-31

申请号：US18539022

申请日：2023-12-13

Applicant: QUALCOMM Incorporated

Inventor： Serag Gadelrab , James Esliger , Meghal Varia , Kyle Ernewein , Alwyn Dos Remedios , George Lee

IPC: G06N20/00 , G06F11/34 , G06N5/04

Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

Patent Agency Ranking