Concurrent optimization of machine learning model performance

    公开(公告)号:US12182676B2

    公开(公告)日:2024-12-31

    申请号:US18539022

    申请日:2023-12-13

    Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

    Concurrent optimization of machine learning model performance

    公开(公告)号:US11907810B2

    公开(公告)日:2024-02-20

    申请号:US16515711

    申请日:2019-07-18

    CPC classification number: G06N20/00 G06F11/3466 G06N5/04

    Abstract: Certain aspects of the present disclosure provide techniques for concurrently performing inferences using a machine learning model and optimizing parameters used in executing the machine learning model. An example method generally includes receiving a request to perform inferences on a data set using the machine learning model and performance metric targets for performance of the inferences. At least a first inference is performed on the data set using the machine learning model to meet a latency specified for generation of the first inference from receipt of the request. While performing the at least the first inference, operational parameters resulting in inference performance approaching the performance metric targets are identified based on the machine learning model and operational properties of the computing device. The identified operational parameters are applied to performance of subsequent inferences using the machine learning model.

    System and method for dynamic control of shared memory management resources

    公开(公告)号:US10067691B1

    公开(公告)日:2018-09-04

    申请号:US15448095

    申请日:2017-03-02

    Abstract: A method and system for dynamic control of shared memory resources within a portable computing device (“PCD”) are disclosed. A limit request of an unacceptable deadline miss (“UDM”) engine of the portable computing device may be determined with a limit request sensor within the UDM element. Next, a memory management unit modifies a shared memory resource arbitration policy in view of the limit request. By modifying the shared memory resource arbitration policy, the memory management unit may smartly allocate resources to service translation requests separately queued based on having emanated from either a flooding engine or a non-flooding engine.

Patent Agency Ranking