-
公开(公告)号:US20210096921A1
公开(公告)日:2021-04-01
申请号:US16688487
申请日:2019-11-19
Applicant: Apple Inc.
Inventor: Kutty Banerjee , Michael Imbrogno
Abstract: A first command is fetched for execution on a GPU. Dependency information for the first command, which indicates a number of parent commands that the first command depends on, is determined. The first command is inserted into an execution graph based on the dependency information. The execution graph defines an order of execution for plural commands including the first command. The number of parent commands are configured to be executed on the GPU before executing the first command. A wait count for the first command, which indicates the number of parent commands of the first command, is determined based on the execution graph. The first command is inserted into cache memory in response to determining that the wait count for the first command is zero or that each of the number of parent commands the first command depends on has already been inserted into the cache memory.
-
公开(公告)号:US10908962B1
公开(公告)日:2021-02-02
申请号:US15944062
申请日:2018-04-03
Applicant: Apple Inc.
Inventor: Francesco Rossi , Kutty Banerjee
Abstract: The embodiments disclosed herein relate to the field of graphics processing and, without limitation, to techniques to enable efficient sharing of a graphics processing unit (GPU) between user interface (UI) graphics operations and intense compute operations. In certain embodiments, intense compute operations, such as long accumulations, are divided into multiple pieces. A scheduler is added to force context switching if an intense compute operation is blocking timely execution of a UI graphics operation. The division of the intense compute operation is tuned so that the GPU compute queue can drain during approximately the same time required to perform a context switch on the GPU.
-
公开(公告)号:US20200379815A1
公开(公告)日:2020-12-03
申请号:US16795814
申请日:2020-02-20
Applicant: Apple Inc.
Inventor: Kutty Banerjee , Michael Imbrogno
Abstract: In general, embodiments are disclosed herein for tracking and allocating graphics hardware resources. In one embodiment, a software and/or firmware process constructs a cross-application command queue utilization table based on one or more specified command queue quality of service (QoS) settings, in order to track the target and current utilization rates of each command queue on the graphics hardware over a given frame and to load work onto the graphics hardware in accordance with the utilization table. Based on the constructed utilization table for a given frame, any command queues that have exceed their respective target utilization value may be moved to an “inactive” status for the duration of the current frame. For any command queues that remain in an “active” status for the current frame, work from those command queues may be loaded on to slots of the appropriate data masters of the graphics hardware in any desired order.
-
公开(公告)号:US10853907B2
公开(公告)日:2020-12-01
申请号:US16511742
申请日:2019-07-15
Applicant: Apple Inc.
Inventor: Tatsuya Iwamoto , Kutty Banerjee , Rohan Sanjeev Patil
Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.
-
公开(公告)号:US10373287B2
公开(公告)日:2019-08-06
申请号:US15680885
申请日:2017-08-18
Applicant: Apple Inc.
Inventor: Tatsuya Iwamoto , Kutty Banerjee , Rohan Sanjeev Patil
Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.
-
公开(公告)号:US20190057484A1
公开(公告)日:2019-02-21
申请号:US15680885
申请日:2017-08-18
Applicant: Apple Inc.
Inventor: Tatsuya Iwamoto , Kutty Banerjee , Rohan Sanjeev Patil
CPC classification number: G06T1/20 , G06F9/485 , G06F9/4881
Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.
-
公开(公告)号:US20180173560A1
公开(公告)日:2018-06-21
申请号:US15386570
申请日:2016-12-21
Applicant: Apple Inc.
Inventor: Gokhan Avkarogullari , Terence M. Potter , Benjiman L. Goodman , Ralph C. Taylor , Kutty Banerjee
CPC classification number: G06F9/4818 , G06F9/505 , G06F2209/5021
Abstract: In various embodiments, hardware resources of a processing circuit may be allocated to a plurality of processes based on priorities of the processes. A hardware resource utilization sensor may detect a current utilization of the hardware resources by a process. A utilization accumulation circuit may determine a utilization of the hardware resources by the process over a particular amount of time. A target utilization of the hardware resources for the process may be determined based on the utilization of the hardware resources over the particular amount of time. A comparator circuit may compare the current utilization to the target utilization. A process priority adjustment circuit may adjust a priority of the process based on the comparison. Based on the adjusted priority, a different amount of hardware resources may be allocated to the processes.
-
-
-
-
-
-