Execution Graph Acceleration
    21.
    发明申请

    公开(公告)号:US20210096921A1

    公开(公告)日:2021-04-01

    申请号:US16688487

    申请日:2019-11-19

    Applicant: Apple Inc.

    Abstract: A first command is fetched for execution on a GPU. Dependency information for the first command, which indicates a number of parent commands that the first command depends on, is determined. The first command is inserted into an execution graph based on the dependency information. The execution graph defines an order of execution for plural commands including the first command. The number of parent commands are configured to be executed on the GPU before executing the first command. A wait count for the first command, which indicates the number of parent commands of the first command, is determined based on the execution graph. The first command is inserted into cache memory in response to determining that the wait count for the first command is zero or that each of the number of parent commands the first command depends on has already been inserted into the cache memory.

    System and method to share GPU resources

    公开(公告)号:US10908962B1

    公开(公告)日:2021-02-02

    申请号:US15944062

    申请日:2018-04-03

    Applicant: Apple Inc.

    Abstract: The embodiments disclosed herein relate to the field of graphics processing and, without limitation, to techniques to enable efficient sharing of a graphics processing unit (GPU) between user interface (UI) graphics operations and intense compute operations. In certain embodiments, intense compute operations, such as long accumulations, are divided into multiple pieces. A scheduler is added to force context switching if an intense compute operation is blocking timely execution of a UI graphics operation. The division of the intense compute operation is tuned so that the GPU compute queue can drain during approximately the same time required to perform a context switch on the GPU.

    Graphics Hardware Priority Scheduling
    23.
    发明申请

    公开(公告)号:US20200379815A1

    公开(公告)日:2020-12-03

    申请号:US16795814

    申请日:2020-02-20

    Applicant: Apple Inc.

    Abstract: In general, embodiments are disclosed herein for tracking and allocating graphics hardware resources. In one embodiment, a software and/or firmware process constructs a cross-application command queue utilization table based on one or more specified command queue quality of service (QoS) settings, in order to track the target and current utilization rates of each command queue on the graphics hardware over a given frame and to load work onto the graphics hardware in accordance with the utilization table. Based on the constructed utilization table for a given frame, any command queues that have exceed their respective target utilization value may be moved to an “inactive” status for the duration of the current frame. For any command queues that remain in an “active” status for the current frame, work from those command queues may be loaded on to slots of the appropriate data masters of the graphics hardware in any desired order.

    Fast GPU context switch
    24.
    发明授权

    公开(公告)号:US10853907B2

    公开(公告)日:2020-12-01

    申请号:US16511742

    申请日:2019-07-15

    Applicant: Apple Inc.

    Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.

    Fast GPU context switch
    25.
    发明授权

    公开(公告)号:US10373287B2

    公开(公告)日:2019-08-06

    申请号:US15680885

    申请日:2017-08-18

    Applicant: Apple Inc.

    Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.

    Fast GPU Context Switch
    26.
    发明申请

    公开(公告)号:US20190057484A1

    公开(公告)日:2019-02-21

    申请号:US15680885

    申请日:2017-08-18

    Applicant: Apple Inc.

    CPC classification number: G06T1/20 G06F9/485 G06F9/4881

    Abstract: Systems, methods, and computer readable media to improve task switching operations in a graphics processing unit (GPU) are described. As disclosed herein, the clock rate (and voltages) of a GPU's operating environment may be altered so that a low priority task may be rapidly run to a task switch boundary (or completion) so that a higher priority task may begin execution. In some embodiments, only the GPU's operating clock (and voltage) is increased during the task switch operation. In other embodiments, the clock rate (voltages) of supporting components may also be increased. For example, the operating clock for the GPU's supporting memory, memory controller or memory fabric may also be increased. Once the lower priority task has been swapped out, one or more of the clocks (and voltages) increased during the switch operation could be subsequently decreased, though not necessarily to their pre-switch rates.

    PROCESSING CIRCUIT HARDWARE RESOURCE ALLOCATION SYSTEM

    公开(公告)号:US20180173560A1

    公开(公告)日:2018-06-21

    申请号:US15386570

    申请日:2016-12-21

    Applicant: Apple Inc.

    CPC classification number: G06F9/4818 G06F9/505 G06F2209/5021

    Abstract: In various embodiments, hardware resources of a processing circuit may be allocated to a plurality of processes based on priorities of the processes. A hardware resource utilization sensor may detect a current utilization of the hardware resources by a process. A utilization accumulation circuit may determine a utilization of the hardware resources by the process over a particular amount of time. A target utilization of the hardware resources for the process may be determined based on the utilization of the hardware resources over the particular amount of time. A comparator circuit may compare the current utilization to the target utilization. A process priority adjustment circuit may adjust a priority of the process based on the comparison. Based on the adjusted priority, a different amount of hardware resources may be allocated to the processes.

Patent Agency Ranking