Kickslot Manager Circuitry for Graphics Processors

    公开(公告)号:US20230048951A1

    公开(公告)日:2023-02-16

    申请号:US17399808

    申请日:2021-08-11

    Applicant: Apple Inc.

    Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.

    Affinity-based Graphics Scheduling
    12.
    发明申请

    公开(公告)号:US20230047481A1

    公开(公告)日:2023-02-16

    申请号:US17399784

    申请日:2021-08-11

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to affinity-based scheduling of graphics work. In disclosed embodiments, first and second groups of graphics processor sub-units may share respective first and second caches. Distribution circuitry may receive a software-specified set of graphics work and a software-indicated mapping of portions of the set of graphics work to groups of graphics processor sub-units. The distribution circuitry may assign subsets of the set of graphics work based on the mapping. This may improve cache efficiency, in some embodiments, by allowing graphics work that accesses the same memory areas to be assigned to the same group of sub-units that share a cache.

    Graphics hardware driven pause for quality of service adjustment

    公开(公告)号:US10795730B2

    公开(公告)日:2020-10-06

    申请号:US16145573

    申请日:2018-09-28

    Applicant: Apple Inc.

    Abstract: In general, embodiments are disclosed for tracking and allocating graphics processor hardware resources. More particularly, a graphics hardware resource allocation system is able to generate a priority list for a plurality of data masters for graphics processor based on a comparison between a current utilizations for the data masters and a target utilizations for the data masters. The graphics hardware resource allocation system designate, based on the priority list, a first data master with a higher priority to submit work to the graphics processor compared to a second data master. The graphics hardware resource allocation system determines a stall counter value for the data master and generates a notification to pause work for the second data master based on the stall counter value.

    Distributed Compute Work Parser Circuitry using Communications Fabric

    公开(公告)号:US20200098160A1

    公开(公告)日:2020-03-26

    申请号:US16143412

    申请日:2018-09-26

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to distributing work from compute kernels using a distributed hierarchical parser architecture. In some embodiments, an apparatus includes a plurality of shader units configured to perform operations for compute workgroups included in compute kernels processed by the apparatus, a plurality of distributed workload parser circuits, and a communications fabric connected to the plurality of distributed workload parser circuits and a master workload parser circuit. In some embodiments, the master workload parser circuit is configured to iteratively determine a next position in multiple dimensions for a next batch of workgroups from the kernel and send batch information to the distributed workload parser circuits via the communications fabric to assign the batch of workgroups. In some embodiments, the distributed parsers maintain coordinate information for the kernel and update the coordinate information in response to the batch information, even when the distributed parsers are not assigned to execute the batch.

    Distributed compute work parser circuitry using communications fabric

    公开(公告)号:US10593094B1

    公开(公告)日:2020-03-17

    申请号:US16143412

    申请日:2018-09-26

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to distributing work from compute kernels using a distributed hierarchical parser architecture. In some embodiments, an apparatus includes a plurality of shader units configured to perform operations for compute workgroups included in compute kernels processed by the apparatus, a plurality of distributed workload parser circuits, and a communications fabric connected to the plurality of distributed workload parser circuits and a master workload parser circuit. In some embodiments, the master workload parser circuit is configured to iteratively determine a next position in multiple dimensions for a next batch of workgroups from the kernel and send batch information to the distributed workload parser circuits via the communications fabric to assign the batch of workgroups. In some embodiments, the distributed parsers maintain coordinate information for the kernel and update the coordinate information in response to the batch information, even when the distributed parsers are not assigned to execute the batch.

Patent Agency Ranking