GPU Resource Tracking
    12.
    发明申请

    公开(公告)号:US20180349146A1

    公开(公告)日:2018-12-06

    申请号:US15615412

    申请日:2017-06-06

    Applicant: Apple Inc.

    Abstract: In general, techniques are disclosed for tracking and allocating graphics processor hardware over specified periods of time. More particularly, hardware sensors may be used to determine the utilization of graphics processor hardware after each of a number of specified intervals (referred to as “sample intervals”). The utilization values so captured may be combined after a first number of sample intervals (the combined interval referred to as an “epoch interval”) and used to determine a normalized utilization of the graphic processor's hardware resources. Normalized epoch utilization values have been adjusted to account for resources used by concurrently executing processes. In some embodiments, a lower priority process that obtains and fails to release resources that should be allocated to one or more higher priority processes may be detected, paused, and its hardware resources given to the higher priority processes.

    Cache control to preserve register data

    公开(公告)号:US12182037B2

    公开(公告)日:2024-12-31

    申请号:US18173500

    申请日:2023-02-23

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to eviction control for cache lines that store register data. In some embodiments, memory hierarchy circuitry is configured to provide memory backing for register operand data in one or more cache circuits. Lock circuitry may control a first set of lock indicators for a set of registers for a first thread, including to assert one or more lock indicators for registers that are indicated, by decode circuitry, as being utilized by decoded instructions of the first thread. The lock circuitry may preserve register operand data in the one or more cache circuits, including to prevent eviction of a given cache line from a cache circuit based on an asserted lock indicator. The lock circuitry may clear the first set of lock indicators in response to a reset event. Disclosed techniques may advantageously retain relevant register information in the cache with limited control circuit area.

    Multi-stage Thread Scheduling
    14.
    发明公开

    公开(公告)号:US20240095065A1

    公开(公告)日:2024-03-21

    申请号:US18054376

    申请日:2022-11-10

    Applicant: Apple Inc.

    CPC classification number: G06F9/4881 G06F9/485

    Abstract: Techniques are disclosed relating to multi-stage thread scheduling. In some embodiments, processor circuitry includes multiple channel pipelines for multiple channels and multiple execution pipelines shared by the channel pipelines and configured to perform different types of operations provided by the channel pipelines. First scheduler circuitry may arbitrate among threads to assign threads to channels. Second scheduler circuitry may arbitrate among channels to assign an operation from a given channel to a given execution pipeline. The execution pipelines may provide backpressure information to the first scheduler circuitry based on execution status and the first scheduler circuitry may adjust priority of a thread for assignment to a channel based on the backpressure information. Disclosed techniques may reduce channel conflicts and starvation for execution resources.

    Private Memory Management using Utility Thread

    公开(公告)号:US20230385201A1

    公开(公告)日:2023-11-30

    申请号:US18362686

    申请日:2023-07-31

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to private memory management using a mapping thread, which may be persistent. In some embodiments, a graphics processor is configured to generate a pool of private memory pages for a set of graphics work that includes multiple threads. The processor may maintain a translation table configured to map private memory addresses to virtual addresses based on identifiers of the threads. The processor may execute a mapping thread to receive a request to allocate a private memory page for a requesting thread, select a private memory page from the pool in response to the request, and map the selected page in the translation table for the requesting. The processor may then execute one or more instructions of the requesting thread to access a private memory space, wherein the execution includes translation of a private memory address to a virtual address based on the mapped page in the translation table. The mapping thread may be a persistent thread for which resources are allocated for an entirety of a time interval over which the set of graphics work is executed.

    GPU task scheduling
    16.
    发明授权

    公开(公告)号:US10902545B2

    公开(公告)日:2021-01-26

    申请号:US14574041

    申请日:2014-12-17

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to scheduling tasks for graphics processing. In one embodiment, a graphics unit is configured to render a frame of graphics data using a plurality of pass groups and the frame of graphics data includes a plurality of frame portions. In this embodiment, the graphics unit includes scheduling circuitry configured to receive a plurality of tasks, maintain pass group information for each of the plurality of tasks, and maintain relative age information for the plurality of frame portions. In this embodiment, the scheduling circuitry is configured to select a task for execution based on the pass group information and the age information. In some embodiments, the scheduling circuitry is configured to select tasks from an oldest frame portion and current pass group before selecting other tasks. This scheduling approach may result in efficient execution of various different types of graphics workloads.

    Power saving with dynamic pulse insertion

    公开(公告)号:US10270434B2

    公开(公告)日:2019-04-23

    申请号:US15046926

    申请日:2016-02-18

    Applicant: Apple Inc.

    Abstract: A method and apparatus for saving power in integrated circuits is disclosed. An IC includes functional circuit blocks which are not placed into a sleep mode when idle. A power management circuit may monitor the activity levels of the functional circuit blocks not placed into a sleep mode. When the power management circuit detects that an activity level of one of the non-sleep functional circuit blocks is less than a predefined threshold, it reduce the frequency of a clock signal provided thereto by scheduling only one pulse of a clock signal for every N pulses of the full frequency clock signal. The remaining N−1 pulses of the clock signal may be inhibited. If a high priority transaction inbound for the functional circuit block is detected, an inserted pulse of the clock signal may be provided to the functional unit irrespective of when a most recent regular pulse was provided.

    PROCESSING CIRCUIT HARDWARE RESOURCE ALLOCATION SYSTEM

    公开(公告)号:US20180173560A1

    公开(公告)日:2018-06-21

    申请号:US15386570

    申请日:2016-12-21

    Applicant: Apple Inc.

    CPC classification number: G06F9/4818 G06F9/505 G06F2209/5021

    Abstract: In various embodiments, hardware resources of a processing circuit may be allocated to a plurality of processes based on priorities of the processes. A hardware resource utilization sensor may detect a current utilization of the hardware resources by a process. A utilization accumulation circuit may determine a utilization of the hardware resources by the process over a particular amount of time. A target utilization of the hardware resources for the process may be determined based on the utilization of the hardware resources over the particular amount of time. A comparator circuit may compare the current utilization to the target utilization. A process priority adjustment circuit may adjust a priority of the process based on the comparison. Based on the adjusted priority, a different amount of hardware resources may be allocated to the processes.

    Texture state cache
    19.
    发明授权

    公开(公告)号:US09811875B2

    公开(公告)日:2017-11-07

    申请号:US14482828

    申请日:2014-09-10

    Applicant: Apple Inc.

    CPC classification number: G06T1/60 G06T15/04

    Abstract: Techniques are disclosed relating to a cache configured to store state information for texture mapping. In one embodiment, a texture state cache includes a plurality of entries configured to store state information relating to one or more stored textures. In this embodiment, the texture state cache also includes texture processing circuitry configured to retrieve state information for one of the stored textures from one of the entries in the texture state cache and determine pixel attributes based on the texture and the retrieved state information. The state information may include texture state information and sampler state information, in some embodiments. The texture state cache may allow for reduced rending times and power consumption, in some embodiments.

    DATA ALIGNMENT AND FORMATTING FOR GRAPHICS PROCESSING UNIT
    20.
    发明申请
    DATA ALIGNMENT AND FORMATTING FOR GRAPHICS PROCESSING UNIT 审中-公开
    图形处理单元的数据对齐和格式化

    公开(公告)号:US20160093014A1

    公开(公告)日:2016-03-31

    申请号:US14496934

    申请日:2014-09-25

    Applicant: Apple Inc.

    Abstract: A data queuing and format apparatus is disclosed. A first selection circuit may be configured to selectively couple a first subset of data to a first plurality of data lines dependent upon control information, and a second selection circuit may be configured to selectively couple a second subset of data to a second plurality of data lines dependent upon the control information. A storage array may include multiple storage units, and each storage unit may be configured to receive data from one or more data lines of either the first or second plurality of data lines dependent upon the control information.

    Abstract translation: 公开了一种数据排队和格式化装置。 第一选择电路可以被配置为选择性地将数据的第一子集耦合到取决于控制信息的第一多个数据线,并且第二选择电路可以被配置为选择性地将第二数据子集耦合到第二多个数据线 取决于控制信息。 存储阵列可以包括多个存储单元,并且每个存储单元可以被配置为根据控制信息从第一或第二多个数据线的一个或多个数据线接收数据。

Patent Agency Ranking