-
公开(公告)号:US20210287324A1
公开(公告)日:2021-09-16
申请号:US16818671
申请日:2020-03-13
Applicant: Apple Inc.
Inventor: Steven Fishwick
Abstract: Techniques are disclosed relating to using cost estimates for portions of a graphics frame to schedule graphics rendering tasks. In some embodiments, a processor generates a first set of cost estimates for respective different portions of a frame for a first render and a second set of cost estimates for respective different portions of a frame for a second render. In some embodiments, the processor compares the first set of cost estimates with the second set of cost estimates. In response to an output of the comparison meeting a first threshold level of similarity, the graphics processor may use one or more portions of the frame generated by the first render for the second render instead of performing the second render for the one or more portions.
-
公开(公告)号:US12236130B2
公开(公告)日:2025-02-25
申请号:US18318672
申请日:2023-05-16
Applicant: Apple Inc.
Inventor: Steven Fishwick , Lior Zimet , Harshavardhan Kaushikkar
IPC: G06F3/06 , G06F12/02 , G06F12/06 , G06F12/0871 , G06F12/0882 , G06F12/1018 , G06F12/1045 , G06F13/16
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US12190164B2
公开(公告)日:2025-01-07
申请号:US17399808
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Steven Fishwick , Fergus W. MacGarry , Jonathan M. Redshaw , David A. Gotwalt , Ali Rabbani Rankouhi , Benjamin Bowman
Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.
-
公开(公告)号:US11972140B2
公开(公告)日:2024-04-30
申请号:US18069033
申请日:2022-12-20
Applicant: Apple Inc.
Inventor: Steven Fishwick , Jeffry E. Gonion , Per H. Hammarlund , Eran Tamari , Lior Zimet , Gerard R. Williams, III
IPC: G06F3/06 , G06F12/02 , G06F12/06 , G06F12/0871 , G06F12/0882 , G06F12/1018 , G06F12/1045 , G06F13/16
CPC classification number: G06F3/0655 , G06F3/0604 , G06F3/0683 , G06F12/0238 , G06F12/0646 , G06F12/0871 , G06F12/0882 , G06F12/1018 , G06F12/1054 , G06F12/1063 , G06F13/1668
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US20230058989A1
公开(公告)日:2023-02-23
申请号:US17821312
申请日:2022-08-22
Applicant: Apple Inc.
Inventor: Per H. Hammarlund , Lior Zimet , Sergio Kolor , Sagi Lahav , James Vash , Gaurav Garg , Tal Kuzi , Jeffry E. Gonion , Charles E. Tucker , Lital Levy-Rubin , Dany Davidov , Steven Fishwick , Nir Leshem , Mark Pilip , Gerard R. Williams, III , Harshavardhan Kaushikkar , Srinivasan Rangan Sridharan
IPC: G06F12/0831 , G06F12/0811 , G06F12/128
Abstract: An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.
-
公开(公告)号:US20230051906A1
公开(公告)日:2023-02-16
申请号:US17399759
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , Melissa L. Velez
Abstract: Disclosed embodiments relate to software control of graphics hardware that supports logical slots. In some embodiments, a GPU includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. Control circuitry may determine mappings between logical slots and distributed hardware slots for different sets of graphics work. Various mapping aspects may be software-controlled. For example, software may specify one or more of the following: priority information for a set of graphics work, to retain the mapping after completion of the work, a distribution rule, a target group of sub-units, a sub-unit mask, a scheduling policy, to reclaim hardware slots from another logical slot, etc. Software may also query status of the work.
-
公开(公告)号:US11257179B2
公开(公告)日:2022-02-22
申请号:US16818671
申请日:2020-03-13
Applicant: Apple Inc.
Inventor: Steven Fishwick
Abstract: Techniques are disclosed relating to using cost estimates for portions of a graphics frame to schedule graphics rendering tasks. In some embodiments, a processor generates a first set of cost estimates for respective different portions of a frame for a first render and a second set of cost estimates for respective different portions of a frame for a second render. In some embodiments, the processor compares the first set of cost estimates with the second set of cost estimates. In response to an output of the comparison meeting a first threshold level of similarity, the graphics processor may use one or more portions of the frame generated by the first render for the second render instead of performing the second render for the one or more portions.
-
公开(公告)号:US20210004331A1
公开(公告)日:2021-01-07
申请号:US17027271
申请日:2020-09-21
Applicant: Apple Inc.
Inventor: Karthik Ramani , Fang Liu , Steven Fishwick , Jonathan M. Redshaw
IPC: G06F12/0888 , G06F12/0815 , G06F12/0877
Abstract: Techniques are disclosed relating to filtering cache accesses. In some embodiments, a control unit is configured to, in response to a request to process a set of data, determine a size of a portion of the set of data to be handled using a cache. In some embodiments, the control unit is configured to determine filtering parameters indicative of a set of addresses corresponding to the determined size. In some embodiments, the control unit is configured to process one or more access requests for the set of data based on the determined filter parameters, including: using the cache to process one or more access requests having addresses in the set of addresses and bypassing the cache to access a backing memory directly, for access requests having addresses that are not in the set of addresses. The disclosed techniques may reduce average memory bandwidth or peak memory bandwidth.
-
公开(公告)号:US10783085B1
公开(公告)日:2020-09-22
申请号:US16290646
申请日:2019-03-01
Applicant: Apple Inc.
Inventor: Karthik Ramani , Fang Liu , Steven Fishwick , Jonathan M. Redshaw
IPC: G06F12/0888 , G06F12/0815 , G06F12/0877
Abstract: Techniques are disclosed relating to filtering cache accesses. In some embodiments, a control unit is configured to, in response to a request to process a set of data, determine a size of a portion of the set of data to be handled using a cache. In some embodiments, the control unit is configured to determine filtering parameters indicative of a set of addresses corresponding to the determined size. In some embodiments, the control unit is configured to process one or more access requests for the set of data based on the determined filter parameters, including: using the cache to process one or more access requests having addresses in the set of addresses and bypassing the cache to access a backing memory directly, for access requests having addresses that are not in the set of addresses. The disclosed techniques may reduce average memory bandwidth or peak memory bandwidth.
-
-
-
-
-
-
-
-