-
公开(公告)号:US20220342806A1
公开(公告)日:2022-10-27
申请号:US17519284
申请日:2021-11-04
Applicant: Apple Inc.
Inventor: Steven Fishwick , Jeffry E. Gonion , Per H. Hammarlund , Eran Tamari , Lior Zimet , Gerard R. Williams, III
IPC: G06F12/02 , G06F12/0871 , G06F12/0882 , G06F12/1045
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US20240272940A1
公开(公告)日:2024-08-15
申请号:US18450978
申请日:2023-08-16
Applicant: Apple Inc.
Inventor: Benjamin Bowman , Ali Rabbani Rankouhi , Jonathan M. Redshaw , Steven Fishwick
CPC classification number: G06F9/4881 , G06F9/3887 , G06F9/485
Abstract: Disclosed techniques relate to scheduling sets of graphics work with dependencies. In some embodiments, a first set of graphics work depends on a second set of graphics work. Control circuitry may, in response to a release signal that indicates the second set reaching a first processing point, initiate processing of the first set. Control circuitry may, in response to reaching a kick gate point, stall processing of the first set. Control circuitry may, in response to an end signal for the second set, resume processing of the first set.
-
公开(公告)号:US11675722B2
公开(公告)日:2023-06-13
申请号:US17337805
申请日:2021-06-03
Applicant: Apple Inc.
Inventor: Sergio Kolor , Sergio V. Tota , Tzach Zemer , Sagi Lahav , Jonathan M. Redshaw , Per H. Hammarlund , Eran Tamari , James Vash , Gaurav Garg , Lior Zimet , Harshavardhan Kaushikkar , Steven Fishwick , Steven R. Hutsell , Shawn M. Fukami
IPC: G06F13/40 , G06F15/173
CPC classification number: G06F13/4027 , G06F13/4022 , G06F15/17375 , G06F15/17381
Abstract: In an embodiment, a system on a chip (SOC) comprises a semiconductor die on which circuitry is formed, wherein the circuitry comprises a plurality of agents and a plurality of network switches coupled to the plurality of agents. The plurality of network switches are interconnected to form a plurality of physical and logically independent networks. A first network of the plurality of physically and logically independent networks is constructed according to a first topology and a second network of the plurality of physically and logically independent networks is constructed according to a second topology that is different from the first topology. For example, the first topology may a ring topology and the second topology may be a mesh topology. In an embodiment, coherency may be enforced on the first network and the second network may be a relaxed order network.
-
公开(公告)号:US20230050061A1
公开(公告)日:2023-02-16
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US20220342805A1
公开(公告)日:2022-10-27
申请号:US17353371
申请日:2021-06-21
Applicant: Apple Inc.
Inventor: Steven Fishwick
IPC: G06F12/02 , G06F12/1018 , G06F12/06 , G06F13/16
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US12086644B2
公开(公告)日:2024-09-10
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
CPC classification number: G06F9/5044 , G06F9/4881 , G06F9/505 , G06T1/20 , G06T1/60
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US20240273667A1
公开(公告)日:2024-08-15
申请号:US18450964
申请日:2023-08-16
Applicant: Apple Inc.
Inventor: Arjun Thottappilly , Steven Fishwick , Jason D. Carroll
CPC classification number: G06T1/20 , G06F9/5061 , G06F2209/503
Abstract: Disclosed techniques relate to parsing and assigning sets of geometry work to distributed hardware slots. In some embodiments, graphics control circuitry implements a plurality of logical slots. Control circuitry may assign a parse version of a set of geometry work to distributed hardware slots of one or more of the graphics processor sub-units that each implement multiple distributed hardware slots. Control circuitry may determine a number of segments for the set of geometry work based on execution of the parse version and assign determined segments to distributed hardware slots of respective graphics processor sub-units for execution. Stitch circuitry may stitch results of the segments processed by the assigned distributed hardware slots.
-
公开(公告)号:US20230048951A1
公开(公告)日:2023-02-16
申请号:US17399808
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Steven Fishwick , Fergus W. MacGarry , Jonathan M. Redshaw , David A. Gotwalt , Ali Rabbani Rankouhi , Benjamin Bowman
Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.
-
公开(公告)号:US20220342588A1
公开(公告)日:2022-10-27
申请号:US17353349
申请日:2021-06-21
Applicant: Apple Inc.
Inventor: Steven Fishwick
IPC: G06F3/06
Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.
-
公开(公告)号:US11256629B2
公开(公告)日:2022-02-22
申请号:US17027271
申请日:2020-09-21
Applicant: Apple Inc.
Inventor: Karthik Ramani , Fang Liu , Steven Fishwick , Jonathan M. Redshaw
IPC: G06F12/08 , G06F12/0888 , G06F12/0815 , G06F12/0877
Abstract: Techniques are disclosed relating to filtering cache accesses. In some embodiments, a control unit is configured to, in response to a request to process a set of data, determine a size of a portion of the set of data to be handled using a cache. In some embodiments, the control unit is configured to determine filtering parameters indicative of a set of addresses corresponding to the determined size. In some embodiments, the control unit is configured to process one or more access requests for the set of data based on the determined filter parameters, including: using the cache to process one or more access requests having addresses in the set of addresses and bypassing the cache to access a backing memory directly, for access requests having addresses that are not in the set of addresses. The disclosed techniques may reduce average memory bandwidth or peak memory bandwidth.
-
-
-
-
-
-
-
-
-