-
公开(公告)号:US20250103501A1
公开(公告)日:2025-03-27
申请号:US18795416
申请日:2024-08-06
Applicant: Apple Inc.
Inventor: Mladen Wilder , Karthik Ramani , Tyson J. Bergland
IPC: G06F12/084 , G06F12/0817 , G06F12/0837
Abstract: Techniques are disclosed relating to data compression in graphics processors. In some embodiments, first and second graphics processor cores include respective shader processor circuitry configured to execute graphics shader programs. Cache circuitry may be configured to store surface data, including a compressed block of surface data and metadata for the compressed block of surface data. Lock control circuitry may lock metadata for the second graphics processor core for the compressed block of surface data based on an access to the metadata by the first graphics processor core and prevent read accesses to the compressed block by the second graphics processor core until the lock on the metadata is released. This may provide consistency across graphics cores for compressed data.
-
公开(公告)号:US20250086116A1
公开(公告)日:2025-03-13
申请号:US18960603
申请日:2024-11-26
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
公开(公告)号:US20250004948A1
公开(公告)日:2025-01-02
申请号:US18342509
申请日:2023-06-27
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
公开(公告)号:US12182026B1
公开(公告)日:2024-12-31
申请号:US18342509
申请日:2023-06-27
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
公开(公告)号:US20230050061A1
公开(公告)日:2023-02-16
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US12086644B2
公开(公告)日:2024-09-10
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
CPC classification number: G06F9/5044 , G06F9/4881 , G06F9/505 , G06T1/20 , G06T1/60
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US20250104181A1
公开(公告)日:2025-03-27
申请号:US18795437
申请日:2024-08-06
Applicant: Apple Inc.
Inventor: Karthik Ramani , Tyson J. Bergland , Leela Kishore Kothamasu , Hongzhou Zhao , Winnie W. Yeung , Mladen Wilder
IPC: G06T1/60 , G06F12/0891 , G06T15/00
Abstract: Techniques are disclosed relating to data compression in graphics processors. In some embodiments, cache circuitry is coupled to shader processor circuitry and is configured to store graphics data that includes a compressed block of data associated with a surface and metadata for the compressed block of data. Metadata coherence circuitry may cache the metadata for the compressed block of data, receive an indication of a write command for non-compressed data associated with the surface, wherein the write command identifies the metadata and has a different address than the compressed block of data, and determine, based on the metadata and the indication, to invalidate the compressed block of data in the cache circuitry. This may maintain read/write coherence in a cache that stores both compressed and uncompressed data, in some embodiments.
-
-
-
-
-
-