-
公开(公告)号:US11474924B2
公开(公告)日:2022-10-18
申请号:US16740140
申请日:2020-01-10
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro
Abstract: Systems, methods, and computer readable media to analyze and improve the performance of applications utilizing graphics hardware are described. In general, techniques are disclosed to monitor the run-time performance of various shader programs from multiple applications executing concurrently on a graphics processing unit (GPU) and present a visualization of such performance to a user. More particularly, the GPU performance profiling comprises sampling data from multiple hardware performance counters and shader programs during the execution of the shader programs on the GPU. The hardware counters may be indicative of the status of various performance and/or architectural limitations of the GPU at a given moment in time. By time-correlating the execution of the various shader programs and the responses of the multiple hardware counters, a more instructive visualization may be presented to the user, which may be used, e.g., as an aid in debugging and/or profiling the applications executing on the GPU.
-
公开(公告)号:US20200379864A1
公开(公告)日:2020-12-03
申请号:US16740140
申请日:2020-01-10
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro
Abstract: Systems, methods, and computer readable media to analyze and improve the performance of applications utilizing graphics hardware are described. In general, techniques are disclosed to monitor the run-time performance of various shader programs from multiple applications executing concurrently on a graphics processing unit (GPU) and present a visualization of such performance to a user. More particularly, the GPU performance profiling comprises sampling data from multiple hardware performance counters and shader programs during the execution of the shader programs on the GPU. The hardware counters may be indicative of the status of various performance and/or architectural limitations of the GPU at a given moment in time. By time-correlating the execution of the various shader programs and the responses of the multiple hardware counters, a more instructive visualization may be presented to the user, which may be used, e.g., as an aid in debugging and/or profiling the applications executing on the GPU.
-
公开(公告)号:US20250086116A1
公开(公告)日:2025-03-13
申请号:US18960603
申请日:2024-11-26
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
公开(公告)号:US20250004948A1
公开(公告)日:2025-01-02
申请号:US18342509
申请日:2023-06-27
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
公开(公告)号:US12182026B1
公开(公告)日:2024-12-31
申请号:US18342509
申请日:2023-06-27
Applicant: Apple Inc.
Inventor: Jedd O. Haberstro , Mladen Wilder
IPC: G06F12/0891 , G06F12/0842
Abstract: Techniques are disclosed relating to smashing atomic operations. In some embodiments, cache control circuitry caches data values in cache storage circuitry and receive multiple requests to atomically update a cached data value according to one or more arithmetic operations. The control circuitry may perform updates to a cached data value based on the multiple requests, in response to determining that the one or more arithmetic operations meet one or more criteria and store operation information that indicates a most-recent requested atomic arithmetic operation for the updated data value. The control circuitry may, in response to an event, flush, to a higher level in a memory hierarchy that includes the cache storage circuitry both: the updated data value and the operation information. This may advantageously smash atomic operations at the cache and reduce operations to the higher-level cache or memory (which may be the actual coherence point for atomic requests).
-
-
-
-