-
公开(公告)号:US20250077232A1
公开(公告)日:2025-03-06
申请号:US18882364
申请日:2024-09-11
Applicant: Intel Corporation
Inventor: JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI
Abstract: A graphics processing device is provided that includes a set of compute units to execute a workload, a cache coupled with the set of compute units, and circuitry coupled with the cache and the set of compute units. The circuitry is configured to, in response to a cache miss for the read from a first cache, broadcast an event within the graphics processor device to identify data associated with the cache miss, receive the event at a second compute unit in the set of compute units, and prefetch the data identified by the event into a second cache that is local to the second compute unit before an attempt to read the instruction or data by the second thread.
-
2.
公开(公告)号:US20200005516A1
公开(公告)日:2020-01-02
申请号:US16024821
申请日:2018-06-30
Applicant: Intel Corporation
Inventor: MICHAEL APODACA , ANKUR SHAH , BEN ASHBAUGH , BRANDON FLIFLET , HEMA NALLURI , PATTABHIRAMAN K , PETER DOYLE , JOSEPH KOSTON , JAMES VALERIO , MURALI RAMADOSS , ALTUG KOKER , ADITYA NAVALE , PRASOONKUMAR SURTI , BALAJI VEMBU
IPC: G06T15/00
Abstract: Apparatus and method for simultaneous command streamers. For example, one embodiment of an apparatus comprises: a plurality of work element queues to store work elements for a plurality of thread contexts, each work element associated with a context descriptor identifying a context storage region in memory; a plurality of command streamers, each command streamer associated with one of the plurality of work element queues, the command streamers to independently submit instructions for execution as specified by the work elements; a thread dispatcher to evaluate the thread contexts including priority values, to tag each instruction with an execution identifier (ID), and to responsively dispatch each instruction including the execution ID in accordance with the thread context; and a plurality of graphics functional units to independently execute each instruction dispatched by the thread dispatcher and to associate each instruction with a thread context based on its execution ID.
-
公开(公告)号:US20200218539A1
公开(公告)日:2020-07-09
申请号:US16243663
申请日:2019-01-09
Applicant: Intel Corporation
Inventor: JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI
Abstract: A graphics processing device comprises a set of compute units to execute multiple threads of a workload, a cache coupled with the set of compute units, and a prefetcher to prefetch instructions associated with the workload. The prefetcher is configured to use a thread dispatch command that is used to dispatch threads to execute a kernel to prefetch instructions, parameters, and/or constants that will be used during execution of the kernel. Prefetch operations for the kernel can then occur concurrently with thread dispatch operations.
-
公开(公告)号:US20230401064A1
公开(公告)日:2023-12-14
申请号:US18347964
申请日:2023-07-06
Applicant: Intel Corporation
Inventor: JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI
CPC classification number: G06F9/3802 , G06T1/20 , G06F13/28
Abstract: A graphics processing device is provided that includes a set of compute units to execute a workload, a cache coupled with the set of compute units, and circuitry coupled with the cache and the set of compute units. The circuitry is configured to, in response to a cache miss for the read from a first cache, broadcast an event within the graphics processor device to identify data associated with the cache miss, receive the event at a second compute unit in the set of compute units, and prefetch the data identified by the event into a second cache that is local to the second compute unit before an attempt to read the instruction or data by the second thread.
-
公开(公告)号:US20220083339A1
公开(公告)日:2022-03-17
申请号:US17509726
申请日:2021-10-25
Applicant: Intel Corporation
Inventor: JAMES VALERIO , VASANTH RANGANATHAN , JOYDEEP RAY , PRADEEP RAMANI
Abstract: A graphics processing device comprises a set of compute units to execute multiple threads of a workload, a cache coupled with the set of compute units, and a prefetcher to prefetch instructions associated with the workload. The prefetcher is configured to use a thread dispatch command that is used to dispatch threads to execute a kernel to prefetch instructions, parameters, and/or constants that will be used during execution of the kernel. Prefetch operations for the kernel can then occur concurrently with thread dispatch operations.
-
公开(公告)号:US20210191868A1
公开(公告)日:2021-06-24
申请号:US16724813
申请日:2019-12-23
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , VASANTH RANGANATHAN , BEN ASHBAUGH , JAMES VALERIO
IPC: G06F12/0846 , G06F12/0837 , G06F12/084 , G06F9/50 , G06F9/38 , G06F9/30
Abstract: An apparatus to facilitate partitioning of local memory is disclosed. The apparatus includes a plurality of execution units to execute a plurality of execution threads, a memory coupled to share access between the plurality of execution units and partitioning hardware to partition the memory to be used as a cache and as shared local memory (SLM), wherein the partitioning hardware partitions the memory based on a quantity of the plurality of execution threads executing on the execution units that are active.
-
-
-
-
-