-
公开(公告)号:US12094048B2
公开(公告)日:2024-09-17
申请号:US17497618
申请日:2021-10-08
申请人: Intel Corporation
发明人: Prasoonkumar Surti , Arthur Hunter , Kamal Sinha , Scott Janus , Brent Insko , Vasanth Ranganathan , Lakshminarayanan Striramassarma
CPC分类号: G06T15/005 , G06T1/20 , G06T1/60 , G06T17/20
摘要: Embodiments are generally directed to multi-tile graphics processor rendering. An embodiment of an apparatus includes a memory for storage of data; and one or more processors including a graphics processing unit (GPU) to process data, wherein the GPU includes a plurality of GPU tiles, wherein, upon geometric data being assigned to each of a plurality of screen tiles, the apparatus is to transfer the geometric data to the plurality of GPU tiles.
-
公开(公告)号:US20220350751A1
公开(公告)日:2022-11-03
申请号:US17862739
申请日:2022-07-12
申请人: Intel Corporation
发明人: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC分类号: G06F12/123 , G06F12/0875 , G06F12/0891 , G06T1/60
摘要: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
-
公开(公告)号:US20220114108A1
公开(公告)日:2022-04-14
申请号:US17428529
申请日:2020-03-14
申请人: Intel Corporation
发明人: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K. , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC分类号: G06F12/123 , G06F12/0891 , G06F12/0875 , G06T1/60
摘要: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received.
-
公开(公告)号:US11145105B2
公开(公告)日:2021-10-12
申请号:US16355364
申请日:2019-03-15
申请人: Intel Corporation
发明人: Prasoonkumar Surti , Arthur Hunter, Jr. , Kamal Sinha , Scott Janus , Brent Insko , Vasanth Ranganathan , Lakshminarayanan Striramassarma
摘要: Embodiments are generally directed to multi-tile graphics processor rendering. An embodiment of an apparatus includes a memory for storage of data; and one or more processors including a graphics processing unit (GPU) to process data, wherein the GPU includes a plurality of GPU tiles, wherein, upon geometric data being assigned to each of a plurality of screen tiles, the apparatus is to transfer the geometric data to the plurality of GPU tiles.
-
公开(公告)号:US10388255B2
公开(公告)日:2019-08-20
申请号:US16024153
申请日:2018-06-29
申请人: Intel Corporation
发明人: Anshuman Thakur , DongHo Hong , Karthik Veeramani , Arvind Tomar , Brent Insko , Atsuo Kuwahara , Zhengmin Li
摘要: Computers for supporting multiple virtual reality (VR) display devices and related methods are described herein. An example computer includes a graphics processing unit (GPU) to render frames for a first VR display device and a second VR display device, a memory to store frames rendered by the GPU for the first VR display device and the second VR display device, and a vertical synchronization (VSYNC) scheduler to transmit alternating first and second VSYNC signals to the GPU such that a time period between each of the first or second VSYNC signals and a subsequent one of the first or second VSYNC signals is substantially the same. The GPU is to, based on the first and second VSYNC signals, alternate between rendering a frame for the first VR display device and a frame for the second VR display device.
-
公开(公告)号:US12124383B2
公开(公告)日:2024-10-22
申请号:US17862739
申请日:2022-07-12
申请人: Intel Corporation
发明人: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC分类号: G06F12/00 , G06F12/0875 , G06F12/0891 , G06F12/123 , G06T1/60
CPC分类号: G06F12/123 , G06F12/0875 , G06F12/0891 , G06T1/60 , G06F2212/302
摘要: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
-
公开(公告)号:US20240256456A1
公开(公告)日:2024-08-01
申请号:US18391346
申请日:2023-12-20
申请人: Intel Corporation
发明人: Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, Jr. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei
IPC分类号: G06F12/0862 , G06T1/20 , G06T1/60
CPC分类号: G06F12/0862 , G06T1/20 , G06T1/60 , G06F2212/602 , G06F2212/608
摘要: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the Li cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
-
公开(公告)号:US20220138895A1
公开(公告)日:2022-05-05
申请号:US17430041
申请日:2020-03-14
申请人: Intel Corporation
发明人: Vasanth Raganathan , Abhishek R. Appu , Ben Ashbaugh , Peter Doyle , Brandon Fliflet , Arthur Hunter , Brent Insko , Scott Janus , Altug Koker , Aditya Navale , Joydeep Ray , Kamal Sinha , Lakshminarayanan Striramassarma , Prasoonkumar Surti , James Valerio
摘要: Embodiments are generally directed to compute optimization in graphics processing. An embodiment of an apparatus includes one or more processors including a multi-tile graphics processing unit (GPU) to process data, the multi-tile GPU including multiple processor tiles; and a memory for storage of data for processing, wherein the apparatus is to receive compute work for processing by the GPU, partition the compute work into multiple work units, assign each of multiple work units to one of the processor tiles, and process the compute work using the processor tiles assigned to the work units.
-
公开(公告)号:US20210255957A1
公开(公告)日:2021-08-19
申请号:US17161465
申请日:2021-01-28
申请人: Intel Corporation
发明人: Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, JR. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei
IPC分类号: G06F12/0862 , G06T1/20 , G06T1/60
摘要: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
-
公开(公告)号:US20210150663A1
公开(公告)日:2021-05-20
申请号:US17095590
申请日:2020-11-11
申请人: Intel Corporation
发明人: Subramaniam Maiyuran , Durgaprasad Bilagi , Joydeep Ray , Scott Janus , Sanjeev Jahagirdar , Brent Insko , Lidong Xu , Abhishek R. Appu , James Holland , Vasanth Ranganathan , Nikos Kaburlasos , Altug Koker , Xinmin Tian , Guei-Yuan Lueh , Changliang Wang
IPC分类号: G06T1/60 , G06T1/20 , G06N5/04 , G06F12/0802
摘要: Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.
-
-
-
-
-
-
-
-
-