-
公开(公告)号:US20180095785A1
公开(公告)日:2018-04-05
申请号:US15281260
申请日:2016-09-30
申请人: Altug Koker , Prassonkumar Surti , Guei-Yuan Lueh , Subramaniam Maiyuran , Tomas G. Akenine-Moller , David J. Cowperthwaite , Balaji Vembu
发明人: Altug Koker , Prassonkumar Surti , Guei-Yuan Lueh , Subramaniam Maiyuran , Tomas G. Akenine-Moller , David J. Cowperthwaite , Balaji Vembu
IPC分类号: G06F9/48
CPC分类号: G06F9/4831 , G06F9/4881
摘要: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a thread dispatcher to assign a priority class to each of a plurality of processing threads prior to dispatching the one or more processing threads, a plurality of execution units to process the threads, a shared resource coupled to each of the plurality of execution units and an arbitration unit to grant access to the shared resource to a first of the plurality of execution units based on the priority class of a thread being executed at the first execution unit.
-
公开(公告)号:US10649956B2
公开(公告)日:2020-05-12
申请号:US15477027
申请日:2017-04-01
申请人: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
发明人: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
IPC分类号: G06F16/13 , G06F9/38 , G06F9/30 , G06F16/11 , G06F16/172 , G06F9/46 , G06F12/1036 , G06F12/1045 , G06F12/0831
摘要: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180285374A1
公开(公告)日:2018-10-04
申请号:US15477027
申请日:2017-04-01
申请人: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
发明人: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
摘要: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180285158A1
公开(公告)日:2018-10-04
申请号:US15477026
申请日:2017-04-01
申请人: Abhishek R. Appu , Altug Koker , Balaji Vembu , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu , Subramaniam Maiyuran , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski
发明人: Abhishek R. Appu , Altug Koker , Balaji Vembu , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu , Subramaniam Maiyuran , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski
摘要: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180285106A1
公开(公告)日:2018-10-04
申请号:US15477033
申请日:2017-04-01
申请人: Abhishek R. Appu , Altug Koker , Joydeep Ray , Kamal Sinha , Kiran C. Veernapu , Subramaniam Maiyuran , Prasoonkumar Surti , Guei-Yuan Lueh , David Puffer , Supratim Pal , Eric J. Hoekstra , Travis T. Schluessler , Linda L. Hurd
发明人: Abhishek R. Appu , Altug Koker , Joydeep Ray , Kamal Sinha , Kiran C. Veernapu , Subramaniam Maiyuran , Prasoonkumar Surti , Guei-Yuan Lueh , David Puffer , Supratim Pal , Eric J. Hoekstra , Travis T. Schluessler , Linda L. Hurd
摘要: In an example, an apparatus comprises a plurality of execution units, and a first general register file (GRF) communicatively couple to the plurality of execution units, wherein the first GRF is shared by the plurality of execution units. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180285120A1
公开(公告)日:2018-10-04
申请号:US15477030
申请日:2017-04-01
申请人: Joydeep Ray , Altug Koker , Balaji Vembu , Murali Ramadoss , Guei-Yuan Lueh , James A. Valerio , Prasoonkumar Surti , Abhishek R. Appu , Vasanth Ranganathan , Kalyan Bhairavabhatla , Arthur D. Hunter, JR. , Wei-Yu Chen , Subramaniam M. Maiyuran
发明人: Joydeep Ray , Altug Koker , Balaji Vembu , Murali Ramadoss , Guei-Yuan Lueh , James A. Valerio , Prasoonkumar Surti , Abhishek R. Appu , Vasanth Ranganathan , Kalyan Bhairavabhatla , Arthur D. Hunter, JR. , Wei-Yu Chen , Subramaniam M. Maiyuran
CPC分类号: G09G5/363 , G06F9/461 , G06F12/0811 , G06F12/084 , G06F12/0875 , G06F2212/1024 , G06F2212/1028 , G06F2212/455 , G09G5/001 , G09G2340/02 , G09G2350/00 , G09G2352/00 , G09G2360/08 , G09G2360/121
摘要: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
-
公开(公告)号:US20180307487A1
公开(公告)日:2018-10-25
申请号:US15493442
申请日:2017-04-21
申请人: Subramaniam M. Maiyuran , Guei-Yuan Lueh , Supratim Pal , Gang Chen , Ananda V. Kommaraju , Joy Chandra , Altug Koker , Prasoonkumar Surti , David Puffer , Hong Bin Liao , Joydeep Ray , Abhishek R. Appu , Ankur N. Shah , Travis T. Schluessler , Jonathan Kennedy , Devan Burke
发明人: Subramaniam M. Maiyuran , Guei-Yuan Lueh , Supratim Pal , Gang Chen , Ananda V. Kommaraju , Joy Chandra , Altug Koker , Prasoonkumar Surti , David Puffer , Hong Bin Liao , Joydeep Ray , Abhishek R. Appu , Ankur N. Shah , Travis T. Schluessler , Jonathan Kennedy , Devan Burke
CPC分类号: G06T1/20
摘要: An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.
-
公开(公告)号:US20180096446A1
公开(公告)日:2018-04-05
申请号:US15281276
申请日:2016-09-30
申请人: Kaiyu Chen , Guei-Yuan Lueh , Subramaniam Maiyuran
发明人: Kaiyu Chen , Guei-Yuan Lueh , Subramaniam Maiyuran
摘要: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a plurality of execution units to process graphics context data and a register file having a plurality of registers to store the graphics context data; and register renaming logic to facilitate dynamic renaming of the plurality of registers by logically partitioning the plurality of registers in the register file into a set of fixed registers and a set of shared registers.
-
公开(公告)号:US20170178384A1
公开(公告)日:2017-06-22
申请号:US14976122
申请日:2015-12-21
摘要: Reducing SIMD fragmentation for SIMD execution widths of 32 or even 64 channels in a single hardware thread leads to better EU utilization. Increasing SIMD execution widths to 32 or 64 channels per thread, enables handling more vertices, patches, primitives and triangles per EU hardware thread. Modified 3D pipeline shader payloads can handle multiple patches in case of domain shaders or multiple primitives when primitive object instance count is greater than one in the case of geometry shaders and multiple triangles in case of pixel shaders.
-
公开(公告)号:US20170178274A1
公开(公告)日:2017-06-22
申请号:US14976306
申请日:2015-12-21
CPC分类号: G06F9/46 , G06F8/41 , G06F9/50 , G06F12/0842 , G06F2209/507 , G06T1/60 , G06T15/005 , G06T17/20
摘要: To use SIMD lanes efficiently for domain shader execution, domain point data from different domain shader patches may be packed together into a single SIMD thread. To generate an efficient code sequence, each domain point occupies one SIMD lane and all attributes for the domain point reside in their own partition of General Register File (GRF) space. This technique is called the multiple-patch SIMD dispatch mode.
-
-
-
-
-
-
-
-
-