Patent search ap:("Intel Corporation") AND inv:"Konrad Trifunovic" Page 1

1.

发明授权
Software scoreboard information and synchronization 有权

公开(公告)号：US10360654B1

公开(公告)日：2019-07-23

申请号：US15990328

申请日：2018-05-25

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Supratim Pal , Jorge E. Parra , Chandra S. Gurram , Ashwin J. Shivani , Ashutosh Garg , Brent A. Schwartz , Jorge F. Garcia Pabon , Darin M. Starkey , Shubh B. Shah , Guei-Yuan Lueh , Kaiyu Chen , Konrad Trifunovic , Buqi Cheng , Weiyu Chen

IPC: G06F9/38 , G06F8/41 , G06T1/20 , G06F9/30 , G06T1/60 , G09G5/36 , G06T15/00

Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware. The compiler can then insert a software scoreboard sync immediate instruction into compiled program code to manage instruction dependencies and prevent data hazards from occurring.

2.

发明授权
Smart performance of spill fill data transfers in computing environments 有权

公开(公告)号：US10956359B2

公开(公告)日：2021-03-23

申请号：US15678592

申请日：2017-08-16

Applicant: Intel Corporation

Inventor： Marek Targowski , Konrad Trifunovic

IPC: G06F9/30 , G06F15/80 , G06T1/20 , G06N5/02 , G06F9/50

Abstract: A mechanism is described for facilitating smart spill/fill data transfers in computing environments. A method of embodiments, as described herein, includes facilitating dividing a kernel into regions including low pressure regions and high pressure regions, where the low pressure regions are associated with low use of registers hosted by a processor of a computing device, while the high pressure regions are associated with high use of the registers. The method may further include transferring of data between memory and the registers based on at least one of the low pressure regions and the high pressure regions.

3.

发明申请
ACCUMULATOR POOLING MECHANISM 审中-公开

公开(公告)号：US20200320662A1

公开(公告)日：2020-10-08

申请号：US16378047

申请日：2019-04-08

Applicant: Intel Corporation

Inventor： Guei-Yuan Lueh , Subramaniam Maiyuran , Wei-Yu Chen , Konrad Trifunovic , Supratim Pal , Chandra S. Gurram , Jorge E. Parra , Pratik J. Ashar , Tomasz Bujewski

IPC: G06T1/60 , G06F9/30 , G06T1/20

Abstract: A processor is disclosed. The processor includes an execution unit having a register file having one or more banks of registers to store operand values, an accumulator comprising a pool of registers to store operand values determined to cause a conflict at register banks within the register file and cache circuitry to control storage of the operand values determined to cause a conflict at the register banks from the register file to the pool of registers.

4.

发明授权
Software scoreboard information and synchronization 有权

公开(公告)号：US10692170B2

公开(公告)日：2020-06-23

申请号：US16437961

申请日：2019-06-11

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Supratim Pal , Jorge E. Parra , Chandra S. Gurram , Ashwin J. Shivani , Ashutosh Garg , Brent A. Schwartz , Jorge F. Garcia Pabon , Darin M. Starkey , Shubh B. Shah , Guei-Yuan Lueh , Kaiyu Chen , Konrad Trifunovic , Buqi Cheng , Weiyu Chen

IPC: G06F9/38 , G06F8/41 , G06T1/20 , G06F9/30 , G06T1/60 , G09G5/36 , G06T15/00

Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware. The compiler can then insert a software scoreboard sync immediate instruction into compiled program code to manage instruction dependencies and prevent data hazards from occurring.

5.

发明申请
SOFTWARE SCOREBOARD INFORMATION AND SYNCHRONIZATION 审中-公开

公开(公告)号：US20190362460A1

公开(公告)日：2019-11-28

申请号：US16437961

申请日：2019-06-11

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Supratim Pal , Jorge E. Parra , Chandra S. Gurram , Ashwin J. Shivani , Ashutosh Garg , Brent A. Schwartz , Jorge F. Garcia Pabon , Darin M. Starkey , Shubh B. Shah , Guei-Yuan Lueh , Kaiyu Chen , Konrad Trifunovic , Buqi Cheng , Weiyu Chen

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06F8/41

Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware. The compiler can then insert a software scoreboard sync immediate instruction into compiled program code to manage instruction dependencies and prevent data hazards from occurring.

6.

发明授权
Instruction and logic for systolic dot product with accumulate 有权

公开(公告)号：US11640297B2

公开(公告)日：2023-05-02

申请号：US17304153

申请日：2021-06-15

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Guei-Yuan Lueh , Supratim Pal , Ashutosh Garg , Chandra S. Gurram , Jorge E. Parra , Junjie Gu , Konrad Trifunovic , Hong Bin Liao , Mike B. MacPherson , Shubh B. Shah , Shubra Marwaha , Stephen Junkins , Timothy R. Bauer , Varghese George , Weiyu Chen

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes systolic dot product circuitry to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.

7.

发明授权
Instruction and logic for systolic dot product with accumulate 有权

公开(公告)号：US11042370B2

公开(公告)日：2021-06-22

申请号：US15957728

申请日：2018-04-19

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Guei-Yuan Lueh , Supratim Pal , Ashutosh Garg , Chandra S. Gurram , Jorge E. Parra , Junjie Gu , Konrad Trifunovic , Hong Bin Liao , Mike B. Macpherson , Shubh B. Shah , Shubra Marwaha , Stephen Junkins , Timothy R. Bauer , Varghese George , Weiyu Chen

IPC: G06F9/30 , G06T1/20 , G06F9/38

Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes a systolic dot product unit to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.

8.

发明授权
Register sharing mechanism 有权

公开(公告)号：US10983794B2

公开(公告)日：2021-04-20

申请号：US16443285

申请日：2019-06-17

Applicant: Intel Corporation

Inventor： Guei-Yuan Lueh , Subramaniam Maiyuran , Weiyu Chen , Konrad Trifunovic , Supratim Pal , Chandra S. Gurram , Jorge E. Parra , Pratik J. Ashar , Tomasz Bujewski

IPC: G06F9/30 , G06F9/54 , G06F9/48 , G06F12/1009 , G06F9/50

Abstract: An processor to facilitate register sharing is disclosed. The processor includes a plurality of execution units (EUs), each including a General Purpose Register File (GRF) having a plurality of registers; and register sharing hardware to divide the plurality of registers into a first set of registers dedicated for execution of a first set of threads and a second set of registers shared for execution of a second set of threads.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification