Patent search ap:("Intel Corporation") AND inv:"SUBRAMANIAM MAIYURAN" Page 3

21.

发明申请
SCALAR CORE INTEGRATION 有权

公开(公告)号：US20210349848A1

公开(公告)日：2021-11-11

申请号：US17321885

申请日：2021-05-17

Applicant: Intel Corporation

Inventor： JOYDEEP RAY , ARAVINDH ANANTARAMAN , ABHISHEK R. APPU , ALTUG KOKER , ELMOUSTAPHA OULD-AHMED-VALL , VALENTIN ANDREI , SUBRAMANIAM MAIYURAN , NICOLAS GALOPPO VON BORRIES , VARGHESE GEORGE , MIKE MACPHERSON , BEN ASHBAUGH , MURALI RAMADOSS , VIKRANTH VEMULAPALLI , WILLIAM SADLER , JONATHAN PEARCE , SUNGYE KIM

IPC: G06F15/80 , G06F9/30 , G06F9/38 , G06T15/00

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

22.

发明申请
POSITION-BASED RENDERING APPARATUS AND METHOD FOR MULTI-DIE/GPU GRAPHICS PROCESSING 有权

公开(公告)号：US20210272349A1

公开(公告)日：2021-09-02

申请号：US17306769

申请日：2021-05-03

Applicant: Intel Corporation

Inventor： TRAVIS SCHLUESSLER , ZACK WATERS , MICHAEL APODACA , DANIEL JOHNSTON , JASON SURPRISE , PRASOONKUMAR SURTI , SUBRAMANIAM MAIYURAN , PETER DOYLE , SAURABH SHARMA , ANKUR SHAH , MURALI RAMADOSS

IPC: G06T15/00 , G06T15/40 , G06T15/80

Abstract: Position-based rendering apparatus and method for multi-die/GPU graphics processing. For example, one embodiment of a method comprises: distributing a plurality of graphics draws to a plurality of graphics processors; performing position-only shading using vertex data associated with tiles of a first draw on a first graphics processor, the first graphics processor responsively generating visibility data for each of the tiles; distributing subsets of the visibility data associated with different subsets of the tiles to different graphics processors; limiting geometry work to be performed on each tile by each graphics processor using the visibility data, each graphics processor to responsively generate rendered tiles; and wherein the rendered tiles are combined to generate a complete image frame.

23.

发明申请
SINGLE INPUT MULTIPLE DATA PROCESSING MECHANISM 审中-公开

公开(公告)号：US20200043124A1

公开(公告)日：2020-02-06

申请号：US16543849

申请日：2019-08-19

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , JORGE F. GARCIA PABON , VIKRANTH VEMULAPALLI , CHANDRA S. GURRAM , ADITYA NAVALE , SAURABH SHARMA

IPC: G06T1/20 , G06F9/30 , G06F9/38

Abstract: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a register file having a plurality of channels to store data and an execution unit to examine data at each of the plurality of channels, read a data value from a first of the plurality of channels upon a determination that each of the plurality of channels has the same data and execute a single input multi data (SIMD) instruction based on the data value.

24.

发明申请
INSTRUCTION AND LOGIC FOR SYSTOLIC DOT PRODUCT WITH ACCUMULATE 审中-公开

公开(公告)号：US20190324746A1

公开(公告)日：2019-10-24

申请号：US15957728

申请日：2018-04-19

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , GUEI-YUAN LUEH , SUPRATIM PAL , ASHUTOSH GARG , CHANDRA S. GURRAM , JORGE E. PARRA , JUNJIE GU , KONRAD TRIFUNOVIC , HONG BIN LIAO , MIKE B. MACPHERSON , SHUBH B. SHAH , SHUBRA MARWAHA , STEPHEN JUNKINS , TIMOTHY R. BAUER , VARGHESE GEORGE , WEIYU CHEN

IPC: G06F9/30 , G06F9/38 , G06T1/20

Abstract: Embodiments described herein provided for an instruction and associated logic to enable GPGPU program code to access special purpose hardware logic to accelerate dot product operations. One embodiment provides for a graphics processing unit comprising a fetch unit to fetch an instruction for execution and a decode unit to decode the instruction into a decoded instruction. The decoded instruction is a matrix instruction to cause the graphics processing unit to perform a parallel dot product operation. The GPGPU also includes a systolic dot product unit to execute the decoded instruction across one or more SIMD lanes using multiple systolic layers, wherein to execute the decoded instruction, a dot product computed at a first systolic layer is to be output to a second systolic layer, wherein each systolic layer includes one or more sets of interconnected multipliers and adders, each set of multipliers and adders to generate a dot product.

25.

发明申请
METHOD AND APPARATUS FOR A HIGH THROUGHPUT RASTERIZER 审中-公开
Title translation: 高通量放电器的方法和装置

公开(公告)号：US20160180585A1

公开(公告)日：2016-06-23

申请号：US14581701

申请日：2014-12-23

Applicant: INTEL CORPORATION

Inventor： SUBRAMANIAM MAIYURAN , THOMAS A. PIAZZA , JORGE F. GARCIA PABON , SHUBH B. SHAH

IPC: G06T17/10 , G06K9/46

CPC classification number: G06K9/4604 , G06T15/005 , G06T2210/12

Abstract: An apparatus and method are described for a high throughput rasterizer. For example, one embodiment of an apparatus comprises: block selection logic to select a plurality of pixel blocks associated with edges of a primitive, the plurality of pixel blocks selected based on the pixel blocks having samples which are both inside and outside of the primitive; and edge determination logic to analyze samples of the plurality of pixel blocks selected by the block selection logic and responsively generate data identifying each edge of the primitive; and final mask determination logic to combine the data identifying each edge and generate a final mask representing the primitive.

Abstract translation: 为高通量光栅化器描述了一种装置和方法。例如，装置的一个实施例包括：块选择逻辑，用于选择与图元的边缘相关联的多个像素块，所述多个像素块基于具有在图元内部和外部的样本的像素块来选择; 以及边缘确定逻辑来分析由块选择逻辑选择的多个像素块的样本，并且响应地生成识别图元的每个边缘的数据; 以及最终掩模确定逻辑，以组合识别每个边缘的数据，并生成表示原始图案的最终掩模。

26.

发明申请
SYSTEMS AND METHODS FOR REDUCING REGISTER BANK CONFLICTS BASED ON SOFTWARE HINT AND HARDWARE THREAD SWITCH 有权

公开(公告)号：US20220179655A1

公开(公告)日：2022-06-09

申请号：US17502492

申请日：2021-10-15

Applicant: Intel Corporation

Inventor： BUQI CHENG , WEI-YU CHEN , GUEI-YUAN LUEH , CHANDRA GURRAM , SUBRAMANIAM MAIYURAN

IPC: G06F9/38 , G06F9/30 , G06F8/41

Abstract: Mechanisms for reducing register bank conflicts based on software hint and hardware thread switch are disclosed. In some embodiments, an apparatus for thread switching includes a graphics processing unit (GPU) that includes a plurality of register banks to store operands that are assigned at least partially to avoid register bank conflicts. A decoding circuitry checks a thread switching field of a first instruction to be executed by a first thread. The GPU performs a thread switch mechanism to cause a second instruction to be executed by a second thread when the thread switching field of the first instruction is set.

27.

发明申请
SPARSE MATRIX MULTIPLICATION ACCELERATION MECHANISM 有权

公开(公告)号：US20220171827A1

公开(公告)日：2022-06-02

申请号：US17527324

申请日：2021-11-16

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , MATHEW NEVIN , JORGE PARRA , ASHUTOSH GARG , SHUBRA MARWAHA , SHUBH SHAH

IPC: G06F17/16 , G06F7/487 , G06F9/30 , G06F13/16

Abstract: An apparatus to facilitate acceleration of matrix multiplication operations. The apparatus comprises a systolic array including matrix multiplication hardware to perform multiply-add operations on received matrix data comprising data from a plurality of input matrices and sparse matrix acceleration hardware to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce multiply-add operations to be performed by the matrix multiplication hardware.

28.

发明申请
SHARING REGISTER FILE USAGE BETWEEN FUSED PROCESSING RESOURCES 有权

公开(公告)号：US20210089301A1

公开(公告)日：2021-03-25

申请号：US16582406

申请日：2019-09-25

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , VARGHESE GEORGE , JOYDEEP RAY , ASHUTOSH GARG , JORGE PARRA , SHUBH SHAH , SHUBRA MARWAHA

IPC: G06F9/30 , G06F17/16 , G06F9/50

Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a shared local memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive an instruction to initiate a matrix multiplication operation, write a first set of matrix data into a first set of registers, and share the first set of matrix data between the first processing resource and the second processing resource for use in the matrix multiplication operation. Other embodiments may be described and claimed.

29.

发明申请
SPARSE MATRIX MULTIPLICATION ACCELERATION MECHANISM 有权

公开(公告)号：US20210073318A1

公开(公告)日：2021-03-11

申请号：US16561715

申请日：2019-09-05

Applicant: Intel Corporation

Inventor： SUBRAMANIAM MAIYURAN , MATHEW NEVIN , JORGE PARRA , ASHUTOSH GARG , SHUBRA MARWAHA , SHUBH SHAH

IPC: G06F17/16 , G06F7/487 , G06F9/30 , G06F13/16

Abstract: An apparatus to facilitate acceleration of matrix multiplication operations. The apparatus comprises a systolic array including matrix multiplication hardware to perform multiply-add operations on received matrix data comprising data from a plurality of input matrices and sparse matrix acceleration hardware to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce multiply-add operations to be performed by the matrix multiplication hardware.

30.

发明申请
REGISTER SHARING MECHANISM 审中-公开

公开(公告)号：US20200285471A1

公开(公告)日：2020-09-10

申请号：US16881920

申请日：2020-05-22

Applicant: Intel Corporation

Inventor： PRATIK J. ASHAR , SUPRATIM PAL , SUBRAMANIAM MAIYURAN , WEI-YU CHEN , GUEI-YUAN LUEH

IPC: G06F9/30 , G06F9/50 , G06F8/41

Abstract: An apparatus to facilitate register sharing is disclosed. The apparatus includes one or more processors to generate first machine code having a first General Purpose Register (GRF) per thread ratio, detect an occurrence of one or more spill/fill instructions in the first machine code, and generate second machine code having a second GRF per thread ratio upon a detection of one or more spill/fill instructions in the first machine code, wherein the second GRF per thread ratio is based on a disabling of a first of a plurality of hardware threads

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification