Patent search ap:("Intel Corporation") AND inv:"Balaji Vembu" Page 26

251.

发明申请
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 审中-公开

公开(公告)号：US20190139182A1

公开(公告)日：2019-05-09

申请号：US16197783

申请日：2018-11-21

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/08 , G06N3/04

CPC classification number: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/08 , G06N3/084

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

252.

发明授权
Apparatus and method for managing data bias in a graphics processing architecture 有权

公开(公告)号：US10282811B2

公开(公告)日：2019-05-07

申请号：US15482685

申请日：2017-04-07

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Altug Koker , Balaji Vembu

IPC: G09G5/36 , G06T1/20 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , G06F12/0875 , G06T1/60

Abstract: An apparatus and method are described for managing data which is biased towards a processor or a GPU. For example, one embodiment of an apparatus comprises: a processor comprising one or more cores to execute instructions and process data, one or more cache levels, and cache coherence controllers to maintain coherent data in the one or more cache levels; a graphics processing unit (GPU) to execute graphics instructions and process graphics data, wherein the GPU and processor cores are to share a virtual address space for accessing a system memory; a GPU memory coupled to the GPU, the GPU memory addressable through the virtual address space shared by the processor cores and GPU; and bias management circuitry to store an indication, for each of a plurality of blocks of data, whether the data has a processor bias or a GPU bias, wherein if the data has a GPU bias, then the data is to be accessed by the GPU from the GPU memory without necessarily accessing the processor's cache coherence controllers and wherein requests for the data from the processor cores are processed as uncached requests, preventing the data from being cached in the one or more cache levels of the processor.

253.

发明授权
High bandwidth connection between processor dies 有权

公开(公告)号：US10255109B2

公开(公告)日：2019-04-09

申请号：US15489062

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Altug Koker , Abhishek R. Appu , Kiran C. Veernapu , Joydeep Ray , Balaji Vembu

IPC: G06F9/50 , G06T1/20

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive a completion acknowledgment from the plurality of graphics processing units and in response to a determination that the workload is finished, to terminate one or more communication connections on the interconnect bridge. Other embodiments are also disclosed and claimed.

254.

发明授权
Position only shader context submission through a render command streamer 有权

公开(公告)号：US10210655B2

公开(公告)日：2019-02-19

申请号：US14865933

申请日：2015-09-25

Applicant: Intel Corporation

Inventor： Murali Ramadoss , Balaji Vembu , Hema C. Nalluri , Michael Apodaca , Jeffery S. Boles

IPC: G06T15/80 , G06T1/20 , G06F9/48

Abstract: By scheduling/managing workload submission to a position only shading pipe one can exploit parallelism with minimum impact to the software scheduler in some embodiments. An interface submits workloads to a slave engine running in one parallel pipe to assist a main engine running in another parallel pipe. Command sequences for each parallel pipe are separated to enable the slave engine to run ahead of the main engine. The slave engine is a position only shader and the main engine is a render engine.

255.

发明申请
COMPUTE CLUSTER PREEMPTION WITHIN A GENERAL-PURPOSE GRAPHICS PROCESSING UNIT 审中-公开

公开(公告)号：US20180308209A1

公开(公告)日：2018-10-25

申请号：US16010692

申请日：2018-06-18

Applicant: Intel Corporation

Inventor： Murali Ramadoss , Balaji Vembu , Eric C. Samson , Kun Tian , David J. Cowperthwaite , Altug Koker , Zhi Wang , Joydeep Ray , Subramaniam M. Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F9/48 , G06F9/50

Abstract: Embodiments described herein provide techniques enable a compute unit to continue processing operations when all dispatched threads are blocked. One embodiment provides for an apparatus comprising a thread dispatcher to dispatch a thread for execution; a compute unit having a single instruction, multiple thread architecture, the compute unit to execute multiple concurrent threads; and a memory coupled with the compute unit, the memory to store thread state for a suspended thread, wherein the compute unit is to: detect that all threads on the compute unit are blocked from execution, select a victim thread from the multiple concurrent threads, suspend the victim thread, store thread state of the victim thread to the memory, and replace the victim thread with an additional thread to be executed.

256.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20180308208A1

公开(公告)日：2018-10-25

申请号：US15819093

申请日：2017-11-21

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06T1/60 , G06F17/16

CPC classification number: G06T1/20 , G06F8/41 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06F17/16 , G06F2009/45583 , G06T1/60

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type

257.

发明申请
REDUCE POWER BY FRAME SKIPPING 审中-公开

公开(公告)号：US20180308205A1

公开(公告)日：2018-10-25

申请号：US15495956

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Balaji Vembu , Nikos Kaburlasos , Josh B. Mastronarde

IPC: G06T1/20 , G06T1/60 , G06F1/32

CPC classification number: G06T1/20 , G06F1/3203 , G06F1/3231 , G06F1/3234 , G06F1/3265 , G06T1/60 , G09G2330/021 , G09G2340/0435 , G09G2354/00

Abstract: In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive an input from one or more detectors proximate a display to present an output from a graphics pipeline, determine that a user is not interacting with the display, and in response to a determination that the user is not interacting with the display, to reduce a frame rendering rate of the graphics pipeline. Other embodiments are also disclosed and claimed.

258.

发明申请
DYNAMIC PRECISION FOR NEURAL NETWORK COMPUTE OPERATIONS 审中-公开

公开(公告)号：US20180307971A1

公开(公告)日：2018-10-25

申请号：US15495020

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Kamal Sinha , Balaji Vembu , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Farshad Akhbari , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Nadathur Rajagopalan Satish , John C. Weast , Mike B. MacPherson , Linda L. Hurd , Vasanth Ranganathan , Sanjeev S. Jahagirdar

IPC: G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20

CPC classification number: G06N3/063 , G06F1/3287 , G06F1/3293 , G06F9/30014 , G06F9/30036 , G06F15/78 , G06N3/04 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/005

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

259.

发明申请
COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20180307950A1

公开(公告)日：2018-10-25

申请号：US15494710

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha

IPC: G06K9/66 , G06N3/04 , G06N3/08 , G06K9/00

CPC classification number: G06K9/66 , G06F9/3001 , G06F9/3851 , G06F9/3887 , G06F9/3893 , G06F2207/4824 , G06K9/00973 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/084 , G06T1/20

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value associated with a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register, wherein to execute the decoded instruction, the barrel shifter is to shift the input value by the quantized weight value to generate a shifted input value and the adder is to add the shifted input value to a value stored in the accumulator register and update the value stored in the accumulator register.

260.

发明申请
MEMORY-BASED SOFTWARE BARRIERS 审中-公开

公开(公告)号：US20180307529A1

公开(公告)日：2018-10-25

申请号：US15493670

申请日：2017-04-21

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Balaji Vembu , James A. Valerio , Abhishek R. Appu

IPC: G06F9/48 , G06T1/20 , G06T1/60

CPC classification number: G06F9/4881 , G06T1/60 , G06T2210/52

Abstract: A mechanism is described for facilitating memory-based software barriers to emulate hardware barriers at graphics processors in computing devices. A method of embodiments, as described herein, includes facilitating converting thread scheduling at a processor from hardware barriers to software barriers, where the software barriers emulate the hardware barriers.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification