Patent search ap:("Intel Corporation") AND inv:"Balaji Vembu" Page 25

241.

发明授权
Extend GPU/CPU coherency to multi-GPU cores 有权

公开(公告)号：US10521349B2

公开(公告)日：2019-12-31

申请号：US16277267

申请日：2019-02-15

Applicant: Intel Corporation

Inventor： Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker

IPC: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/04 , G06N3/063

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

242.

发明授权
Graphics engine partitioning mechanism 有权

公开(公告)号：US10482562B2

公开(公告)日：2019-11-19

申请号：US15493522

申请日：2017-04-21

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Balaji Vembu , Altug Koker , Bryan R. White , David J. Cowperthwaite , Joydeep Ray , Murali Ramadoss

IPC: G06F15/80 , G06T1/20 , G06F9/50 , G06F9/455

Abstract: An apparatus to facilitate partitioning of a graphics device is disclosed. The apparatus includes a plurality of engines and logic to partition the plurality of engines to facilitate independent access to each engine within the plurality of engines.

243.

发明申请
COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20190332903A1

公开(公告)日：2019-10-31

申请号：US16505012

申请日：2019-07-08

Applicant: Intel Corporation

Inventor： Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha

IPC: G06K9/66 , G06N3/08 , G06N3/063 , G06N3/04 , G06T1/20 , G06K9/00 , G06F9/38 , G06F9/30

Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a bipolar binary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the bipolar binary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.

244.

发明授权
Memory-based dependency tracking and cache pre-fetch hardware for multi-resolution shading 有权

公开(公告)号：US10452552B2

公开(公告)日：2019-10-22

申请号：US15488988

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Andrew T. Lauritzen , Gabor Liktor , Tomer Bar-On , Hugues Labbe , John G. Gierach , Joydeep Ray , Travis T. Schluessler , John H. Feit , Nikos Kaburlasos , Jacek Kwiatkowski , Abhishek R. Appu , Balaji Vembu , Altug Koker

IPC: G06F12/0862 , G06F9/30 , G06F12/0875 , G06F12/0811 , G06F12/0855 , G06F9/38 , G06T1/20

Abstract: Systems, apparatuses and methods may provide a way to track graphics pipeline operations. More particularly, the systems, apparatuses and methods may provide a way to track operation dependencies between graphics pipeline operations for blocks of pixel samples and stall one or more of the pipeline operations based on the operation dependencies. The systems, apparatuses and methods may further provide cache pre-fetch hardware to monitor processing of blocks of pixel samples and fetch a next block of the pixel samples from the memory into a cache before completion of processing a current block of pixel samples based on one or more of the pipeline operations or a surface state of one or more regions of a screen space.

245.

发明授权
Efficient multi-context thread distribution 有权

公开(公告)号：US10452397B2

公开(公告)日：2019-10-22

申请号：US15477022

申请日：2017-04-01

Applicant: Intel Corporation

Inventor： Joydeep Ray , Altug Koker , Balaji Vembu , Abhishek R. Appu , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu

IPC: G06F9/30 , G06F12/0842 , G09G5/393 , G06T1/60 , G06T1/20 , G06F9/50

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.

246.

发明申请
Graphics Processor With Encrypted Kernels 审中-公开

公开(公告)号：US20190296909A1

公开(公告)日：2019-09-26

申请号：US16435083

申请日：2019-06-07

Applicant: Intel Corporation

Inventor： Balaji Vembu , Vidhya Krishnan , Sandeep S. Sodhi , Scott Janus , Daniel Nemiroff

IPC: H04L9/14 , G06F21/74 , G06F21/75

Abstract: An embodiment of a graphics apparatus may include a graphics processor including a kernel executor, and a security engine communicatively coupled to the graphics processor. The security engine may be configured to create a kernel security key, encrypt an executable kernel for the kernel executor in accordance with the kernel security key, and share the kernel security key with the graphics processor.

247.

发明授权
Compute optimizations for neural networks 有权

公开(公告)号：US10410098B2

公开(公告)日：2019-09-10

申请号：US15494710

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha

IPC: G06F9/30 , G06K9/66 , G06K9/00 , G06F9/38 , G06F9/46 , G06T1/20 , G06N3/04 , G06N3/063 , G06N3/08

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value associated with a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register, wherein to execute the decoded instruction, the barrel shifter is to shift the input value by the quantized weight value to generate a shifted input value and the adder is to add the shifted input value to a value stored in the accumulator register and update the value stored in the accumulator register.

248.

发明申请
AVOID CACHE LOOKUP FOR COLD CACHE 审中-公开

公开(公告)号：US20190251033A1

公开(公告)日：2019-08-15

申请号：US16277114

申请日：2019-02-15

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Prasoonkumar Surti , Kamal Sinha , Kiran C. Veernapu , Balaji Vembu

IPC: G06F12/0888 , G06F13/40 , G06T1/60 , G06T1/20 , G06F13/42

CPC classification number: G06F12/0888 , G06F12/0895 , G06F13/4022 , G06F13/4282 , G06F2212/1024 , G06F2212/1028 , G06F2212/6032 , G06F2213/0026 , G06T1/20 , G06T1/60

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.

249.

发明授权
Handling pipeline submissions across many compute units 有权

公开(公告)号：US10380713B2

公开(公告)日：2019-08-13

申请号：US16150012

申请日：2018-10-02

Applicant: Intel Corporation

Inventor： Balaji Vembu , Altug Koker , Joydeep Ray

IPC: G06T1/20 , G06T15/00

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising multiple processing elements having a single instruction, multiple thread architecture, the multiple processing elements enabled to perform hardware multithreading, wherein execution context for threads to be executed is maintained on-chip during execution, a scheduler to schedule a warp to the multiple processing elements, wherein the warp is a group of parallel threads, the warp includes multiple sub-warps, and threads within the warp diverge at sub-warp granularity, and a logic unit including hardware or firmware logic, the logic unit to group active threads from the warp for execution on the multiple processing elements.

250.

发明授权
Autonomous vehicle advanced sensing and response 有权

公开(公告)号：US10332320B2

公开(公告)日：2019-06-25

申请号：US15488914

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Barath Lakshamanan , Linda L. Hurd , Ben J. Ashbaugh , Elmoustapha Ould-Ahmed-Vall , Liwei Ma , Jingyi Jin , Justin E. Gottschlich , Chandrasekaran Sakthivel , Michael S. Strickland , Brian T. Lewis , Lindsey Kuper , Altug Koker , Abhishek R. Appu , Prasoonkumar Surti , Joydeep Ray , Balaji Vembu , Javier S. Turek , Naila Farooqui

IPC: G01C22/00 , G07C5/00 , G05D1/00 , G01C21/34 , G08G1/01 , H04W28/08 , G06N20/00 , G06F9/50 , G08G1/052 , G01S19/13 , G05D1/02 , H04L29/08 , H04L12/26

Abstract: One embodiment provides for a computing device within an autonomous vehicle, the compute device comprising a wireless network device to enable a wireless data connection with an autonomous vehicle network, a set of multiple processors including a general-purpose processor and a general-purpose graphics processor, the set of multiple processors to execute a compute manager to manage execution of compute workloads associated with the autonomous vehicle, the compute workload associated with autonomous operations of the autonomous vehicle, and offload logic configured to execute on the set of multiple processors, the offload logic to determine to offload one or more of the compute workloads to one or more autonomous vehicles within range of the wireless network device.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification