Patent search ap:("Intel Corporation") AND inv:"Balaji Vembu" Page 19

181.

发明授权
Scalable I/O virtualization interrupt and scheduling 有权

公开(公告)号：US12197358B2

公开(公告)日：2025-01-14

申请号：US18459311

申请日：2023-08-31

Applicant: Intel Corporation

Inventor： David Puffer , Ankur Shah , Niranjan Cooray , Bryan White , Balaji Vembu , Hema Chand Nalluri , Kritika Bala

IPC: G06F9/48 , G06F13/16 , G06F13/24 , G06T1/20

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

182.

发明授权
Handling pipeline submissions across many compute units 有权

公开(公告)号：US12073489B2

公开(公告)日：2024-08-27

申请号：US18300052

申请日：2023-04-13

Applicant: Intel Corporation

Inventor： Balaji Vembu , Altug Koker , Joydeep Ray

IPC: G06T1/20 , G06T15/00

CPC classification number: G06T1/20 , G06T15/005 , G06T2200/04

Abstract: One embodiment provides an apparatus comprising an interconnect fabric comprising a processing cluster including an array of multiprocessors coupled to an interconnect fabric, scheduling circuitry to distribute a plurality of thread groups across the array of multiprocessors, each thread group comprising a plurality of threads. A first multiprocessor of the array of multiprocessors can be assigned to process a first thread group comprising a first plurality of threads including a first thread sub-group and a second thread sub-group. The second thread sub-group has a data dependency on the first thread sub-group and the first multiprocessor includes circuitry to cause threads of the second thread sub-group to sleep until the threads of the first thread sub-group have satisfied the data dependency.

183.

发明授权
Barriers and synchronization for machine learning at autonomous machines 有权

公开(公告)号：US12001209B2

公开(公告)日：2024-06-04

申请号：US17750917

申请日：2022-05-23

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G06F9/48 , G05D1/00 , G06F9/52 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084 , G06F9/46 , G06T1/20

CPC classification number: G05D1/0088 , G06F9/4881 , G06F9/522 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F9/46 , G06T1/20

Abstract: A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.

184.

发明授权
Apparatus and method for managing data bias in a graphics processing architecture 有权

公开(公告)号：US11847719B2

公开(公告)日：2023-12-19

申请号：US17695591

申请日：2022-03-15

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Altug Koker , Balaji Vembu

IPC: G06F15/16 , G06T1/20 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , G06F12/0875 , G06T1/60

CPC classification number: G06T1/20 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0875 , G06F12/0888 , G06T1/60 , G06F2212/1024 , G06F2212/302 , G06F2212/455 , G06F2212/621

Abstract: An apparatus and method are described for managing data which is biased towards a processor or a GPU. For example, an apparatus comprises a processor comprising one or more cores, one or more cache levels, and cache coherence controllers to maintain coherent data in the one or more cache levels; a graphics processing unit (GPU) to execute graphics instructions and process graphics data, wherein the GPU and processor cores are to share a virtual address space for accessing a system memory; a GPU memory addressable through the virtual address space shared by the processor cores and GPU; and bias management circuitry to store an indication for whether the data has a processor bias or a GPU bias, wherein if the data has a GPU bias, the data is to be accessed by the GPU without necessarily accessing the processor's cache coherence controllers.

185.

发明公开
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 审中-公开

公开(公告)号：US20230394616A1

公开(公告)日：2023-12-07

申请号：US18334733

申请日：2023-06-14

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08

CPC classification number: G06T1/20 , G06N3/063 , G06F9/3887 , G06F9/3895 , G06F9/3001 , G06F9/3851 , G06F9/3017 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08

Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.

186.

发明授权
Autonomous vehicle advanced sensing and response 有权

公开(公告)号：US11810405B2

公开(公告)日：2023-11-07

申请号：US17539083

申请日：2021-11-30

Applicant: Intel Corporation

Inventor： Barath Lakshamanan , Linda L. Hurd , Ben J. Ashbaugh , Elmoustapha Ould-Ahmed-Vall , Liwei Ma , Jingyi Jin , Justin E. Gottschlich , Chandrasekaran Sakthivel , Michael S. Strickland , Brian T. Lewis , Lindsey Kuper , Altug Koker , Abhishek R. Appu , Prasoonkumar Surti , Joydeep Ray , Balaji Vembu , Javier S. Turek , Naila Farooqui

IPC: G01C22/00 , G07C5/00 , G05D1/00 , G08G1/01 , H04L67/12 , G06N20/00 , G06F9/50 , G01C21/34 , B60W30/00 , G06N3/063 , G06N3/084 , G06N20/10 , G06N3/044 , G06N3/045 , G08G1/052 , G01S19/13 , H04L43/0852 , G05D1/02 , H04L43/16

CPC classification number: G07C5/008 , B60W30/00 , G01C21/34 , G01C21/3415 , G01C21/3492 , G05D1/0088 , G06F9/5027 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N20/00 , G06N20/10 , G08G1/012 , H04L67/12 , G01S19/13 , G05D1/0257 , G05D2201/0213 , G06F2209/509 , G08G1/0112 , G08G1/052 , H04L43/0852 , H04L43/16

Abstract: An autonomous vehicle is provided that includes one or more processors configured to provide a local compute manager to manage execution of compute workloads associated with the autonomous vehicle. The local compute manager can perform various compute operations, including receiving offload of compute operations from to other compute nodes and offloading compute operations to other compute notes, where the other compute nodes can be other autonomous vehicles. The local compute manager can also facilitate autonomous navigation functionality.

187.

发明授权
Coordination and increased utilization of graphics processors during inference 有权

公开(公告)号：US11748841B2

公开(公告)日：2023-09-05

申请号：US17871781

申请日：2022-07-22

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Kamal Sinha , Joydeep Ray , Balaji Vembu , Sanjeev Jahagirdar , Vasanth Ranganathan , Dukhwan Kim

IPC: G06T1/20 , G06N3/063 , G06F9/46 , G06N3/045 , G06N3/08 , G06N3/084 , G06N3/044

CPC classification number: G06T1/20 , G06F9/46 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/044 , G06N3/084

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.

188.

发明授权
Dynamic precision for neural network compute operations 有权

公开(公告)号：US11748606B2

公开(公告)日：2023-09-05

申请号：US17317857

申请日：2021-05-11

Applicant: INTEL CORPORATION

Inventor： Kamal Sinha , Balaji Vembu , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Farshad Akhbari , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Nadathur Rajagopalan Satish , John C. Weast , Mike B. MacPherson , Linda L. Hurd , Vasanth Ranganathan , Sanjeev S. Jahagirdar

IPC: G06F7/50 , G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20 , G06F9/30 , G06T15/00 , G06F15/78 , G06F15/76 , G06F1/3287 , G06F1/3293 , G06N3/084 , G06N3/044 , G06N3/045 , G06T1/60

CPC classification number: G06N3/063 , G06F1/3287 , G06F1/3293 , G06F9/30014 , G06F9/30036 , G06F15/76 , G06F15/78 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08 , G06N3/084 , G06T1/20 , G06T15/005 , G06T1/60

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

189.

发明授权
Order independent asynchronous compute and streaming for graphics 有权

公开(公告)号：US11688122B2

公开(公告)日：2023-06-27

申请号：US17591166

申请日：2022-02-02

Applicant: Intel Corporation

Inventor： Devan Burke , Adam T. Lake , Jeffery S. Boles , John H. Feit , Karthik Vaidyanathan , Abhishek R. Appu , Joydeep Ray , Subramaniam Maiyuran , Altug Koker , Balaji Vembu , Murali Ramadoss , Prasoonkumar Surti , Eric J. Hoekstra , Gabor Liktor , Jonathan Kennedy , Slawomir Grajewski , Elmoustapha Ould-Ahmed-Vall

IPC: G06T15/00 , G06F9/48 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20

CPC classification number: G06T15/005 , G06F9/4881 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20

Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, and a graphics subsystem communicatively coupled to the application processor. The system may include one or more of a draw call re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more draw calls, a workload re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more work items in an order independent mode, a queue primitive included in at least one of the two or more draw calls to define a producer stage and a consumer stage, and an order-independent executor communicatively coupled to the application processor and the graphics subsystem to provide tile-based order independent execution of a compute stage. Other embodiments are disclosed and claimed.

190.

发明申请
MACHINE LEARNING SPARSE COMPUTATION MECHANISM 有权

公开(公告)号：US20230040631A1

公开(公告)日：2023-02-09

申请号：US17881720

申请日：2022-08-05

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , H03M7/30 , G06K9/62 , G06N20/00 , G06F12/02 , G06F9/48 , G06F17/16 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification