Patent search ap:("Intel Corporation") AND inv:"Balaji Vembu" Page 9

81.

发明授权
Offloading touch processing to a graphics processor 有权

公开(公告)号：US08896560B2

公开(公告)日：2014-11-25

申请号：US13785098

申请日：2013-03-05

Applicant: Intel Corporation

Inventor： Balaji Vembu , David I. Poisner , Arvind Kumar , Chaitanya R. Gandra

IPC: G06T1/20

CPC classification number: G06T1/20

Abstract: In an embodiment, a processor includes a graphics domain including a graphics engines each having at least one execution unit. The graphics domain is to schedule a touch application offloaded from a core domain to at least one of the plurality of graphics engines. The touch application is to execute responsive to an update to a doorbell location in a system memory coupled to the processor, where the doorbell location is written responsive to a user input to the touch input device. Other embodiments are described and claimed.

82.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US12198221B2

公开(公告)日：2025-01-14

申请号：US18436494

申请日：2024-02-08

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F8/41 , G06F9/455 , G06F9/50 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

83.

发明公开
BARRIERS AND SYNCHRONIZATION FOR MACHINE LEARNING AT AUTONOMOUS MACHINES 审中-公开

公开(公告)号：US20240280987A1

公开(公告)日：2024-08-22

申请号：US18595649

申请日：2024-03-05

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G05D1/00 , G06F9/46 , G06F9/48 , G06F9/52 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T1/20

CPC classification number: G05D1/0088 , G06F9/4881 , G06F9/522 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F9/46 , G06T1/20

Abstract: A mechanism is described for facilitating barriers and synchronization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.

84.

发明公开
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 审中-公开

公开(公告)号：US20240184572A1

公开(公告)日：2024-06-06

申请号：US18528340

申请日：2023-12-04

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06F17/16 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N20/00 , G06T15/00 , G09G5/393

CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30036 , G06F9/3851 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G09G5/393 , G06F9/30025 , G06F9/3013 , G06F17/16 , G06F2207/3824 , G06N20/00 , G06T15/005

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

85.

发明公开
Regional Adjustment of Render Rate 审中-公开

公开(公告)号：US20240086138A1

公开(公告)日：2024-03-14

申请号：US18474361

申请日：2023-09-26

Applicant: Intel Corporation

Inventor： Eric J. Asperheim , Subramaniam Maiyuran , Kiran C. Veernapu , Sanjeev S. Jahagirdar , Balaji Vembu , Devan Burke , Philip R. Laws , Kamal Sinha , Abhishek R. Appu , Elmoustapha Ould-Ahmed-Vall , Peter L. Doyle , Joydeep Ray , Travis T. Schluessler , John H. Feit , Nikos Kaburlasos , Jacek Kwiatkowski , Altug Koker

IPC: G06F3/14 , G06F3/01 , G06F3/0484 , G09G5/391

CPC classification number: G06F3/1438 , G06F3/013 , G06F3/0484 , G09G5/391 , G09G5/001 , G09G2340/0435 , G09G2352/00 , G09G2354/00 , G09G2360/08 , G09G2360/121

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

86.

发明公开
COORDINATION AND INCREASED UTILIZATION OF GRAPHICS PROCESSORS DURING INFERENCE 审中-公开

公开(公告)号：US20240013337A1

公开(公告)日：2024-01-11

申请号：US18351898

申请日：2023-07-13

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Kamal Sinha , Joydeep Ray , Balaji Vembu , Sanjeev Jahagirdar , Vasanth Ranganathan , Dukhwan Kim

IPC: G06T1/20 , G06F9/46 , G06N3/063 , G06N3/045 , G06N3/08

CPC classification number: G06T1/20 , G06F9/46 , G06N3/063 , G06N3/045 , G06N3/08 , G06N3/084

Abstract: A mechanism is described for detecting, at training time, information related to one or more tasks to be performed by the one or more processors according to a training dataset for a neural network, analyzing the information to determine one or more portions of hardware of a processor of the one or more processors that is configurable to support the one or more tasks, configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks, and monitoring utilization of the hardware via a hardware unit of the graphics processor and, via a scheduler of the graphics processor, adjusting allocation of the one or more tasks to the one or more portions of the hardware based on the utilization.

87.

发明公开
DYNAMIC PRECISION FOR NEURAL NETWORK COMPUTE OPERATIONS 审中-公开

公开(公告)号：US20240005136A1

公开(公告)日：2024-01-04

申请号：US18351124

申请日：2023-07-12

Applicant: Intel Corporation

Inventor： Kamal Sinha , Balaji Vembu , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Farshad Akhbari , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Nadathur Rajagopalan Satish , John C. Weast , Mike B. MacPherson , Linda L. Hurd , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20 , G06F9/30 , G06T15/00 , G06F15/78 , G06F15/76 , G06F1/3287 , G06F1/3293 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20 , G06F9/30014 , G06T15/005 , G06F15/78 , G06F15/76 , G06F9/30036 , G06F1/3287 , G06F1/3293 , G06N3/084 , G06N3/044 , G06N3/045 , G06T1/60

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

88.

发明公开
PROCESSOR POWER MANAGEMENT 审中-公开

公开(公告)号：US20230418355A1

公开(公告)日：2023-12-28

申请号：US18339827

申请日：2023-06-22

Applicant: INTEL CORPORATION

Inventor： Altug Koker , Abhishek R. Appu , Kiran C. Veernapu , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Eric J. Hoekstra , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy

IPC: G06F1/3209 , H04W52/02 , G06F1/324 , G06F1/3203 , G06F1/3212 , G06F1/3218 , G06F1/3231 , G06F3/01 , G06F11/07 , G06F11/30

CPC classification number: G06F1/3209 , H04W52/0258 , G06F1/324 , G06F1/3203 , G06F1/3212 , G06F1/3218 , G06F1/3231 , G06F3/01 , G06F11/0781 , G06F11/3062 , Y02D10/00 , Y02D30/70 , H04M1/72448

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

89.

发明公开
COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20230359461A1

公开(公告)日：2023-11-09

申请号：US18315625

申请日：2023-05-11

Applicant: Intel Corporation

Inventor： Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha

IPC: G06F9/30 , G06F9/38 , G06N3/063 , G06N3/084 , G06T1/20 , G06N3/044 , G06N3/045

CPC classification number: G06F9/3001 , G06F9/3851 , G06F9/3887 , G06F9/3893 , G06N3/063 , G06N3/084 , G06T1/20 , G06N3/044 , G06N3/045 , G06F2207/4824

Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a one-bit weight associated with a neural network, as well as an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a fused operation including an exclusive not OR (XNOR) operation and a population count operation. The adder is configured to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.

90.

发明授权
Apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor 有权

公开(公告)号：US11798125B2

公开(公告)日：2023-10-24

申请号：US17828411

申请日：2022-05-31

Applicant: INTEL CORPORATION

Inventor： Abhishek R. Appu , Joydeep Ray , Altug Koker , Balaji Vembu , Pattabhiraman K , Matthew B. Callaway

IPC: G06F13/14 , G06T1/60 , G06T15/00 , G06F9/455 , G06F9/50 , G06F9/48

CPC classification number: G06T1/60 , G06F9/45558 , G06F9/4881 , G06F9/5038 , G06T15/005 , G06F2009/45579 , G06F2009/45591

Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification