Patent search ap:("INTEL CORPORATION") AND inv:"Balaji Vembu" Page 3

21.

发明申请
GRAPHICS SCHEDULING MECHANISM 有权

公开(公告)号：US20220413869A1

公开(公告)日：2022-12-29

申请号：US17900230

申请日：2022-08-31

Applicant: Intel Corporation

Inventor： Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker

IPC: G06F9/38 , G06F9/48 , G06F12/0866 , G06F9/46 , G06F9/54 , G06F15/76 , G06F12/0897 , G06F9/52 , G06F15/16 , G06T1/20 , G06F9/50 , G06T1/60

Abstract: An apparatus to facilitate thread scheduling is disclosed. In one embodiment the apparatus includes a processor comprising a plurality of multiprocessors comprising single-instruction multiple thread (SIMT) execution circuitry to simultaneously execute multiple threads, a shared local memory to be shared by the multiple threads, and scheduling hardware logic to schedule the multiple threads in a thread group for execution across the plurality of multiprocessors in accordance with barrier data. The instructions of the multiple threads are to produce shared data to be stored in the shared local memory when executed by the plurality of multiprocessors, wherein additional instructions of at least a first thread of the multiple threads are to use the shared data, and wherein, in accordance with the barrier data, the first thread is to wait for other threads of the multiple threads to finish producing the shared data before executing the additional instructions.

22.

发明申请
GRAPHICS ENGINE PARTITIONING MECHANISM 有权

公开(公告)号：US20220405876A1

公开(公告)日：2022-12-22

申请号：US17738254

申请日：2022-05-06

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Balaji Vembu , Altug Koker , Bryan R. White , David J. Cowperthwaite , Joydeep Ray , Murali Ramadoss

IPC: G06T1/20 , G06F9/50 , G06F9/455

Abstract: An apparatus to facilitate partitioning of a graphics device is disclosed. The apparatus includes a plurality of engines and logic to partition the plurality of engines to facilitate independent access to each engine within the plurality of engines.

23.

发明授权
Regional adjustment of render rate 有权

公开(公告)号：US11531510B2

公开(公告)日：2022-12-20

申请号：US17399103

申请日：2021-08-11

Applicant: Intel Corporation

Inventor： Eric J. Asperheim , Subramaniam M. Maiyuran , Kiran C. Veernapu , Sanjeev S. Jahagirdar , Balaji Vembu , Devan Burke , Philip R. Laws , Kamal Sinha , Abhishek R. Appu , Elmoustapha Ould-Ahmed-Vall , Peter L. Doyle , Joydeep Ray , Travis T. Schluessler , John H. Feit , Nikos Kaburlasos , Jacek Kwiatkowski , Altug Koker

IPC: G06F3/14 , G06F3/01 , G09G5/391 , G06F3/0484 , G09G5/00

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

24.

发明申请
BARRIERS AND SYNCHRONIZATION FOR MACHINE LEARNING AT AUTONOMOUS MACHINES 有权

公开(公告)号：US20220357742A1

公开(公告)日：2022-11-10

申请号：US17750917

申请日：2022-05-23

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G05D1/00 , G06N3/063 , G06F9/52 , G06N3/04 , G06N3/08

Abstract: A mechanism is described for facilitating barriers and synchronization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.

25.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 有权

公开(公告)号：US20220335562A1

公开(公告)日：2022-10-20

申请号：US17741934

申请日：2022-05-11

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F9/455 , G06F9/50 , G06N3/04 , G06N3/063 , G06N3/08

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

26.

发明授权
Apparatus and method for efficient graphics virtualization 有权

公开(公告)号：US11475623B2

公开(公告)日：2022-10-18

申请号：US17141431

申请日：2021-01-05

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Pattabhiraman K , Balaji Vembu , Altug Koker , Niranjan L. Cooray , Josh B. Mastronarde

IPC: G06T15/00 , G06F9/455 , G06T1/60 , G09G5/36 , G09G5/00 , G09G5/393 , G06F9/48 , G06F9/50 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20

Abstract: An apparatus and method are described for allocating local memories to virtual machines. For example, one embodiment of an apparatus comprises: a command streamer to queue commands from a plurality of virtual machines (VMs) or applications, the commands to be distributed from the command streamer and executed by graphics processing resources of a graphics processing unit (GPU); a tile cache to store graphics data associated with the plurality of VMs or applications as the commands are executed by the graphics processing resources; and tile cache allocation hardware logic to allocate a first portion of the tile cache to a first VM or application and a second portion of the tile cache to a second VM or application; the tile cache allocation hardware logic to further allocate a first region in system memory to store spill-over data when the first portion of the tile cache and/or the second portion of the file cache becomes full.

27.

发明授权
Specialized fixed function hardware for efficient convolution 有权

公开(公告)号：US11475286B2

公开(公告)日：2022-10-18

申请号：US17558285

申请日：2021-12-21

Applicant: Intel Corporation

Inventor： Rajkishore Barik , Elmoustapha Ould-Ahmed-Vall , Xiaoming Chen , Dhawal Srivastava , Anbang Yao , Kevin Nealis , Eriko Nurvitadhi , Sara S. Baghsorkhi , Balaji Vembu , Tatiana Shpeisman , Ping T. Tang

IPC: G06N3/06 , G06N3/063 , G06N3/04 , G06F9/30 , G06F9/38 , G06T1/20 , G06N3/08 , G06F16/17

Abstract: One embodiment provides an apparatus comprising an instruction cache to store a plurality of instructions, a scheduler unit coupled to the instruction cache, the scheduler unit to schedule the plurality of instructions for execution, an instruction fetch and decode unit to decode the plurality of instructions to determine a set of operations to perform in response, one or more compute blocks to perform parallel multiply-accumulate operations based on the instruction fetch and decode unit decoding a first instruction of the plurality of instructions, and matrix multiplication logic to perform matrix multiplication operations based on the instruction fetch and decode unit decoding a second instruction of the plurality of instructions.

28.

发明授权
Machine learning sparse computation mechanism 有权

公开(公告)号：US11430083B2

公开(公告)日：2022-08-30

申请号：US17193658

申请日：2021-03-05

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries

IPC: G06F17/16 , G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , H03M7/30 , G06K9/62 , G06N20/00 , G06F12/02 , G06F9/48 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

29.

发明授权
Efficient thread group scheduling 有权

公开(公告)号：US11360808B2

公开(公告)日：2022-06-14

申请号：US15482801

申请日：2017-04-09

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Rajkishore Barik , Eriko Nurvitadhi , Nicolas Galoppo Von Borries , Tsung-Han Lin , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G06F9/48 , G06T1/20

Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.

30.

发明授权
Apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor 有权

公开(公告)号：US11354770B2

公开(公告)日：2022-06-07

申请号：US17182256

申请日：2021-02-23

Applicant: INTEL CORPORATION

Inventor： Abhishek R. Appu , Joydeep Ray , Altug Koker , Balaji Vembu , Pattabhiraman K , Matthew B. Callaway

IPC: G06F13/14 , G06T1/60 , G06T15/00 , G06F9/455 , G06F9/50 , G06F9/48

Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification