Patent search ap:("Intel Corporation") AND inv:"Sanjeev Jahagirdar" Page 6

51.

发明申请
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 有权

公开(公告)号：US20230046506A1

公开(公告)日：2023-02-16

申请号：US17967283

申请日：2022-10-17

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F7/483 , G06N3/063 , G06N3/04 , G06F9/38 , G06N3/08 , G09G5/393 , G06F7/544 , G06T15/00 , G06N20/00 , G06F17/16

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

52.

发明申请
DYNAMIC POWER BUDGET ALLOCATION IN MULTI-PROCESSOR SYSTEM 有权

公开(公告)号：US20230030396A1

公开(公告)日：2023-02-02

申请号：US17966151

申请日：2022-10-14

Applicant: Intel Corporation

Inventor： Nikos Kaburlasos , Iqbal Rajwani , Bhushan Borole , Kamal Sinha , Sanjeev Jahagirdar

IPC: G06F1/28

Abstract: Dynamic power budget allocation in a multi-processor system is described. In an example, an apparatus includes a plurality of processor units; and a power control component, the power control component to monitor power utilization of each of the plurality of processor units, wherein power consumed by the plurality of processor units is limited by a global power budget. The apparatus is to assign a workload to each of the processor units and is to establish an initial power budget for operation of each of the processor units, and, upon the apparatus determining that one or more processor units require an increased power budget based on one or more criteria, the apparatus is to dynamically reallocate an amount of the global power budget to the one or more processor units.

53.

发明申请
BARRIERS AND SYNCHRONIZATION FOR MACHINE LEARNING AT AUTONOMOUS MACHINES 有权

公开(公告)号：US20220357742A1

公开(公告)日：2022-11-10

申请号：US17750917

申请日：2022-05-23

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G05D1/00 , G06N3/063 , G06F9/52 , G06N3/04 , G06N3/08

Abstract: A mechanism is described for facilitating barriers and synchronization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.

54.

发明申请
ENABLING PRODUCT SKUS BASED ON CHIPLET CONFIGURATIONS 有权

公开(公告)号：US20220188967A1

公开(公告)日：2022-06-16

申请号：US17685117

申请日：2022-03-02

Applicant: Intel Corporation

Inventor： Altug Koker , Lance Cheney , Eric Finley , Varghese George , Sanjeev Jahagirdar , Josh Mastronarde , Naveen Matam , Iqbal Rajwani , Lakshminarayanan Striramassarma , Melaku Teshome , Vikranth Vemulapalli , Binoj Xavier

IPC: G06T1/20 , G06F13/40

Abstract: A disaggregated processor package can be configured to accept interchangeable chiplets. Interchangeability is enabled by specifying a standard physical interconnect for chiplets that can enable the chiplet to interface with a fabric or bridge interconnect. Chiplets from different IP designers can conform to the common interconnect, enabling such chiplets to be interchangeable during assembly. The fabric and bridge interconnects logic on the chiplet can then be configured to confirm with the actual interconnect layout of the on-board logic of the chiplet. Additionally, data from chiplets can be transmitted across an inter-chiplet fabric using encapsulation, such that the actual data being transferred is opaque to the fabric, further enable interchangeability of the individual chiplets, With such an interchangeable design, cache or DRAM memory can be inserted into memory chiplet slots, while compute or graphics chiplets with a higher or lower core count can be inserted into logic chiplet slots.

55.

发明授权
Efficient thread group scheduling 有权

公开(公告)号：US11360808B2

公开(公告)日：2022-06-14

申请号：US15482801

申请日：2017-04-09

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Rajkishore Barik , Eriko Nurvitadhi , Nicolas Galoppo Von Borries , Tsung-Han Lin , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G06F9/48 , G06T1/20

Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.

56.

发明申请
SYSTEMS AND METHODS FOR CACHE OPTIMIZATION 有权

公开(公告)号：US20220156202A1

公开(公告)日：2022-05-19

申请号：US17590362

申请日：2022-02-01

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian

IPC: G06F12/123 , G06F12/0891 , G06T1/60 , G06F12/0875

Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.

57.

发明授权
Instructions and logic to perform floating-point and integer operations for machine learning 有权

公开(公告)号：US11169799B2

公开(公告)日：2021-11-09

申请号：US16432402

申请日：2019-06-05

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/063 , G06N20/00 , G09G5/393 , G06N3/04 , G06N3/08 , G06T15/00

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

58.

发明申请
ENABLING PRODUCT SKUS BASED ON CHIPLET CONFIGURATIONS 有权

公开(公告)号：US20210256654A1

公开(公告)日：2021-08-19

申请号：US17161941

申请日：2021-01-29

Applicant: Intel Corporation

Inventor： Altug Koker , Lance Cheney , Eric Finley , Varghese George , Sanjeev Jahagirdar , Josh Mastronarde , Naveen Matam , Iqbal Rajwani , Lakshminarayanan Striramassarma , Melaku Teshome , Vikranth Vemulapalli , Binoj Xavier

IPC: G06T1/20 , G06F13/40

Abstract: A disaggregated processor package can be configured to accept interchangeable chiplets. Interchangeability is enabled by specifying a standard physical interconnect for chiplets that can enable the chiplet to interface with a fabric or bridge interconnect. Chiplets from different IP designers can conform to the common interconnect, enabling such chiplets to be interchangeable during assembly. The fabric and bridge interconnects logic on the chiplet can then be configured to confirm with the actual interconnect layout of the on-board logic of the chiplet. Additionally, data from chiplets can be transmitted across an inter-chiplet fabric using encapsulation, such that the actual data being transferred is opaque to the fabric, further enable interchangeability of the individual chiplets. With such an interchangeable design, higher or lower density memory can be inserted into memory chiplet slots, while compute or graphics chiplets with a higher or lower core count can be inserted into logic chiplet slots.

59.

发明申请
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 有权

公开(公告)号：US20210182058A1

公开(公告)日：2021-06-17

申请号：US17169232

申请日：2021-02-05

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06N3/04 , G06F9/38 , G06F7/483 , G09G5/393 , G06F7/544 , G06N3/063 , G06N3/08

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

60.

发明授权
Resource load balancing based on usage and power limits 有权

公开(公告)号：US10983581B2

公开(公告)日：2021-04-20

申请号：US15859598

申请日：2017-12-31

Applicant: Intel Corporation

Inventor： Sanjeev Jahagirdar , Altug Koker , Yoav Harel , Kenneth Brand , Chandra Gurram , Eric Finley , Bhushan Borole , Carlos Nava Rodriguez

IPC: G06F1/32 , G06F1/324 , G06F1/3212 , G06F1/3234

Abstract: Methods and apparatus relating to techniques for resource load balancing based on usage and/or power limits are described. In an embodiment, resource load balancing logic causes a first resource of a processor to operate at a first frequency and a second resource of the processor to operate at a second frequency. Memory stores a plurality of frequency values. The resource load balancing logic also selects the first frequency and the second frequency based on the stored plurality of frequency values. Operation of the first resource at the first frequency and the second resource at the second frequency in turn causes the processor to operate under a power budget. The resource load balancing logic causes change to the first frequency and the second frequency in response to a determination that operation of the processor is different than the power budget. Other embodiments are also disclosed and claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification