Patent search ap:("Intel Corporation") AND inv:"Kamal Sinha" Page 9

81.

发明申请
DYNAMIC DISTRIBUTED TRAINING OF MACHINE LEARNING MODELS 审中-公开

公开(公告)号：US20180307984A1

公开(公告)日：2018-10-25

申请号：US15494971

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Altug Koker , Abhishek R. Appu , Kamal Sinha , Joydeep Ray , Balaji Vembu , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , John C. Weast , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Farshad Akhbari , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06N3/08 , G06N99/00

CPC classification number: G06N3/08 , G06F9/28 , G06F9/505 , G06N3/0445 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N99/005

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

82.

发明申请
AVOID CACHE LOOKUP FOR COLD CACHE 审中-公开

公开(公告)号：US20180300251A1

公开(公告)日：2018-10-18

申请号：US15488961

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Prasoonkumar Surti , Kamal Sinha , Kiran C. Veernapu , Balaji Vembu

IPC: G06F12/0888 , G06F13/42 , G06F13/40 , G06T1/20

CPC classification number: G06F12/0888 , G06F13/4022 , G06F13/4282 , G06F2212/1024 , G06F2212/6032 , G06F2213/0026 , G06T1/60

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.

83.

发明申请
Read/Write Modes for Reducing Power Consumption in Graphics Processing Units 审中-公开

公开(公告)号：US20180300074A1

公开(公告)日：2018-10-18

申请号：US15488723

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Kamal Sinha , Bhushan M. Borole , Altug Koker , Joydeep Ray , Wenyin Fu

IPC: G06F3/06 , G06T1/20

Abstract: Power for on-die heavily used local memories in general purpose graphics processing unit (GPGPU) applications may be reduced by using low latency read and high latency write operations. Power consumption in read heavy graphic operations can be reduced using a small memory footprint design with possible reduction of hot spotting in some embodiments.

84.

发明申请
EFFICIENT THREAD GROUP SCHEDULING 审中-公开

公开(公告)号：US20180293102A1

公开(公告)日：2018-10-11

申请号：US15482801

申请日：2017-04-09

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Rajkishore Barik , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Tsung-Han Lin , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G06F9/50 , G06T1/20 , G06F9/52 , G06F9/48

Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.

85.

发明申请
EFFICIENT MULTI-CONTEXT THREAD DISTRIBUTION 审中-公开

公开(公告)号：US20180285110A1

公开(公告)日：2018-10-04

申请号：US15477022

申请日：2017-04-01

Applicant: Intel Corporation

Inventor： Joydeep Ray , Altug Koker , Balaji Vembu , Abhishek R. Appu , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu

IPC: G06F9/30 , G06F12/0842 , G09G5/393 , G06T1/60 , G06T1/20 , G06T15/00 , G06F9/50

CPC classification number: G06F9/30123 , G06F9/5016 , G06F9/5027 , G06T1/20 , G06T1/60 , G09G5/363 , G09G5/393 , G09G2352/00 , G09G2360/08

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.

86.

发明申请
PROCESSOR POWER MANAGEMENT 审中-公开

公开(公告)号：US20180284868A1

公开(公告)日：2018-10-04

申请号：US15477029

申请日：2017-04-01

Applicant: Intel Corporation

Inventor： Altug Koker , Abhishek R. Appu , Kiran C. Veernapu , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Eric J. Hoekstra , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy

IPC: G06F1/32 , G06F3/01 , G06F11/30 , G06F11/07

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

87.

发明申请
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 有权

公开(公告)号：US20250061534A1

公开(公告)日：2025-02-20

申请号：US18819073

申请日：2024-08-29

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084

Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.

88.

发明授权
Instructions and logic to perform floating point and integer operations for machine learning 有权

公开(公告)号：US12217053B2

公开(公告)日：2025-02-04

申请号：US18528340

申请日：2023-12-04

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G09G5/393 , G06F1/16 , G06F17/16 , G06N20/00 , G06T15/00

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

89.

发明授权
Dynamic power budget allocation in multi-processor system 有权

公开(公告)号：US11874715B2

公开(公告)日：2024-01-16

申请号：US17966151

申请日：2022-10-14

Applicant: Intel Corporation

Inventor： Nikos Kaburlasos , Iqbal Rajwani , Bhushan Borole , Kamal Sinha , Sanjeev Jahagirdar

IPC: G06F1/00 , G06F1/28 , G06F1/3203 , G06F1/3206

CPC classification number: G06F1/28 , G06F1/3203 , G06F1/3206

Abstract: Dynamic power budget allocation in a multi-processor system is described. In an example, an apparatus includes a plurality of processor units; and a power control component, the power control component to monitor power utilization of each of the plurality of processor units, wherein power consumed by the plurality of processor units is limited by a global power budget. The apparatus is to assign a workload to each of the processor units and is to establish an initial power budget for operation of each of the processor units, and, upon the apparatus determining that one or more processor units require an increased power budget based on one or more criteria, the apparatus is to dynamically reallocate an amount of the global power budget to the one or more processor units.

90.

发明公开
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 审中-公开

公开(公告)号：US20230394616A1

公开(公告)日：2023-12-07

申请号：US18334733

申请日：2023-06-14

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08

CPC classification number: G06T1/20 , G06N3/063 , G06F9/3887 , G06F9/3895 , G06F9/3001 , G06F9/3851 , G06F9/3017 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/08

Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification