-
公开(公告)号:US10186011B2
公开(公告)日:2019-01-22
申请号:US15581182
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.
-
公开(公告)号:US20180314249A1
公开(公告)日:2018-11-01
申请号:US15581124
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Kamal Sinha , Joydeep Ray , Balaji Vembu , Mike B. Macpherson , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan
CPC classification number: G06F9/5016 , G06F9/5061
Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.
-
公开(公告)号:US20180308256A1
公开(公告)日:2018-10-25
申请号:US15494812
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Nadathur Rajagopalan Satish , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Farshad Akhbari
CPC classification number: G06T9/00 , G06N3/0445 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N3/084
Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.
-
公开(公告)号:US20180307983A1
公开(公告)日:2018-10-25
申请号:US15494948
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Narayan Srinivasa , Joydeep Ray , Nicolas C. Galoppo Von Borries , Ben Ashbaugh , Prasoonkumar Surti , Feng Chen , Barath Lakshmanan , Elmoustapha Ould-Ahmed-Vall , Liwei Ma , Linda L. Hurd , Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Chandrasekaran Sakthivel , Farshad Akhbari , Dukhwan Kim , Altug Koker , Nadathur Rajagopalan Satish
CPC classification number: G06N3/08 , G06N3/04 , G06N3/0454 , G06N3/063 , G06N3/082
Abstract: An apparatus to facilitate optimization of a neural network (NN) is disclosed. The apparatus includes optimization logic to define a NN topology having one or more macro layers, adjust the one or more macro layers to adapt to input and output components of the NN and train the NN based on the one or more macro layers.
-
公开(公告)号:US20170286122A1
公开(公告)日:2017-10-05
申请号:US15089232
申请日:2016-04-01
Applicant: Intel Corporation
Inventor: Lisa K. Wu , Tae Jun Ham , Nadathur Rajagopalan Satish , Narayanan Sundaram
CPC classification number: G06F9/5027 , G06F9/3877 , G06F2209/509 , G06T1/20 , Y02D10/22
Abstract: A processor includes a front end including circuitry to receive and decode an instruction. The instruction is to perform a graph analytic function and pass the instruction to a graph accelerator. The graph accelerator including circuitry to process graph vertices and graph edges as datatypes, execute the instruction, and pass results of the instruction to a memory subsystem of the processor.
-
46.
公开(公告)号:US20130297878A1
公开(公告)日:2013-11-07
申请号:US13934198
申请日:2013-07-02
Applicant: Intel Corporation
Inventor: Christopher J. Hughes , Yen-Kuang Chen , Changkyu Kim , Daehyun Kim , Victor W. Lee , Anthony-Trung D. Nguyen , Nadathur Rajagopalan Satish
IPC: G06F12/08
CPC classification number: G06F12/0811 , G06F9/30043 , G06F12/0802 , G06F12/0897 , G06F2212/62 , Y02D10/13
Abstract: Methods and apparatus relating to gather or scatter operations in a multi-level cache are described. In some embodiments, a logic may determine whether to perform gather or scatter operations at a first memory or a second memory, based in part on a relative performance of performing the gather or scatter operations at the first memory and the second memory. Other embodiments are also described and claimed.
Abstract translation: 描述与多级缓存中的收集或散布操作有关的方法和装置。 在一些实施例中,逻辑可以部分地基于在第一存储器和第二存储器执行收集或散布操作的相对性能来确定是否在第一存储器或第二存储器执行收集或散布操作。 还描述和要求保护其他实施例。
-
公开(公告)号:US12198221B2
公开(公告)日:2025-01-14
申请号:US18436494
申请日:2024-02-08
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
-
公开(公告)号:US20250005703A1
公开(公告)日:2025-01-02
申请号:US18773094
申请日:2024-07-15
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman
IPC: G06T1/20 , G06F3/14 , G06F9/30 , G06F9/38 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T15/00 , G06T15/04 , G09G5/36
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core including mixed-precision execution circuitry to execute one or more of the mixed-precision instructions to perform a mixed-precision dot-product operation comprising to perform a set of multiply and accumulate operations.
-
公开(公告)号:US20240005136A1
公开(公告)日:2024-01-04
申请号:US18351124
申请日:2023-07-12
Applicant: Intel Corporation
Inventor: Kamal Sinha , Balaji Vembu , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Farshad Akhbari , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Nadathur Rajagopalan Satish , John C. Weast , Mike B. MacPherson , Linda L. Hurd , Vasanth Ranganathan , Sanjeev Jahagirdar
IPC: G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20 , G06F9/30 , G06T15/00 , G06F15/78 , G06F15/76 , G06F1/3287 , G06F1/3293 , G06N3/084 , G06N3/044 , G06N3/045
CPC classification number: G06N3/063 , G06N3/08 , G06N3/04 , G06T1/20 , G06F9/30014 , G06T15/005 , G06F15/78 , G06F15/76 , G06F9/30036 , G06F1/3287 , G06F1/3293 , G06N3/084 , G06N3/044 , G06N3/045 , G06T1/60
Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20220327357A1
公开(公告)日:2022-10-13
申请号:US17723074
申请日:2022-04-18
Applicant: Intel Corporation
Inventor: Liwei Ma , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Eriko Nurvitadhi , Chandrasekaran Sakthivel , Barath Lakshmanan , Jingyi Jin , Justin E. Gottschlich , Michael Strickland
Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.
-
-
-
-
-
-
-
-
-