-
公开(公告)号:US20240070926A1
公开(公告)日:2024-02-29
申请号:US18466141
申请日:2023-09-13
Applicant: Intel Corporation
Inventor: Joydeep Ray , Ben Ashbaugh , Prasoonkumar Surti , Pradeep Ramani , Rama Harihara , Jerin C. Justin , Jing Huang , Xiaoming Cui , Timothy B. Costa , Ting Gong , Elmoustapha Ould-ahmed-vall , Kumar Balasubramanian , Anil Thomas , Oguz H. Elibol , Jayaram Bobba , Guozhong Zhuang , Bhavani Subramanian , Gokce Keskin , Chandrasekaran Sakthivel , Rajesh Poornachandran
CPC classification number: G06T9/002 , G06F12/023 , G06T15/005 , G06F2212/302 , G06F2212/401
Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
公开(公告)号:US20230376762A1
公开(公告)日:2023-11-23
申请号:US18320385
申请日:2023-05-19
Applicant: Intel Corporation
Inventor: Srinivas Sridharan , Karthikeyan Vaidyanathan , Dipankar Das , Chandrasekaran Sakthivel , Mikhail E. Smorkalov
CPC classification number: G06N3/08 , G06N3/088 , G06F9/5061 , G06F9/50 , G06F9/5077 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/063 , G06N3/048
Abstract: Embodiments described herein provide an apparatus comprising an interconnect switch configured to couple with a plurality of graphics processors via a plurality of point-to-point interconnects and one or more processors including a graphics processor coupled with the interconnect switch via a point-to-point interconnect of the plurality of point-to-point interconnects.
-
公开(公告)号:US20230230289A1
公开(公告)日:2023-07-20
申请号:US18152643
申请日:2023-01-10
Applicant: Intel Corporation
Inventor: Joydeep Ray , Ben Ashbaugh , Prasoonkumar Surti , Pradeep Ramani , Rama Harihara , Jerin C. Justin , Jing Huang , Xiaoming Cui , Timothy B. Costa , Ting Gong , Elmoustapha Ould-ahmed-vall , Kumar Balasubramanian , Anil Thomas , Oguz H. Elibol , Jayaram Bobba , Guozhong Zhuang , Bhavani Subramanian , Gokce Keskin , Chandrasekaran Sakthivel , Rajesh Poornachandran
CPC classification number: G06T9/002 , G06F12/023 , G06T15/005 , G06F2212/401 , G06F2212/302
Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
公开(公告)号:US11669932B2
公开(公告)日:2023-06-06
申请号:US17355267
申请日:2021-06-23
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Joydeep Ray
CPC classification number: G06T1/20 , G06F9/544 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , H04L67/10
Abstract: A mechanism is described for facilitating sharing of data and compression expansion of models at autonomous machines. A method of embodiments, as described herein, includes detecting a first processor processing information relating to a neural network at a first computing device, where the first processor comprises a first graphics processor and the first computing device comprises a first autonomous machine. The method further includes facilitating the first processor to store one or more portions of the information in a library at a database, where the one or more portions are accessible to a second processor of a computing device.
-
公开(公告)号:US11609856B2
公开(公告)日:2023-03-21
申请号:US17206514
申请日:2021-03-19
Applicant: Intel Corporation
Inventor: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
IPC: G06F12/08 , G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/063 , G06N3/04 , G06N3/084 , G06N3/088
Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11592817B2
公开(公告)日:2023-02-28
申请号:US15581124
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Kamal Sinha , Joydeep Ray , Balaji Vembu , Mike B. Macpherson , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan
Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.
-
公开(公告)号:US11580361B2
公开(公告)日:2023-02-14
申请号:US15494826
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Gokcen Cilingir , Elmoustapha Ould-Ahmed-Vall , Rajkishore Barik , Kevin Nealis , Xiaoming Chen , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Abhishek Appu , John C. Weast , Sara S. Baghsorkhi , Barnan Das , Narayan Biswal , Stanley J. Baran , Nilesh V. Shah , Archie Sharma , Mayuresh M. Varerkar
Abstract: An apparatus to facilitate neural network (NN) training is disclosed. The apparatus includes training logic to receive one or more network constraints and train the NN by automatically determining a best network layout and parameters based on the network constraints.
-
公开(公告)号:US20220366527A1
公开(公告)日:2022-11-17
申请号:US17871781
申请日:2022-07-22
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Kamal Sinha , Joydeep Ray , Balaji Vembu , Sanjeev Jahagirdar , Vasanth Ranganathan , DUKHWAN Kim
Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.
-
公开(公告)号:US11270201B2
公开(公告)日:2022-03-08
申请号:US15859180
申请日:2017-12-29
Applicant: Intel Corporation
Inventor: Srinivas Sridharan , Karthikeyan Vaidyanathan , Dipankar Das , Chandrasekaran Sakthivel , Mikhail E. Smorkalov
Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library, the instructions to cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network.
-
公开(公告)号:US20210294649A1
公开(公告)日:2021-09-23
申请号:US17206514
申请日:2021-03-19
Applicant: Intel Corporation
Inventor: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-